#416183

Anonymous
Quote:
First of all to have an outlier you need several samples, not just one sample as was the case in Slovakia. Secondly to have a Slovak population label most other genetic scientists would wait until they have several other individuals rather than giving it on the basis of one. The dodecad project for example requires a minimum of 5 people before creating a population reference group. Secondly yes an outlier is an extreme that deviates from the mean however there is a limit to them as well. When it is no longer an outlier but should not even be part of that sample.

Firstly, a sample could be as large as 1,000 subjects or as small as 30 subjects or there could be a stratification technique employed taking several samples of any size from 30 to 200 from a population. In case of a small size sample, a well known simulation technique called bootstrapping generating average values from a probability distribution is employed. This is all discussed in a technical paper published by Novembre et al. which has been cited hundreds of times by other researchers in their work meaning the graph you are not fond of has been accepted by the scientific community. In any cases, an outlier it’s just that – an observation lying out of the cluster of values usually above 3 standard deviations from the mean regardless the number of samples are being used.

For example in Eurogenes in order to get the country label ie SK1, SK2, SK3, etc…. for Slovakia the individual must be within a certain range from the mean. Meaning a "Russian" that clusters with Greeks on Eurogenes would likely not get the RU label, as it is safe to assume that they are not an ethnic Russian. Not to mention you must know the ancestry of your participants. This "Slovak" should have been omitted, as I will repeat myself once more they are likely not a Slovak, likewise with the "Serbs/Yugoslavs" whom ever they maybe, which cluster with Greeks.

Once again these are some very basic errors this scientist made. Creating populations based on a single sample and not including only people of that specific ethnicity in the sample. IT does not take a scientist to see that this is wrong.

Secondly, it does not matter what guys from Eurogenes or Dodecad do. Their work does not receive much consideration among scientific community, so they can remove outliers or use non-variable data in their private work at will . In formal work some outliers are removed, while other remains in the study being mentioned in supplementary notes for further investigations. The treatment of outliers in data analysis is a separate subject in itself. No where I've seen authors mentioning Slovakian or Russian subjects claimed non-Slovakian or non-Russian ancestry in that particular study.

Thirdly, the study is specific to another population and the emphasis is on another population. Do you understand this? Now, cover the bottom of the graph with a piece of paper each time you are looking at it and be done with it.