The role of Southeastern Europe in origins and diffusion of major paternal lineages

The human Y chromosome defines male sex through the action of the sex determining region (SRY). It is an atypical segment of the human genome, since it is haploid in most of its length, escapes recombination with the X chromosome, and undergoes uniparental transmission. These properties make the Y chromosome sequence a valuable tool for the purposes of human history reconstruction and studies focused on the dispersal of anatomically modern humans. The non-recombining region of the Y (NRY) is inherited as a single locus that changes exclusively via mutations accumulating over time, thus allowing the preservation of a relatively simple record of genetic history in comparison to nuclear DNA (autosomes). There has been an interest in studying paternal genetic history since the mid-80s (e.g. Casanova et al. 1985; Hammer 1994; Underhill et al. 2000). A constantly growing number of evolutionary informative polymorphisms provide a deeper resolution of human paternal history and evolution. Currently, there are more than 300 known SNPs (single nucleotide polymorphisms) and small indels (YCC 2002; Jobling and Tyler-Smith 2003). In evolutionary genetics terminology, the set of alleles at different biallelic loci along the chromosome is called a haplogroup.


Introduction
The human Y chromosome defines male sex through the action of the sex determining region (SRY).It is an atypical segment of the human genome, since it is haploid in most of its length, escapes recombination with the X chromosome, and undergoes uniparental transmission.These properties make the Y chromosome sequence a valuable tool for the purposes of human history reconstruction and studies focused on the dispersal of anatomically modern humans.The non-recombining region of the Y (NRY) is inherited as a single locus that changes exclusively via mutations accumulating over time, thus allowing the preservation of a relatively simple record of genetic history in comparison to nuclear DNA (autosomes).
There has been an interest in studying paternal genetic history since the mid-80s (e.g.Casanova et al. 1985;Hammer 1994;Underhill et al. 2000).A constantly growing number of evolutionary informative polymorphisms provide a deeper resolution of human paternal history and evolution.Currently, there are more than 300 known SNPs (single nucleotide polymorphisms) and small indels (YCC 2002;Jobling and Tyler-Smith 2003).In evolutionary genetics terminology, the set of alleles at different biallelic loci along the chromosome is called a haplogroup.
Assuming a 1:1 sex ratio, the effective population size of the Y chromosome in a population would be ABSTRACT -The aim of this study is to explore the existing data based on high-resolution phylogenetic studies of Y chromosome variation in populations from Southeastern Europe and elsewhere in Eurasia in order to evaluate the role of the region in the process of the prehistoric colonization of the European continent and the structuring of the modern paternal genetic pool.Even though the distribution and estimated range expansions of major paternal lineages in Southeastern Europe are consistent with the typical European Y chromosome gene pool, the specific role of this region in the process of structuring the European paternal genetic landscape is evident in prehistoric episodes of significant gene flow that diffused from or into the region.IZVLE∞EK -Cilj te ∏tudije je preu≠iti obstoje≠e podatke, ki so osnovani na visoko resolucijskih filogenetskih ∏tudijah variacije kromosoma Y v populacijah jugovzhodne Evrope in v drugih delih Evrazije, z namenom, da ocenimo vlogo regije v procesu prazgodovinske kolonizacije evropskega kontinenta ter pri strukturiranju modernega mo∏kega genskega sklada.∞eprav sta distribucija in predviden obseg ∏iritve glavnih mo∏kih dednih linij v jugovzhodni Evropi skladna s tipi≠nim Evropskim genskim skladom kromosoma Y, se posebna vloga te regije ka∫e v procesu strukturiranja evropskega mo∏kega dednega genetskega zemljevida, povezanega z dogodki v prazgodovini in genskim pretokom v regijo in iz regije.
about one-quarter of that of any autosome.Consequently, a genetic difference depicted by Y chromosomes, in comparison to autosomes, is more susceptible to the effects of random genetic drift that accelerates geographic clustering and differentiation between different (especially small) populations.The general structure of paternal genealogies is compatible and indicative of the common origin of all non-African contemporary populations from a small subset of Africans.Despite disagreement about the time to the most recent common ancestor (TMRCA) of the Y chromosome, its phylogeny roots in Africa around 100 KYA (e.g.Hammer et al. 1998;Underhill et al. 2001;Underhill 2003).
The first extensive studies of European Y chromosome dispersal by Semino et al. (2000) and Rosser et al. (2000) showed clinal patterns for the most frequent European haplogroups.Moreover, Semino et al. (2000) grouped more than 95% of European Y chromosomes into 10 phylogenetically distinct haplogroups, of which 70-80% of the Y chromosome gene pool was represented by R1a, R1b, I and N3, and the remaining 20% by J2, E3b, and G.

Palaeolithic haplogroups
Haplogroup I is the only autochthonous European haplogroup assumed to have arisen in an Epi-Gravettian group among the descendants of people who arrived in Europe from the Near East around 25 KYA (Semino et al. 2000).This haplogroup is almost entirely restricted to the European continent, where it shows frequency peaks in two areas -Scandinavia and Southeastern Europe (Semino et al. 2000).Further phylogenetic subdivision revealed subclades I1a, I1b*, I1b2, and I1c (Rootsi et al. 2004).The geographical distribution of I1a (the highest frequencies in Northern Europe among Norwegians, Swedes and Saami) is considered to be a result of the recolonization of Europe after the LGM from the Francocantabrain refugial area (Rootsi et al. 2004).The origin of the less frequent I1c, that covers a wide range of Europe and peaks in northwest coastal Europe, is in concordance with I1a (Rootsi et al. 2004) (2003).This scenario could be indirectly supported by the recolonization of Northern Europe from the direction of Southeastern Europe by at least two species -the brown bear Ursus arctos (Taberlet and Bouvent 1994) and the European hedgehog Erinaceus europeus (Hewitt 2000).
Another widespread haplogroup in Europe, R1a, is characteristic of Eastern European populations (Fig. 2a).The age of this haplogroup has been approximated to 15 KYA (Semino et al. 2000;Wells et al. 2001).Kivisild et al. (2003) suggested that Southern and Western Asia might be the source of R1 and R1a differentiation.Present R1a distribution in Europe shows an increasing west-east frequency gradient, with the highest frequencies among Finno-Ugric and Slavic speakers (Fig. 2a).R1a frequency shows a decrease in the north-south direction in Southeastern Europe (Fig. 2b), where its age is estimated at 15.8 ± 2.1 KYA (Peri≠i≤ et al. 2005).This estimate is consistent with the R1a deep Palaeolithic time depth previously suggested by Semino et al. (2000) and Wells et al. (2001).At the current level of resolution it is not possible to determine which of three potential episodes of gene flow might have influenced the estimated age in Southeastern Europe: early post-LGM recolonizations from the direction of the Ukrainian refugium, migrations from the northern Pontic steppe in the period between 3000 to 1000 BC, or Slavic migrations between the 5 th and 7 th centuries AD.
Its sister clade, haplogroup R1b, was introduced by or arose in an Aurignacian group who entered Europe and diffused from east to west about 40 to 35 KYA (Semino et al. 2000).
R1b shows a frequency peak in Western Europe and a decrease in Eastern and Southern Europe (Fig. 3a).Even though R1b frequency decline continues from Western to Southeastern and Southern Europe, two intermediate local peaks are evident in Southeastern Europe (Fig. 3b).According to our data, in Southeastern Europe the coalescent estimate of R1b (11.6 ± 1.4 KYA) closely matches the estimate for the I1b* lineages, pointing to the Younger Dryas to Holocene transition as a possible expansion period of these two major Y chromosome lineages (Peri≠i≤ et al. 2005).

Neolithic haplogroups
Approximately 20% of European Y chromosomes belong to haplogroups E3b, J2 and G that, due to their decreasing frequency gradients from the Near East to Europe, have been traditionally considered to re-

Fig. 4. E3b1 frequency distribution in Europe, Northern Africa and Asia Minor (panel a) as well as in Southeastern Europe (panel b). Frequency distributions surfaces are taken from Peri≠i≤ et al. (2005). E3b1 frequency data for different Eurasian populations were generated from literature, as listed in Table 1 in Peri≠i≤ et al. (2005).
present the male contribution of a demic diffusion of farmers (e.g.Semino et al. 2000;Semino et al. 2004;Cruciani et al. 2004).E3b1 shows a frequency peak in Southern and Southeastern Europe (Fig. 4a).In fact, E3b1 shows a rather continuous frequency decline in Southeastern Europe (Fig. 4a).Populations of the Adriatic-Dinaric complex are distinguished from neighboring populations of the Vardar-Morava-Danube river system by a lower frequency of E3b1 (Fig. 4b), possibly due to its different dispersal modes in two proximate geographic regions.Moreover, the Vardar-Morava-Danube river system could have been one of major routes for E3b1 expansion from South and Southeastern to continental Europe, as evidenced in the archeological record (e.g.Tringham 2000).The estimated age of this haplogroup of 7.3 ± 2.8 KYA in Southeastern Europe accords with the time of expansion of the Neolithic in Europe (Cruciani et al. 2004;Semino et al. 2004).
Haplogroup J is subdivided into two major clades, J1-M267 and J2-M172 (Cinnioglu et al. 2004).J2-M172 is more frequent in Europe (Semino et al. 2004).In Southeastern Europe the most frequent is haplogroup J2e, which comprises 5% of all chromosomes (Peri≠i≤ et al. 2005), while haplogroup J2, the main J2 cluster among Greeks and Italians (Di Giacomo et al. 2004), is present at a frequency of less than 1%.The estimated age of the haplogroup J2e in Southeastern Europe (2.8+1.6 KYA), together with its spatial distribution (two frequency peaks positioned in the Balkans and central Italy, Figs.5a  and 5b), may be explained by the maritime spread of J2e lineages from the southern Balkans towards the Apennines later than is traditionally suggested by the demic expansion model (Peri≠i≤ et al. 2005).

Concluding remarks
Even though the distribution and estimated range expansions of major paternal lineages in Southeastern Europe are consistent with the typical European Y chromosome gene pool, the specific role of this region in the process of structuring the European paternal genetic landscape is evident in the following prehistoric episodes of significant gene flow: the post-LGM R1a expansion from Eastern to Western Europe, the YD-Holocene I1b* diffusion out of the . A completely different distribution pattern is observed in I1b* Y chromosomes, the most frequent haplogroup I clade in Eastern Europe and on the Balkan Peninsula.I1b* reaches maximum frequencies in Southeastern Europe in Bosnia and Herzegovina (Fig. 1).Our results indicate that the homogenous distribution of elevated I1b* frequency among different populations in Southeastern Europe could support the hypothesis of their having a common paternal history shared over a long period of time (Peri≠i≤ et al. 2005).Rootsi et al. (2004) estimated that I1b* diverged from I* at 10.7+4.8KYA, possibly in relation to the post Younger Dryas (YD) climate amelioration in Europe, and that I1b* expansion occurred around the early Holocene at 7.6+2.7 KYA.Our coalescent estimate of I1b* (Pe-ri≠i≤ et al. 2005) is substantially older (11.1+4.8KYA).This finding suggests that the I1b* lineages might

Fig. 1 .
Fig. 1.I1b* frequency distribution in Europe, Northern Africa and Asia Minor (panel a) as well as in Southeastern Europe (panel b).Frequency distributions surfaces are taken from Peri≠i≤ et al. (2005).I1b* frequency data for different Eurasian populations were generated from literature, as listed inTable 1 in Peri≠i≤ et al. (2005).
Balkans, with subsequent R1a and I1b* gene flows between Eastern and Southeastern Europe, and the weaker extent of E3b1 dispersal out of Southern and Southeastern Europe towards Eastern Europe than towards Western (especially Mediterranean) Europe.
Fig. 5. J2e frequency distribution in Europe, Northern Africa and Asia Minor (panel a) as well as in Southeastern Europe (panel b).Frequency distributions surfaces are taken from Peri≠i≤ et al. (2005).J2e frequency data for different Eurasian populations were generated from literature, as listed inTable 1 in Peri≠i≤ et al. (2005).

Table 1 in Peri≠i≤ et al. (2005). Fig. 2. R1a frequency distribution in Europe, Northern Africa and Asia Minor (panel a) as well as in Southeastern Europe (panel b). Frequency distributions surfaces are taken from Peri≠i≤ et al. (2005). R1a frequency data for different Eurasian populations were generated from literature, as listed in Table 1 in Peri≠i≤ et al. (2005).
Semino et al. (2000))utheastern to Central, Eastern and Southern Europe in a period not earlier than the YD to Holocene transition and not later than the early Neolithic(Peri≠i≤ et al. 2005).Although not yet supported by archaeological evidence, the I1b* spread in Europe suggests that Southeastern Europe could have served as an LGM refugium, as previously suggested bySemino et al. (2000)and Bara≤ et al.