Phylogenetic analysis of Colchicum (Colchicaceae) genus by molecular markers from nuclear and chloroplast genome

We evaluated the patterns of genetic variation of 16 Colchicum species, including 37 different genotypes, using RAPD marker and trnL–trnF chloroplast DNA sequence. A total of 861 polymorphic alleles through RAPDs showed a mean of 33.88 ± 3.80 alleles per primer, while mean major allele frequency was 0.067 ± 0.05. The sequence length of trnL–trnF ranged from 1022 bp to 1081 bp. The phylogenetic tree was constructed to understand the relationship between Colchicum species and the discrimination power of the nuclear and chloroplast genome for species. The results showed that trnL–trnF gene region grouped Colchicum species well in comparison with RAPD analysis. This data was also supported by haplotype network analysis, structure analysis and PCA (Principle Component Analysis). This study showed that there is a need for a characterization that contains more molecular and morphological methods to correctly distinguish Colchicum species.


Introduction
Colchicum L. is a geophyte member of a flowering plant family, namely Colchicaceae with a widespread distribution. Nineteen genera naturally present throughout Asia, Eurasia, Africa and North America (Vinnersten & Reeves, 2003). The major speciation and diversification center for the genera is thought to be the Balkans and Turkey due to high frequencies of species and endemics (Persson, 1993). Turkey harbours 39 taxa of the family Colchicaceae, eighteen of which are endemic to Turkey (Brickell, 1984;Akan & Satil, 2005;Persson, 2000Persson, , 2005Persson, , 2007. Colchicaceae was firstly used as family name by a researcher, Candolle, in 1805 (Vinnersten & Reeves, 2003). Since then, the placement of several genera taxonomically within Colchicaceae shows uncertainty (Kahraman & Celep, 2010). Some studies suggested that, two genera (Bulbocodium L. and Merendera Ramond) separated depending on their style and sepal characteristics from Colchicum L. while they are commonly included with Colchicaceae (Kahraman & Celep, 2010).
Investigation of morpho-anatomical characteristic of different Colchicum species has previously been conducted several times (Akan & Eker, 2005;Persson, 2005;Düşen & Sümbül, 2007;Kahraman & Celep, 2010;Sevgi & Kucuker, 2011). Additionally, Colchicum is a problematic genus so classifying the species in different periods of flowering and foliage is difficult. The characterization of Colchicum species, however, has not been fully realized. In such a problematic genus, a morphological or molecular single parameter is not sufficient to accurately describe the species and due its complicated structure it should be evaluated based on different research methods for diversity analysis. Different types of molecular marker techniques throughout the plant genome are usable for determining genetic diversity and evolutionary relationships. In this study phylogenetic studies including nuclear (RAPDs) and chloroplast (trnL-trnF) genome markers due to their different modes of inheritance and rates of mutation were determined along the phylogenetic tree. Random Amplified Polymorphic DNA (RAPD) markers are non-specific markers which are very popular tools for the investigation of nuclear genome and the determination of the genetic fingerprint data. RAPD can help to identify phylogenies among closely related species or at the intraspecific level and detect high levels of polymorphism, without needing any previous information on genome (Williams et al., 1990). The "non-coding intergenic spacer" regions of chloroplast genome are widely used for molecular characterization and phylogenetic analysis in plant genomes. Chloroplast DNA (cpDNA) is inherited maternally in plants and particularly useful for phylogenetic studies due to its high degree of base sequence conservation (Palmer, 1987;Taberlet et al., 1991;Hodkinson et al., 2000;Zhang et al., 2001).
Although cpDNA is highly conserved within species, non-coding regions within the chloroplast genome have higher rates of mutation (Shaw et al., 2005). The non-coding intergenic spacer trnL (leucine) and trnF (phenylalanine) regions are more convenient to use at taxonomic levels since they have less limited sequences functionally and therefore less chances of change because phylogenetically, they have more informative characters (Shaw et al., 2005;Downie & Palmer, 1992;Soltis et al., 1990). The analysis of non-coding cpDNA regions was extremely useful for the determination of phylogenetic relationships in lower taxonomic levels, between closely related species and even within the species. So far, they have been widely used for reconstructing phylogenetic inference at the intra-and inter-specific levels in Colchicaceae family (Vinnersten & Reeves, 2003;Türktaş et al., 2012). Particularly in the latest phylogenetic analysis of the Colchicum genus, Persson et al. (2011) used Colchicum morphology and six plastid regions (trnL intron, trnL-trnF IGS, trnY-trnD IGS, trnH-psbA, IGS, atpB-rbcL IGS, rps16 intron) to reconstruct Colchicum phylogeny and they have revealed important results about phylogenetic relationship between Colchicum species. In this study, we seek to characterize phylogenetic relationships of genus Colchicum collected from all regions of Turkey, including 16 species and 37 genotypes, using RAPD markers and sequence data of the chloroplast intergenic region trnL-trnF (c-f region) markers to resolve relationships in the genus.

Plant Materials
Leaf samples from 32 various locations (at the seven different regions) of 37 Colchicum L. genotypes, including 16 different species throughout Turkey flora, were obtained from the Republic of Turkey, Ministry of Agriculture and Livestock General Directorate of Agricultural Research and Policy, Atatürk Central Research Institute of Gardening Plants (ACRIGP) at Yalova (Table 1). The samples consist out of: 10 autumn species including 18 genotypes; 6 spring species including 19 genotypes and 6 endemic species including 16 genotypes.

DNA Extraction
CTAB (hexadecyltrimethylammonium bromide) protocol (Doyle & Doyle, 1987) was used for the genomic DNA isolation from Colchicum leaf tissues. DNA quantity was measured using Qubit 2.0 fluorometer and diluted for PCR reactions. Qualities of isolated DNAs were visualized at the 0.8% agarose gel.

Genotyping by Molecular Markers from Nuclear Genome
We generated RAPD fingerprints, using the primers synthesized by Invitrogen Technologies based on the Operon Technologies. We used 51 RAPD primers in this study ( Table 2). The optimal reaction for RAPD analysis was set, using the following conditions: 25 µl containing 1× buffer, 0.2 mM dNTP mix, 3.75 mM  MgCl 2 , 0.025 U µl -1 Taq DNA Polymerase, 1.8 µM primer and 1µl gDNA. The PCR program was carried out by the following procedure: initial denaturation for 3 min at 94 °C, denaturation for 1 min at 94 °C, 1 min for annealing at the temperature determined for each primer, 1 min at 72 °C for elongation for 40 cycles following by a final elongation

Genotyping by Molecular Markers from Chloroplast Genome
Non-coding region between the trnL (UAA) 5' exon (c region) and trnF (GAA) (f region) of chloroplast genome was amplified using primers (5'-CGA AAT CGG TAG ACG CTA CG -3') and (5'-ATT TGA ACT GGT GAC ACG AG -3') (Taberlet et al., 1991;Mummenhoff et al., 2001; Figure 1). The amplification reactions were performed using 1×PCR buffer, 2.5 mM MgCl 2 , 0.025 U Taq DNA polymerase, 0.2 mM dNTPs, 0.3 µM forward primers and 0.5 µM reverse primer, and 4 ng of genomic DNA in final volume of 25 µl reactions. PCR amplifications were carried out following the PCR reactions: initial denaturation at 94 °C for 1 min 30 s, 35 cycles at 94 °C for 1 min, 60 °C for 35 s, 72 °C for 1 min 50 s, and final extension step step at 72 °C for 5 min. We run the PCR products in agarose gel containing 1.5% agarose dissolved in 0.5×TBE buffer and 2 µl 100 ml -1 red safe for 90 minutes in 150V. We visualized the gels by gel documentation system under UV light and the images were analyzed by Genosoft version 3.8.2 software.
at 72 °C for 10 min. The PCR products were separated by 1.5% agarose gel with red safe nucleic acid staining solution and photographed. Purification of amplifications obtained from PCR analysis was performed by "charge-switch-Pro PCR clean-up purification" kit. Following purification, cycle sequencing reaction was carried out using trnL-trnF fragment with "ABI BigDye Terminator v3.1 Cycle Sequencing" kit. The cycle sequencing reaction mixture with a total reaction volume of 10 µL included 1µL 5× buffer, 2 µL Big Dye terminator, 1.8 µL nuclease-free water, 3.2 µL primer (1 µM), and 2 µL of genomic DNA (4 ng µL -1 ). Reactions of sequence analysis were carried out with a procedure; at 96 °C for 1 min (denaturing), followed by 25 cycles of 10 s at 96 °C, 5 s at 50 °C and 4 min at 60 °C (final

Data Scoring and Statistical Analysis
After RAPD analysis, obtained band sizes were determined and by Genosoft software version 3.8.2. they were analyzed. The fragments were evaluated statistically as 1-0 data according to their absence or presence. PIC (polymorphism information content) values, major allele frequency and gene diversity were identified by the software POWER MARKER version 3.25 (Liu & Muse, 2005). Model-based software STRUCTURE 2.3.3 was used to infer the population structure. STRUCTURE 2.3.3 software was run using a burn-in the length of 10.000 -100.000 for the determination of the number of the subpopulations (K) (Mucciarelli et al., 2014). To determine the genetic structure of the 37 Colchicum sp. genotypes using RAPD, the Principal Component Analyses (PCA) was performed by GenAlEx software 6.41 (Peakall & Smouse, 2012). For this application pairwise genetic distances of each marker were calculated and the principal components were extracted from the pairwise genetic distance matrix. Cluster analysis was performed, and phylogenetic tree was constructed by Phylip V 3.695 software using neighbor joining method with 100 bootstrap replicates (Felsenstein, 2005) for RAPD markers. Phylogenetic tree was visualized by TREEVIEW program (Page & Holmes, 1998). Thirty-seven Colchicum genotypes having been sequenced, alignments for trnL-trnF gene region were carried out using MEGA 6.06 (Tamura et al., 2013). Single Nucleotide Polymorphism (SNP) extraction was conducted by MEGA 6.06 and SNPs were detected as loci after alignments were performed. The population structure was evaluated by STRUCTURE 2.3.3 software with SNP data as 0 and 1. Genetic diversity assessment was carried out by PCA. Using Mega 6.06 software program, pairwise distance was computed. We exported the computed genetic distance data into excel format. GenAlEx 6.4 was run to examine PCA. Phylogenetic analysis of trnL-trnF chloroplast intergenic region sequence data was performed in MEGA 6.06 and neighbor joiningbased dendrogram was constructed. So as to supply the phylogenetic tree of trnL-trnF chloroplast region, haplotype median joining network was constructed using PopART program (Bandelt, 1999).

Genotyping Results from Nuclear and Chloroplast Genome
RAPD analyses were carried to assess 37 genotypes gathered from 32 different locations all over Turkey. Annealing temperatures, number of bands per primer and minimum-maximum band sizes acquired by amplification of 51 RAPD primers are listed in Table 2. A total of 861 polymorphic alleles were found through RAPD marker analyses. Minimum fragment size observed was 120 base pairs by OPM 11 primer, the maximum fragment size observed were 1870 base pairs by OPAW 7 primer. All alleles were weighted equal and subjected to statistical analyses with Power Marker 3.25 software. Analyses showed a mean of 33.88 ± 3.80 alleles per primer, while mean major allele frequency was found 0.067 ± 0.05. Mean of gene diversity was found 0.964 ± 0.018 and, PIC was found as 0.963 ± 0.19. The results of the summary statistics were given in Table 2. The sequence length of trnL-trnF ranged from 1022 bp to 1081 bp. The 37 sequences obtained from chloroplast intergenic region trnL-trnF of 37 Colchicum genotypes were submitted to NCBI GenBank (accession numbers: KY860481-KY860517).

Phylogenetic Analysis
Phylogenetic tree was constructed with RAPD markers and the sequencing data of trnL-trnF region for 37 Colchicum L. genotypes (Figure 2A, B) showed differences from one another. Dendrogram constituted with 51 RAPD markers data indicated that 37 Colchicum L. genotypes were separated as three outgroups and two main groups, one of which contains eight subgroubs. Other phylogenetic tree, constructed with sequencing data showed two main groups, and each of which contains two subgroups. There are eight species represented by a single genotype. Two of them (C. boissieri Orph., C. polyphyllum Boiss. & Heldr.) appear in one of the main out groups of the phylogenic tree by RAPD results and four of them (C. boissieri Orph., C. chalcedonicum Azn. ssp. chalcedonicum, C. decaisnei Boiss. and C. polyphyllum Boiss. & Heldr.) appear in one of the main outgroups of the phylogenic tree by trnL-trnF region results. It has been revealed through both RAPD and trnL-trnF region phylogenetic tree results that, as the single represented species, C. boissieri Orph. and C. polyphyllum Boiss. & Heldr. were observed in one main group. Each marker system has its own advantages and disadvantages for phylogenetic analysis. It was observed that some genotypes belonging to the same species were grouped well by RAPD analysis and some genotypes with trnL-trnF region analysis (Figure 2A, B). It is quite difficult to determine the relationships between the species using only a marker system. In addition, haplotype network analysis was performed to support the accuracy of the phylogenetic tree generated by sequence knowledge of the trnL-trnF gene region. The haplotype network of all haplotypes visualizes their relationships as well as their relative abundance. In haplotype network (Figure 3), haplotypes are represented by circles. Two haplotypes are connected by a line if they are separated by one mutation; each additional mutation is indicated by black lines. Configuration of the haplotype network is consistent with the trnL-trnF chloroplast gene region phylogenetic tree.

Structure Analysis
STRUCTURE software was used for the estimation of possible distributions of Colchicum genotypes. The structure of genotypes was evaluated by optimal K value which shows the most probable subgroup estimation. Variations of RAPD and trnL-trnF markers in 37 genotypes of Colchicum L. resulted in four and two meaningful clusters (K=4, K=2) respectively ( Figure 4A, B), with the exploration of the meaningful clustering by statistics developed by Evanno et al. (2005). In Figure 4A, deltaK showed high peak so it was decided that genotypes are divided into 4 groups. In Figure 4B, deltaK value was observed at only K=2 point. The structure results were taken into consideration in the light of Structure Harvester (Earl and Vonholdt, 2012) and the four and two groups were seen as in the Figure  4C, D respectively. Subpopulations structured by RAPD (yellow, red, green and blue) and trnL-trnF (red and green) marker data indicated that the 37 genotypes of Colchicum L. genotypes were not clustered well.

The Principal Component Analysis (PCA)
PCA of the RAPD marker data indicated that 36.81% of the variation was represented by the coordinates 1 (18.76%) and 2 (18.05%). The first three principle coordinates represented 53.57% of the total variation (Table 3). PCA data of sequencing analysis of trnL-trnF showed that 56.58% of the variation was represented by the coordinates 1 (37.67%) and 2 (18.91%). The first three principle coordinates represented 71.37% of the total variation ( Table 3). The PCA graph of RAPD markers dataset indicated that 37 genotypes of Colchicum L. were not divided into any groups ( Figure 5A), while the PCA graph of sequencing analysis of trnL-trnF dataset indicated that 37 genotypes of Colchicum L. were divided into three main groups ( Figure 5B). By using RAPD markers and chloroplast intergenic region trnL-trnF to characterize the genus Colchicum, this study performed to show the phylogenetic relations, genetic diversity, and population structure among 37 Colchicum L. genotypes belongs to 16 species. The PCA results of RAPD markers showed that 37 genotypes of Colchicum L. were not divided into any groups while the PCA results of trnL-trnF showed that 37 genotypes of Colchicum L. were divided into three main groups. There was no correlation between RAPD markers and trnL-trnF, according to PCA results. Though very few phylogenetic studies on Colchicum have been performed, there are many studies related to Colchicaceae family, particularly with Androcymbium species. In these studies, covering some Colchicum and Androcymbium species, Manning et al. (2007) used rps16 intron, the atpB-rbcL spacer and the trnL-F region and Del Hoyo & Pedrola-Monfort (2008) used trnL intron, trnL-trnF IGS, trnY-trnD IGS, trnH-psbA IGS and RNApol2 intron 23 specific markers and they created a phylogenetic tree of Colchicum species based upon both trnL-trnF sequences and morphology data. The result of this study reported that, the clades were in many cases not very well supported and the origin of the Colchicum clade at the side of the Androcymbium group basis indicates that the history of this dispersion event occurred early. Arzate-Fernández et al. (2005) evaluated the molecular characterization using RAPD, ISSR and cpDNA markers for the selection of Lilium maculatum var. bukosanense species, a significant correlation between the gene diversity and the population size was reported, and these species should thus be taken into in-situ conservation and breeding program. The latest phylogenetic analysis of the genus Colchicum including Bulbocodium and Merendera was published in 2011 by Persson et al. (2011) studied phylogenetic relationships among Colchicum species by using six plastid regions, morphological and chromosomal characteristics and reported that the phylogeny of almost all known Colchicum species based on sequence data of noncoding chloroplast DNA (cpDNA) was used for the elucidation of phylogenetic relationships among species. According to their results, in a group of phylogenetic, consisting of species with ideal data sampling would include a combination of morphology, plastid sequence data and parentally inherited single-copy nuclear sequence data. The study of phylogenetic relationships between 14 Colchicum taxa spreading throughout Turkey performed by fluorescent-based AFLP technique reported that the genetic diversity among these Turkish genotypes is relatively high (Metin et al., 2014). These researchers grouped 14 Colchicum taxa into 3 clusters. Each marker system has its advantages as well as its disadvantages, so it is difficult to clearly determine the phylogenetic relationships of species with only one type of marker. Colchicum L., species should be characterized by multiple aspects such as agronomic traits, molecular and cytogenetics data to resolve the phylogeny of genus (Del Hoyo et al., 2009).

Conclusion
Phylogenic data obtained in this study may provide significant contribution to the future studies to increase the power of the resolution in the phylogenetic pattern of Colchicum. As the species possess different levels of ploidy, the addition of plastid sequence data to better resolve the phylogenetic classification of the genus for future investigations is important. Additional data such as selected morphometric characteristics and the singlecopy nuclear DNA sequence data, will be required to resolve all the relationships and correlations among the Colchicum genus.