Molecular phylogeny of angiospermic plant families using RBCL gene sequences

Abstract: The present study was undertaken to assess the role of plastid-rbcL (ribulose-1, 5-bisphosphate carboxylase large-subunit) gene sequences in addressing the evolutionary relationships within the angiosperms at inter and intra-familial levels using computational experiment. In order to elucidate the relationships, a set of 92 chloroplast rbcL sequences representing from 90 taxa of 12 genera and 10 angiospermic plant families (dicot and monocots) were withdrawn from the GenBank database. The multiple sequence alignment was performed using Genebee-ClustalW service to findout the regions of conserved or indels among the sequences. The phylogenetic tree was inferred from these sequences by employing Bootstrap method of UPGMA (Unweighted Pair Group Method with Arithmetic Mean) using MEGA (Molecular Evolutionary Genetics Analysis) software. The consistency of these generic-wise groupings was further confirmed by the MP (Maximum-Parsimony), ME (Minimum-Evolution) and NJ (Neighbor-Joining) methods. The analysis of these studies strongly indicates that, out of the 12 selected genera Trichosanthes (Cucurbitaceae), Phyllanthus (Phyllanthaceae), Austrobryonia (Cucurbitaceae), Solanum (Solanaceae), Piper (Piperaceae) and Saxifraga (Saxifragaceae) are grouped into separate clusters and exhibiting monophyletic conditions. Where as, Drypetes (Putranjivaceae), Asparagus (Asparagaceae), Cassia (Caesalpinioideae, sub-family), Canna (Cannaceae), Mentha (Lamiaceae), are paraphyletic and the members of the Salvia (Lamiaceae) are distributed throughout these hiraeoid clades, confirming the polyphyly of this large genus. Similar observations were noticed in all four methods. Thus, chloroplast rbcL gene sequences can unambiguously resolve the relationships, as well as provided a good indication of major supra-generic groupings among the selected angiospermic plant families and also gave many clues for future studies.


INTRODUCTION
The use of DNA nucleotide sequence data has awakened a quickly growing interest in molecular systematic analyses. Since last decade molecular systematics in plants has progressed rapidly with in-vitro DNA amplification and direct sequencing methods. In angiosperm systematics, this molecular approach has been effective in addressing many phylogenetic questions that had not been solved only by using phenotypic characters. The gene for the large subunit of the ribulose-bisphosphate carboxylase/oxygenase (rbcL), located on the chloroplast genome, is an appropriate choice for inference of phylogenetic relationships at higher taxonomic levels [1][2][3]. Because of its slow synonymous nucleotide substitution rate in comparison with nuclear genes and its functional constraint that reduces the evolutionary rate of non-synonymous substitution [4]. The substitution rate of rbcL appears appropriate for studies involving taxa that diverged from 10's of million years [5]. In recent years, there has been a growing interest in using sequences of the gene coding for the large subunit of ribulose-1, 5-bisphosphate carboxylase/ oxygenase (rbcL) to estimate the plant phylogenies [6]. Several research groups have shown the usefulness of this gene for solving intergeneric and interspecific relationships among flowering plants example, Cucurbitaceae [7], Orchidaceae [8], Rhamnaceae [9], Araucariaceae [10]. The present study was undertaken to evaluate the relationships at the intra and inter familial, intergeneric and interspecific levels, and also to examine the rates, patterns and types of nucleotide substitutions within the rbcL gene in all the selected taxa of each genera, through a computational experiment using 92 chloroplast rbcL gene sequences from 90 taxa, 12 genera, 10 families, and discussed the utilization of plastid-rbcL gene for molecular phylogeny.

MATERIALS AND METHODS
Cladistic parsimony analyses of a total 92 rbcL nucleotide sequence data representing 90 species (Spp) and 2 duplicates from Austrobryonia micrantha and Mentha longifolia of 12 genera, 10 angiospermic plant families (both dicots and monocots) are presented here in the table-1, were withdrawn from GenBank database). The multiple sequence alignment was performed using Genebee-ClustalW service. All gap characters were scored as missing data rather than a fifth character. Phylogenetic analyses of the sequence data was analyzed with UPGMA using MEGA software. The consistency of these generic wise groupings was further confirmed by the MP, NJ and ME methods as implemented in MEGA 4.1 software [11]. Tree statistics included the consistency index (CI) [12], retention index (RC) [13] and rescaled consistency index (RCI) [13]. Branch lengths (ACCTRAN optimization with equal weights) and the level of support for branches of the phylogenetic trees was evaluated with the bootstrap analysis [14] to verify the strength of the branches based on 100 replicates, using branch and bound search. Bootstrap percentages are described as high (85-100%), moderate (75 -84%) or low (50 -74%) [15]. The number of nucleotide substitutions per site was estimated by Kimura's two parameter method [16] with the DNADIST program.

RESULTS
The 12 genera of angiosperms namely, Trichosanthes (4 Spp), Solanum (2 Spp), Austrobryonia (4 Spp and 5 sequences), Piper (11 Spp), Phyllanthus (9 Spp.), Saxifraga (2 Spp.), Drypetes (10 Spp), Asparagus (3 Spp), Salvia (35 Spp.), Cassia (4 Spp), Canna (3 Spp), Mentha (3 Spp and 4 sequences) were selected for the study are highly divergent in several respects from one another. But the properties, such as, biochemical, morphological and cytological characteristics of species within each genus are exhibiting greater level of similarity. This in turn inferred that the molecular makeup of these organisms at the generic and family levels is also highly conserved. The analysis of the above studies strongly indicates that, out of the 12 selected genera, the members of Trichosanthes, Austrobryonia, Phyllanthus, Solanum, Piper and Saxifraga are nested into a separate clusters as a monophyletic group, within the major clades and sub-clades in the tree topology, this is strongly supported by high bootstrap replications (100%). Where as, Drypetes, Asparagus, Cassia, Canna, Mentha are paraphyletic and the members of the Salvia are distributed throughout these hiraeoid clades, confirming the polyphyletic condition of this large genus. These observations were noticed in all UPGMA, MP, ME and NJ methods, but only the positions of the generic-wise clusters were altered into the tree topology in all these methods. The members of the Asparagus (Asparagales) and Canna (Cannaceae) were nested within the angiospermic dicot plant families also exhibiting the paraphyletic conditions of these genera with selected dicot plant families. The total length of the tree is 14029. Of the 92 sequences ranging between 968 -1471 in the data matrix, of which only 877 nucleotide residues were used to infer the phylogenetic relationships among the selected taxa and were shown to be invariant and 305 were variable. These trees were characterized by a consistency index (CI) of 0.187540, retention index (RI) of 0.733592 and the rescaled consistency index i.e., RCI = 0.137578 (for all sites). The strict consensus of these weighted trees is shown in the fistula-1465. However, the sequence length of rbcL in 35 members of the genus Salvia is ranging between 968 -1474. But the members of this genus located in the subclades-I of Clade-V are strictly containing 1323 and subclades-I of Clade-I containing 1371 nucleotide residues only, and these exhibited highest level of sequence similarity and originated from one common ancestral forms. Where as, the members of this genus located in the sub-clade-IV of clade-I, clade-II, and clade -III are found to be highly variable. The UPGMA method of phylogenetic analysis strongly suggests that, the members of the genus Trichosanthes are evolved first and forms a primitive, and then Phyllanthus, Drypetes, Saxifraga, Solanum, Austrobryonia are evolved in a sequential order. Where as, the members of the genus Piper are found to be late evolved and inferred them as highly advanced among the selected genera. Where as, the members of the genus Mentha are evolved twice with the members of Salvia. Mentha suaveolens and M. longifolia are early evolved, they are found in the cluster-I and forms a basal group. But the M. rotundifolia and M. longifolia (2) are late evolved and nested in a cluster-V indicating that they are advanced. Among the four species of the Cassia, only the C. senna and C. didymobotrya exhibited high level of similarity, this is supported by high bootstrap values of 100% and occupying in the basal group. Where as, Cassia grandis nested with in the members of Salvia in clade-II and Cassia fistula originated sisterly with the members of the genus Saxifraga. Similarities between these individuals are strongly supported with bootstrap values as 96%. In case of the genus Canna, only the C. indica and C. glauca are highly similar, supporting with bootstrap values of 100% and located in the major clade-I. Another member of the same genus C. tuerckheimii is located in the clade-IV. Due to high sequence diversity among the three selected species of Asparagus are distributed distantly into the tree topology and indicated that, this is evolved several times during evolution. A. capensis is found in clade-I and referred as less evolved or primitive, where as, A. cochinchinensis in the cluster III and A. officinalis is in Clade-V and this A. officinalis is inferred as highly evolved among other selected species. Among the 35 species, the members located in the subclade-I of clade-I and members in the subclade -I of clade-V are exhibiting greater level of sequence similarity, members of the same genus located in the other clades and subclades are displaying much variable. It is evident that the members of this large genus evolved several times and are distributed into several patches all along the tree topology (FIG -1). The similar monophyletic, paraphyletic and polyphyletic conditions of these selected genera were also confirmed through other NJ, MP, and ME methods, indicating that these conditions are strictly conserved. Only the relative position of these generic level groupings was altered (FIG -2-4).

DISCUSSION
The study presented here is an attempt to employ a set of 92 plastid rbcL gene sequences of 90 taxa belonging to 12 genera of 10 angiospermic plant families to address intra familial relationships. The analyses of rbcL nucleotide sequences presented here provide a great deal of support for previous hypotheses of relationships within the family and generic levels of some angiospermic plants. The members of the genera belongs to Asparagus (Asparagales) and Canna (Cannaceae) belongs to the subclass monocotyledonous were nested within the angiospermic dicot plant families also exhibiting the paraphyletic conditions of these genera with selected dicot plant families. The present data also reject the views of Burger, 1997; Taylor and Hickey, 1992 [17][18], that Piperaceae are basal among most of the primitive angiosperms and that, the monocots included in the present study were derived from them. In contrast to the Shu-Miaw et al., 1997 [19], that the monocots included in the present studies are not clustered into a separate monophyletic group. The members of the genera Canna and Asparagus belongs to monocotyledons are nested within the major clades of the dicot plant families. Therefore, the plastid-rbcL gene sequence data lend support to Hutchinson's view that "the single cotyledon, parallel-veined leaves, absence of cambium, dissected stele and the adventitious root system of monocots are all regarded as apomorphies within the angiosperms. Moreover monocots are regarded as derived from dicots, the point of origin being Ranales [20]. These results show that analyses of the chloroplast DNA rbcL is a useful approach for inferring phylogenetic relationships especially at the supra-generic level.

CONCLUSION
However, the present study, the tree relations among the genera are well resolved and providing support for the monophyly of large clades and subclades for many of the selected plant genera such as Trichosanthes, Austrobryonia, Saxifraga, Solanum, Phyllanthus and Piper. The polyphyletic condition of the genus Salvia is supported by the higher rate of the transition and transversion ratio is 2.7, 554 gaps as well as significant variation in the total number nucleotide residues in the different members of the genus Salvia among 35 species (i.e., 968 in Salvia nubicola to 1474 in Salvia divinorum).
The present investigation also concludes that, the gene plastid-rbcL alone is not likely to provide robust estimates of phylogeny. Data from additional molecular (complete nuclear and chloroplast genetic systems) and morphological sources and their combined analyses are necessary to establish a stable and firm base for a plant systematics (Family classification) that better reflects phylogenetic relationships.