Marine Genomics xxx (xxxx) xxxx
Contents lists available at ScienceDirect
Marine Genomics journal homepage: www.elsevier.com/locate/margen
Method paper
The chloroplast genome sequence of the green macroalga Caulerpa okamurae (Ulvophyceae, Chlorophyta): Its structural features, organization and phylogenetic analysis Fengrong Zhenga,b, , Bo Wanga, Zhen Shena, Zongxing Wanga, Wei Wanga, Hongzhan Liuc, , Claire Wangd, Maosheng Xine ⁎⁎
⁎
a
First Institute of Oceanography, Ministry of Natural Resources, Qingdao 266061, China Laboratory for Marine Fisheries Science and Food Production Processes, Qingdao National Laboratory for Marine Science and Technology, Qingdao 266000, China Marine College of Shandong University, Weihai 264209, China d Qingdao Haiputao Organic Green Algae Research and Development Breed CO., LTD, Qingdao 266000, China e Qingdao Heroes Group, Qingdao 266000, China b c
ARTICLE INFO
ABSTRACT
Keywords: Caulerpa okamurae Chloroplast genome Genome sequencing Phylogenetic relationship
To clarify evolutionary characteristics, phylogenetic relationships as well as species identification of C. okamurae, we determined the cpDNA sequence of Caulerpa okamurae using de novo sequencing in the present study. The cpDNA of C. okamurae was 148,274 bp in length, and it lacked the inverted repeat commonly found in vascular green plants. The cpDNA of C. okamurae was highly compact with a gene density of 71.7%. Moreover, it was an AT-rich genome (65.5%) consisting 76 protein-coding genes (PCGs), 27 transfer RNA (tRNA) genes, three ribosomal RNA (rRNA) genes, 32 putative open reading frames (ORFs) and six introns. Additionally, the six introns were annotated in six genes as follows: psbA, rpoB, ftsH, psbD, atpF and cysA. The overall base composition of its cpDNA was 65.46% for AT. A total of 56 genes were encoded on the light strand, while all the other 50 chloroplast genes were encoded on the heavy strand. All of the PCGs had ATG as their start codon and employed TAA, TGA or TAG as their termination codon. Phylogenetic analyses suggested that the complete cpDNA sequence of C. okamurae fell in the Chlorophyta, Ulvophyceae, Bryopsidales, and Caulerpaceae and more resembled the cpDNAs of C. racemosa, C. cliftonii voucher and Tydemania expeditionis. Taken together, our data offered useful information for the studies of C.okamurae on evolutionary characteristics, phylogenetic relationships as well as species identification.
1. Introduction Caulerpa okamurae, a marine green macroalgae, belongs to Chlorophyta, Ulvophyceae, Bryopsidales, Caulerpaceae, which is distributed worldwide in tropical and subtropical oceans (Eun et al., 2003). C. okamurae is one of the edible seaweed species with high nutritional content and economic value (Eun et al., 2003). Caulerpa is a conspicuous member of the marine ulvophycean order Bryopsidales, consisting of more than 20 species in northern Australia (Ian, 2011), and there are more than 97species name of which, in the online database AlgaeBase (http://www.algaebase.org). Caulerpa is one of the favored edible seaweed genus due to its high nutritional content and economic value (Kudaka et al., 2008; Eun
et al., 2003). Wang et al. (2018) have analyzed the nutritional components of C. okamurae, indicating that the contents of Fe and Se of C. okamurae are 381 mg/kg and 14.66 mg/kg, respectively (Wang et al., 2018). Besides its high nutritional value as edible seaweeds, anticancer, antioxidative, antibacterial, antifungal and antiviral activities of Caulerpa extract have been documented (Patama and Anong, 2006; Patricia et al., 2008; Reiko et al., 2012; Vanderlei et al., 2010; Da et al., 2014). Chloroplast genomes (cpDNAs) were the first characterized plant genomes due to their small size, limited number of repeated elements, and abundance in foliar tissues(Lü et al., 2011). (Leliaert et al., 2016). The determination of cpDNA sequences is not only important for the research of functional genomics, but also plays a crucial role in the
Corresponding author. Corresponding author at: Marine Ecology Research Center of the first Institute Oceanography, MNR, No.6 Xianxialing Road, Qingdao City, Shandong Province, 266061, PR China. E-mail addresses:
[email protected] (F. Zheng),
[email protected] (H. Liu). ⁎
⁎⁎
https://doi.org/10.1016/j.margen.2020.100752 Received 28 December 2018; Received in revised form 5 December 2019; Accepted 24 January 2020 1874-7787/ © 2020 Elsevier B.V. All rights reserved.
Please cite this article as: Fengrong Zheng, et al., Marine Genomics, https://doi.org/10.1016/j.margen.2020.100752
Marine Genomics xxx (xxxx) xxxx
F. Zheng, et al.
origin and phylogenetic resolution in the green plant lineage (Wu et al., 2013; Lu et al., 2015; Melton et al., 2015). At present, complete chloroplast genome sequences have been obtained from virtually all of the major higher plant and algal lineages. At present, complete chloroplast genome sequences have been obtained from virtually all of the major higher plant and algal lineages. Comparative analyses of those complete cpDNA sequences not only offer information in species molecular identification, but also offer clarification of the evolutionary relationships among the some groups of algae and higher plants (Pombert et al., 2005; Lemieux et al., 2007; Marcelino et al., 2016; Monique et al., 1999; Kudaka et al., 2008; Lü et al., 2011; Zuccarello et al., 2009; Leliaert and Lopez-Bautista, 2015; Sun et al., 2016; Fučíková et al., 2016). Compared with higher plants, algal chloroplast genomes exhibit numerous extreme features (Simpson and Stern, 2002). Ulvophyceae is well represented by marine green macroalgae (Leliaert et al., 2012; Cocquyt et al., 2010). C. okamurae belongs to Ulvophyceae, Bryoposodales. To date, 12 complete cpDNAs, Caulerpa racemosa(KT946602), Caulerpa cupressoides (MG797569), Caulerpa cliftonii voucher HV03798 (NC_031368), Caulerpa lentillifera (NC_039377.1), Caulerpa manorensis (NC_037367.1), Caulerpa verticillata voucher HV05119 (NC_039523.1), Caulerpa cupressoides (MG797569.1),Tydemania expeditionis (NC_026796.1), Ostreobium sp. (KU979013.1), Bryopsis hypnoides (NC_013359.1), Codium decorticatum (KT946603), Bryopsis plumosa (NC_026795.1) and Derbesia sp. (NC_031367) have been published from organisms that are currently classified in the Bryopsidales, including six of complete cpDNAs in Caulerpa (https://www.ncbi.nlm.nih.gov/nuccore+Caulerpa complete chloroplast). Apart from completely sequenced cpDNAs, partial cpDNA data in the Bryopsidales have long been available for Codium fragile (Manhart et al., 1989) and C. sertularoides (Jungho Lee and James, 2003), and partial DNA sequence data are also available for C. filiformis (Zuccarello et al., 2009). However, little attention has been paid to the chloroplasts of C. okamurae, especially the evolutionary characteristics as well as phylogenetic relationships of C. okamurae within the Ulvophyceae. Here, we sequenced, assembled and annotated the complete sequence of the C. okamurae cpDNA, performed a phylogenomic investigation based on the genomic data of 28 other core Chlorophyta algae. Our results laid the foundation for the future studies on the evolution of chloroplast genomes of Caulerpa, as well as the molecular identification of C. okamurae varieties.
extension temperature of 72 °C for 60 s, followed by an extra extension step at 72 °C for 4 min. DNA libraries were purified by magnetic beads and quantified by real timePCR. Total DNA was used to generate 500 bp (insert size) paired-end library according to the standard protocol of Illumina Hiseq 2500. Approximately 3.4 Gbp raw data were generated with a read length of 250 bp, and the cpDNA sequencing depth was about 194 X. Adapters and low-quality sequences were removed from these reads using CUTADAPT (Martin, 2011) and FastQC(0.11.8) (https://www.bioinformatics.babraham.ac.uk/projects/fastqc/), respectively. 2.3. De novo assembly The de novo assemblers of SOAP denovo2-r240 were employed to assemble and qualitatively assess the obtained nucleotide sequence reads (http://soap.genomics.org.cn). Since the assembled contigs contain a mixture of sequences from both organellar and nuclear genomes, the methods were used to isolate the chloroplast sequences based on the high correlation between contig read depth and the number of copies in the genome. Firstly, we sorted the assembled contigs by contig-read depth analysis of assemblies, that is, the raw read sequences were mapped to the assembled contigs, and the read depth of each contig was calculated through read mapping. Taking the advantage of the difference of read depths among contigs, we could isolate the chloroplast contigs with high-coverage (more than 500×) from the nuclear contigs. Secondly, we also used all published chloroplast genome sequence to capture reads with BWA, then assembly those reads to chloroplast contigs (Li and Durbin, 2009). Finally, combined all isolated chloroplast contigs and recaptured reads again to isolate more completely chloroplast DNA reads, reassembly and contigs extended to get completely chloroplast genome sequencing. A C. okamurae library consisting of 6,809,771 paired-end reads amounting to approximately 3,404,885,500 bases of raw sequence data was constructed. 2.4. Genome annotation and analysis The online program Dual Organellar GenoMe Annotator (DOGMA) (Cheng et al., 2013) and CPGAVAS (Schattner et al., 2005) were employed to perform preliminary gene annotation using plastid/bacterial genetic code and default conditions. To verify the exact gene and exon boundaries, putative gene sequences and protein sequences were BLAST searched in Nt and Nr databases. tRNAscan-SE v1.21 with default settings (Schattner et al., 2005) and tRNADB-CE (http://trna. ie.niigata-u.ac.jp/cgi-bin/trnadb/index.cgi) search servers (Lowe and Chan, 2016)) were used to verify the tRNA genes. Genes for rRNAs were identified and localized using RNAmmer (Lagesen et al., 2007). The Organellar Genome DRAW tool (OGDRAW v1.2) was employed to draw the graphical map of C. okamurae cpDNA, followed by manual modification (Lohse et al., 2013). ORF-Finder at the NCBI was used to predict open reading frames (ORFs) with a minimal ORF length of 300 nt. MEGA 6.06 was employed to analyze the base composition, codon usage and nucleotide substitution (Tamura et al., 2013). The GC and AT asymmetries were investigated in terms of GC and skewed distributions using the formulae as follows: AT-skew = (A − T) / (A + T) and GCskew = (G − C) / (G + C) (Abdullah et al., 2014). To further assess the evolutionary adaptation in the Caulerpa lineage, DnaSP 5.10.01 was used to estimate the GC-AT skews among cpDNAs of 10 green algae (Rozas, 2009; Tamura et al., 2013).
2. Materials and methods 2.1. Sample collection and genomic DNA isolation Healthy and fresh C. okamurae specimens were collected from Qingdao, China on 4th July 2016 (36°19′30.3″N, 120°40′19.6″E). Collected samples were directly frozen in liquid nitrogen and then stored at −80 °C prior to genomic DNA isolation. Before DNA extraction, samples were rinsed with plenty of autoclaved seawater several times and brushed with a soft brush to get rid of the surface microbial and epiphytic organisms. Total genomic DNA was purified from approximately 100 mg of tissue using the Plant Tissue DNA Kit (CWBIO Cat. No. CW0553, Beijing, China) according to the manufacturer's instructions. The quality of isolated genomic DNA was examined using a Nano-Drop 2000 spectrometer (Thermo Scientific, USA). 2.2. Illumina library preparation and sequencing Paired-end DNA libraries were established following the Illumina DNA manufacturer's instructions (Nextera DNA Library Prep Kit). The size-selected, adapter-modified DNA fragments were amplified by PCR. Briefly, after an initial denaturation step at 98 °C for 2 min, amplifications were carried out with 10 cycles at a melting temperature of 98 °C for 30 s, an annealing temperature of 65 °C for 30 s, and an
2.5. Phylogenetic analysis Phylogenetic analysis was conducted from 29 green algal organisms. One species, Porphyridium purpureum, was used as the outgroup. Alignments were conducted by MUSCLE v3.8.31 with default parameters after all positions containing gaps or missing data were 2
Marine Genomics xxx (xxxx) xxxx
F. Zheng, et al.
Fig. 1. Gene map of C. okamurae Chloroplast genome. Genes belonging to different functional groups are color coded. The genes outside the circle are transcribed clockwise, while the genes inside are transcribed counterclockwise. Gene blocks are filled with different colors as the cutline shows.
eliminated (Edgar, 2004). GTR + G + I nucleotide model was selected using jModeltest (v2.1.7), and LG + I + G amino acid model was selected using ProtTest(v3.4).The saturation of the data matrix of nucleotide was examined in DAMBE v6.0.48. Maximum likelihood (ML) method and the computer program RaxML 8.1.5 were applied for phylogenetic reconstruction (Stamatakis, 2014) (https://cme.h-its.org/ exelixis/web/software/raxml/index.html). Branch support was evaluated by 1000 replications of bootstrap (BS) re-sampling. The Tree View program v.1.65 was employed to construct the phylogenetic trees (Page, 1996) and Evolview (http://www.evolgenius.info/evolview/) (Zhang et al., 2012).
3. Results and discussion 3.1. The cpDNA of C. okamurae A C. okamurae library consisting of 6,83 M paired-end reads, and clean data was 6.81 M. A 148,274-bp circular sequence was generated with the total sequencing 97,122,500 bases, representing the complete cpDNA of C. okamurae (Genbank: KX809677), and such sequence had an average sequencing depth of 194.30× and a GC content of 34.54% (Fig. 1, Table 1). Fig. 1 illustrates the gene map of the cpDNA of C. okamurae. The cpDNA features of C. okamurae and comparison of 3
Marine Genomics xxx (xxxx) xxxx
F. Zheng, et al.
Table 1 Summary of the Chloroplast genome features of C. okamurae and comparison of Ulvophyceae and Trebouxiophyceae cpDNAs.
Ulvophyceae, Bryopsidales Caulerpa okamurae Caulerpa cliftonii Caulerpa racemosa Tydemania expeditionis Bryopsis hypnoides Bryopsis plumosa Derbesia sp. Codium decorticatum Ulvophyceae, Ulvales Ulva fasciata Oltmannsiellopsis viridis Pseudendoclonium akinetum Ulva sp. Trebouxiophyceae Chlorella vulgaris Chlorophyceae Tetradesmus obliquus Mychonastes homosphaera Chlamydomonas reinhardtii Leptosira terrestris
GenBank Accession
Genome Size (bp)
AT (%)
a
Total genes
Protein coding genes
tRNA genes
rRNA genes
b Freestanding ORFs
c
Coding DNA (%)
d
Intronic ORFs
IR size (kb)
KX809677 NC_031368 KT946602 NC_026796 NC_013359 NC_026795 NC_031367 KT946603
148,274 131,135 176,522 105,200 153,429 106,859 115,765 91,509
65.5 62.4 66.4 67.2 66.9 69.2 70.3 70.9
136 142 121 123 111 118 126 106
76 74 76 77 69 80 77 78
27 28 27 28 37 27 26 25
3 3 3 3 5 3 3 3
32 37 15 15 0 8 20 0
71.7 76.6 60.4 86.1 44.6 83 87.8 81.2
8 10 10 11 11 14 8 10
0 4 0 8 0 3 4 0
none none none none none none none none
NC_029040 DQ291132 NC_008114
96,005 151,933 195,867
75.1 59.7 62.3
106 117 105
71 75 73
27 26 29
2 3 3
6 13 12
86 65 62.4
5 5 27
5 10 19
none 18.5 6
KP720616
99,983
74.7
102
71
28
3
4
81.8
5
4
none
AB001684
150,613
68.4
130
78
33
3
16
55.7
2
0
none
DQ396875 KJ806270
161,452 25,149
73.1 57
105 42
69 13
27 23
3 6
6 0
61.3 72.8
7 0
3 0
12.0 none
NC_005353
203.8
65.5
99
67
27
5
2
49.9
5
0
22.2
NC_009681
195.1
72.7
107
76
28
3
1
47.3
4
1
none
Introns
Note, a A summary of protein-coding genes, tRNA genes, rRNA genes and Free-standing ORFs. Duplicated genes only counted once; b ORFs > 300 bp; c % of genome consisting of conserved genes (including introns) and ORFs > 300 bp; d Duplicated genes with introns only counted once. Table 2 Genes category and number of the Chloroplast genome of C. okamurae. Category of genes
Group of gene
Name of gene
Self-replication
Small subunit of ribosome Large subunit of ribosome DNA-dependent RNA polymerase Ribosomal RNA genes Transfer RNA genes
rps11,rps12,rps14,rps18,rps19,rps2,rps3,rps4,rps7,rps8,rps9 rpl14,rpl16,rpl19,rpl2,rpl20,rpl23,rpl36,rpl5,rpl32 rpoA,rpoBa,rpoBb,rpoC1,rpoC2
Genes for photosynthesis
Other genes
Hypothetical protein gene
Large subunit of Rubisco Subunits of photosystem II Subunits of photosystem I Subunits of ATP synthase photochlorophyllide reductase subunit Subunits of cytochrome Envelope membrane protein C-type cytochrome synthesis gene Subunit of acetyl-CoA Protease photosystem I assembly Translation initiation factor sulfate transport protein cell division protein elongation factor Tu
5S-rRNA,16S-rRNA,23S-rRNA tRNA-Ser(GCT),tRNA-Gly(GCC),tRNA-Cys(GCA),tRNA-Asn(GTT),tRNA-Trp(CCA),tRNA-Lys(TTT),tRNA-Ile (GAT),tRNA-Ala(TGC),tRNA-Asp(GTC),tRNA-Arg(TCT),tRNA-Ser(TGA),tRNA-Tyr(GTA),tRNA-Glu(TTC),tRNA-Leu (TAG),tRNA-Phe(GAA),tRNA-Gln(TTG),tRNA-Arg(CCG),tRNA-Pro(TGG),tRNA-Thr(TGT),tRNA-Leu(GAG),tRNA-Gly (TCC),tRNA-Val(TAC), tRNA-Arg(ACG),tRNA-His(GTG),tRNA-Met(CAT)* rbcl psbA,psbB,psbC,psbD,psbE,psbF,psbH,psbI,psbJ,psbK,psbL,psbM,psbN,psbT,psbZ psaA,psaB,psaC,psaI,psaJ,psaM atpA,atpB,atpE,atpF,atpH,atpI chlB,chlI,chlL,chlN petA,petB,petD,petG, petL cemA CcsA AccD ClpP ycf3,ycf4 InfA cysA, CysT FtsH TufA ycf1,ycf12,ycf20
Note, * repeat gene.
Ulvophyceae and Trebouxiophyceae cpDNAs were summarized in Table 1. There were 76 protein coding genes (PCGs), 27 tRNA genes, three rRNA genes, eight introns, and 32 ORFs (Fig. 1, Table 2). The overall base composition of its cpDNA was 32.6% for A, 32.86% for T, 16.71% for G, 17.83% for C and 65.46% for AT. A total of 55 genes were encoded on the light strand (36 PCGs, 16 tRNAs and three rRNAs),
while all the other 51 chloroplast genes were encoded on the heavy strand (40 PCGs and 11 tRNAs). All of the PCGs had ATG as their start codon and employed TAA, TGA or TAG as their termination codon (52 genes employed TAA, 15 genes employed TGA and nine genes employed TAG). The circular cpDNA of C. okamurae was 148,274 bp in length, which was larger than most published cpDNAs of Ulvophyceae and only 4
Marine Genomics xxx (xxxx) xxxx
F. Zheng, et al.
Table 3 Comparison of protein genes content in Ulvophyceae green algal cpDNA genomes (excluding tRNAs). Gene
Caulerpa okamurae
Caulerpa racemosa
Caulerpa cliftonii
Tydemania expeditionis
Bryopsis plumosa
Bryopsis hypnoides
Derbesia sp.
Codium decorticatum
Ulva fasciata
Chlorella vulgaris
Oltmannsiellopsisviridis
Tetradesmus obliquus
chlB chlI chlL chlN cysA cysT ftsH psaI psaM rpl19 ycf1 ycf12 ycf20 petL rpl12 rpl32 ycf47 tilS minD psb30 ycf5 ycf9 ccsA cemA psbZ ycf10 minE I-CvuI
+ + + + + + + + + + + + + + − + − − − − − − + + + − − −
+ + + + + + + + + + + + + + − + − − − − − − + + + − − −
+ + + + + + + + + + + − + − − + − − − + − − + + + − − −
+ + + + + + + + + + + + + − − + + − − − − − + + + − − −
+ + + + + + + + + + + + + + + + + + − − − − + + + − − −
+ − + + + + − − − − − + − + + + − − − − − − + + + − − −
+ + + + + + + + + + + − + + + + + − − + − − + + + − − −
+ + + + + + + + + + + + + + + + + − − − − − + + + − − −
− + − − − − + + + + + + + + + + − − − − − − + + + − − −
+ + + + + + + + + + − + − + + + − − + − + + − − − + + +
+ + + + − − + + + + + + + + + + − − + − − − + + + − − −
+ − + + − − + − − − + + − + + − − − − − − − + + + − − −
Note, 1, A common set of 60 genes is shared by these genomes (accD,atpA,atpB,atpE,atpF,atpH,atpI,clpP,infA,petA,petB,petD,petG,psaA, psaB,psaC,psaJ,psbA,psbB,psbC,psbD,psbE,psbF,psbH,psbI, psbJ, psbK,psbL,psbM,psbN,psbT,rbcL,rpl14,rpl16,rpl2,rpl20,rpl23,rpl36,rpl5,rpoA,rpoB,rpoC1,rpoC2, rps11,rps12,rps14,rps18,rps19,rps2,rps3,rps4,rps7,rps8,rps9,tufA,ycf3,ycf4,rrf,rrs,rrl). Note, See Table 1 for accession numbers.
smaller than C. racemosa (176,522 bp), Chlamydomonas reinhardti (203 kb) and P. akinetum (195,867 bp), and it was similar to those found in B. hypnoides, C. cliftonii voucher (131.135 kb) and O. viridis (151.933 kb) (Lü et al., 2011; Pombert et al., 2005; Lopez-Bautista and Lam, 2016; Boudreau et al., 1994; Pombert, 2006). The algae have shown to have a wide range of organellar genome size, composition (such as AT.%, number of genes, and number of introns) and organization of genes (Leliaert et al., 2012; Lang and Nedelcu, 2012). As a parasitic, nonphotosynthetic green alga, the cpDNA of Helicosporidium sp. is only 37.5 kb in length, which is the smallest one among the characterized cpDNAs, and it lacks all genes coding functional proteins in photosynthesis (de Koning and Keeling, 2006). The cpDNA size of the siphonous alga Acetabularia sp. is greater than 2000 kb, which is the largest known cpDNA of photosynthetic organisms (Padmanabhan and Green, 1978; Manhart et al., 1989). The AT content of C. okamurae cpDNA was 65.5%, falling within the limits of other ulvophycean and green algal cpDNAs, which was comparable with that of C. cliftonii voucher (62.4%), C. racemosa (66.4%), T. expeditionis (67.2%), B. hypnoides (66.9%) and P. akinetum (62.3%) (Table 1). The coding DNA (%) of cpDNA in C. okamurae was 71.7%, which resembled Mychonastes homosphaera (72.8%), O. viridis (65%) and C. cliftonii (76.6%), falling within the other ulvophycean and green algal cpDNAs. Intergenic spacers in C. okamurae cpDNA ranged from 2 to 5548 bp in size, and 68 spacers were larger than 100 bp. In C. okamurae, the longest intergenic spacer was 5548 bp in length, and there were another four intergenic spacer regions over 2000 bp. Moreover, there were eight regions of overlapping genes, such as rpl16 and rpl14, rpoBb and rpoBa, rpl20 and rps18, 23S-rRNA and ORF243, ORF114 and ORF113, tRNA-Arg and chlI, psbD and psbC, atpI and rps2 (Table 1). Many green algal cpDNAs have a quadripartite structure, which is also found in higher plants, and such a structure is characterized by the presence of two copies of a large inverted repeat (IR) sequence
separating a small and a large single-copy region. The IR sequence is the remarkable characteristic of high plant cpDNA. Although this architecture is believed to be ancestral in the green algae, many species do not have a quadripartite structure (Wakasugi et al., 1997; de Cambiaire et al., 2007; Civáň et al., 2014; Leliaert and Lopez-Bautista, 2015). The C. okamurae lacked a large IR, and similar situation has been found in other eight completely sequenced cpDNAs of C. cliftonii, C. racemosa, Tydemania expeditionis, B. plumosa, B. hypnoides, Codium decorticatum and Derbesia sp. in Bryopsidales (Lü et al., 2011; Leliaert and LopezBautista1, 2015; Marcelino et al., 2016). The lack of a quadripartite architecture in Bryopsidales has earlier been implied based on Southern hybridization analysis of restriction fragments in Codium fragile and C. sertularoides (Manhart et al., 1989; Lehman and Manhart, 1997). However, the cpDNAs of P. akinetum and O. viridis, belonging to Ulvales, Ulvophyceae, both possess the IR structure (Pombert et al., 2005; Pombert, 2006). These findings suggested the great plasticity of the cpDNA in the Ulvophyceae and indicated that the IR was lost multiple times independently in Bryopsidales. 3.2. Gene content The cpDNA of C. okamurae contained 106 unique genes, including 76 PCGs, 27 tRNA genes and three rRNA genes (Table 1, 2 and Fig. 1). Its freestanding ORF (> 300 bp) number was 32, which was higher than that of cpDNA in most other green algae and only smaller than that of C. cliftonii. The genes category and number of the Chloroplast genome of C. okamurae was shown in Table 2. The total length of these PCGs was 29,276 bp, representing 19.74% of the entire cpDNA genome. Table 3 shows a comparison of gene repertoires of C. okamurae and 11 published core chlorophytan cpDNAs. A large proportion of genes (60 genes) were shared among species of Ulvophyceae, while they had a smaller gene repertoire. 5
Marine Genomics xxx (xxxx) xxxx
F. Zheng, et al.
Pseudogenes are not common in cpDNAs but have been reported. Zuccarello et al., have reported that the cpDNA of C. filiformis contains a pseudogene ycf62 (Zuccarello et al., 2009). The tilS pseudogene is found in the C. racemosa, Bryopsis and Tydemania plastomes (Leliaert and Lopez-Bautista, 2015; Lopez-Bautista and Lam, 2016). Our data revealed that three hypothetical protein gene, ycf1, ycf12 and ycf20 existed in the cpDNA of C. okamurae. It would be useful to assess the evolution of these genes within the Bryopsidales and examine the related orders to determine when the loss of function occurred. Thefive genes, minD, ycf5, ycf9, ycf10 and I-CvuI, were absent from C. okamurae cpDNA and the other seven green algal Bryopsidales (C. racemosa,C.cliftonii voucher, Tydemania expeditionis, B. plumosa, B.hypnoides, Derbesia sp. and Codium decorticatum), but they are expressed in the Chlorella vulgaris, Trebouxiophyceae (Wakasugi et al., 1997) (Table 2). The C. okamurae cpDNAs share nearly identical gene repertoires with other two Caulerpaceae, C. racemosa and C. cliftonii (Table 3).
3.4. The nucleotide content, skew characteristics and codon usage Table 4 shows the skew characteristics of C. okamurae cpDNA and comparison with cpDNAs of other green algae. The overall base composition of the C. okamurae cpDNA was 32.60% for A, 32.86% for T, 16.71% for G and 17.83% for C. The overall AT content of the C. okamurae cpDNA was 65.46%, which was similar to that of the other two green algae (C. racemosa and B. hypnoides), showing an AT-skew preference for AT (p = −0.0041). The AT- and GC-skews of the C. okamurae cpDNA were similar to the most of the strand biases of Ulvophyceae and Chlorophyceae cpDNAs (positive AT-skew and negative GC-skew) (Table 4) (Lopez-Bautista and Lam, 2016; Leliaert and LopezBautista, 2015; Lü et al., 2011; Melton et al., 2015). C. okamurae cpDNA totally encoded 29,276 amino acids of PCGs. Table 5 presents the codon usage and relative synonymous codon usage (RSCU) in the 76 chloroplast PCGs in C. okamurae. The RSCU represents the proportion of synonym codon usage in all synonymous codons, “1” shows that the codon is unbiased, and less than 1 indicates that the codon is a codon with a relatively low frequency (Sharp and Li, 1987). All 64 codons were used, and there was not strong bias for the codon ending in A or U in four-codon families. The most frequently detected codon was AAA (6.47%), followed by TTA (5.70%), TTT (5.56%) and ATT (5.31%)(Table 5). Moreover, the most frequently detected amino acid was Ile, followed by Gln,pheand Lys. Except for Met and Trp with only one codon, the remaining amino acids exhibited codon usage bias, and the most biased RSCU was TTA (3.271), followed GTT (2.329), GCT (2.253), TCT (2.212) and TAA (1.896) (Table 6).
3.3. tRNAs and rRNAs The length of these 27 tRNAs varied from 71 bp (tRNA-Gly) to 88 bp (tRNA-Met), and all of these tRNAs could be folded into the typical cloverleaf secondary structure. All these 27 tRNAs were 2038 bp in length with an AT content of 49.17% (Table 4). There were three tRNAMet (CAT) genes in the cpDNA of C. okamurae. Three rRNAs, 5S rRNA, 16S rRNA and 23S rRNA, were identified in the cpDNA of C. okamurae, which were shared by the two other Caulerpaceae, C.cliftonii voucher and C.racemosa (Table 1). The three rRNAs were localized on the light strand in the cpDNA of C. okamurae, and they had slightly longer sequences compared with the other Caulerpaceae, C.cliftonii voucher and C.racemosa (Table 4). There are five rRNAs in the B. hypnoides and Chlamydomonas reinhardtii chloroplast genome, rrn23, rrn16, rrn7, rrn5, and rrn3 (Lü et al., 2011; Maul et al., 2002).The 5S rRNA was 121 bp in length with an AT content of 60.33%, 16S rRNA was 1499 bp in length with an AT content of 52.10%, and 23S rRNA gene was 2637 bp in length with an AT content of 58.29% (Table 4).
3.5. Introns in cpDNA of C. okamurae A total of eight introns were present in eight genes (psbA, rpoBb, ftsH, psbD, atpF, cysA, 23S-rRNA and 16S-rRNA) in the protein genes of C. okamurae cpDNA, including two group I, one group II and three unknown types. Table 6 lists the intron insertion sites and types in the C. okamurae cpDNA. The six detected introns varied from 138 bp to 880 bp in length, and the average length was only 532 bp. The number
Table 4 The skew characteristics of Chloroplast genomes and comparison with other green algae cp DNAs. Species
Caulerpa okamurae Caulerpa racemosa Caulerpa cliftonii Tydemania expeditionisi Bryopsis plumosa Bryopsis hypnoides Derbesia sp. Codium decorticatum Oltmannsiellopsis viridis Ulva fasciata
Species
Caulerpa okamurae Caulerpa racemosa Caulerpa cliftonii Tydemania expeditionisi Bryopsis plumosa Bryopsis hypnoides Derbesia sp. Codium decorticatum Oltmannsiellopsis viridis Ulva fasciata
Length(bp)
148,274 176,522 131,135 105,200 106,859 153,429 115,765 91,509 151,933 96,005
Entire genome
Protein-coding gene
rrnL
AT.%
GC-skew
AT-skew
Length(aa)
AT(%)(all)
AT(%)(3rd)
AT-skew
GC-skew
Length(bp)
At.%
AT-skew
GC-skew
65.46 66.36 62.37 67.17 69.24 66.88 70.35 70.94 59.53 75.14
−0.0326 0.0185 0.0044 0.0088 0.0033 0.0108 −0.0057 0.0286 0.0044 0.0053
−0.0041 0.0048 −0.0101 −0.0052 −0.0373 −0.0302 0.0069 −0.0211 0.0035 0.0079
29,276 27,581 28,763 23,930 24,240 16,693 26,186 19,122 25,596 23,151
67.6 67.16 65.1 67.54 68.18 66.6 69.73 70.56 62.36 74.66
68.65 74.2 64.33 67.43 69.16 68.24 70.39 74.42 64.42 77.21
0.0097 −0.0006 −0.016 −0.0098 −0.0416 −0.062 0.0078 −0.048 −0.0011 −0.0026
−0.0228 0.0238 0.0084 0.0044 0.0122 0.0164 −0.0146 0.0301 0.0133 −0.0131
2637 2941 2923 2883 2869 2502 2857 2873 2907 2849
58.29 58.14 58.06 57.27 58.42 56.63 59.68 60.49 50.53 55.07
−0.1451 0.138 0.1161 0.1266 −0.1539 −0.139 0.1402 0.1761 −0.115 0.1523
−0.14 0.1552 0.1436 0.1542 −0.1685 −0.1613 0.1719 0.1683 −0.1794 0.1797
rrnS
5s-rRNA
tRNAs
Length(bp)
AT(%)
GC-skew
AT-skew
Length(bp)
AT(%)
AT-skew
GC-skew
Length(bp)
At.%
AT-skew
GC-skew
1499 1519 1531 1461 1487 1478 1490 1555 1519 1476
52.10 52.21 52.38 52.77 52.79 51.89 52.75 55.95 49.64 54.34
−0.1421 0.1433 −0.1413 0.142 0.1766 0.1814 0.1705 0.1825 −0.1582 0.1543
−0.0832 0.087 −0.1072 0.1051 0.1032 0.1004 0.1043 0.0667 −0.0981 0.1297
121 122 119 118 122 123 122 123 122 –
60.33 59.02 61.34 65.25 63.11 84.55 68.85 77.24 46.72 –
0.0959 −0.1111 −0.0137 0.0649 −0.013 0.1923 0.0238 0.1158 0.0175 –
−0.1667 0.16 0.1304 0.2195 −0.0222 0.0526 0.0526 0 −0.1385 –
2038 2039 2152 2107 2039 2789 1990 1890 2110 2034
49.17 49.19 50.37 51.16 50.71 45.11 51.96 55.19 48.25 50.15
0.008 −0.0269 −0.0627 0.0482 0.0426 0.0223 0.0019 0.0105 0.0432 −0.0059
−0.0251 0.0598 0.0955 −0.0573 −0.0687 −0.0529 0.0084 −0.0153 −0.0311 −0.002
Note, See Table 1 for accession numbers. 6
Marine Genomics xxx (xxxx) xxxx
F. Zheng, et al.
Table 5 Codon usage in 76 protein-coding genes of C. okamurae. Amino acid
Codon
Number
Frequency(%)
RSCU
Amino acid
Codon
Number
Frequency(%)
RSCU
Ala
GCT GCA GCC GCG AGA CGT CGA AGG CGG CGC AAT AAC GAT GAC TGT TGC CAA CAG GAA GAG GGT GGA GGG GGC CAT CAC ATT ATA ATC TTA CTT TTG
824 400 149 90 416 367 364 108 100 78 1305 241 962 174 337 56 1258 173 1336 215 788 721 267 155 440 99 1563 663 201 1676 628 310
2.804438091 1.361377714 0.507113199 0.306309986 1.415832823 1.249064053 1.23885372 0.367571983 0.340344429 0.265468654 4.441494793 0.820230073 3.274113403 0.592199306 1.146960724 0.19059288 4.281532911 0.588795861 4.547001566 0.731740521 2.681914097 2.45388333 0.908719624 0.527533864 1.497515486 0.336940984 5.319583418 2.256483561 0.684092301 5.704172623 2.137363011 1.055067729
2.25290499 1.093643199 0.407382092 0.24606972 1.741800419 1.536636427 1.524075366 0.452198186 0.418702024 0.326587579 1.688227684 0.311772316 1.693661972 0.306338028 1.715012723 0.284987277 1.758211041 0.241788959 1.72275951 0.27724049 1.632314863 1.49352667 0.553081305 0.321077162 1.632653061 0.367346939 1.932014833 0.819530284 0.248454883 3.271307742 1.225764476 0.605074821
Lys
AAA AAG ATG TTT TTC CCT CCA CCC CCG TCT TCA AGT TCC TCG AGC TAA TGA TAG ACT ACA ACC ACG TGG TAT TAC GTT GTA GTG GTC CTA CTC CTG
1901 285 530 1635 279 649 433 142 56 755 464 439 190 112 88 67 23 16 623 583 138 63 434 976 138 838 330 148 123 241 132 87
6.469947587 0.969981621 1.803825471 5.564631407 0.949560956 2.208835341 1.473691376 0.483289089 0.19059288 2.569600436 1.579198149 1.494112041 0.646654414 0.38118576 0.299503097 0.228030767 0.078279219 0.054455109 2.12034579 1.984208019 0.469675311 0.21441699 1.47709482 3.321761623 0.469675311 2.852086311 1.123136614 0.503709754 0.418623647 0.820230073 0.449254646 0.296099653
1.739249771 0.260750229 1 1.70846395 0.29153605 2.028125 1.353125 0.44375 0.175 2.211914063 1.359375 1.286132813 0.556640625 0.328125 0.2578125 1.896226415 0.650943396 0.452830189 1.771144279 1.65742715 0.392324094 0.179104478 1 1.752244165 0.247755835 2.329395413 0.917303683 0.411396803 0.3419041 0.470396877 0.257644763 0.169811321
Arg
Asn Asp Cys Gln Glu Gly
His Ile Leu
Met Phe Pro
Ser
Stp Thr
Trp Tyr Val
Leu
of C. okamurae cpDNA could be found in Ostreobium sp., Derbesia sp. and C. cliftonii (Marcelino et al., 2016; Pombert, 2006) (Table 7).
Table 6 The comparison of intron types and distributing in protein genes from C. okamurae. Gene
Exon I(bp)
Intron I(bp)
Exon II(bp)
psbA rpoBb ftsH psbD atpF cysA
417 765 3285 744 78 306
880 495 543 564 575 138
648 1149 987 321 429 393
3.6. Phylogenetic analysis In the present study, we analyzed the amino acid sequences derived from the PCGs of C. okamurae cpDNAs and 28 other core Chlorophyta algae with ML and ML-distance methods, and Porphyridium purpureum was used as the outgroup in the phylogenetic analysis. Phylogenetic analysis based on a concatenated alignment of chloroplast protein sequences suggested that C. okamurae was genetically related to C. racemosa, C. cliftonii and T. expeditionis (Fig. 2). In general, the presented phylogenomic analysis of C. okamurae cpDNAs is quite similar to previously published phylogenies of Bryopsidales based on the molecular phylogeny using a 50 gene dataset and morphological and anatomical observations (Lopez-Bautista and Lam, 2016; Leliaert and LopezBautista, 2015; Hillis-Colinvaux, 1984). The unstable phylogenetic position of Bryopsidales was also apparent when comparing published chloroplast multigene phylogenies, revealing different relationships depending on gene and taxon sampling (Lü et al., 2011; Zuccarello et al., 2009; Fučíková et al., 2014; Zheng et al., 2018). Phylogenomic analyses of B. plumosa, T. expeditions and B. hypnoid were largely inconclusive with respect to monophyly of Ulvophyceae, which were more closely related to Chlorella (Trebouxiophyceae) than Ulva sp., O. viridis and P. akinetum (Leliaert and Lopez-Bautista, 2015; Lü et al., 2011). Phylogenetic analysis of 23 cp genes indicated that C. filiformis was more closely related to Chlorella (Trebouxiophyceae) than the other two ulvophycean taxa in the phylogeny (Zuccarello et al., 2009).
and type of introns in the cpDNA of C. okamurae were much less than those in the green algae. A total of 27 introns have been previously annotated in the cpDNA of P. akinetum (Pombert et al., 2005), and 14 introns in B. plumosa, 11 introns in B. hypnoides and T. expeditionis, 10 introns in C. cliftonii, and eight introns in Derbesia sp. The number of introns is five both in Ulva fasciata and O. viridis, and there are ten introns in C. racemosa and Codium decorticatum (Pombert, 2006).Hanyuda et al. (2000) have reported that the presence of introns of C. brachypus is not consistent with the three species of Caulerpa (C. okamurae, C. lentillifera and C. racemosa). Table 7 summarizes the comparison of intron types and distributing of C. okamurae and other green algal cpDNAs. The intron in psbA of C. okamurae cpDNA could be found in P. akinetum, G. planctonica and T. expeditionis, but there were obvious differences in terms of intron type, size and insertion site in psbA between C. okamurae and C. cliftonii (Table 7). The intron in atpF could also be found in Ostreobium sp., C. cliftonii, T. expeditionis, B. plumosa and G. planctonica, there was obvious comparability in the intron size and type, the size was between 374 and 875 bp, and the type was group II. The intron in gene psbD was also found in P. akinetum, and it had homologous insertion site, size and type in the two green algae, C.okamurae and P. akinetum (Pombert et al., 2005). The intron in rpoB
4. Conclusions The complete cpDNA of C. okamurae was circular and 148,274 bp in length, with a coding DNA density of 71.7%, falling within the other 7
Marine Genomics xxx (xxxx) xxxx
F. Zheng, et al.
Table 7 The comparison of intron types and distributing of C. okamurae and other green algae cpDNAs (Insertion Sites/size/type). Species
Gene
Start to end
Size
Type
All/GI//GII/ unknown
Species
Gene
Start to end
Size
Type
All/GI//GII/ unknown
Caulerpa okamurae
psbA
17,653–18,533
881
GI
6/2/1/3
petB
14,208–15,476
1269
GI
29/11/15/3
rpoBb ftsH psbD atpF cysA ccsA atpF rpoC1 rpoB rpl23 rpl5 rbcL
25,571–26,066 67,984–68,527 91,435–91,999 133,385–133,960 140,845–140,983 2956–3390 16,624–17,028 27,114–27,540 31,170–31,581 52,932–53,326 56,431–56,848 52,632–54,314
496 544 565 576 139 435 5 85 427 412 395 418 1683
unknow unknow GI GII unknow GII unknow GII GII unknow unknow GI
Gloeotilopsis planctonica
clpP psbB
21,721–23,158 37,769–38,802 38,913–40,200 40,296–41,212 41,538–42,604 45,723–48,746 57,285–58,505 62,245–63,288 63,379–65,131 65,305–66,424 66,589–68,946 77,799–80,311
1438 1034 1288 917 1067 3024 1221 1044 1753 1120 2358 2513
GII GI GI GII GI GII GI GI GI unknow GI GII
psbA
68,488–69,556 69,648–70,768 70,881–72,333 72,357–73,439 73,556–74,772 74,795–75,773 75,795–76,868 76,943–77,276 77,391–78,436 78,574–78,888
1069 1121 1453 1083 1217 979 1074 334 1046 315
GI GI GI GI GI GI GI GI GI GI
psaM ycf12 psbF psaA
90,819–91,937 92,231–93,152 93,703–94,590 94,755–95,662 95,836–96,796 96,887–97,938 111,964–112,930 113,463–114,745 115,075–116,305 141,504–142,320 152,428–152,948 171,392–171,736 171,760–173,397 179,981–181,253 181,785–183,273 184,029–185,397 194,461–195,277 17,038–19,529 30,296–31,072 31,078–31,879 31,887–32,936 33,601–34,343 34,815–34,964 54,109–54,479 59,949–60,809 61,096–62,084 83,602–86,829 91,327–91,700 28,753–30,687 39,344–41,736 47,887–50,030 60,251–62,667 66,334–66,653 69,709–70,033 73,560–73,952 77,351–77,682
1119 922 888 908 961 1052 967 1283 1231 817 521 345 1638 1273 1489 1369 817 2492 777 802 1050 743 150 371 861 989 3228 374 1935 2393 2144 2417 320 325 393 332
GI GI GI GI GI GI GI GI GI GI GI GI GI GI GI GI GI GII GI GI GI GI GI unknow GI GI GII GII GII GII unknow GII unknow GII unknow unknow
atpF atpH atpI psaB atpA
117,808–11,870 118,963–120,626 122,797–124,715 134,039–135,294 135,917–137,383 164,765–165,781 168,301–170,781 171,275–173,976 191,018–193,484 198,416–200,865 201,129–202,594 204,149–205,005 205,241–207,894 210,464–211,818 216,840–218,716 2476–3204 3468–3707 4762–5243 16,740–17,129 19,067–19,915 21,213–21,564 28,358–30,602 34,155–34,226 52,525–52,730 56,036–56,421 58,568–59,671 60,703–61,130 91,130–92,167 95,224–95,597 24,371–25,098 69,487–69,956 72,574–73,587 73,595–74,728 75,225–76,164 76,299–77,439 77,672–77,709 77,811–77,858 88,308–88,957 118,056–118,105
933 1664 1919 1256 1467 1017 2481 2702 2467 2450 1466 857 2654 1355 1877 729 240 482 390 849 352 2245 72 206 386 1104 428 1038 374 25,098 69,956 73,587 74,728 76,164 77,439 77,709 77,858 88,957 118,105
GII GII GII GI unknow GII GII GII GII GII GI GII GII GII unknow GII GI GII GII GII GII GII unknow GI unknow GI GII GI GII 728 470 1014 1134 940 1141 38 48 650 50
Ostreobium sp.
Pseudendoclonium akinetum
psbD psbC
psbB rrl psaA atpA psaB Tydemania expeditionis
rrl psbC rrl
trnL(uaa) ccsA rrs
Derbesia sp.
psbA atpF rps19 psbC psaA atpB psbT ccsA rpoB rpoC1
6/0/3/3
psbH psbA psbC
28/28/0/0
petA
rpl23 atpB psaC atpA
Bryopsis plumosa
11/7/3/1 Caulerpa cliftonii
atpF rpl5 rps19 rpl23 psaA tilS trnL(uaa) ccsA psbB psbT rrl ycf3 rpl16 ccsA rnl
rrn5 atpF rpoB
14/4/8/2
10/0/3/7
Note, See Table 1 for accession number.
ulvophycean and green algal cpDNAs. The cpDNA of C. okamurae consisted of 106 functional genes, including 76 PCGs, three rRNA genes, 27 tRNA genes and eight introns. Similar to the other eight completely sequenced cpDNAs in Bryopsidales (C. racemosa, C. cliftonii, T. expeditionis, B. hypnoides, B. plumosa, Derbesia sp., Codium decorticatum and C.lentillifera), the C. okamurae lacked a large IR, suggesting that this ancestral character was lost on separate occasions of
Bryopsidales. It is worth mentioning that C. okamurae was a marine green macroalgal species, its cpDNA conformation might be similar to that of higher plants, and the phylogenetic analysis indicated that C. okamurae more resembled the cpDNAs of C. racemosa, C. cliftonii and T. expeditionis. Collectively, our data provided useful information for the studies of C. okamurae on evolutionary characteristics, phylogenetic relationships as well as species identification. 8
Marine Genomics xxx (xxxx) xxxx
F. Zheng, et al.
Fig. 2. Phylogenetic trees of C. okamurae based the deduced amino acid sequences from 30 algae cpDNAs. GTR + G + I nucleotide model was selected using jModeltest (v2.1.7), and LG + I + G amino acid model was selected using ProtTest(v3.4). The Maximum Likelihood analyses 1000 parsimony threeswith bootstrap method using RAxML. The genes used in the concatenated alignment was atpA, atpB, atpE, atpI, petG, psaA, psaB, psbB, psbC, psbD, psbE, psbI, psbK, rpl14, rpl16, rpl20, rpl2, rpl5, rps12, rps14, rps8 and tufA.
Data availability statement
References
Some or all data, models, or code generated or used during the study are available from the corresponding author by request. Fengrong Zheng, Hongzhan Liu.
de Cambiaire, J.C., Otis, C., Turmel, M., Lemieux, C., 2007. The chloroplast genome sequence of the green alga Leptosira terrestris: multiple losses of the inverted repeat and extensive genome rearrangements within the Trebouxiophyceae. BMC Genomics 8, 213–225. Abdullah, H.S., Matthias, B., Peter, F.S., Kifah, T., 2014. GC skew and mitochondrial origins of replication. Mitochondrion 17, 56–66. Boudreau, E., Otis, C., Turmel, M., 1994. Conserved gene clusters in the highly rearranged chloroplast genomes of Chlamydomonas moewusii and Chlamydomonas reinhardtii. Plant Mol. Biol. 24, 585–602. Cheng, J., Zeng, X., Ren, G., Liu, Z., 2013. CGAP: a new comprehensive platform for the comparative analysis of chloroplast genomes. BMC Bioinformatics 14, 95. Civáň, P., Foster, P.G., Embley, T.M., Séneca, A., Cox, C.J., 2014. Analyses of charophyte chloroplast genomes help characterize the ancestral chloroplast genome of land plants. Genome Biol Evol. (4), 897–911. Cocquyt, E., Verbruggen, H., Leliaert, F., DeClerck, O., 2010. Evolution and cytological diversification of the green seaweeds (Ulvophyceae). Mol. Biol. Evol. 27, 2052–2061. Da, C.R.R., Chaves, H.V., do Val, D.R., de Freitas, A.R., Lemos, J.C., Rodrigues, J.A., Pereira, K.M., de Araujo, I.W., Bezerra, M.M., Benevides, N.M., 2014. A lectin from the green seaweed Caulerpa cupressoides reduces mechanical hyper-nociception and inflammation in the rat temporomandibular joint during zymosan-induced arthritis. Int. Immunopharmacol. 2014, 34–43. Edgar, R.C., 2004. MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res. 32 (5), 1792–1797. Eun, K.H., Chan, S.P., Jung, W.H., Won, J.S., Chang, G.C., Chul, H.S., 2003. Growth and maturation of a Green alga, Caulerpa okamurae Weber van Bosse. Algae 18 (3), 217–223. Fučíková, K., Leliaert, F., Cooper, E.D., Skaloud, P., D'hondt, S., De Clerck, O., et al., 2014.
Declaration of Competing Interest The authors declare no conflict of interest. Acknowledgments This study was supported by the Key Research and Development Program of Shandong Province (No. 2019GHY112003), National Key R &D Program of China (No.2017YFC1404504) and Conservation and Restoration of Bio-resources and Habitats in Typical Coastal Zones in Shandong Province, CF-MEEC/TR/2018-08. Appendix A. Supplementary data Supplementary data to this article can be found online at https:// doi.org/10.1016/j.margen.2020.100752. 9
Marine Genomics xxx (xxxx) xxxx
F. Zheng, et al.
Plant Cell 14 (11), 2659–2679. Melton, J.T., Leliaert, F., Tronholm, A., Lopez-Bautista, J.M., 2015. The complete chloroplast and mitochondrial genomes of the green macroalga Ulva sp. UNA00071828 (Ulvophyceae, Chlorophyta). PLoS One 10, 1–21. Monique, T., Christian, O., Claude, L., 1999. The complete chloroplast DNA sequence of the green alga Nephroselmis olivacea: insights into the architecture of ancestral chloroplast genomes. Proc. Natl. Acad. Sci. 96, 10248–10253. Padmanabhan, U., Green, B.R., 1978. The kinetic complexity of Acetabularia chloroplast DNA. BBA—Nucleic Acids and Protein Synthesis 521, 67–73. Page, R.D., 1996. TREEVIEW: an application to display phylogenetic trees on personal computers. Comput. Appl. Biosci. 12, 357–358. Patama, R., Anong, C., 2006. Nutritional evaluation of tropical green seaweeds Caulerpa lentilifera and Ulva reticulata. Kasetsa J. Nat. Sci. 40, 75–83. Patricia, M., Suhaila, M., Noordin, M.M., et al., 2008. Antioxidant activities and phenolics content of eight species of seaweeds from North Borneo. J. Appl. Phycol. 20, 367–373. Pombert, J., 2006. F., Lemieux, C., Turmel, M., the complete chloroplast DNA sequence of the green alga Oltmannsiellopsis viridis reveals a distinctive quadripartite architecture in the chloroplast genome of early diverging ulvophytes. BMC Biol. 4, 3. Pombert, J.F., Otis, C., Lemieux, C., Turmel, M., 2005. The chloroplast genome sequence of the green alga Pseudendoclonium akinetum (Ulvophyceae) reveals unusual structural features and new insights into the branching order of chlorophyte lineages. Mol. Biol. Evol. 22 (9), 1903–1918. Reiko, M., Tomoaki, I., Hideshi, I., Tatsuji Sakamoto, T., 2012. Immunostimulatory activity of polysaccharides isolated from Caulerpa lentillifera on macrophage cells. Biosci. Biotechnol. Biochem. 76 (3), 501–505. Rozas, J., 2009. DNA sequence polymorphism analysis using DnaSP. In: Posada, D. (Ed.), Bioinformatics for DNA Sequence Analysis Methods in Molecular Biology Series. 537. Humana Press, NJ, USA, pp. 337–350. Schattner, P., Brooks, A.N., Lowe, T.M., 2005. The tRNAscan-SE, snoscan and snoGPS web servers for the detection of tRNAs and snoRNAs. Nucleic Acids Res. 33, W686–W689. Sharp, P.M., Li, W.H., 1987. The codon adaptation index–a measure of directional synonymous codon usage bias, and its potential applications. Nucleic Acids Res. 15 (3), 1281–1295. Simpson, C.L., Stern, D.B., 2002. The treasure trove of algal chloroplast genomes. Surprises in architecture and gene content, and their functional implications. Plant Physiol. 129, 957–966. Stamatakis, A., 2014. RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies. Bioinformatics 30 (9), 1312–1313. Sun, L., Fang, L., Zhang, Z., Chang, X., Penny, D., Zhong, B., 2016. Chloroplast phylogenomic inference of green algae relationships. Sci. Rep. 6, 20528. Tamura, K., Stecher, G., Peterson, D., Filipski, A., Kumar, S., 2013. MEGA6: molecular evolutionary genetics analysis version 6.0. Mol. Biol. Evol. 30 (12), 2725–2729. Vanderlei, E.S.O., Patoilo, K.K.N.R., Lima, N.A., Lima, A.P.S., Rodrigues, J.A.G., Silva, L.M.C.M., Lima, M.E.P., Lima, V., Benevides, N.M.B., 2010. Antinociceptive and antiinflammatory activities of lectin from the marine green alga Caulerpa cupressoides. Int. Immunopharmacol. 10, 1113–1118. Wakasugi, T., Nagai, T., Kapoor, M., Sugita, M., Ito, M., Ito, S., et al., 1997. Complete nucleotide sequence of the chloroplast genome from the green alga Chlorella vulgaris: the existence of genes possibly involved in chloroplast division. Proc. Natl. Acad. Sci. U. S. A. 94 (11), 5967–5972. Wang, B., Zheng, F.R., Wang, X., Li, J.X., Jiang, M.C., Zhang, J.D., 2018. Nutrient Component Analysis and Evaluation of Caulerpa lentillifera and Caulerpa okamurae. Nutrimenta Sinica 40 (5), 515–517. Wu, C.S., Chaw, S.M., Huang, Y.Y., 2013. Chloroplast phylogenomics indicates that Ginkgo biloba is sister to cycads. Genome Biol. Evol. 5, 243–254. Zhang, H.K., Gao, S.H., Martin, J.L., Hu, S.N., Chen, W.H., 2012. EvolView, an online tool for visualizing, annotating and managing phylogenetic trees. Nucleic Acids Res. 40 (Web Server issue), W569–W572. Zheng, F.R., Liu, H.Z., Jiang, M.J., Xu, Z.J., Wang, Z.X., Wang, Claire, Du, F., Shen, Z., Wang, B., 2018. The complete mitochondrial genome of the Caulerpa lentillifera (Ulvophyceae, Chlorophyta): Sequence, genome content, organization structure and phylogenetic consideration. Gene 673 (2), 25–238. Zuccarello, G.C., Price, N., Verbruggen, H., Leliaert, F., 2009. Analysis of a plastid multigene data set and the phylogenetic position of the marine macroalga Caulerpa filiformis (Chlorophyta). J. Phycol. 45, 1206–1212.
New phylogenetic hypotheses for the core Chlorophyta based on chloroplast sequence data. Front. Ecol. Evol. 2, 63. Fučíková, K., Lewis, P.O., Lewis, L.A., 2016. Chloroplast phylogenomic data from the green algal order Sphaeropleales (Chlorophyceae, Chlorophyta) reveal complex patterns of sequence evolution. Mol. Phylogenet. Evol. 98, 176–183. Hanyuda, T., Ara, I.S., Ueda, K., 2000. Variability in the rbcL introns of caulerpalean algae (Chlorophyta, Ulvophyceae). J. Plant Res. 113, 403–413. Hillis-Colinvaux, L., 1984. Systematics of the Siphonales. In: Irvine, D.E.G., John, D.M. (Eds.), Systematics of the Green Algae. Academic Press Inc., London, pp. 271–296. Ian, R.P., 2011. A taxonomic revision of the marine green algal genera Caulerpa and Caulerpella (Chlorophyta, Caulerpaceae) in northern (tropical and subtropical) Australia. Aust. Syst. Bot. 24, 137–213. de Koning, A.P., Keeling, P.J., 2006. The complete plastid genome sequence of the parasitic green alga Helicosporidium sp. is highly reduced and structured. BMC Biol. 4, 12. 16630350. Kudaka, J., Itokazu, K., Taira, K., et al., 2008. Investigation and culture of microbial contaminants of Caulerpa lentillifera (sea grape). Shokuhiu Eiseigaku Zasshi 49 (1), 11–15. Lagesen, K., Hallin, P., Rodland, E.A., Staerfeldt, H.H., Rognes, T., Ussery, D.W., 2007. RNAmmer: consistent and rapid annotation of ribosomal RNA genes. Nucleic Acids Res. 35, 3100–3108. Lang, B.F., Nedelcu, A.M., 2012. Plastid genomes of algae. In: Bock, R., Knoop, V. (Eds.), Advances in Photosynthesis and Respiration Including Bioenergy and Related Processes: Genomics of Chloroplasts and Mitochondria, pp. 59–87. Lee, J., James, R.M., 2003. Three ORF-containing group I introns in chloroplast SSU of Caulerpa sertularioides (Ulvophyceae) and their evolutionary implications. Algae 18 (3), 183–190. Lehman, R.L., Manhart, J.R., 1997. A preliminary comparison of restriction fragment patterns in the genus Caulerpa (Chlorophyta) and the unique structure of the chloroplast genome of Caulerpa sertularioides. J. Phycol. 33 (6), 1055–1062. Leliaert, F., Lopez-Bautista, J.M., 2015. The chloroplast genomes of Bryopsis plumosa and Tydemania expeditiones (Bryopsidales, Chlorophyta):compact genomes and genes of bacterial origin. BMC Genomics 16 (1), 204. Leliaert, F., Smith, D.R., Moreau, H., Herron, M.D., Verbruggen, H., Delwiche, C.F., 2012. Phylogeny and molecular evolution of the Green algae. Crit. Rev. Plant Sci. 31, 1–46. Leliaert, F., Tronholm, A., Lemieux, C., Turmel, M., DePriest, M.S., Bhattacharya, D., et al., 2016. Chloroplast phylogenomic analyses reveal the deepest-branching lineage of the Chlorophyta, Palmophyllophyceae class. nov. Sci Rep Nature Publishing Group. 6, 25367. Lemieux, C., Otis, C., Turmel, M., 2007. A clade uniting the green algae Mesostigma viride and Chlorokybus atmophyticus represents the deepest branch of the Streptophyta in chloroplast genome-based phylogenies. BMC Biol. 5, 2. Li, H., Durbin, R., 2009. Fast and accurate short read alignment with burrows-wheeler transform. Bioinformatics 25, 1754–1760. Lohse, M., Drechsel, O., Kahlau, S., Bock, R., 2013. Organellar GenomeDRAW—a suite of tools for generating physical maps of plastid and mitochondrial genomes and visualizing expression data sets. Nucleic Acids Res. 41, 1–7. Lopez-Bautista, Juan, Lam, Daryl, 2016. Complete chloroplast genome for Caulerpa racemosa (Bryopsidales, Chlorophyta) and comparative analyses of siphonous green seaweed plastomes cymbella. 2 (2). pp. 23–32. Lowe, T.M., Chan, P.P., 2016. tRNAscan-SE on-line: search and contextual analysis of transfer RNA genes. Nucl. Acids Res. 44, W54–W57. Lü, F., Xu, W., Tian, C., Wang, G., Niu, J., Pan, G., et al., 2011. The Bryopsis hypnoides plastid genome: multimeric forms and complete nucleotide sequence. PLoS One 6, e14663. Lu, J.M., Zhang, N., Du, X.Y., Wen, J., Li, D.Z., 2015. Chloroplast phylogenomics resolves key relationships in ferns. J. Syst. Evol. 53, 448–457. Manhart, J.R., Kelly, K., Dudock, B.S., Palmer, J.D., 1989. Unusual characteristics of Codium fragile chloroplast DNA revealed by physical and gene mapping. Mol. Gen. Genet. 216 (2–3), 417–421. Marcelino, V.R., Ma, C.M.C., Jackson, C.J., Larkum, A.A.W., Verbruggen, H., 2016. Evolutionary dynamics of chloroplast genomes in low light: a case study of the endolithic green alga Ostreobium quekettii. Genome Biol. Evol. 8 (9), 2939–2951. Martin, M., 2011. Cutadapt removes adapter sequences from high-throughput sequencing reads. Embnet J. 17, 10–12. Maul, J.E., Lilly, J.W., Cui, L., dePamphilis, C.W., Miller, W., et al., 2002. The Chlamydomonas reinhardtii plastid chromosome: islands of genes in a sea of repeats.
10