• 中国中文核心期刊
  • 中国科学引文数据库(CSCD)核心库来源期刊
  • 中国科技论文统计源期刊(CJCR)
  • 第二届国家期刊奖提名奖
Volume 36 Issue 4
Aug.  2023
Article Contents
Turn off MathJax

Citation:

Chloroplast Genome Phylogeny and Codon Preference of Anabasis aphylla L.

  • Corresponding author: JIANG Ping, shzujp@163.com
  • Received Date: 2022-11-26
    Accepted Date: 2023-02-09
  • Objective To analyze the structural characteristics of Anabasis aphylla chloroplast genome, clarify the taxonomic status of Anabasis in Chenopodiaceae for further exploring its codon preference and determining the optimal codon. Method Total genomic DNA was extracted from fresh assimilation twigs of A. aphylla based on CTAB method. Sequencing was performed using the Illumina Genome Analyzer HiSeq 2000 high-throughput sequencing platform. The chloroplast genome sequence was assembled and annotated by GetOrganelle and Plann. Simple sequence repeat (SSR) in chloroplast genome was analyzed by MISA. Multiple sequence alignment and ML phylogenetic tree construction were analyzed using MAFFT v.7.450 and IQ-TREE v.2.1.1 software. The nucleotide polymorphism values were calculated using DnaSP 6.0 software. The codon preference was studied by CUSP and Codon W 1.4.2. software. Result The full-length chloroplast genome of A. aphylla was 154 084 bp, including a large single copy region (LSC, 85 124 bp), a small single copy region (SSC, 18 934 bp), and a pair of inverted repeat regions (IRa and IRb, 25 013 bp). A total of 132 genes were annotated, including 83 protein-coding genes, 8 rRNA genes, 37 tRNA genes and 4 pseudogenes. The number of SSRs located in the intergenic region was the largest (70.4%), and the number of single-base (A/T) repeat type SSRs was the largest. The optimal model of phylogenetic tree was TVM + F + R3, and Chenopodiaceae was divided into four clustering groups. Among them, A. aphylla showed the closest relationship with Haloxylon and Salsola. TrnS-trnG(exon1)、ndhF-rpl32rpl32-trnLrps16(exon1)-trnQ and ycf1 were high nucleotide polymorphism regions. A total of 20 optimal codons (UUU, UAU, UGU, CAU, UCU, UCA, UUA, CUU, CCU, AGA, GAA, ACU, ACA, AAU, GAU, AAA, GUU, GCU, GGU, CAA) were determined, all ending with A/U. The codon usage preference was mainly affected by natural selection, and the influence of mutation and other influencing factors was weak. Conclusion The chloroplast genome structure of A. aphylla is conservative, showing typical quadripartite structure. In the phylogeny of Chenopodiaceae, A. aphylla has the closest relationship with Haloxylon and Salsola. The identified hypervariable regions and SSR loci can be used for molecular identification of intergeneric species in Chenopodiaceae. The codon of the chloroplast genome of A.aphylla prefer endings with A/U, and the 20 optimal codons determined are useful for the optimization of its exogenous codons. The results can provide a reference for molecular marker development, phylogeny and chloroplast gene engineering of A. aphylla.
  • 加载中
  • [1] 朱格麟. 藜科植物的起源、分化和地理分布[J]. 植物分类学报, 1995, 34(5):486-504.

    [2] 楚光明, 王 梅, 张硕新. 准噶尔盆地南缘洪积扇无叶假木贼种群空间点格局[J]. 林业科学, 2014, 50(4):8-14.

    [3] 王婷婷, 楚光明, 江 萍, 等. 不同处理对无叶假木贼种子萌发的影响[J]. 西北林学院学报, 2017, 32(5):125-129. doi: 10.3969/j.issn.1001-7461.2017.05.22

    [4] 陈 华, 李援朝. 假木贼属植物化学成分及生物活性研究进展[J]. 天然产物研究与开发, 2004, 16(6):585-589. doi: 10.3969/j.issn.1001-6880.2004.06.025

    [5] 杜 华, 周立刚, 李 春, 等. 藜科植物化学成分与生物活性的研究进展[J]. 天然产物研究与开发, 2007, 19(5):884-889. doi: 10.3969/j.issn.1001-6880.2007.05.038

    [6] 孙 艳, 沈庆国, 屈玲霞, 等. 无叶假木贼的化学成分及抗菌活性研究[J]. 中草药, 2022, 53(8):2278-2284. doi: 10.7501/j.issn.0253-2670.2022.08.003

    [7]

    XIAO S Z, XU P, DENG Y T, et al. Comparative analysis of chloroplast genomes of cultivars and wild species of sweetpotato (Ipomoea batatas [L. ] Lam)[J]. BMC Genomics, 2021, 22(1): 262. doi: 10.1186/s12864-021-07544-y
    [8]

    MENG J, LI X P, LI H T, et al. Comparative analysis of the complete chloroplast genomes of four Aconitum medicinal species[J]. Molecules., 2018, 23(5): 1015-1017. doi: 10.3390/molecules23051015
    [9]

    LI P, LU R S, XU W Q, et al. Comparative genomics and phylogenomics of East Asian Tulips (Amana, Liliaceae)[J]. Frontiers in Plant Science, 2017, 8: 451.
    [10]

    MENEZES A P A, RESENDE-MOREIRA L C, BUZATTI R S O, et al. Chloroplast genomes of Byrsonima species (Malpighiaceae): Comparative analysis and screening of high divergence sequences[J]. Scientific Reports, 2018, 8(1): 2210. doi: 10.1038/s41598-018-20189-4
    [11]

    ZHOU J W, ZHANG S, WANG J, et al. Chloroplast genomes in Populus (Salicaceae): comparisons from an intensively sampled genus reveal dynamic patterns of evolution[J]. Scientific Reports, 2021, 11(1): 9471. doi: 10.1038/s41598-021-88160-4
    [12]

    de SANTANA LOPES, AMANDA, PACHECO, et al. The Linum usitatissimum L. plastome reveals atypical structural evolution, new editing sites, and the phylogenetic position of Linaceae within Malpighiales[J]. Plant Cell Reports, 2018, 37(2): 307-328. doi: 10.1007/s00299-017-2231-z
    [13]

    VAUGHN JUSTIN N, CHALUVADI, SRINIVASA R, et al. Whole plastome sequences from five ginger species facilitate marker development and define limits to barcode methodology[J]. PLoS ONE, 2014, 9(10): e108581. doi: 10.1371/journal.pone.0108581
    [14]

    BOEL G, LETSO R, NEELY H, et al. Codon influence on protein expression in E. coli correlates with mRNA levels[J]. Nature, 2016, 529(7586): 358-363. doi: 10.1038/nature16509
    [15]

    CHENG Y, HE X, PRIYADARSHANI S V G N, et al. Assembly and comparative analysis of the complete mitochondrial genome of Suaeda glauca[J]. BMC Genomics, 2021, 22(1): 167. doi: 10.1186/s12864-021-07490-9
    [16]

    SHE H, LIU Z, XU Z, et al. Comparative chloroplast genome analyses of cultivated spinach and two wild progenitors shed light on the phylogenetic relationships and variation[J]. Scientific Reports, 2022, 12(1): 856. doi: 10.1038/s41598-022-04918-4
    [17] 张鲁杰, 夏秀英, 徐 娜, 等. 高效提取越橘成熟组织基因组DNA的方法[J]. 华北农学报, 2008, 3(S2):205-208. doi: 10.7668/hbnxb.2008.S2.047

    [18]

    JIN J J, YU W B, YANG J B, et al. GetOrganelle: a fast and versatile toolkit for accurate de novo assembly of organelle genomes[J]. Genome Biology, 2020, 21(1): 241. doi: 10.1186/s13059-020-02154-5
    [19]

    HUANG D I, CRONK Q C B. Plann: a command-line application for annotating plastome sequences[J]. Applications in Plant Sciences, 2015, 3(8): 1500026. doi: 10.3732/apps.1500026
    [20]

    BENSON D A, KARSCH-MIZRACHI I, LIPMAN D J, et al. GenBank[J]. Nucleic Acids Research, 2010, 39(suppl_1): D32.
    [21]

    GREINER S, LEHWARK P, BOCK R. Organellar Genome DRAW (OGDRAW) version 1.3. 1: expanded toolkit for the graphical visualization of organellar genomes[J]. Nucleic Acids Research, 2019, 47(W1): W59. doi: 10.1093/nar/gkz238
    [22]

    KURTZ S, CHOUDHURI J V, OHLEBUSCH E, et al. REPuter: the manifold applications of repeat analysis on a genomic scale[J]. Nucleic Acids Research, 2001, 29(22): 4633-4642. doi: 10.1093/nar/29.22.4633
    [23]

    BEIER S, THIEL T, MÜNCH T, et al. MISA-web: a web server for microsatellite prediction[J]. Bioinformatics, 2017, 33(16): 2583-2585. doi: 10.1093/bioinformatics/btx198
    [24]

    KATOH K, STANDLEY D M. MAFFT multiple sequence alignment software version 7: improvements in performance and usability[J]. Molecular Biology and Evolution, 2013, 30(4): 772-780. doi: 10.1093/molbev/mst010
    [25]

    MINH B Q, SCHMIDT H A, CHERNOMOR O, et al. IQ-TREE 2: new models and efficient methods for phylogenetic inference in the genomic era[J]. Molecular Biology and Evolution, 2020, 37(5): 1530-1534. doi: 10.1093/molbev/msaa015
    [26]

    ROZAS J, FERRER-MATA A, SÁNCHEZ-DELBARRIO J C, et al. DnaSP 6: DNA sequence polymorphism analysis of large data sets[J]. Molecular Biology and Evolution, 2017, 34(12): 3299-3302. doi: 10.1093/molbev/msx248
    [27]

    WOLF P G, DER J, DUFFY A, et al. The evolution of chloroplast genes and genomes in ferns[J]. Plant Molecular Biology, 2010, 76(3-5): 251-261.
    [28]

    SHINOZAKI K, OHME M, TANAKA M, et al. The complete nucleotide sequence of the tobacco chloroplast genome: its gene organization and expression[J]. The EMBO Journal, 1986, 5(9): 2029-2043.
    [29]

    DUGAS D V, HERNANDEZ D, KOENEN E J M, et al. Mimosoid legume plastome evolution: IR expansion, tandem repeat expansions, and accelerated rate of evolution in clpP[J]. Scientific Reports, 2015, 5(1): 16958. doi: 10.1038/srep16958
    [30]

    WANG W B, YU H, WANG J H, et al. The complete chloroplast genome sequences of the medicinal plant Forsythia suspensa (Oleaceae)[J]. International Journal of Molecular Sciences, 2017, 18(11): 2288. doi: 10.3390/ijms18112288
    [31] 蒋礼玲, 王琳超, 黄新荣, 等. 藜属植物叶绿体基因组结构与系统进化[J]. 应用与环境生物学报, 2022, 28(5):1255-1261.

    [32] 李泳潭, 张 军, 黄亚丽, 等. 杜梨叶绿体基因组分析[J]. 园艺学报, 2020, 47(6):1021-1032.

    [33] 蒋思思, 袁 军, 周文君, 等. 薄壳山核桃(Carya illinoinensis)叶绿体基因组及其特征分析[J]. 园艺学报, 2022, 49(8):1772-1784.

    [34]

    GUISINGER M M, CHUMLEY T W, KUEHL J V, et al. Implications of the plastid genome sequence of Typha (Typhaceae, Poales) for understanding genome evolution in Poaceae[J]. Journal of Molecular Evolution, 2010, 70(2): 149-166. doi: 10.1007/s00239-009-9317-3
    [35]

    KRISHNAN J, MISHRA R K. Code in the Non-Coding[J]. Proceedings of the Indian National Science Academy, 2015, 81(3): 609-628.
    [36]

    KOROL A B, FAHIMA T, NEVO E. Microsatellites within genes: structure, function, and evolution[J]. Molecular Biology and Evolution, 2004, 21(6): 991-1007. doi: 10.1093/molbev/msh073
    [37] 蒋 明, 柯世省, 王军峰. 多脉铁木叶绿体基因组的序列特征和系统发育[J]. 林业科学, 2020, 56(5):60-68. doi: 10.11707/j.1001-7488.20200507

    [38] 高鸣泽. 不同品种藜麦叶绿体基因组全序列及其系统发育关系[D]. 太原: 山西大学, 2021.

    [39] 努尔古丽·阿木提. 新疆藜科植物系统分类学研究[D]. 乌鲁木齐: 新疆大学, 2013.

    [40]

    DONG S, YING Z, YU S, et al. Complete chloroplast genome of Stephania tetrandra (Menispermaceae) from Zhejiang Province: insights into molecular structures, comparative genome analysis, mutational hotspots and phylogenetic relationships[J]. BMC Genomics, 2021, 22(1): 880. doi: 10.1186/s12864-021-08193-x
    [41]

    HUANG X, JIAO Y, GUO J, et al. Analysis of codon usage patterns in Haloxylon ammodendron based on genomic and transcriptomic data[J]. Gene, 2022, 845: 146842. doi: 10.1016/j.gene.2022.146842
    [42]

    ZHANG Z C, DAI W, WANG Y, et al. Analysis of synonymous codon usage patterns in torque Teno sus virus 1 (TTSuV1)[J]. Archives of Virology, 2013, 158(1): 145-154. doi: 10.1007/s00705-012-1480-y
    [43]

    CAMPBELL W H, GOWRI G. Codon usage in higher plants, green algae, and cyanobacteria[J]. Plant Physiology, 1990, 92(1): 1-11. doi: 10.1104/pp.92.1.1
  • 加载中
通讯作者: 陈斌, bchen63@163.com
  • 1. 

    沈阳化工大学材料科学与工程学院 沈阳 110142

  1. 本站搜索
  2. 百度学术搜索
  3. 万方数据库搜索
  4. CNKI搜索

Figures(6) / Tables(3)

Article views(3852) PDF downloads(58) Cited by()

Proportional views

Chloroplast Genome Phylogeny and Codon Preference of Anabasis aphylla L.

    Corresponding author: JIANG Ping, shzujp@163.com
  • College of Agriculture, Shihezi University, Shihezi 832003, Xinjiang, China

Abstract:  Objective To analyze the structural characteristics of Anabasis aphylla chloroplast genome, clarify the taxonomic status of Anabasis in Chenopodiaceae for further exploring its codon preference and determining the optimal codon. Method Total genomic DNA was extracted from fresh assimilation twigs of A. aphylla based on CTAB method. Sequencing was performed using the Illumina Genome Analyzer HiSeq 2000 high-throughput sequencing platform. The chloroplast genome sequence was assembled and annotated by GetOrganelle and Plann. Simple sequence repeat (SSR) in chloroplast genome was analyzed by MISA. Multiple sequence alignment and ML phylogenetic tree construction were analyzed using MAFFT v.7.450 and IQ-TREE v.2.1.1 software. The nucleotide polymorphism values were calculated using DnaSP 6.0 software. The codon preference was studied by CUSP and Codon W 1.4.2. software. Result The full-length chloroplast genome of A. aphylla was 154 084 bp, including a large single copy region (LSC, 85 124 bp), a small single copy region (SSC, 18 934 bp), and a pair of inverted repeat regions (IRa and IRb, 25 013 bp). A total of 132 genes were annotated, including 83 protein-coding genes, 8 rRNA genes, 37 tRNA genes and 4 pseudogenes. The number of SSRs located in the intergenic region was the largest (70.4%), and the number of single-base (A/T) repeat type SSRs was the largest. The optimal model of phylogenetic tree was TVM + F + R3, and Chenopodiaceae was divided into four clustering groups. Among them, A. aphylla showed the closest relationship with Haloxylon and Salsola. TrnS-trnG(exon1)、ndhF-rpl32rpl32-trnLrps16(exon1)-trnQ and ycf1 were high nucleotide polymorphism regions. A total of 20 optimal codons (UUU, UAU, UGU, CAU, UCU, UCA, UUA, CUU, CCU, AGA, GAA, ACU, ACA, AAU, GAU, AAA, GUU, GCU, GGU, CAA) were determined, all ending with A/U. The codon usage preference was mainly affected by natural selection, and the influence of mutation and other influencing factors was weak. Conclusion The chloroplast genome structure of A. aphylla is conservative, showing typical quadripartite structure. In the phylogeny of Chenopodiaceae, A. aphylla has the closest relationship with Haloxylon and Salsola. The identified hypervariable regions and SSR loci can be used for molecular identification of intergeneric species in Chenopodiaceae. The codon of the chloroplast genome of A.aphylla prefer endings with A/U, and the 20 optimal codons determined are useful for the optimization of its exogenous codons. The results can provide a reference for molecular marker development, phylogeny and chloroplast gene engineering of A. aphylla.

  • 无叶假木贼(Anabasis aphylla L.)隶属于藜科(Chenopodiaceae)假木贼属(Anabasis),半灌木,具有强的抗盐碱能力,在我国主要分布于西北地区[1-2]。无叶假木贼是荒漠植被的主要建群种和优势种[3],常作为防风固沙的植物材料,具有很高的生态价值。同时,其植株提取物包含生物碱、萜类、皂苷类等多种生物活性物质[4],有效治疗疥癣、疥疮和湿疹痒痛,还有效防治菜青虫、蚜虫等多种害虫[5-6]

    叶绿体是一种重要的质体,在植物细胞的光合作用等生物过程中起着关键作用[7]。叶绿体基因组通常比核基因更为保守,对植物系统发育和物种鉴定有重要作用[8-9]。在物种进化过程中,叶绿体基因组在序列、组成、大小和基因含量方面高度保守[10],具有2个反向重复区(inverted repeats, IR)、1个小单拷贝区(small single copy, SSC)和1个大单拷贝区组成的四分体结构(large single copy, LSC)[11]。IR区域的收缩和扩张是叶绿体基因含量和基因组大小变化的主要影响因素[12]。叶绿体基因组中存在一些简单重复序列(SSRs)和单核苷酸多态性(SNP)的热点区域,可产生足够的信息用于物种分类和鉴定[13]。此外,植物叶绿体基因组中的密码子偏好性反映其在进化过程中的分子适应程度和受到的进化压力,同时参与基因的表达[14]。目前,叶绿体基因组序列作为超级条形码,已经在藜科中多个物种的系统发育研究中得到应用[15-16]。然而,假木贼属物种的叶绿体基因组尚未被报道,它们的进化特征和遗传多样性尚不清晰。

    本研究首次对假木贼属的无叶假木贼的叶绿体基因组进行测序、组装和注释,进一步分析其叶绿体基因组特征和密码子偏好性等;此外,将其与已公布叶绿体基因组的藜科物种构建系统发育树,进一步筛选种间基因组高变区。本研究目的在于:(1)阐明无叶假木贼与其它藜科物种的进化关系及其在系统发育中的地位;(2)筛选有效的候选分子标记序列和最优密码子,以期为无叶假木贼的分子标记开发、系统进化及叶绿体基因工程研究提供参考。

    • 植物样本来源于新疆准格尔盆地南缘(84°52' E,45°22' N,海拔265 m),经石河子大学楚光明教授鉴定为无叶假木贼(A. aphylla L.)。采集的幼嫩同化枝用液氮处理后置于液氮保温桶,带回实验室放于-80℃冰箱保存。基于改良的CTAB法[17]提取无叶假木贼总DNA,用超声波将DNA片段化,经过纯化、末端修复、3’端加A、连接测序接头的片段,通过琼脂糖凝胶电泳的方法选择合适长度的片段进行PCR扩增,构建测序文库。文库质检后,基于Illumina Genome Analyzer Hiseq 2000测序平台进行叶绿体基因组测序。

    • 使用GetOrganelle软件[18]对无叶假木贼叶绿体基因组序列进行组装;通过Perl语言脚本Plann[19]对叶绿体基因组进行注释;利用Sequin[20]检查注释缺失或错误的基因。利用OGDRAW v. 1. 3. 1软件[21]绘制叶绿体基因组环状结构图。无叶假木贼叶绿体基因组数据已上传GenBank数据库(https://www.ncbi.nlm.nih.gov/genbank/),登录号为OP712667。

    • 利用REPuter在线工具对长重复序列进行分析,最大重复序列数为100,最小重复大小为22 bp[22]。使用MISA在线工具[23]检测简单重复序列(Simple Sequence Repeats, SSRs),参数设置为:单核苷酸重复次数≥10,二核苷酸重复次数≥5,三核苷酸重复次数≥4,四核苷酸到六核苷酸重复次数≥3。

    • 从NCBI的GenBank数据库下载19种藜科(Chenopodiaceae)物种的叶绿体基因组序列,登录号分别为:Spinacia oleracea L. (AJ400848)、Dysphania pumilio (R.Br.) Mosyakin & Clemants (MH936550)、Dysphania botrys L. (MH898873)、Dysphania ambrosioides L. (MK182726)、Chenopodium quinoa Willd. (KY419706)、Chenopodium ficifolium Sm. (MK182725)、Chenopodium album L. (KY419707)、Chenopodium acuminatum Willd. (MW057780)、Atriplex centralasiatica Iljin (MK867774)、Atriplex gmelinii C. A. Mey. ex Bong. (MT810472)、Salsola affinis C. A. Mey (ON080842)、Salsola abrotanoides (Bunge) Akhani (MW123092)、Haloxylon ammodendron (C. A. Mey.) Bunge (KF534478)、Haloxylon persicum Bge. (KF534479)、Suaeda glauca L. (MK867773)、Salicornia europaea L. (KJ629116)、Salicornia brachiate Miq. (KJ629115)、Salicornia bigelovii Torr. (KJ629117)、Kalidium foliatum (Pall.) Moq. (MW699755);以2种苋科物种,Deeringia amaranthoides (Lam.) Merr. (MK397865)和Celosia argentea L. (MK397861)的叶绿体基因组序列作为外群,与测得的无叶假木贼叶绿体基因组序列共同构建系统发育树。基于MAFFT v. 7. 450软件对22个物种的叶绿体基因组序列进行多序列比对[24],通过IQ-TREE v. 2. 1. 1软件[25]构建最大似然法(Maximum likelihood,ML)系统进化树,其中最优构树模型为TVM + F + R3,步长值为1 000。

    • 通过MAFFT v. 7. 450软件对包括无叶假木贼在内的藜科20个物种的叶绿体基因组序列进行多序列比对,对齐后的序列通过DnaSP 6.0软件[26]计算核苷酸多态性值(搜索窗口长度为600 bp,步长为200 bp)。将无叶假木贼及其近缘种的genbank格式的叶绿体基因组文件上传至生信云在线分析网站(http://112.86.217.82:9919/#/tool/alltool/detail/296),进行叶绿体基因组IR区边界区域上基因的可视化。

    • 在无叶假木贼叶绿体基因组中,筛选长度大于300 bp的基因序列,使用Codon W 1. 4. 2软件和CUSP 在线程序(https://emboss.toulouse.inra.fr/cgi-bin/emboss/cusp)对有效密码子数、同义密码子相对使用度(RSCU)、密码子GC含量和最优密码子进行计算。通过中性绘图、ENC-plot和PR2-plot分析密码子偏好性的影响因素。

    2.   结果与分析
    • 无叶假木贼叶绿体基因组呈典型的双链环状四分体结构(图1),全长为154 084 bp,其中,LSC长85 124 bp,SSC长18 934 bp,IRa和IRb长25 013 bp。叶绿体基因组GC含量为36.25%,其中,SSC、LSC、IR区的GC含量分别为29.26%、33.89%、42.85%。

      Figure 1.  Gene map of the chloroplast genome of Anabasis aphylla

      无叶假木贼叶绿体基因组中共注释到132个基因,包含83个蛋白编码基因,8个rRNA基因,37个tRNA基因和4个假基因。其中,75个基因与自我复制功能相关,45个基因与光合作用功能相关,6个基因编码其它蛋白质,6个基因的功能未知(表1)。16个基因存在双份拷贝,包括6个蛋白编码基因(rpl23rpl2rps12rps7ndhBycf2),6个tRNA基因(trnA-UGC、trnI-GAU、trnL-CAA、trnN-GUU、trnR-ACG、trnV-GAC)和4个rRNA基因(rrn4.5Srrn5Srrn16Srrn23S)。此外,1个tRNA基因(trnM-CAU)在无叶假木贼叶绿体基因组中存在3份拷贝。

      基因功能
      Gene Function
      基因分类
      Gene group
      基因
      Gene
      光合作用
      Photosynthesis
      光合系统I
      Photosystem I
      psaA, psaB, psaC, psaI, psaJ
      光合系统II
      Photosystem II
      psbA, psbB, psbC, psbD, psbE, psbF, psbH, psbI, psbJ, psbK, psbL, psbM, psbN, psbT, psbZ
      NADH 脱氢酶
      Subunits of NADH dehydrogenase
      ndhA*, ndhB*(2), ndhC, ndhD, ndhE, ndhF, ndhG, ndhH, ndhI, ndhJ, ndhK
      细胞色素复合物
      Cytochrome b/f complex
      petA, petB*, petD*, petG, petL, petN
      ATP 合成酶
      Subunits of ATP synthase
      atpA, atpB, atpE, atpF*, atpH, atpI
      二磷酸核酮糖羧化酶大亚基
      Large subunit of rubisco
      rbcL
      自我复制
      Self-replication
      核糖体大亚基蛋白
      Proteins of large ribosomal subunit
      #rpl23(2), rpl14, rpl16*, rpl2(2), rpl20, rpl22, rpl32, rpl33, rpl36
      核糖体小亚基蛋白
      Proteins of small ribosomal subunit
      #rps19, rps11, rps12(2), rps14, rps15, rps16*, rps18, rps19, rps2, rps3, rps4, rps7(2), rps8
      RNA 聚合酶亚基
      Subunits of RNA polymerase
      rpoA, rpoB, rpoC1*, rpoC2
      核糖体
      RNA Ribosomal RNAs
      rrn16S(2), rrn23S(2), rrn4.5S(2), rrn5S(2)
      转运RNA
      Transfer RNAs
      trnA-UGC*(2), trnC-GCA, trnD-GUC, trnE-UUC, trnF-GAA, trnG-GCC, trnG-UCC*, trnH-GUG, trnI-GAU*(2), trnK-UUU*, trnL-CAA(2), trnL-UAA*, trnL-UAG, trnM-CAU(3), trnN-GUU(2), trnP-UGG, trnQ-UUG, trnR-ACG(2), trnR-UCU, trnS-GCU, trnS-GGA, trnS-UGA, trnT-GGU, trnT-UGU, trnV-GAC(2), trnV-UAC*, trnW-CCA, trnY-GUA, trnfM-CAU
      其他基因
      Other genes
      成熟酶
      Maturase
      matK
      蛋白酶
      Protease
      clpP**
      膜包被蛋白基因
      Envelope membrane protein
      cemA
      乙酰辅酶A羧化酶
      Acetyl-CoA carboxylase
      accD
      c 型细胞色素合成基因
      c-type cytochrome synthesis gene
      ccsA
      翻译起始因子
      Translation initiation factor
      infA
      未知功能
      Unknown function
      假想叶绿体读码框
      Hypothetical chloroplast reading frames
      #ycf1, ycf1, ycf2(2), ycf3**, ycf4
      注:*表示含1个内含子的基因;**表示含有2个内含子的基因;#表示假基因;(n )表示多拷贝基因拷贝数n。
        Notes: * indicates the gene containing one intron; ** indicates the gene containing two introns; # indicates the pseudogene; (n) indicates the copy number of the multi-copy gene.

      Table 1.  Gene annotation in the chloroplast genome of Anabasis aphylla

      无叶假木贼叶绿体基因组中有16个基因包含内含子,12个基因位于LSC区,3个基因位于IR区,1个基因位于SSC区 (表2)。14个基因包含1个内含子(trnK-UUU、rps16trnG-UCC、atpFrpoC1trnL-UAA、trnI-GAU、petBpetDrpl16ndhBtrnV-UAC、trnA-UGC、ndhA),2个基因包含2个内含子(ycf3clpP)。内含子长度在525 bp(trnL)~2 500 bp(trnK)之间。

      基因
      Gene
      位置
      Location
      外显子I
      Exon I/bp
      内含子I
      Intron I/bp
      外显子II
      Exon II/bp
      内含子II
      Intron II/bp
      外显子III
      Exon III/bp
      trnK-UUULSC372 50035
      rps16LSC40892197
      trnG-UCCLSC3170860
      atpFLSC145777410
      rpoC1LSC4327891 602
      ycf3LSC126779228790153
      trnL-UAALSC3552550
      trnI-GAULSC3957656
      clpPLSC71949294605226
      petBLSC6797642
      petDLSC8744475
      rpl16LSC91 095360
      ndhBIR775673758
      trnV-UACIR3294642
      trnA-UGCIR3783836
      ndhASSC5531 101539

      Table 2.  Information of gene introns in the chloroplast genome of Anabasis aphylla

    • 无叶假木贼叶绿体基因组中共有41对长序列重复,其中,正向重复21对,回文重复20对,无互补和反向重复(图2A)。其中,重复长度为30 bp的数量最多,分布在IR区的重复长序列数量最多。无叶假木贼叶绿体基因组中,共确定71个SSR位点,属于12种重复类型(图2B)。其中,A/T重复类型的SSR数量最多,且重复次数在10、11、12次最常见。此外,SSR在基因间区的数量最多(70.4%),其次是位于内含子(14.1%)和蛋白编码序列(12.7%),tRNA和rRNA数量最少(1.4%)(图2 C)。

      Figure 2.  Repeat sequences analysis

    • 为确定无叶假木贼在藜科的系统位置,将其和19个藜科物种的叶绿体基因组进行系统发育分析,并以2种苋科物种为外类群,构建了ML系统发育树(图3)。结果表明:藜科物种系统发育树共分为2个大的分支,聚类的支持率较高,大部分节点的支持率为100%。第一分支包含聚类组1和聚类组2,聚类组1包含:盐爪爪属、盐角草属和碱蓬属的5个物种;聚类组2包含:假木贼属、梭梭属和猪毛菜属的5个物种。第二分支包含聚类组3和聚类组4,聚类组3包含:滨藜属和藜属的6个物种;聚类组4包含:腺毛藜和菠菜属的4个物种。2个苋科的外群物种单独在一个分支。

      Figure 3.  The Maximum-likelihood tree of Chenopodiaceae species based on analyses of the chloroplast genomes

    • 基于聚类分析结果,将无叶假木贼及其9种近缘种叶绿体基因组序列进行突变热点分析。结果表明:LSC和SSC区的核苷酸多态性明显高于IR区(图4)。序列比对总长度为161 920 bp,序列一致位点长度为138 470 bp, 突变位点数为14 021;核苷酸多态性平均值为0.039 18,范围为0~0.143 43。19个突变位点的核苷酸多态性大于0.1,3个在LSC区,16个在SSC区;19个突变位点分别属于trnS-trnG(exon1)、ndhF-rpl32rpl32-trnLrps16(exon1)-trnQ基因间区和ycf1基因区。

      Figure 4.  The nucleotide diversity of chloroplast genome sequence of Chenopodiaceae species

    • 无叶假木贼及其9种近缘种的边界分析显示:IR区长度变化不大(23 701~25 036 bp),但4个边界区的过渡区域存在一定差异(图5)。藜科10个物种的叶绿体基因组在IRb-LSC边界均存在rps19基因,向LSC区扩张长度在148~173 bp之间。在IRb-SSC边界,梭梭属、猪毛菜属和碱蓬属的5个物种ycf1假基因缺失;其它5个物种的ycf1基因均不同程度的扩张到了SSC区域中,扩张长度在18~4 440 bp之间。在IRa-SSC边界,均存在不同程度的ycf1基因扩张,长度在3~5 426 bp之间。在IRa-LSC边界, 盐爪爪属、盐角草属、猪毛菜属和碱蓬属的6个物种IRa区不存在rps19基因,其余4个物种的rps19基因均没有越过IRa-LSC边界。

      Figure 5.  Analysis of IR boundary contraction and expansion in chloroplast genome of Chenopodiaceae species

    • 无叶假木贼叶绿体基因组中RSCU值在0.32(CUG)~2.07(UUA)之间,30个密码子为高频密码子(RSCU > 1),除编码亮氨酸的密码子UUG以G结尾外,其它29种密码子均以A/U结尾(表3)。共确定20个最优密码子(UUU、UAU、UGU、CAU、UCU、UCA、UUA、CUU、CCU、AGA、GAA、ACU、ACA、AAU、GAU、AAA、GUU、GCU、GGU、CAA),均以A/U结尾。

      氨基酸
      Amino acid
      密码子
      Codon
      RSCU氨基酸
      Amino acid
      密码子
      Codon
      RSCU
      基因组
      Genome
      高表达基因
      High expression
      gene
      低表达基因
      Low expression
      gene
      基因组
      Genome
      高表达基因
      High expression
      gene
      低表达基因
      Low expression
      gene
      苯丙氨酸
      Phe/F
      UUU* 1.38 1.56 1.06 谷氨酰胺
      Ile/I
      AUU 1.52 1.46 0.86
      UUC 0.62 0.44 0.94 AUC 0.52 0.30 1.50
      络氨酸
      Tyr/Y
      UAU* 1.64 1.74 0.36 AUA 0.97 1.24 0.64
      UAC 0.36 0.26 1.64 苏氨酸
      Thr/T
      ACU* 1.63 1.46 0.81
      半胱氨酸
      Cys/C
      UGU* 1.55 1.79 0.63 ACC 0.69 0.53 1.18
      UGC 0.45 0.21 1.37 ACA* 1.27 1.55 0.91
      组氨酸
      His/H
      CAU* 1.50 1.62 1.06 ACG 0.42 0.47 1.11
      CAC 0.50 0.38 0.94 天冬酰胺
      Asn/N
      AAU* 1.56 1.61 0.88
      丝氨酸
      Ser/S
      UCU* 1.70 1.87 0.58 AAC 0.44 0.39 1.12
      UCC 0.91 0.84 0.89 天冬氨酸
      Asp/D
      GAU* 1.59 1.56 1.00
      UCA* 1.19 1.36 0.72 GAC 0.41 0.44 1.00
      UCG 0.62 0.60 1.27 赖氨酸
      Lys/K
      AAA* 1.55 1.66 0.86
      AGU 1.24 1.17 1.10 AAG 0.45 0.34 1.14
      AGC 0.34 0.16 1.44 缬氨酸
      Val/V
      GUU* 1.55 1.76 0.79
      亮氨酸
      Leu/L
      UUA* 2.07 2.48 0.91 GUC 0.37 0.40 0.59
      UUG 1.17 0.99 0.82 GUA 1.56 1.47 1.10
      CUU* 1.25 1.21 0.56 GUG 0.52 0.37 1.52
      CUC 0.36 0.34 0.99 丙氨酸
      Ala/A
      GCU* 1.81 2.02 0.66
      CUA 0.83 0.62 1.25 GCC 0.63 0.67 1.04
      CUG 0.32 0.36 1.47 GCA 1.11 1.11 1.21
      脯氨酸
      Pro/P
      CCU* 1.62 1.82 0.94 GCG 0.45 0.19 1.10
      CCC 0.71 0.58 1.17 甘氨酸
      Gly/G
      GGU* 1.34 1.45 0.64
      CCA 1.18 1.24 0.66 GGC 0.39 0.21 0.86
      CCG 0.49 0.36 1.23 GGA 1.65 1.72 1.22
      精氨酸
      Arg/R
      CGU 1.45 1.16 0.88 GGG 0.62 0.62 1.28
      CGC 0.38 0.10 0.54 谷氨酸
      Glu/E
      GAA* 1.55 1.76 0.94
      CGA 1.46 1.46 0.85 GAG 0.45 0.24 1.06
      CGG 0.38 0.15 1.25 色氨酸
      Trp/W
      UGG 1.00 1.00 1.00
      AGA* 1.70 2.62 0.94 甲硫氨酸
      Met/M
      AUG 1.00 1.00 1.00
      AGG 0.62 0.50 1.54 终止子
      TER
      UAA 1.58 2.00 0.88
      谷氨酰胺
      Gln/Q
      GAA* 1.58 1.74 1.08 UAG 0.57 0.00 0.97
      GAG 0.42 0.26 0.92 UGA 0.85 1.00 1.15
      注:*表示最优密码子,下划线表示叶绿体基因组中密码子RSCU值大于1。
        Notes: * indicates the optimal codon, and the underline indicates that the RSCU value of the codon in the chloroplast genome is greater than 1.

      Table 3.  Relative synonymous codon usage (RSCU) of genes in the chloroplast genome of Anabasis aphylla

      进一步通过ENC-plot、ENC分布直方图、PR2-plot和中性绘图,分析无叶假木贼叶绿体基因组中密码子偏好性的影响因素(图6)。由图6A可知:大部分基因分布在期望曲线附近。由图6B可知:大部分基因的ENC值小于ENC期望值,且主要分布在直方图的0~0.1区间内。由图6C可知,分布在四个象限点的数量差异不大,但右下角分布点的数量略高于其它三个象限;这表明除了突变因素,自然选择也是无叶假木贼叶绿体基因组密码子偏好性的影响因素。由图6D可知:GC12和GC3之间相关性系数为0.45,线性回归系数为0.343 6,进一步表明突变因素对密码子使用偏好性的影响占34.36%。因此,无叶假木贼叶绿体基因组密码子使用偏好性主要受自然选择影响,突变等影响因素对其影响较弱。

      Figure 6.  Analysis of the influencing factors of codon bias in in the chloroplast genome of Anabasis

    3.   讨论
    • 植物叶绿体基因组相对保守,已有研究表明,被子植物叶绿体基因组长度通常在120~180 kb之间,IR区在20~30 kb之间[27]。本研究中,无叶假木贼叶绿体基因组全长为154 084 bp,IR区长度为25 013 bp,在被子植株叶绿体基因组序列长度范围内。此外,无叶假木贼叶绿体基因组与大多数被子植物的叶绿体基因组有相似的环状四分体结构[28]。被子植物叶绿体基因组大小与IR区和SC区边界的扩张和收缩密切相关[29]。本研究中,无叶假木贼及其9种近缘种叶绿体基因组的边界分析显示,IR区长度变化不大(23 701~25 036 bp),表明藜科植物叶绿体基因组结构相对保守;但藜科中一些物种在IR-LSC边界处ycf1rps19基因的缺失和不同程度的扩张,这导致了藜科物种叶绿体基因组中IR区长度大小的差异[30]。无叶假木贼叶绿体基因组的平均GC含量为36.25%,这可能与该基因组偏好使用A/U结尾的密码子有关。无叶假木贼叶绿体基因组中共注释到132个基因,这和蒋礼玲等[31]报道的4种藜属物种的叶绿体基因组编码基因相比,少1个蛋白编码基因,多4个假基因,这可能与假木贼属中物种进化较缓慢有关[32];同时,在胡桃科(Juglandaceae)核桃属(Carya[33]和蔷薇科(Rosaceae)梨属(Pyrus[32]的叶绿体基因组的相关研究中也得到相似的结果,表明藜科中不同物种间叶绿体基因组基因数量存在差异属于正常现象;此外,笔者还推测这与不同研究中的测序平台和注释结果的差异有关。

      分布在植物叶绿体基因组上的重复序列和多态性变异位点,目前已广泛应用于多个物种的遗传多样性和系统关系的研究[34]。本研究在无叶假木贼叶绿体基因组中鉴定到12种类型,共71个SSR位点,其中,单碱基重复(A/T)、二碱基重复(AT/TA)、三碱基重复(TAA/TTA)和一些多碱基重复(AAAT、TAAT、TTAT、TAAAA、TTATT、TTTTA)均为多聚A或多聚T,占所有SSR位点的83.33%,这是假木贼叶绿体基因组中AT含量高的一个重要因素。在真核生物中,大多数SSR分布在非编码序列[35],无叶假木贼叶绿体基因组中SSR主要位于基因间区(70.4%),这可能与该物种的叶绿体基因组在遗传进化中较保守有关 [36]。此外,本研究中,LSC和SSC区的核苷酸多态性显著高于IR区,这和蒋礼玲等[31]在藜属植物叶绿体基因组的研究结果一致。trnS-trnG(exon1)、ndhF-rpl32rpl32-trnLrps16(exon1)-trnQycf1是无叶假木贼叶绿体基因组中的高核苷酸多态性区域,这些序列为该科的属间分子鉴定奠定了基础。

      植物叶绿体全基因组序列较单个或多个编码序列包含更丰富的信息,基于其构建的系统发育树结果更加准确[37]。高鸣泽[38]通过对藜科41个物种的蛋白编码基因构建系统发育树,将盐角草属、碱蓬属、梭梭属聚成一类,表明其亲缘关系较近。本研究得到了相似的结论,同时,本研究结果进一步表明,假木贼属和梭梭属、猪毛菜属有较高的亲缘关系,这和前人基于形态学特征将假木贼属、梭梭属、猪毛菜属归为猪毛菜族(Salsoleae)的研究结果一致[39]。本研究是首次对藜科中假木贼属物种进行叶绿体基因组测序,明确了假木贼属在藜科中系统发育的位置,但要充分明确藜科物种的系统发育关系,需要更多藜科物种全基因组被测序。

      密码子偏好性在植物叶绿体基因组中蛋白质编码基因的过程中发挥重要作用,这与突变、自然选择和随机遗传漂变等分子进化现象密切相关[40]。本研究中,ENC-plot、ENC分布直方图、PR2-plot和中性绘图分析的综合结果表明,无叶假木贼叶绿体基因组密码子使用偏好性主要受自然选择影响,突变等影响因素对其影响较弱,这可能与荒漠植物在遗传进化过程中的特殊生存环境有关[41]。前人研究表明,密码子第三个碱基的GC含量在基因组结构进化过程中也起着重要作用[42]。本研究中,在RSCU值大于1的30个密码子中,除编码亮氨酸的UUG以G结尾外,其它29种密码子均以A/U结尾。当RSCU值<1时,多以G/C结尾。这表明,以A/U结尾的同义密码子更多地参与无叶假木贼叶绿体基因组的蛋白质编码基因的过程,这与前人在双子叶植物中密码子使用偏好性的研究一致[43]。此外,本研究在无叶假木贼叶绿体基因组中确定了20个最优密码子,均以A/U结尾,这为无叶假木贼中外源基因密码子的优化提供了理论依据。

    4.   结论
    • 无叶假木贼叶绿体基因组全长为154 084 bp,包括1个LSC区(85 124 bp)、1个SSC区(18 934 bp)、1对IR区(IRa和IRb,25 013 bp),呈典型的四分体结构。12种类型共71个SSR位点在无叶假木贼叶绿体基因组中被鉴定;trnS-trnG(exon1)、ndhF-rpl32rpl32-trnLrps16(exon1)-trnQycf1是无叶假木贼叶及其9种近缘种叶绿体基因组中的高核苷酸多态性区域;这些信息为无叶假木贼今后的分子标记开发提供了科学的依据。系统发育分析中,20个藜科物种被归为4个聚类组,其中,无叶假木贼与梭梭属和猪毛菜属的物种亲缘关系最近;在无叶假木贼叶绿体基因组中确定20个最优密码子,均以A/U结尾;其密码子使用偏好性主要受自然选择影响,突变等影响因素对其影响较弱;研究结果可为无叶假木贼的系统进化及叶绿体基因工程研究提供参考。

Reference (43)

Catalog

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return