草业学报
草業學報
초업학보
PRATACULTURAL SCIENCE
2014年
6期
242-252
,共11页
贾新平%叶晓青%梁丽建%邓衍明%孙晓波%佘建明
賈新平%葉曉青%樑麗建%鄧衍明%孫曉波%佘建明
가신평%협효청%량려건%산연명%손효파%사건명
海滨雀稗%转录组%高通量测序%基因注释%SSR
海濱雀稗%轉錄組%高通量測序%基因註釋%SSR
해빈작패%전록조%고통량측서%기인주석%SSR
Paspalum vaginatum%transcriptome%high-throughput sequencing%gene annotation%simple se-quence repeat
采用新一代高通量测序技术 Illumina HiSeq 2000对海滨雀稗叶片转录组进行测序,结合生物信息学方法开展基因表达谱研究和功能基因预测。通过测序,获得了47520544个序列读取片段(reads),包含了4752054400个碱基序列(bp)信息。对 reads 进行序列组装,获得81220个单基因簇(unigene),平均长度1077 bp,序列信息达到了87542503 bp。另外从长度分布、GC 含量、表达水平等方面对 unigene 进行评估,数据显示测序质量好,可信度高。数据库中的序列同源性比较表明,46169个 unigene 与其他生物的已知基因具有不同程度的同源性。海滨雀稗转录组中的 unigene 根据 GO 功能大致可分为细胞组分、分子功能和生物学过程三大类48个分支,其中有大量 uni-gene 与代谢进程、结合活性和细胞进程相关。将 unigene 与 COG 数据库进行比对,根据其功能大致可分为25类。KEGG 数据库作为参考,依据代谢途径可将 unigene 定位到112个代谢途径分支,包括苯丙氨酸代谢通路、植物与病原物互作、植物激素生物合成和信号转导、黄酮类化合物合成、萜类骨架生物合成、脂类代谢、RNA 降解等。SSR位点查找发现,从81220个 unigene 中共找到22721个 SSR 位点。SSR 不同重复基序类型中,出现频率最高的为A/T,其次是 CCG/CGG 和 AGC/CTG。本研究首次对海滨雀稗转录组进行了分析,为草坪草的分子生物学研究提供了宝贵的基因组数据来源。
採用新一代高通量測序技術 Illumina HiSeq 2000對海濱雀稗葉片轉錄組進行測序,結閤生物信息學方法開展基因錶達譜研究和功能基因預測。通過測序,穫得瞭47520544箇序列讀取片段(reads),包含瞭4752054400箇堿基序列(bp)信息。對 reads 進行序列組裝,穫得81220箇單基因簇(unigene),平均長度1077 bp,序列信息達到瞭87542503 bp。另外從長度分佈、GC 含量、錶達水平等方麵對 unigene 進行評估,數據顯示測序質量好,可信度高。數據庫中的序列同源性比較錶明,46169箇 unigene 與其他生物的已知基因具有不同程度的同源性。海濱雀稗轉錄組中的 unigene 根據 GO 功能大緻可分為細胞組分、分子功能和生物學過程三大類48箇分支,其中有大量 uni-gene 與代謝進程、結閤活性和細胞進程相關。將 unigene 與 COG 數據庫進行比對,根據其功能大緻可分為25類。KEGG 數據庫作為參攷,依據代謝途徑可將 unigene 定位到112箇代謝途徑分支,包括苯丙氨痠代謝通路、植物與病原物互作、植物激素生物閤成和信號轉導、黃酮類化閤物閤成、萜類骨架生物閤成、脂類代謝、RNA 降解等。SSR位點查找髮現,從81220箇 unigene 中共找到22721箇 SSR 位點。SSR 不同重複基序類型中,齣現頻率最高的為A/T,其次是 CCG/CGG 和 AGC/CTG。本研究首次對海濱雀稗轉錄組進行瞭分析,為草坪草的分子生物學研究提供瞭寶貴的基因組數據來源。
채용신일대고통량측서기술 Illumina HiSeq 2000대해빈작패협편전록조진행측서,결합생물신식학방법개전기인표체보연구화공능기인예측。통과측서,획득료47520544개서렬독취편단(reads),포함료4752054400개감기서렬(bp)신식。대 reads 진행서렬조장,획득81220개단기인족(unigene),평균장도1077 bp,서렬신식체도료87542503 bp。령외종장도분포、GC 함량、표체수평등방면대 unigene 진행평고,수거현시측서질량호,가신도고。수거고중적서렬동원성비교표명,46169개 unigene 여기타생물적이지기인구유불동정도적동원성。해빈작패전록조중적 unigene 근거 GO 공능대치가분위세포조분、분자공능화생물학과정삼대류48개분지,기중유대량 uni-gene 여대사진정、결합활성화세포진정상관。장 unigene 여 COG 수거고진행비대,근거기공능대치가분위25류。KEGG 수거고작위삼고,의거대사도경가장 unigene 정위도112개대사도경분지,포괄분병안산대사통로、식물여병원물호작、식물격소생물합성화신호전도、황동류화합물합성、첩류골가생물합성、지류대사、RNA 강해등。SSR위점사조발현,종81220개 unigene 중공조도22721개 SSR 위점。SSR 불동중복기서류형중,출현빈솔최고적위A/T,기차시 CCG/CGG 화 AGC/CTG。본연구수차대해빈작패전록조진행료분석,위초평초적분자생물학연구제공료보귀적기인조수거래원。
The transcriptome of Paspalum vaginatum leaf was sequenced using an Illumina HiSeq 2000 plat-form,which is a new generation of high-throughput sequencing technology used to study expression profiles and to predict functional genes.In the target sample,a total of 47520544 reads containing 4752054400 bp of se-quence information were generated.A total of 81220 unigenes containing 87542503 bp sequence information were formed by initial sequence splicing,with an average read length of 1077 bp.Unigene qualities for several aspects were assessed,such as length distribution,GC content and gene expression level.The sequencing data was of high quality and reliability.The 46169 unigenes were annotated using BLAST searches against the Nr, Nt and SwissProt databases.All the assembled unigenes could be broadly divided into biological processes,cel-lular components and 48 branches of molecular function categories by gene ontology,including metabolic process,binding and cellular processes.The unigenes were further annotated based on COG category,which could be grouped into 25 functional categories.The unigenes could be broadly divided into 112 classes according to their metabolic pathway,including the phenylalanine metabolism pathway,plant-pathogen interaction,plant hormone biosynthesis and signal transduction,flavonoid biosynthesis,terpenoid backbone biosynthesis,lipid metabolism,and RNA degradation.There were 22721 SSR in 81220 unigenes and in the SSR,A/T was the highest repeat,following by CCG/CGG and AGC/CTG.This study is the first comprehensive transcriptome a-nalysis for Paspalum vaginatum ,providing valuable genome data sources for the molecular biology of this grass.