东南大学学报(自然科学版)
東南大學學報(自然科學版)
동남대학학보(자연과학판)
JOURNAL OF SOUTHEAST UNIVERSITY
2014年
5期
924-928
,共5页
缺失变异%二代测序%特征提取%AdaBoost
缺失變異%二代測序%特徵提取%AdaBoost
결실변이%이대측서%특정제취%AdaBoost
deletion%next-generation sequencing%feature extraction%AdaBoost
针对基因组缺失变异检测中测序序列分裂比对方法所存在的假发现率较高的问题,提出了一种基于检测理论和AdaBoost的综合检测策略。首先,对配对末端测序序列进行初次映射和二次分裂比对,得到1 bp解析度的候选缺失变异集合,并使得该集合中包含尽可能多的候选变异;然后,依据配对末端测序序列映射分析、测序序列分裂比对和测序序列映射深度分析3类检测方法的基本原理,在2次比对结果中提取与缺失变异相关的序列特征;最后,以具有高泛化性能的AdaBoost神经网络集成模型为判别模型,筛除候选集中的伪阳性结果,从而得到最终结果集。实验结果表明,相对于传统的测序序列分裂比对方法,所提策略能够在几乎不损失检测敏感度的前提下更加有效地降低假发现率。
針對基因組缺失變異檢測中測序序列分裂比對方法所存在的假髮現率較高的問題,提齣瞭一種基于檢測理論和AdaBoost的綜閤檢測策略。首先,對配對末耑測序序列進行初次映射和二次分裂比對,得到1 bp解析度的候選缺失變異集閤,併使得該集閤中包含儘可能多的候選變異;然後,依據配對末耑測序序列映射分析、測序序列分裂比對和測序序列映射深度分析3類檢測方法的基本原理,在2次比對結果中提取與缺失變異相關的序列特徵;最後,以具有高汎化性能的AdaBoost神經網絡集成模型為判彆模型,篩除候選集中的偽暘性結果,從而得到最終結果集。實驗結果錶明,相對于傳統的測序序列分裂比對方法,所提策略能夠在幾乎不損失檢測敏感度的前提下更加有效地降低假髮現率。
침대기인조결실변이검측중측서서렬분렬비대방법소존재적가발현솔교고적문제,제출료일충기우검측이론화AdaBoost적종합검측책략。수선,대배대말단측서서렬진행초차영사화이차분렬비대,득도1 bp해석도적후선결실변이집합,병사득해집합중포함진가능다적후선변이;연후,의거배대말단측서서렬영사분석、측서서렬분렬비대화측서서렬영사심도분석3류검측방법적기본원리,재2차비대결과중제취여결실변이상관적서렬특정;최후,이구유고범화성능적AdaBoost신경망락집성모형위판별모형,사제후선집중적위양성결과,종이득도최종결과집。실험결과표명,상대우전통적측서서렬분렬비대방법,소제책략능구재궤호불손실검측민감도적전제하경가유효지강저가발현솔。
To solve the problem that the false discovery rate of split-read approaches for genomic de-letion detection is relatively high,an integrated strategy based on detection theories and AdaBoost is proposed.First,after initial mapping and following split read alignment of paired-end reads,a set containing 1 bp-resolution deletion candidates as many as possible is identified.Then,according to the fundamentals of read-pair technologies,split-read approaches and read-depth methods,deletion-related features are extracted based on the two alignment results.Finaly,to get final calls,an Ada-Boost neural net ensemble model is generalized to distinguish true from false deletion candidates. The experimental results show that compared with the traditional split-read approaches,the proposed strategy can reduce the number of false positives more effectively with negligible loss of sensitivity.