世界复合医学
世界複閤醫學
세계복합의학
World Journal of Complex Medicine
2015年
1期
63-67
,共5页
马立伟%曾强%吕秋平%范成烨%程鹏
馬立偉%曾彊%呂鞦平%範成燁%程鵬
마립위%증강%려추평%범성엽%정붕
大数据%早期预测癌症%血常规%血生化%尿常规
大數據%早期預測癌癥%血常規%血生化%尿常規
대수거%조기예측암증%혈상규%혈생화%뇨상규
big data%early cancer prediction%complete blood count (CBC)%blood chemistry%urinalysis
中国抗癌协会指出:90%的早期癌症没有明显症状,以至于80%的癌症患者确诊时已属于中晚期。如果我们能够早期发现癌症,至少可以挽救上百万人的生命。本研究的主要目的就是借助于大数据价值提取技术,建立一套能够早期预测癌症风险的系统。本研究对486394人,包括40217名癌症患者和446177名健康体检者进行了血常规,血生化和尿常规数据的分析预测,预测分析数据共计48项。显著性分析和预测模型的统计方法为逻辑分析法和判别分析法,显著性检验标准为p<0.05。预测分析使用的统计软件为SAS,预测分析所用数据均来自MS SQL数据库。研究结果显示血常规,血生化和尿常规数据可以用来区分癌症患者和健康者,基于血常规,血生化和尿常规数据的癌症风险预测模型可以精准锁定高风险癌症人群,准确率达95.5%。癌症风险预测模型建成后,经过2014年1—7月9931名癌症患者和110077名健康体检者数据的验证,准确率超过95%。本研究证明血常规,血生化和尿常规数据可以用来早期预测癌症的风险。
中國抗癌協會指齣:90%的早期癌癥沒有明顯癥狀,以至于80%的癌癥患者確診時已屬于中晚期。如果我們能夠早期髮現癌癥,至少可以輓救上百萬人的生命。本研究的主要目的就是藉助于大數據價值提取技術,建立一套能夠早期預測癌癥風險的繫統。本研究對486394人,包括40217名癌癥患者和446177名健康體檢者進行瞭血常規,血生化和尿常規數據的分析預測,預測分析數據共計48項。顯著性分析和預測模型的統計方法為邏輯分析法和判彆分析法,顯著性檢驗標準為p<0.05。預測分析使用的統計軟件為SAS,預測分析所用數據均來自MS SQL數據庫。研究結果顯示血常規,血生化和尿常規數據可以用來區分癌癥患者和健康者,基于血常規,血生化和尿常規數據的癌癥風險預測模型可以精準鎖定高風險癌癥人群,準確率達95.5%。癌癥風險預測模型建成後,經過2014年1—7月9931名癌癥患者和110077名健康體檢者數據的驗證,準確率超過95%。本研究證明血常規,血生化和尿常規數據可以用來早期預測癌癥的風險。
중국항암협회지출:90%적조기암증몰유명현증상,이지우80%적암증환자학진시이속우중만기。여과아문능구조기발현암증,지소가이만구상백만인적생명。본연구적주요목적취시차조우대수거개치제취기술,건립일투능구조기예측암증풍험적계통。본연구대486394인,포괄40217명암증환자화446177명건강체검자진행료혈상규,혈생화화뇨상규수거적분석예측,예측분석수거공계48항。현저성분석화예측모형적통계방법위라집분석법화판별분석법,현저성검험표준위p<0.05。예측분석사용적통계연건위SAS,예측분석소용수거균래자MS SQL수거고。연구결과현시혈상규,혈생화화뇨상규수거가이용래구분암증환자화건강자,기우혈상규,혈생화화뇨상규수거적암증풍험예측모형가이정준쇄정고풍험암증인군,준학솔체95.5%。암증풍험예측모형건성후,경과2014년1—7월9931명암증환자화110077명건강체검자수거적험증,준학솔초과95%。본연구증명혈상규,혈생화화뇨상규수거가이용래조기예측암증적풍험。
Chinese Anti-Cancer Association indicates that about 90%of early cancers have no obvious symptoms, so that 80%of the diagnosed cancer patients are in the later stage. More than one million lives could be saved if we can predict early cancer risk. The purpose of this research is to provide a system to early predict cancer risk with the help of big data technology. A total of 486,394 people including 40,217 cancer patients and 446,177 normal people were involved in the study. The data were used in the research including demographic, CBC (Complete Blood Count), CMP (Complete Metabolic Panel), Lipids and Urinalysis data, total of 48 data points. Both Logistic analysis and discriminant analysis were used to identify the signiifcant factors and to build seven cancer risk prediction models and the signiifcant level was set at p<0.05. SAS was used as the primary statistical analysis tool. All the data were pulled out from the MS SQL database. The analysis results showed that CBC, CMP, Lipids and Urinalysis data can signiifcantly distinguish normal people from cancer patients and those data can be used to build cancer risk prediction models, the average accuracy of the prediction models was 95.5%. Those seven prediction models were veriifed by a total of 120,008 people (from January 2014 to July 2014) including 9,931 cancer patients and 110,077 normal people. The accuracy of the veriifcation was over 95%. This research shows that the routine blood and urine test results can be used to predict cancer risk in the early stage.