CAJ | 학술논문

中国抗癌协会指出：90%的早期癌症没有明显症状，以至于80%的癌症患者确诊时已属于中晚期。如果我们能够早期发现癌症，至少可以挽救上百万人的生命。本研究的主要目的就是借助于大数据价值提取技术，建立一套能够早期预测癌症风险的系统。本研究对486394人，包括40217名癌症患者和446177名健康体检者进行了血常规，血生化和尿常规数据的分析预测，预测分析数据共计48项。显著性分析和预测模型的统计方法为逻辑分析法和判别分析法，显著性检验标准为p<0.05。预测分析使用的统计软件为SAS，预测分析所用数据均来自MS SQL数据库。研究结果显示血常规，血生化和尿常规数据可以用来区分癌症患者和健康者，基于血常规，血生化和尿常规数据的癌症风险预测模型可以精准锁定高风险癌症人群，准确率达95.5%。癌症风险预测模型建成后，经过2014年1—7月9931名癌症患者和110077名健康体检者数据的验证，准确率超过95%。本研究证明血常规，血生化和尿常规数据可以用来早期预测癌症的风险。
중국항암협회지출：90%적조기암증몰유명현증상，이지우80%적암증환자학진시이속우중만기。여과아문능구조기발현암증，지소가이만구상백만인적생명。본연구적주요목적취시차조우대수거개치제취기술，건립일투능구조기예측암증풍험적계통。본연구대486394인，포괄40217명암증환자화446177명건강체검자진행료혈상규，혈생화화뇨상규수거적분석예측，예측분석수거공계48항。현저성분석화예측모형적통계방법위라집분석법화판별분석법，현저성검험표준위p<0.05。예측분석사용적통계연건위SAS，예측분석소용수거균래자MS SQL수거고。연구결과현시혈상규，혈생화화뇨상규수거가이용래구분암증환자화건강자，기우혈상규，혈생화화뇨상규수거적암증풍험예측모형가이정준쇄정고풍험암증인군，준학솔체95.5%。암증풍험예측모형건성후，경과2014년1—7월9931명암증환자화110077명건강체검자수거적험증，준학솔초과95%。본연구증명혈상규，혈생화화뇨상규수거가이용래조기예측암증적풍험。
Chinese Anti-Cancer Association indicates that about 90%of early cancers have no obvious symptoms, so that 80%of the diagnosed cancer patients are in the later stage. More than one million lives could be saved if we can predict early cancer risk. The purpose of this research is to provide a system to early predict cancer risk with the help of big data technology. A total of 486,394 people including 40,217 cancer patients and 446,177 normal people were involved in the study. The data were used in the research including demographic, CBC (Complete Blood Count), CMP (Complete Metabolic Panel), Lipids and Urinalysis data, total of 48 data points. Both Logistic analysis and discriminant analysis were used to identify the signiifcant factors and to build seven cancer risk prediction models and the signiifcant level was set at p<0.05. SAS was used as the primary statistical analysis tool. All the data were pulled out from the MS SQL database. The analysis results showed that CBC, CMP, Lipids and Urinalysis data can signiifcantly distinguish normal people from cancer patients and those data can be used to build cancer risk prediction models, the average accuracy of the prediction models was 95.5%. Those seven prediction models were veriifed by a total of 120,008 people (from January 2014 to July 2014) including 9,931 cancer patients and 110,077 normal people. The accuracy of the veriifcation was over 95%. This research shows that the routine blood and urine test results can be used to predict cancer risk in the early stage.