CAJ | 학술논문

主要介绍了在HADOOP平台下使用优化的SPRINT算法进行海量数据处理。首先介绍了传统的数据挖掘算法SPRINT算法，然后结合云计算中的MapReduce编程模式对SPRINT算法进行改进和优化，将并行的SPRINT算法移植到HADOOP平台下，最后通过实验实现分布式数据处理。
주요개소료재HADOOP평태하사용우화적SPRINT산법진행해량수거처리。수선개소료전통적수거알굴산법SPRINT산법，연후결합운계산중적MapReduce편정모식대SPRINT산법진행개진화우화，장병행적SPRINT산법이식도HADOOP평태하，최후통과실험실현분포식수거처리。
In this paper, optimized SPRINT algorithm which was used in the Hadoop platform for mass data process-ing was introduced. Firstly, the SPRINT algorithm which is the traditional data mining algorithm was introduced and then was combined with the Map Reduce model in the cloud computing to improve and optimize the SPRINT algo-rithm. At last,the parallel SPRINT algorithm will be transplanted to the HADOOP platform,and finally the distribut-ed data processing was achieved through the experiment.