微型机与应用
微型機與應用
미형궤여응용
Microcomputer & its Applications
2015年
19期
86-89,92
,共5页
Hadoop%垃圾邮件%贝叶斯%MapReduce
Hadoop%垃圾郵件%貝葉斯%MapReduce
Hadoop%랄급유건%패협사%MapReduce
Hadoop%spam%Bayes%MapReduce
传统的贝叶斯垃圾邮件过滤系统虽然具有较高的分类准确性,但是在处理邮件时存在效率低、消耗资源量大的问题。本文针对贝叶斯垃圾邮件过滤算法进行了在 Hadoop MapReduce 下的研究,并对判定类别的阈值进行了优化,实验表明,本文提出的算法降低了正常邮件的误判率,提高了垃圾邮件判定的准确率和 F 值,同时提高了垃圾邮件过滤的效率。
傳統的貝葉斯垃圾郵件過濾繫統雖然具有較高的分類準確性,但是在處理郵件時存在效率低、消耗資源量大的問題。本文針對貝葉斯垃圾郵件過濾算法進行瞭在 Hadoop MapReduce 下的研究,併對判定類彆的閾值進行瞭優化,實驗錶明,本文提齣的算法降低瞭正常郵件的誤判率,提高瞭垃圾郵件判定的準確率和 F 值,同時提高瞭垃圾郵件過濾的效率。
전통적패협사랄급유건과려계통수연구유교고적분류준학성,단시재처리유건시존재효솔저、소모자원량대적문제。본문침대패협사랄급유건과려산법진행료재 Hadoop MapReduce 하적연구,병대판정유별적역치진행료우화,실험표명,본문제출적산법강저료정상유건적오판솔,제고료랄급유건판정적준학솔화 F 치,동시제고료랄급유건과려적효솔。
Although the traditional Bayesian spam filtering system has a high classification accuracy , but the problems of low efficiency, needing to consume a lot of resources are exiting in the processing of mail. In this paper, the research on Hadoop MapReduce for Bayesian spam filtering algorithm is carried out, and the threshold of determine category is optimized. Experiments show that the algorithm proposed in this paper reduces the error rate of legitimate emails , improves spam email accuracy and F value. At the same time, it improves the efficiency of spam filtering.