江西理工大学学报
江西理工大學學報
강서리공대학학보
JOURNAL OF JIANGXI UNIVERSITY OF SCIENCE AND TECHNOLOGY
2013年
5期
82-87
,共6页
文本图像%C4.5决策树分类器%灰度直方图%图像纹理
文本圖像%C4.5決策樹分類器%灰度直方圖%圖像紋理
문본도상%C4.5결책수분류기%회도직방도%도상문리
text image%C4.5 decision tree classifier%gradation histogram%image texture
针对文本图像特有的图像特征,提出了一种基于底层图像特征组合的文本图像分类方法,该方法使用了两层C4.5决策树分类器,能将文本图像有效地分为标题文本图像、文档图像和场景文本图像.首先将样本图像转换为灰度图像,提取灰度直方图的特征,根据灰度直方图特征的不同,可以先区分文档图像;然后把余下的图像转换为二值图像,提取图像的GLCM纹理特征,根据GLCM特征区分场景文本和标题文本图像.在开源的WEKA数据挖掘软件环境下进行仿真实验,结果表明该方法是可行的,并能够得到较高的查全率和查准率.
針對文本圖像特有的圖像特徵,提齣瞭一種基于底層圖像特徵組閤的文本圖像分類方法,該方法使用瞭兩層C4.5決策樹分類器,能將文本圖像有效地分為標題文本圖像、文檔圖像和場景文本圖像.首先將樣本圖像轉換為灰度圖像,提取灰度直方圖的特徵,根據灰度直方圖特徵的不同,可以先區分文檔圖像;然後把餘下的圖像轉換為二值圖像,提取圖像的GLCM紋理特徵,根據GLCM特徵區分場景文本和標題文本圖像.在開源的WEKA數據挖掘軟件環境下進行倣真實驗,結果錶明該方法是可行的,併能夠得到較高的查全率和查準率.
침대문본도상특유적도상특정,제출료일충기우저층도상특정조합적문본도상분류방법,해방법사용료량층C4.5결책수분류기,능장문본도상유효지분위표제문본도상、문당도상화장경문본도상.수선장양본도상전환위회도도상,제취회도직방도적특정,근거회도직방도특정적불동,가이선구분문당도상;연후파여하적도상전환위이치도상,제취도상적GLCM문리특정,근거GLCM특정구분장경문본화표제문본도상.재개원적WEKA수거알굴연건배경하진행방진실험,결과표명해방법시가행적,병능구득도교고적사전솔화사준솔.
A text image classification method based on the combination of underlying image feature was proposed in this paper. With two layers of C4.5 decision tree classifier, the method can divide the text image into caption text image, document image and scene text image. The text image classification is a two-step process. In the first place, the sample image is converted into gray image for histogram feature extraction. Document images could then be well distinguished according to the variable characteristics of the gray histogram. In the second place, the rest of the images are converted into binary images to extract their GLCM features, according to which the scene text and caption text images are distinguished. Simulation experiments were carried out in the open source WEKA data mining software, the results showed that the method is feasible, and is able to get favorable recall and good precision ratio.