科普研究
科普研究
과보연구
SCIENCE POPULARIZATION
2013年
5期
43-46,88
,共5页
吴晨生%郭金忠%罗植%廖涛
吳晨生%郭金忠%囉植%廖濤
오신생%곽금충%라식%료도
科普网站%特征词频%向量空间
科普網站%特徵詞頻%嚮量空間
과보망참%특정사빈%향량공간
popular science website%characteristic word frequency%vector space
在中国,识别科普网站的内容长期以来主要是依靠专家判断来进行。这种主观判断不仅费时费力,效果也并不好。这其中最主要的一个原因是网站内容比较丰富,人工浏览效率低下,在一定的时间只能处理有限的内容,对于整个网站的判断会存在不全面的地方,也具有主观性。对此问题的解决需要提出一个基于人工智能的可以进行快速定量计算的方法。本文提出的科普网站特征向量就是讲网站内容通过计算机进行处理抽象出来的一个向量空间模型,它能比较好的表现网站的文字内容和意思,可以最终实现机器自动判断网站内容是否含有科普成分以及什么性质的科普内容。
在中國,識彆科普網站的內容長期以來主要是依靠專傢判斷來進行。這種主觀判斷不僅費時費力,效果也併不好。這其中最主要的一箇原因是網站內容比較豐富,人工瀏覽效率低下,在一定的時間隻能處理有限的內容,對于整箇網站的判斷會存在不全麵的地方,也具有主觀性。對此問題的解決需要提齣一箇基于人工智能的可以進行快速定量計算的方法。本文提齣的科普網站特徵嚮量就是講網站內容通過計算機進行處理抽象齣來的一箇嚮量空間模型,它能比較好的錶現網站的文字內容和意思,可以最終實現機器自動判斷網站內容是否含有科普成分以及什麽性質的科普內容。
재중국,식별과보망참적내용장기이래주요시의고전가판단래진행。저충주관판단불부비시비력,효과야병불호。저기중최주요적일개원인시망참내용비교봉부,인공류람효솔저하,재일정적시간지능처리유한적내용,대우정개망참적판단회존재불전면적지방,야구유주관성。대차문제적해결수요제출일개기우인공지능적가이진행쾌속정량계산적방법。본문제출적과보망참특정향량취시강망참내용통과계산궤진행처리추상출래적일개향량공간모형,타능비교호적표현망참적문자내용화의사,가이최종실현궤기자동판단망참내용시부함유과보성분이급십요성질적과보내용。
In China, the recognizing whether a website belongs to science websites relies mainly on expert judgment to proceed. This kind of subjective judgment is not only time-consuming,and the results are not reliable. Browsing and judging by experts have low efficient because the rich website content. They only can process very limited part of any website under certain time and energy. Besides this,different people may make different judgments. It is necessary to propose a quantitative method based on machine intelligence. This paper will discuss the feature word vectors of Chinese popular science websites what is processed by computer abstracted from real content based on vector space model. We think it can better the performance of the site’s textual content and meaning. Based on this method, people may make a system to automatically In China, the recognizing whether a website belongs to science websites relies mainly on expert judgment to proceed. This kind of subjective judgment is not only time-consuming,and the results are not reliable. Browsing and judging by experts have low efficient because the rich website content. They only can process very limited part of any website under certain time and energy. Besides this,different people may make different judgments. It is necessary to propose a quantitative method based on machine intelligence. This paper will discuss the feature word vectors of Chinese popular science websites what is processed by computer abstracted from real content based on vector space model. We think it can better the performance of the site’s textual content and meaning. Based on this method, people may make a system to automatically determine the ultimate realization of website content if it contains science ingredients as well as what kind of science content.