CAJ | 학술논문

在中国，识别科普网站的内容长期以来主要是依靠专家判断来进行。这种主观判断不仅费时费力，效果也并不好。这其中最主要的一个原因是网站内容比较丰富，人工浏览效率低下，在一定的时间只能处理有限的内容，对于整个网站的判断会存在不全面的地方，也具有主观性。对此问题的解决需要提出一个基于人工智能的可以进行快速定量计算的方法。本文提出的科普网站特征向量就是讲网站内容通过计算机进行处理抽象出来的一个向量空间模型，它能比较好的表现网站的文字内容和意思，可以最终实现机器自动判断网站内容是否含有科普成分以及什么性质的科普内容。
재중국，식별과보망참적내용장기이래주요시의고전가판단래진행。저충주관판단불부비시비력，효과야병불호。저기중최주요적일개원인시망참내용비교봉부，인공류람효솔저하，재일정적시간지능처리유한적내용，대우정개망참적판단회존재불전면적지방，야구유주관성。대차문제적해결수요제출일개기우인공지능적가이진행쾌속정량계산적방법。본문제출적과보망참특정향량취시강망참내용통과계산궤진행처리추상출래적일개향량공간모형，타능비교호적표현망참적문자내용화의사，가이최종실현궤기자동판단망참내용시부함유과보성분이급십요성질적과보내용。
In China, the recognizing whether a website belongs to science websites relies mainly on expert judgment to proceed. This kind of subjective judgment is not only time-consuming,and the results are not reliable. Browsing and judging by experts have low efficient because the rich website content. They only can process very limited part of any website under certain time and energy. Besides this,different people may make different judgments. It is necessary to propose a quantitative method based on machine intelligence. This paper will discuss the feature word vectors of Chinese popular science websites what is processed by computer abstracted from real content based on vector space model. We think it can better the performance of the site’s textual content and meaning. Based on this method, people may make a system to automatically In China, the recognizing whether a website belongs to science websites relies mainly on expert judgment to proceed. This kind of subjective judgment is not only time-consuming,and the results are not reliable. Browsing and judging by experts have low efficient because the rich website content. They only can process very limited part of any website under certain time and energy. Besides this,different people may make different judgments. It is necessary to propose a quantitative method based on machine intelligence. This paper will discuss the feature word vectors of Chinese popular science websites what is processed by computer abstracted from real content based on vector space model. We think it can better the performance of the site’s textual content and meaning. Based on this method, people may make a system to automatically determine the ultimate realization of website content if it contains science ingredients as well as what kind of science content.