CAJ | 학술논문

万方数据

哈尔滨商业大学学报（自然科学版） 합이빈상업대학학보（자연과학판）
Journal of Harbin University of Commerce (Natural Sciences Edition)
2015年 5期 573-577 ,共5页

潘庆和%徐耀群%赵星驰潘慶和%徐耀群%趙星馳

반경화%서요군%조성치

站点拓扑结构%深度优先遍历%Python%爬虫站點拓撲結構%深度優先遍歷%Python%爬蟲
참점탁복결구%심도우선편력%Python%파충
website topology structure%depth first traversal%Python%crawler

提出了一种使用深度优先遍历方式实现的Web站点拓扑结构获取策略，使用Python语言实现，并可扩展成用于数据采集的爬虫。利用这种方式可以对目标网站进行拓扑探测，了解其内部组织结构，为进一步的研究提供基础。
제출료일충사용심도우선편력방식실현적Web참점탁복결구획취책략，사용Python어언실현，병가확전성용우수거채집적파충。이용저충방식가이대목표망참진행탁복탐측，료해기내부조직결구，위진일보적연구제공기출。
In this paper , the strategy on the obtainment of website topology structure was put forward based on depth first traversal .The Python language was used to implement this strat-egy and the program can be extended to construct a web crawler .By obtaining web site topol-ogy structure it could detect and understand the internal organizational structure of the web -site and would provide a basis for further research .