TY - GEN
T1 - Adaptive focused crawler based on tunneling and link analysis
AU - Zhang, Xiaoming
AU - Li, Zhoujun
AU - Hu, Chaojian
PY - 2009
Y1 - 2009
N2 - At present, using focused crawler becomes a way to seek the needed information. The main characteristic of a focused web crawler is to select and retrieve only relevant web pages in each crawling process. In this paper, we propose a learnable algorithm that combines link analysis with web content in order to retrieve specific web documents, and it can predict the next URL through learning. The algorithm also uses an adaptive tunneling to overcome some of the limitations of normal focused crawlers. We apply three metrics to compare its efficiency with other weD-known web crawling techniques based.
AB - At present, using focused crawler becomes a way to seek the needed information. The main characteristic of a focused web crawler is to select and retrieve only relevant web pages in each crawling process. In this paper, we propose a learnable algorithm that combines link analysis with web content in order to retrieve specific web documents, and it can predict the next URL through learning. The algorithm also uses an adaptive tunneling to overcome some of the limitations of normal focused crawlers. We apply three metrics to compare its efficiency with other weD-known web crawling techniques based.
UR - https://www.scopus.com/pages/publications/67649888336
M3 - 会议稿件
AN - SCOPUS:67649888336
SN - 9788955191387
T3 - International Conference on Advanced Communication Technology, ICACT
SP - 2225
EP - 2230
BT - 11th International Conference on Advanced Communication Technology, ICACT 2009 - Proceedings
T2 - 11th International Conference on Advanced Communication Technology, ICACT 2009
Y2 - 15 February 2009 through 18 February 2009
ER -