Automated web data mining using semantic analysis

Wenxiang Dou*, Jinglu Hu

*この研究の対応する著者

研究成果: Conference contribution

2 被引用数 (Scopus)

抄録

This paper presents an automated approach to extracting product data from commercial web pages. Our web mining method involves the following two phrases: First, it analyzes the data information located at the leaf node of DOM tree structure of the web page, generates the semantic information vector for other nodes of the DOM tree and find maximum repeat semantic vector pattern. Second, it identifies the product data region and data records, builds a product object template by using semantic tree matching technique and uses it to extract all product data from the web page. The main contribution of this study is in developing a fully automated approach to extract product data from the commercial sites without any user's assistance. Experiment results show that the proposed technique is highly effective.

本文言語English
ホスト出版物のタイトルAdvanced Data Mining and Applications - 8th International Conference, ADMA 2012, Proceedings
ページ539-551
ページ数13
DOI
出版ステータスPublished - 2012
外部発表はい
イベント8th International Conference on Advanced Data Mining and Applications, ADMA 2012 - Nanjing, China
継続期間: 2012 12月 152012 12月 18

出版物シリーズ

名前Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
7713 LNAI
ISSN(印刷版)0302-9743
ISSN(電子版)1611-3349

Conference

Conference8th International Conference on Advanced Data Mining and Applications, ADMA 2012
国/地域China
CityNanjing
Period12/12/1512/12/18

ASJC Scopus subject areas

  • 理論的コンピュータサイエンス
  • コンピュータ サイエンス(全般)

フィンガープリント

「Automated web data mining using semantic analysis」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

引用スタイル