Abstract
We designed a system that track the changes to a particular area of a user's interests on the World Wide Web and to generate a summary of emerging topics back to the user. This system consists of three main components, which are the Area View System, the Web Spider and the Summary Generator. The Area View System, as a meta-search engine, directs the user's keywords to a commercial search engine, obtains the hits, performs further analysis and derives a number of most relevant domain sites. Then, the Web Spider dispatches and scans all these domains at a certain time interval to collect all the modified and newly added HTML pages. Lastly, the Summary Generator extracts all the newly added sentences or changes from the collected HTML pages and then counts the term weights in the changes by adapting a newly innovated algorithm called TF∗PDF (Term Frequency ∗ Proportional Document Frequency). The terms that deem to explain the emerging topic are heavily weighted. The sentences with the highest average weight are extracted to form a summary of emerging topics. We refer to our system as the Emerging Topic Tracking System (ETTS).
Original language | English |
---|---|
Title of host publication | Proceedings - 3rd International Workshop on Advanced Issues of E-Commerce and Web-Based Information Systems, WECWIS 2001 |
Publisher | Institute of Electrical and Electronics Engineers Inc. |
Pages | 2-11 |
Number of pages | 10 |
ISBN (Print) | 0769512240, 9780769512242 |
DOIs | |
Publication status | Published - 2001 |
Externally published | Yes |
Event | 3rd International Workshop on Advanced Issues of E-Commerce and Web-Based Information Systems, WECWIS 2001 - San Juan, United States Duration: 2001 Jun 21 → 2001 Jun 22 |
Other
Other | 3rd International Workshop on Advanced Issues of E-Commerce and Web-Based Information Systems, WECWIS 2001 |
---|---|
Country/Territory | United States |
City | San Juan |
Period | 01/6/21 → 01/6/22 |
ASJC Scopus subject areas
- Computer Science(all)