Text Mining using PrefixSpan constrained by Item Interval and Item Attribute

Issei Sato, Yu Hirate, Hayato Yamana

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Citation (Scopus)

Abstract

Applying conventional sequential pattern mining methods to text data extracts many uninteresting patterns, which increases the time to interpret the extracted patterns. To solve this problem, we propose a new sequential pattern mining algorithm by adopting the following two constraints. One is to select sequences with regard to item intervals-The number of items between any two adjacent items in a sequence-And the other is to select sequences with regard to item attributes. Using Amazon customer reviews in the book category, we have confirmed that our method is able to extract patterns faster than the conventional method, and is better able to exclude uninteresting patterns while retaining the patterns of interest.

Original languageEnglish
Title of host publicationICDEW 2006 - Proceedings of the 22nd International Conference on Data Engineering Workshops
EditorsRoger S. Barga, Xiaofang Zhou
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages35-38
Number of pages4
ISBN (Electronic)0769525717, 9780769525716
DOIs
Publication statusPublished - 2006
Externally publishedYes
Event22nd International Conference on Data Engineering Workshops, ICDEW 2006 - Atlanta, United States
Duration: 2006 Apr 32006 Apr 7

Publication series

NameICDEW 2006 - Proceedings of the 22nd International Conference on Data Engineering Workshops

Other

Other22nd International Conference on Data Engineering Workshops, ICDEW 2006
Country/TerritoryUnited States
CityAtlanta
Period06/4/306/4/7

ASJC Scopus subject areas

  • Information Systems
  • Computer Networks and Communications
  • Information Systems and Management

Fingerprint

Dive into the research topics of 'Text Mining using PrefixSpan constrained by Item Interval and Item Attribute'. Together they form a unique fingerprint.

Cite this