Genetic network programming with parallel processing for association rule mining in large and dense databases

Eloy Gonzales*, Kaoru Shimada, Shingo Mabu, Kotaro Hirasawa, Jinglu Hu

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Citation (Scopus)

Abstract

Several methods of extracting association rules have been reported. A new evolutionary computation method named Genetic Network Programming (GNP) has also been developed recently and its efectiveness is shown for small datasets. However, it has not been tested for large datasets, particularly in datasets with a large number of attributes. The aim of this paper is to extract association rules from large and dense datasets using GNP considering a real world database with a huge number of attributes. We propose a new method where a large database is divided into many small datasets, then each GNP deals with one dataset having attributes with appropiate size, which was selected randomly from a large dataset and generated genetically. These GNPs are processed in parallel. We then propose some new genetic operations to improve the number of rules extracted and their quality as well. The proposed method improves remarkably on simulations. Fig. 1 shows the architecture of the proposed method. We use the CLIENT/SERVER model. CLIENT side carries out preprocessing of large database, assignment of files to each server, rule checking, and genetic operations on files. SERVER side carries out processing of each file using conventional GNP based mining method independently. The features and advantages of the proposed method are the following: Rule extraction is done in parallel. Each file generates its local pool of the rules. Files or datasets are treated as individuals in order to do new genetic operations over them and improve the rule extraction. Extracted rules are stored in a global pool. The rules are verified to avoid redundancy among them and it is assured that only new rules are stored.

Original languageEnglish
Title of host publicationProceedings of GECCO 2007
Subtitle of host publicationGenetic and Evolutionary Computation Conference
Pages1512
Number of pages1
DOIs
Publication statusPublished - 2007
Event9th Annual Genetic and Evolutionary Computation Conference, GECCO 2007 - London, United Kingdom
Duration: 2007 Jul 72007 Jul 11

Publication series

NameProceedings of GECCO 2007: Genetic and Evolutionary Computation Conference

Conference

Conference9th Annual Genetic and Evolutionary Computation Conference, GECCO 2007
Country/TerritoryUnited Kingdom
CityLondon
Period07/7/707/7/11

Keywords

  • Association rules
  • Genetic network programming
  • Parallel processing

ASJC Scopus subject areas

  • Artificial Intelligence
  • Software
  • Theoretical Computer Science

Fingerprint

Dive into the research topics of 'Genetic network programming with parallel processing for association rule mining in large and dense databases'. Together they form a unique fingerprint.

Cite this