Time-decaying Bloom Filters for data streams with skewed distributions

Kai Cheng*, Limin Xiang, Haiyan Xu, Mizuho Iwaihara, Mukesh M. Mohania

*この研究の対応する著者

研究成果: Paper査読

43 被引用数 (Scopus)

抄録

Bloom Filters are space-efficient data structures for membership queries over sets. To enable queries for multiplicities of multi-sets, the bitmap in a Bloom Filter is replaced by an array of counters whose values increment on each occurrence. In a data stream model, however, data items arrive at varying rates and recent occurrences are often regarded as more significant than past ones. In most data stream applications, it is critical to handle this "time-sensitivity". Furthermore, data streams with skewed distributions are common in many emerging applications, e.g., traffic engineering and billing, intrusion detection, trading surveillance and outlier detection. For such applications, it is inefficient to allocate counters of uniform size to all buckets. In this paper, we present Time-decaying Bloom Filters (TBF), a Bloom Filter that maintains the frequency count for each item in a data stream, and the value of each counter decays with time. For data streams with highly skewed distributions, we proposed further optimization by allowing dynamically allocating free counters to the "large" items. We performed preliminary experiments to verify the optimization.

本文言語English
ページ63-69
ページ数7
出版ステータスPublished - 2005
外部発表はい
イベント15th International Workshop on Research Issues in Data Engineering: Stream Data Mining and Applications, RIDE-SDMA 2005 - Tokyo, Japan
継続期間: 2005 4月 32005 4月 4

Conference

Conference15th International Workshop on Research Issues in Data Engineering: Stream Data Mining and Applications, RIDE-SDMA 2005
国/地域Japan
CityTokyo
Period05/4/305/4/4

ASJC Scopus subject areas

  • ソフトウェア
  • 工学(その他)
  • ハードウェアとアーキテクチャ

フィンガープリント

「Time-decaying Bloom Filters for data streams with skewed distributions」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

引用スタイル