Abstract
In this paper, we consider the Bayesian approach for representation of a set of documents. In the field of representation of a set of documents, many previous models, such as the latent semantic analysis (LSA), the probabilistic latent semantic analysis (PLSA), the Semantic Aggregate Model (SAM), the Bayesian Latent Semantic Analysis (BLSA), and so on, were proposed. In this paper, we formulate the Bayes optimal solutions for estimation of parameters and selection of the dimension of the hidden latent class in these models and analyze it's asymptotic properties.
Original language | English |
---|---|
Pages (from-to) | 4637-4642 |
Number of pages | 6 |
Journal | Proceedings of the IEEE International Conference on Systems, Man and Cybernetics |
Volume | 5 |
Publication status | Published - 2003 |
Externally published | Yes |
Event | System Security and Assurance - Washington, DC, United States Duration: 2003 Oct 5 → 2003 Oct 8 |
Keywords
- Automated document indexing
- Bayesian statistics
- Information retrieval
- Probabilistic latent semantic indexing
ASJC Scopus subject areas
- Control and Systems Engineering
- Hardware and Architecture