Abstract
Two types of similarities between words have been studied in the natural language processing community: synonymy and relational similarity. A high degree of similarity exist between synonymous words. On the other hand, a high degree of relational similarity exists between analogous word pairs. We present and empirically test a hypothesis that links these two types of similarities. Specifically, we propose a method to measure the degree of synonymy between two words using relational similarity between word pairs as a proxy. Given two words, first, we represent the semantic relations that hold between those words using lexical patterns. We use a sequential pattern clustering algorithm to identify different lexical patterns that represent the same semantic relation. Second, we compute the degree of synonymy between two words using an inter-cluster covariance matrix. We compare the proposed method for measuring the degree of synonymy against previously proposed methods on the Miller-Charles dataset and the WordSimilarity-353 dataset. Our proposed method outperforms all existingWeb-based similarity measures, achieving a statistically significant Pearson correlation coefficient of 0.867 on the Miller-Charles dataset.
Original language | English |
---|---|
Pages (from-to) | 2116-2123 |
Number of pages | 8 |
Journal | IEICE Transactions on Information and Systems |
Volume | E95-D |
Issue number | 8 |
DOIs | |
Publication status | Published - 2012 Aug |
Externally published | Yes |
Keywords
- Attributional similarity
- Miller-Charles dataset
- Relational similarity
- Synonymy
- WordSimilarity-353 dataset
ASJC Scopus subject areas
- Electrical and Electronic Engineering
- Software
- Artificial Intelligence
- Hardware and Architecture
- Computer Vision and Pattern Recognition