TY - JOUR
T1 - A haplotype inference method based on sparsely connected multi-body ising model
AU - Kato, Masashi
AU - Gao, Qian Ji
AU - Chigira, Hiroshi
AU - Shindo, Hiroyuki
AU - Inoue, Masato
PY - 2010
Y1 - 2010
N2 - Statistical haplotype inference is an indispensable technique in the field of medical science. The method usually has two steps: inference of haplotype frequencies and inference of diplotype for each subject. The first step can be done by using the expectation-maximization (EM) algorithm, but it incurs an unreasonably large calculation cost when the number of single-nucleotide polymorphism (SNP) loci of concern is large. In this article, we describe an approximate probabilistic model of haplotype frequencies. The model is constructed by using several distributions of nearby local SNPs. This approximation seems good because SNPs are generally more strongly correlated when they are close to one another on a chromosome. To implement this approach, we use a log linear model, the Walsh-Hadamard transform, and a combinatorial optimization method. Artificial data suggested that the overall haplotype inference of our method is good if there are nine or more local consecutive SNPs. Some minor problems should be dealt with before this method can be applied to real data.
AB - Statistical haplotype inference is an indispensable technique in the field of medical science. The method usually has two steps: inference of haplotype frequencies and inference of diplotype for each subject. The first step can be done by using the expectation-maximization (EM) algorithm, but it incurs an unreasonably large calculation cost when the number of single-nucleotide polymorphism (SNP) loci of concern is large. In this article, we describe an approximate probabilistic model of haplotype frequencies. The model is constructed by using several distributions of nearby local SNPs. This approximation seems good because SNPs are generally more strongly correlated when they are close to one another on a chromosome. To implement this approach, we use a log linear model, the Walsh-Hadamard transform, and a combinatorial optimization method. Artificial data suggested that the overall haplotype inference of our method is good if there are nine or more local consecutive SNPs. Some minor problems should be dealt with before this method can be applied to real data.
UR - http://www.scopus.com/inward/record.url?scp=78651104716&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=78651104716&partnerID=8YFLogxK
U2 - 10.1088/1742-6596/233/1/012022
DO - 10.1088/1742-6596/233/1/012022
M3 - Article
AN - SCOPUS:78651104716
SN - 1742-6588
VL - 233
JO - Journal of Physics: Conference Series
JF - Journal of Physics: Conference Series
M1 - 012022
ER -