TY - JOUR

T1 - Configuration model for correlation matrices preserving the node strength

AU - Masuda, Naoki

AU - Kojaku, Sadamori

AU - Sano, Yukie

N1 - Funding Information:
We thank Diego Garlaschelli for valuable discussion. We thank Koh Murayama for providing the academic motivation data used in the present paper. We thank Takahiro Ezaki for calculating the correlation matrices for the stock market data. The fMRI data were provided in part by the Human Connectome Project, Washington University–Minn Consortium (Principal Investigators, David Van Essen and Kamil Ugurbil; Project No. 1U54MH091657), funded by the 16 National Institute of Health (NIH) Institutes and Centers that support the NIH Blueprint for Neuroscience Research, and by the McDonnell Center for Systems Neuroscience at Washington University. N.M. acknowledges the support provided through Japan Science and Technology Agency (JST) CREST Grant No. JPMJCR1304 and the JST ERATO Grant No. JPMJER1201, Japan.
Publisher Copyright:
© 2018 American Physical Society.

PY - 2018/7/20

Y1 - 2018/7/20

N2 - Correlation matrices are a major type of multivariate data. To examine properties of a given correlation matrix, a common practice is to compare the same quantity between the original correlation matrix and reference correlation matrices, such as those derived from random matrix theory, that partially preserve properties of the original matrix. We propose a model to generate such reference correlation and covariance matrices for the given matrix. Correlation matrices are often analyzed as networks, which are heterogeneous across nodes in terms of the total connectivity to other nodes for each node. Given this background, the present algorithm generates random networks that preserve the expectation of total connectivity of each node to other nodes, akin to configuration models for conventional networks. Our algorithm is derived from the maximum entropy principle. We will apply the proposed algorithm to measurement of clustering coefficients and community detection, both of which require a null model to assess the statistical significance of the obtained results.

AB - Correlation matrices are a major type of multivariate data. To examine properties of a given correlation matrix, a common practice is to compare the same quantity between the original correlation matrix and reference correlation matrices, such as those derived from random matrix theory, that partially preserve properties of the original matrix. We propose a model to generate such reference correlation and covariance matrices for the given matrix. Correlation matrices are often analyzed as networks, which are heterogeneous across nodes in terms of the total connectivity to other nodes for each node. Given this background, the present algorithm generates random networks that preserve the expectation of total connectivity of each node to other nodes, akin to configuration models for conventional networks. Our algorithm is derived from the maximum entropy principle. We will apply the proposed algorithm to measurement of clustering coefficients and community detection, both of which require a null model to assess the statistical significance of the obtained results.

UR - http://www.scopus.com/inward/record.url?scp=85050552987&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85050552987&partnerID=8YFLogxK

U2 - 10.1103/PhysRevE.98.012312

DO - 10.1103/PhysRevE.98.012312

M3 - Article

C2 - 30110768

AN - SCOPUS:85050552987

SN - 1063-651X

VL - 98

JO - Physical Review E - Statistical Physics, Plasmas, Fluids, and Related Interdisciplinary Topics

JF - Physical Review E - Statistical Physics, Plasmas, Fluids, and Related Interdisciplinary Topics

IS - 1

M1 - 012312

ER -