Phylogenetic trees are essential tools in evolutionary biology that present information on evolutionary events among organisms and molecules. From a dataset of n sequences, a phylogenetic tree of (2n-5)!! possible topologies exists, and determining the optimum topology using brute force is infeasible. Recently, a recursive graph cut on a graph-represented-similarity matrix has proven accurate in reconstructing a phylogenetic tree containing distantly related sequences. However, identifying the optimum graph cut is challenging, and approximate solutions are currently utilized. Here, a phylogenetic tree was reconstructed with an improved graph cut using a quantum-inspired computer, the Fujitsu Digital Annealer (DA), and the algorithm was named the “Normalized-Minimum cut by Digital Annealer (NMcutDA) method”. First, a criterion for the graph cut, the normalized cut value, was compared with existing clustering methods. Based on the cut, we verified that the simulated phylogenetic tree could be reconstructed with the highest accuracy when sequences were diverged. Moreover, for some actual data from the structure-based protein classification database, only NMcutDA could cluster sequences into correct superfamilies. Conclusively, NMcutDA reconstructed better phylogenetic trees than those using other methods by optimizing the graph cut. We anticipate that when the diversity of sequences is sufficiently high, NMcutDA can be utilized with high efficiency.
ASJC Scopus subject areas