TY - JOUR
T1 - Bridging Between Deviation Indices for Non-Tree-Based Phylogenetic Networks
AU - Suzuki, Takatora
AU - Guo, Han
AU - Hayamizu, Momoko
N1 - Publisher Copyright:
© 2004-2012 IEEE.
PY - 2024
Y1 - 2024
N2 - Phylogenetic networks are a useful model that can represent reticulate evolution and complex biological data. In recent years, mathematical and computational aspects of tree-based networks have been well studied. However, not all phylogenetic networks are tree-based, so it is meaningful to consider how close a given network is to being tree-based; Francis-Steel-Semple (2018) proposed several different indices to measure the degree of deviation of a phylogenetic network from being tree-based. One is the minimum number of leaves that need to be added to convert a given network to tree-based, and another is the number of vertices that are not included in the largest subtree covering its leaf-set. Both values are zero if and only if the network is tree-based. Both deviation indices can be computed efficiently, but the relationship between the above two is unknown, as each has been studied using different approaches. In this study, we derive a tight inequality for the values of the two measures and also give a characterisation of phylogenetic networks such that they coincide. This characterisation yields a new efficient algorithm for the Maximum Covering Subtree Problem based on the maximal zig-zag trail decomposition.
AB - Phylogenetic networks are a useful model that can represent reticulate evolution and complex biological data. In recent years, mathematical and computational aspects of tree-based networks have been well studied. However, not all phylogenetic networks are tree-based, so it is meaningful to consider how close a given network is to being tree-based; Francis-Steel-Semple (2018) proposed several different indices to measure the degree of deviation of a phylogenetic network from being tree-based. One is the minimum number of leaves that need to be added to convert a given network to tree-based, and another is the number of vertices that are not included in the largest subtree covering its leaf-set. Both values are zero if and only if the network is tree-based. Both deviation indices can be computed efficiently, but the relationship between the above two is unknown, as each has been studied using different approaches. In this study, we derive a tight inequality for the values of the two measures and also give a characterisation of phylogenetic networks such that they coincide. This characterisation yields a new efficient algorithm for the Maximum Covering Subtree Problem based on the maximal zig-zag trail decomposition.
KW - Phylogenetic networks
KW - deviation from tree-based networks
KW - maximal zig-zag trail decomposition
KW - maximum covering subtree problem
UR - http://www.scopus.com/inward/record.url?scp=85204107774&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85204107774&partnerID=8YFLogxK
U2 - 10.1109/TCBB.2024.3456575
DO - 10.1109/TCBB.2024.3456575
M3 - Article
C2 - 39250364
AN - SCOPUS:85204107774
SN - 1545-5963
JO - IEEE/ACM Transactions on Computational Biology and Bioinformatics
JF - IEEE/ACM Transactions on Computational Biology and Bioinformatics
ER -