TY - JOUR
T1 - A STRUCTURE THEOREM FOR ROOTED BINARY PHYLOGENETIC NETWORKS AND ITS IMPLICATIONS FOR TREE-BASED NETWORKS
AU - Hayamizu, Momoko
N1 - Funding Information:
\ast Received by the editors November 4, 2019; accepted for publication (in revised form) June 27, 2021; published electronically October 19, 2021. https://doi.org/10.1137/19M1297403 Funding: This work was funded by JST PRESTO grants JPMJPR16EB and JPMJPR1929. \dagger Department of Statistical Modeling, The Institute of Statistical Mathematics, and JST PRESTO, Tokyo, Japan. Current address: Department of Applied Mathematics, Waseda University, and JST PRESTO, Tokyo, Japan (hayamizu@waseda.jp).
Publisher Copyright:
© 2021 SIAM.
PY - 2021
Y1 - 2021
N2 - Attempting to recognize a tree inside a phylogenetic network is a fundamental undertaking in evolutionary analysis. In the last few years, therefore, "tree-based"phylogenetic networks, which are defined by a spanning tree called a "subdivision tree"that is an embedding of a phylogenetic tree on the same leaf-set, have attracted the attention of many theoretical biologists. However, the application of such networks is still not easy, due to many important computational problems whose time complexities are unknown or not clearly understood. In this paper, we provide a general framework for solving those various old or new problems on tree-based phylogenetic networks from a coherent perspective, rather than analyzing the complexity of each individual problem or developing an algorithm one by one. More precisely, we establish a structure theorem that gives a way to canonically decompose any rooted binary phylogenetic network N into maximal zig-zag trails that are uniquely determined by N, and furthermore use it to characterize the set of subdivision trees of N in the form of a direct product. From these main results, we derive a series of linear time (and linear time delay) algorithms for solving the following problems: given a rooted binary phylogenetic network N, (1) determine whether or not N has a subdivision tree and find one if there exists any (decision/search problems); (2) measure the deviation of N from being tree-based (deviation quantification problem); (3) compute the number of subdivision trees of N (counting problem); (4) list all subdivision trees of N (enumeration problem); and (5) find a subdivision tree to maximize or minimize a prescribed objective function (optimization problem). All algorithms proposed here are optimal in terms of time complexity. Our results not only imply and unify various known results in the relevant literature, but also answer many open questions and moreover enable novel applications, such as the estimation of a maximum likelihood tree underlying a tree-based network. The results and algorithms in this paper also apply for a special class of rooted nonbinary phylogenetic networks.
AB - Attempting to recognize a tree inside a phylogenetic network is a fundamental undertaking in evolutionary analysis. In the last few years, therefore, "tree-based"phylogenetic networks, which are defined by a spanning tree called a "subdivision tree"that is an embedding of a phylogenetic tree on the same leaf-set, have attracted the attention of many theoretical biologists. However, the application of such networks is still not easy, due to many important computational problems whose time complexities are unknown or not clearly understood. In this paper, we provide a general framework for solving those various old or new problems on tree-based phylogenetic networks from a coherent perspective, rather than analyzing the complexity of each individual problem or developing an algorithm one by one. More precisely, we establish a structure theorem that gives a way to canonically decompose any rooted binary phylogenetic network N into maximal zig-zag trails that are uniquely determined by N, and furthermore use it to characterize the set of subdivision trees of N in the form of a direct product. From these main results, we derive a series of linear time (and linear time delay) algorithms for solving the following problems: given a rooted binary phylogenetic network N, (1) determine whether or not N has a subdivision tree and find one if there exists any (decision/search problems); (2) measure the deviation of N from being tree-based (deviation quantification problem); (3) compute the number of subdivision trees of N (counting problem); (4) list all subdivision trees of N (enumeration problem); and (5) find a subdivision tree to maximize or minimize a prescribed objective function (optimization problem). All algorithms proposed here are optimal in terms of time complexity. Our results not only imply and unify various known results in the relevant literature, but also answer many open questions and moreover enable novel applications, such as the estimation of a maximum likelihood tree underlying a tree-based network. The results and algorithms in this paper also apply for a special class of rooted nonbinary phylogenetic networks.
KW - counting
KW - decision/search
KW - deviation quantification
KW - enumeration
KW - optimization
KW - phylogenetic network
KW - phylogenetic tree
KW - subdivision tree
KW - tree-based network
UR - http://www.scopus.com/inward/record.url?scp=85133230113&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85133230113&partnerID=8YFLogxK
U2 - 10.1137/19M1297403
DO - 10.1137/19M1297403
M3 - Article
AN - SCOPUS:85133230113
SN - 0895-4801
VL - 35
SP - 2490
EP - 2516
JO - SIAM Journal on Discrete Mathematics
JF - SIAM Journal on Discrete Mathematics
IS - 4
ER -