The optimization principle in phylogenetic analysis tends to give incorrect topologies when the number of nucleotides or amino acids used is small

Masatoshi Nei; Sudhir Kumar; Kei Takahashi

doi:10.1073/pnas.95.21.12390

The optimization principle in phylogenetic analysis tends to give incorrect topologies when the number of nucleotides or amino acids used is small

Masatoshi Nei, Sudhir Kumar, Kei Takahashi

Research output: Contribution to journal › Article › peer-review

140 Scopus citations

Abstract

In the maximum parsimony (MP) and minimum evolution (ME) methods of phylogenetic inference, evolutionary trees are constructed by searching for the topology that shows the minimum number of mutational changes required (M) and the smallest sum of branch lengths (S), respectively, whereas in the maximum likelihood (ML) method the topology showing the highest maximum likelihood (A) of observing a given data set is chosen. However, the theoretical basis of the optimization principle remains unclear. We therefore examined the relationships of M, S, and A for the MP, ME, and ML trees with those for the true tree by using computer simulation. The results show that M and S are generally greater for the true tree than for the MP and ME trees when the number of nucleotides examined (n) is relatively small, whereas A is generally lower for the true tree than for the ML tree. This finding indicates that the optimization principle tends to give incorrect topologies when n is small. To deal with this disturbing property of the optimization principle, we suggest that more attention should be given to testing the statistical reliability of an estimated tree rather than to finding the optimal tree with excessive efforts. When a reliability test is conducted, simplified MP, ME, and ML algorithms such as the neighbor-joining method generally give conclusions about phylogenetic inference very similar to those obtained by the more extensive tree search algorithms.

Original language	English (US)
Pages (from-to)	12390-12397
Number of pages	8
Journal	Proceedings of the National Academy of Sciences of the United States of America
Volume	95
Issue number	21
DOIs	https://doi.org/10.1073/pnas.95.21.12390
State	Published - Oct 13 1998
Externally published	Yes

ASJC Scopus subject areas

General

Access to Document

10.1073/pnas.95.21.12390

Cite this

The optimization principle in phylogenetic analysis tends to give incorrect topologies when the number of nucleotides or amino acids used is small. / Nei, Masatoshi; Kumar, Sudhir; Takahashi, Kei.
In: Proceedings of the National Academy of Sciences of the United States of America, Vol. 95, No. 21, 13.10.1998, p. 12390-12397.

Research output: Contribution to journal › Article › peer-review

@article{284ff555d4fa407c8d25a9f3b248b25e,

title = "The optimization principle in phylogenetic analysis tends to give incorrect topologies when the number of nucleotides or amino acids used is small",

abstract = "In the maximum parsimony (MP) and minimum evolution (ME) methods of phylogenetic inference, evolutionary trees are constructed by searching for the topology that shows the minimum number of mutational changes required (M) and the smallest sum of branch lengths (S), respectively, whereas in the maximum likelihood (ML) method the topology showing the highest maximum likelihood (A) of observing a given data set is chosen. However, the theoretical basis of the optimization principle remains unclear. We therefore examined the relationships of M, S, and A for the MP, ME, and ML trees with those for the true tree by using computer simulation. The results show that M and S are generally greater for the true tree than for the MP and ME trees when the number of nucleotides examined (n) is relatively small, whereas A is generally lower for the true tree than for the ML tree. This finding indicates that the optimization principle tends to give incorrect topologies when n is small. To deal with this disturbing property of the optimization principle, we suggest that more attention should be given to testing the statistical reliability of an estimated tree rather than to finding the optimal tree with excessive efforts. When a reliability test is conducted, simplified MP, ME, and ML algorithms such as the neighbor-joining method generally give conclusions about phylogenetic inference very similar to those obtained by the more extensive tree search algorithms.",

author = "Masatoshi Nei and Sudhir Kumar and Kei Takahashi",

year = "1998",

month = oct,

day = "13",

doi = "10.1073/pnas.95.21.12390",

language = "English (US)",

volume = "95",

pages = "12390--12397",

journal = "Proceedings of the National Academy of Sciences of the United States of America",

issn = "0027-8424",

publisher = "National Academy of Sciences",

number = "21",

}

TY - JOUR

T1 - The optimization principle in phylogenetic analysis tends to give incorrect topologies when the number of nucleotides or amino acids used is small

AU - Nei, Masatoshi

AU - Kumar, Sudhir

AU - Takahashi, Kei

PY - 1998/10/13

Y1 - 1998/10/13

N2 - In the maximum parsimony (MP) and minimum evolution (ME) methods of phylogenetic inference, evolutionary trees are constructed by searching for the topology that shows the minimum number of mutational changes required (M) and the smallest sum of branch lengths (S), respectively, whereas in the maximum likelihood (ML) method the topology showing the highest maximum likelihood (A) of observing a given data set is chosen. However, the theoretical basis of the optimization principle remains unclear. We therefore examined the relationships of M, S, and A for the MP, ME, and ML trees with those for the true tree by using computer simulation. The results show that M and S are generally greater for the true tree than for the MP and ME trees when the number of nucleotides examined (n) is relatively small, whereas A is generally lower for the true tree than for the ML tree. This finding indicates that the optimization principle tends to give incorrect topologies when n is small. To deal with this disturbing property of the optimization principle, we suggest that more attention should be given to testing the statistical reliability of an estimated tree rather than to finding the optimal tree with excessive efforts. When a reliability test is conducted, simplified MP, ME, and ML algorithms such as the neighbor-joining method generally give conclusions about phylogenetic inference very similar to those obtained by the more extensive tree search algorithms.

AB - In the maximum parsimony (MP) and minimum evolution (ME) methods of phylogenetic inference, evolutionary trees are constructed by searching for the topology that shows the minimum number of mutational changes required (M) and the smallest sum of branch lengths (S), respectively, whereas in the maximum likelihood (ML) method the topology showing the highest maximum likelihood (A) of observing a given data set is chosen. However, the theoretical basis of the optimization principle remains unclear. We therefore examined the relationships of M, S, and A for the MP, ME, and ML trees with those for the true tree by using computer simulation. The results show that M and S are generally greater for the true tree than for the MP and ME trees when the number of nucleotides examined (n) is relatively small, whereas A is generally lower for the true tree than for the ML tree. This finding indicates that the optimization principle tends to give incorrect topologies when n is small. To deal with this disturbing property of the optimization principle, we suggest that more attention should be given to testing the statistical reliability of an estimated tree rather than to finding the optimal tree with excessive efforts. When a reliability test is conducted, simplified MP, ME, and ML algorithms such as the neighbor-joining method generally give conclusions about phylogenetic inference very similar to those obtained by the more extensive tree search algorithms.

UR - http://www.scopus.com/inward/record.url?scp=0032514671&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0032514671&partnerID=8YFLogxK

U2 - 10.1073/pnas.95.21.12390

DO - 10.1073/pnas.95.21.12390

M3 - Article

C2 - 9770497

AN - SCOPUS:0032514671

SN - 0027-8424

VL - 95

SP - 12390

EP - 12397

JO - Proceedings of the National Academy of Sciences of the United States of America

JF - Proceedings of the National Academy of Sciences of the United States of America

IS - 21

ER -

The optimization principle in phylogenetic analysis tends to give incorrect topologies when the number of nucleotides or amino acids used is small

Abstract

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this