Sebastien Roch - Research

Funding

I gratefully acknowledge current support from NSF grants DMS-2023239 (TRIPODS Phase II) and DMS-2308495, as well as a Van Vleck Research Professor Award and a Vilas Distinguished Achievement Professorship.

Past funding includes NSF grants DMS-1248176, DMS-1149312 (CAREER), DMS-1614242, CCF-1740707 (TRIPODS Phase I), DMS-1902892, and DMS-1916378, as well as an Alfred P. Sloan Research Fellowship, a Simons Fellowship, and a Vilas Associates Award.

Students and Postdocs

Current Ph.D. Students

Hongyi Huang

Current Postdocs

David Clancy

Former Ph.D. Students

Max Hill [graduated 2023; now at IMSI]
Yu Sun [graduated 2023; now at TikTok]
Shuqi Yu [graduated 2023; now at Metanotitia]
Brandon Legried [graduated 2020; now postdoc at Georgia Institute of Technology]
Kun-Chieh (Jason) Wang [graduated 2017; now at Google]

Former Postdocs

Wai Tong (Louis) Fan [2015-2018; now assistant professor at Indiana University Bloomington]

Books and Surveys

Mathematical Methods in Data Science (with Python)

To be published by Cambridge University Press.

Website

Modern Discrete Probability: An Essential Toolkit

Cambridge University Press, 2024.

Website Order

Book review of "Phylogeny-Discrete and random processes in evolution by Mike Steel"

Bulletin of the AMS, 56:527-533, 2019.

PDF doi MRef

Hands-on introduction to sequence-length requirements in phylogenetics

Bioinformatics and Phylogenetics. Computational Biology, vol 29. Springer, 2019.

ABSTRACT PDF code doi

Preprints

Estimating Graph Dimension with Cross-validated Eigenvalues

Preprint. With Fan Chen, Karl Rohe, Shuqi Yu.

ABSTRACT arXiv

Reducing Seed Bias in Respondent-Driven Sampling by Estimating Block Transition Probabilities

Preprint. With Yilin Zhang and Karl Rohe.

ABSTRACT arXiv

Journal Papers and Refereed Proceedings

Maximum Likelihood Estimation for Unrooted 3-Leaf Trees: An Analytic Solution for the CFN Model

Bull. Math. Biol., 86:106, 2024. With Max Hill, Jose Israel Rodriguez.

ABSTRACT arXiv doi

Pairwise sequence alignment at arbitrarily large evolutionary distance

Ann. Appl. Probab., 34(3):2714-2732, 2024. With Brandon Legried.

ABSTRACT arXiv doi

QR-STAR: A Polynomial-Time Statistically Consistent Method for Rooting Species Trees under the Coalescent

Journal of Computational Biology, 30(11):1146-1181, 2023. With Yasamin Tabatabaee and Tandy Warnow.

ABSTRACT bioRxiv doi

Expanding the class of global objective functions for dissimilarity-based hierarchical clustering

Journal of Classification, 40:513-526, 2023.

ABSTRACT arXiv doi

Statistically consistent rooting of species trees under the multi-species coalescent model

RECOMB 2023. With Yasamin Tabatabaee and Tandy Warnow.

ABSTRACT bioRxiv doi

Inconsistency of triplet-based and quartet-based species tree estimation under intralocus recombination

Journal of Computational Biology, 29(11):1173-1197, 2022. With Max Hill.

ABSTRACT bioRxiv doi

Impossibility of phylogeny reconstruction from k-mer counts

Annals of Applied Probability, 32(6):4893-4913, 2022. With Wai-Tong Louis Fan and Brandon Legried.

ABSTRACT arXiv doi

Species tree estimation under joint modeling of coalescence and duplication: sample complexity of quartet methods

Annals of Applied Probability, 32(6): 4681-4705, 2022. With Max Hill and Brandon Legried.

ABSTRACT arXiv doi

On the Effect of Intralocus Recombination on Triplet-Based Species Tree Estimation

RECOMB 2022. With Max Hill.

ABSTRACT bioRxiv doi

A stochastic Farris transform for genetic data under the multispecies coalescent with applications to data requirements

Journal of Mathematical Biology, 84(5):36, April 2022. With Gautam Dasarathy, Elchanan Mossel, Robert Nowak.

ABSTRACT arXiv doi

Polynomial-Time Statistical Estimation of Species Trees Under Gene Duplication and Loss

Journal of Computational Biology, 28(5):452-468, 2021. With Brandon Legried, Erin Molloy, and Tandy Warnow.
Conference version in Proceedings of RECOMB 2020, 120-135.

ABSTRACT bioRxiv doi

Sufficient condition for root reconstruction by parsimony on binary trees with general weights

Electronic Communications in Probability, 26:1-13, 2021. With Jason Wang.

ABSTRACT arXiv doi

Impossibility of consistent distance estimation from sequence lengths under the TKF91 model

Bulletin of Mathematical Biology, 82(9):123, 2020. With Wai-Tong Louis Fan and Brandon Legried.

ABSTRACT arXiv doi

Asymptotic seed bias in respondent-driven sampling

Electronic Journal of Statistics, 14(1):1577-1610, 2020. With Yuling Yan, Bret Hanlon and Karl Rohe.

ABSTRACT arXiv doi

Statistically consistent and computationally efficient inference of ancestral DNA sequences in the TKF91 model under dense taxon sampling.

Bulletin of Mathematical Biology, 82(2):21, 2020. With L. Fan.

ABSTRACT arXiv doi

Long-branch attraction in species tree estimation: inconsistency of partitioned likelihood and topology-based summary methods

Systematic Biology, Volume 68, Issue 2, March 2019, Pages 281-297. With Michael Nute and Tandy Warnow.

ABSTRACT arXiv doi

Generalized least squares can overcome the critical threshold in respondent-driven sampling

Proceedings of the National Academy of Sciences, 115(41):10299-10304, 2018. With Karl Rohe.