Skip to content

Bash scripts for analysis of Hyb-Seq data (from raw reads to species trees)

License

Notifications You must be signed in to change notification settings

tomas-fer/HybPhyloMaker

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

HybPhyloMaker

Fér T. & Schmickl R. (2018): HybPhyloMaker: target enrichment data analysis from raw reads to species trees. Evolutionary Bioinformatics 14: 1-9. doi: 10.1177/1176934317742613

Set of bash scripts for analysis of HybSeq data (from raw reads to species trees). Consists of several steps:

0: Prepare FASTQ files to folder (optionally download files from Illumina BaseSpace storage)
1: Processing raw reads (PhiX removal, adaptor removal, quality filtering, summary statistics)
2: Mapping reads to reference (using Bowtie2/BWA), create consensus sequence
3: Recognize sequences matching probes (generate PSLX files using BLAT)
4: Create alignments for all genes (+ optionally correct reading frame and/or select low heterozygosity loci)
5: Treat missing data, select best genes
6: Generate FastTree or RAxML gene trees + calculate/plot trees-alignment properties
7: Root gene trees with outgroup, combine gene trees into a single file
8: Estimate species tree and other methods... (ASTRAL, ASTRID, MRL, BUCKy, concatenation FastTree/ExaML, NeighbourNetwork, Dsuite, SuperQ network, quartet sampling, SNaQ, PhyloNet)
9: Subselect suitable genes and repeat steps 7+8
10:Subselect trees based on samples presence, collapse unsupported branches
11:Calculate PhyParts
12:Subselect samples based on exclude list
13:DiscoVista

Uses many additional software that must be installed and put in the PATH prior to run scripts (see Table located in docs folder and consider to run install_software.sh).
Also utilizes many scripts developed by others (located in HybSeqSource folder). PLEASE CITE APPROPRIATELY THOSE SCRIPTS WHEN USING HybPhyloMaker!

Read manual located in docs folder before running HybPhyloMaker.

HybPhyloMaker workflow

HybPhyloMaker workflow

Papers citing HybPhyloMaker

  1. Carlsen MM, Fér T, Schmickl R, Leong-Škorničková J, Newman M, Kress WJ. 2018. Resolving the rapid plant radiation of early diverging lineages in the tropical Zingiberales: Pushing the limits of genomic data. Molecular Phylogenetics and Evolution 128:55-68. doi: 10.1016/j.ympev.2018.07.020
  2. Herrando-Moraira S, Cardueae Radiations Group. 2018. Exploring data processing strategies in NGS target enrichment to disentangle radiations in the tribe Cardueae (Compositae). Molecular Phylogenetics and Evolution 128:69-87. doi: 10.1016/j.ympev.2018.07.012
  3. Villaverde T, Pokorny L, Olsson S, Rincón-Barrado M, Johnson MG, Gardner EM, Wickett NJ, Molero J, Riina R, Sanmartín I. 2018. Bridging the micro- and macroevolutionary levels in phylogenomics: Hyb-Seq solves relationships from populations to species and above. New Phytologist 220:636-650. doi: 10.1111/nph.15312
  4. Jones KE, Fér T, Schmickl RE, Dikow RB, Funk VA, Herrando-Moraira S, Siniscalchi CM, Susanna A, Slovák M, Thapa R, Watson LE, Mandel JR. 2019. An empirical assessment of a single family-wide hybrid capture locus set at multiple evolutionary timescales in Asteraceae. Applications in Plant Sciences 7(10):e11295. doi: 10.1002/aps3.11295
  5. Karimi N, Grover CE, Galagher JP, Wendel JF, Ané C, Baum DA. 2020. Reticulate evolution helps explain apparent homoplasy in floral biology and pollination in baobabs (Adansonia; Bombacoideae; Malvaceae). Systematic Biology 69:462-478. doi: 10.1093/sysbio/syz073
  6. Mao Y, Hou S, Shi J, Economo EP. 2020. TREEasy: an automated workflow to infer gene trees, species trees, and phylogenetic networks from multilocus data. Molecular Ecology Resources 20:832-840. doi: 10.1111/1755‐0998.13149
  7. Tomasello S, Karbstein K, Hodač L, Paetzold C, Hörandl E. 2020. Phylogenomics unravels Quaternary vicariance and allopatric speciation patterns in temperate‐montane plant species: a case study on the Ranunculus auricomus species complex. Molecular Ecology 29:2031-2049. doi: 10.1111/mec.15458
  8. Karbstein K, Tomasello S, Hodač L, Dunkel FG, Daubert M, Hörandl E. 2020. Phylogenomics supported by geometric morphometrics reveals delimitation of sexual species within the polyploid apomictic Ranunculus auricomus complex (Ranunculaceae). Taxon 69:1191-1220. doi: 10.1002/tax.12365
  9. Knyshov A, Gordon ERL, Christiane Weirauch C. 2021. New alignment-based sequence extraction software (ALiBaSeq) and its utility for deep level phylogenetics. PeerJ 9:e11019. doi: 10.7717/peerj.11019
  10. Rejlová L, Böhmová A, Chumová Z, Hořčicová Š, Josefiová J, Schmidt PA, Trávníček P, Urfus T, Vít P, Chrtek J. 2021. Disparity between morphology and genetics in Urtica dioica (Urticaceae). Botanical Journal of the Linnean Society 195:606-621. doi: 10.1093/botlinnean/boaa076
  11. McLay TGB, Birch JL, Gunn BF, Ning W, Tate JA, Nauheimer L, Joyce EM, Simpson L, Schmidt-Lebuhn AN, Baker WJ, Forest F, Jackson CJ. 2021. New targets acquired: Improving locus recovery from the Angiosperms353 probe set. Applications in Plant Sciences 9(7): e11420. doi: 10.1002/aps3.11420
  12. Nauheimer L, Weigner N, Joyce E, Crayn D, Clarke C, Nargar K. 2021. HybPhaser: A workflow for the detection and phasing of hybrids in target capture data sets. Applications in Plant Sciences 9(7):e11441. doi:10.1002/aps3.11441
  13. Ufimov R, Zeisek V, Píšová S, Baker WJ, Fér T, van Loo M, Dobeš Ch, Schmickl R. 2021. Relative performance of customized and universal probe sets in target enrichment: A case study in subtribe Malinae. Applications in Plant Sciences 9(7):e11442. doi:10.1002/aps3.11442
  14. Montero-Mendieta S, De la Riva I, Irisarri I, Leonard JA, Webster MT, Vilà C. 2021. Phylogenomics and evolutionary history of Oreobates (Anura: Craugastoridae) Neotropical frogs along elevational gradients. Molecular Phylogenetics and Evolution 161:107167. doi: 10.1016/j.ympev.2021.107167
  15. Chumová Z, Záveská E, Hloušková P, Ponert J, Schmidt PA, Čertner M, Mandáková T, Trávníček P. 2021 Repeat proliferation and partial endoreplication jointly shape the patterns of genome size evolution in orchids. The Plant Journal 107:511-524. doi: 10.1111/tpj.15306
  16. Reichelt N, Wen J, Pätzold C, Appelhans MS. 2021. Target enrichment improves phylogenetic resolution in the genus Zanthoxylum (Rutaceae) and indicates both incomplete lineage sorting and hybridization events. Annals of Botany 128:497-510. doi: 10.1093/aob/mcab092
  17. Nesi N, Tsagkogeorga G, Tsang SM, Nicolas V, Lalis A, Scanlon AT, Riesle-Sbarbaro SA, Wiantoro S, Hitch AT, Juste J, Pinzari CA, Bonaccorso FJ, Todd CM, Lim BK, Simmons NB, McGowen MR, Rossiter SJ. 2021. Interrogating phylogenetic discordance resolves deep splits in the rapid radiation of Old World fruit bats (Chiroptera: Pteropodidae). Systematic Biology 70:1077-1089. doi: 10.1093/sysbio/syab013
  18. Lara-Cabrera SI, Perez-Garcia ML, Maya-Lastra CA, Montero-Castro JC, Godden GT, Cibrian-Jaramillo A, Fisher AE, Porter JM. 2021. Phylogenomics of Salvia L. subgenus Calosphace (Lamiaceae). Frontiers in Plant Science 12:725900. doi: 10.3389/fpls.2021.725900
  19. Sangvirotjanapat S, Fér T, Denduangboripant J, Newman MF. 2022. Phylogeny of Globba section Nudae and taxonomic revision of the new Globba subsection Pelecantherae. Plant Systematics and Evolution 308:5. doi: 10.1007/s00606-021-01789-6
  20. Kandziora M, Sklenář P, Kolář F, Schmickl R. 2022. How to tackle phylogenetic discordance in recent and rapidly radiating groups? Developing a workflow using Loricaria (Asteraceae) as an example. Frontiers in Plant Science 12:765719. doi: 10.3389/fpls.2021.765719
  21. Gizaw A, Gorospe JM, Kandziora M, Chala D, Gustafsson L, Zinaw A, Salomón L, Eilu G, Brochmann C, Kolář F, Schmickl R. 2022. Afro-alpine fagships revisited II: elucidating the evolutionary relationships and species boundaries in the giant senecios (Dendrosenecio, Asteraceae). Alpine Botany 132:89-105. doi: 10.1007/s00035-021-00268-5
  22. Hatami E, Jones KE, Kilian N. 2022. New insights into the relationships within subtribe Scorzonerinae (Cichorieae, Asteraceae) using hybrid capture phylogenomics (Hyb-Seq). Frontiers in Plant Science 13:851716. doi: 10.3389/fpls.2022.851716
  23. Karbstein K, Tomasello S, Hodač L, Wagner N, Marinček P, Barke BH, Pätzold C, Hörandl E. 2022. Untying Gordian knots: Unraveling reticulate polyploid plant evolution by genomic data using the large Ranunculus auricomus species complex. New Phytologist 235:2081-2098. doi: 10.1111/nph.18284
  24. Michel T, Tseng YH, Wilson H, Chung KF, Kidner C. 2022. A hybrid capture bait set for Begonia. Edinburgh Journal of Botany 79:409. doi: 10.24823/ejb.2022.409
  25. Ufimov R, Gorospe JM, Fér T, Kandziora M, Salomon L, van Loo M, Schmickl R. 2022. Utilizing paralogs for phylogenetic reconstruction has the potential to increase species tree support and reduce gene tree discordance in target enrichment data. Molecular Ecology Resources 22:3018-3034. doi: 10.1111/1755-0998.13684
  26. Méndez-Urbano F, Sierra-Giraldo JA, Carlsen MM, Rodríguez-Rey GT, Castaño-Rubiano N. 2022. Anthurium caldasii: a new species of Araceae from Colombia and its phylogenetic relationships with other black-spathed Anthurium species. Brittonia 74:419-435. doi: 10.1007/s12228-022-09722-y
  27. Woudstra Y, Quatela A-S, Kidner C, Viruel J, Zuntini A, Martin MD, Michel T, Grace OM. 2022. Chapter 14. Target capture. In: de Boer H, Rydmark MO, Verstraete B, Gravendeel B (Eds) Molecular identification of plants: from sequence to species. Advanced Books. doi: 10.3897/ab.e98875
  28. Hlavatá K, Leong-Škorničková J, Záveská E, Šída O, Newman M, Mandáková T, Lysak MA, Marhold K, Fér T. 2023: Phylogenomics and genome size evolution in Amomum s. s. (Zingiberaceae): comparison of traditional and modern sequencing methods. Molecular Phylogenetics and Evolution 178:107666. doi: 10.1016/j.ympev.2022.107666
  29. Böhmová A, Leong-Škorničková J, Šída O, Poulsen AD, Newman MF, Fér T. 2023. Next-generation sequencing data show rapid radiation and several long-distance dispersal events in early Costaceae. Molecular Phylogenetics and Evolution 179:107664. doi: 10.1016/j.ympev.2022.107664
  30. Blanco-Gavaldà C, Galbany-Casals M, Susanna A, Andrés-Sánchez S, Bayer RJ, Brochmann C, Cron GV, Bergh NG, Garcia-Jacas N, Gizaw A, Kandziora M, Kolář F, López-Alvarado J, Leliaert F, Letsara R, Moreyra LD, Razafimandimbison SG, Schmickl R, Roquet C. 2023. Repeatedly northwards and upwards: southern African grasslands fuel the colonization of the African sky islands in Helichrysum (Compositae). Plants 12:2213. doi: 10.3390/plants12112213
  31. Scheunert A, Lautenschlager U, Ott T, Oberprieler C. 2023. Nano-Strainer: A workflow for the identification of single-copy nuclear loci for plant systematic studies, using target capture kits and Oxford Nanopore long reads. Ecology and Evolution 13:e10190. doi: 10.1002/ece3.10190
  32. Pezzini FF, Ferrari G, Forrest LL, Hart ML, Nishii K, Kidner CA. 2023. Target capture and genome skimming for plant diversity studies. Applications in Plant Sciences 11(4):e11537. doi: 10.1002/aps3.11537
  33. Moreyra LD, Garcia-Jacas N, Roquet C, Ackerfield JR, Arabacı T, Blanco-Gavaldà C, Brochmann C, Calleja JA, Dirmenci T, Fujikawa K, Galbany-Casals M, Gao T, Gizaw A, López-Alvarado J, Mehregan I, Vilatersana R, Yıldız B, Leliaert F, Seregin AP, Susanna A. 2023. African mountain thistles: three new genera in the Carduus-Cirsium Group. Plants 12:3083. doi: 10.3390/plants12173083
  34. Bradican JP, Tomasello S, Boscutti F, Karbstein K, Hörandl E. 2023. Phylogenomics of southern European taxa in the Ranunculus auricomus species complex: the apple doesn't fall far from the tree. Plants 12: 3664. doi: 10.3390/plants12213664
  35. Xie P, Guo Y, Teng Y, Zhou W, Yu Y. 2024. GeneMiner: A tool for extracting phylogenetic markers from next-generation sequencing data. Molecular Ecology Resources 24:e13924. doi:10.1111/1755-0998.13924
  36. Skopalíková J, Leong-Škorničková J, Šída O, Newman M, Chumová Z, Fér T, Záveská E. 2024. Ancient hybridization in Curcuma (Zingiberaceae) – Accelerator or brake in lineage diversification?. The Plant Journal 116:773-785. doi: 10.1111/tpj.16408
  37. Hlavatá K, Záveská E, Leong-Škorničková J, Pouch M, Poulsen AD, Šída O, Khadka B, Mandáková T & Fér T. 2024. Ancient hybridization and repetitive element proliferation in the evolutionary history of the monocot genus Amomum (Zingiberaceae). Frontiers in Plant Science 15:1324358. doi: 10.3389/fpls.2024.1324358
  38. Ning W, Meudt HM & Tate JA. 2024. A roadmap of phylogenomic methods for studying polyploid plant genera. Applications in Plant Sciences 12: e11580. doi: 10.1002/aps3.11580
  39. Bradican JP, Tomasello S, Vollmer J & Hörandl E. 2024. Converging forms: an examination of sub-Arctic, circumarctic, and Central Asian Ranunculus auricomus agg. populations. Frontiers in Plant Science 15:1415059. doi: 10.3389/fpls.2024.1415059
  40. Poulsen AD, Fér T, Kumarage Marasinghe LD, Sabu M, Hughes M, Valderrama E & Leong-Škorničková J. 2024. The cardamom conundrum resolved: Recircumscription and placement of Elettaria in the only pantropically distributed ginger lineage. Taxon. doi: 10.1002/tax.13242