Skip to content

pontushojer/awesome-linked-reads

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

47 Commits
 
 
 
 

Repository files navigation

Awesome Linked Reads

This is a collection of tools and resources for analysis and processing of linked-reads.

Tools

Name Category Description Last commit
Aldy structural variants, variant calling Allelic decomposition and exact genotyping of highly polymorphic and structurally variant genes GitHub last commit
Ambigram structural variants Detection of complex breakage-fusion-bridge genome rearrangements that supports linked-reads GitHub last commit
Aquila assembly, pipeline Diploid personal genome assembly and comprehensive variant detection based on linked-reads GitHub last commit
Aquila_stLFR assembly, pipeline Human haplotype-resolved assembly and variant detection for stLFR, hybrid assembly for linked-reads GitHub last commit
AquilaDeepFilter structural variants Deep learing based filtering of genome-wide false positive large deletions GitHub last commit
AquilaSV structural variants, variant calling Structural variant detection from region-based phased diploid assemblies for 10X and stLFR linked-reads GitHub last commit
ARBitR scaffolding ARBitR is an overlap aware genome assembly scaffolder for linked sequencing reads. GitHub last commit
Ariadne assembly, metagenomics de Bruijn graph-based program for barcoded read deconvolution GitHub last commit
arcs assembly Scaffold genome sequence assemblies. GitHub last commit
Athena assembly, metagenomics Read cloud assembler for metagenomes GitHub last commit
BarCrawler qc QC package for 10X genomics barcoded reads. GitHub last commit
bcctools toolkit Correcting barcodes in 10X linked-read sequencing data GitHub last commit
bcmap mapping, toolkit Fast tool to map approximate genome locations for barcoded molecules GitHub last commit
BLR pipeline An end-to-end Snakemake workflow for whole genome haplotyping and structural variant calling from FASTQs from multiple linked-read technologies. GitHub last commit
bxtools toolkit Tools for analyzing mapped 10x data GitHub last commit
cloudSPAdes assembly Assembly of synthetic long reads using de Bruijn graphs GitHub last commit (branch)
ChromeQC qc Summarize sequencing library quality of 10x Genomics Chromium linked reads GitHub last commit
Cue structural variants Deep learning framework for SV calling and genotyping GitHub last commit
DrLink structural variants Detecting recombination breakpoints using Linked read sequencing GitHub last commit
EMerAld (EMA) mapping Preforms barcode-aware alignment of linked reads. Also does preprocessing of 10x Genomics data. GitHub last commit
Gemtools toolkit Tools for working with linked-read sequencing (10X Genomics) data GitHub last commit
grocsvs structural variants Genome-wide reconstruction of complex structural variants GitHub last commit
HapCUT2 phasing Phasing of barcode linked reads GitHub last commit
HapTree-X phasing Haplotype phaser for next-generation sequencing data GitHub last commit
HARPY pipeline Process raw haplotagging data, from raw sequences to phased haplotypes, batteries included. GitHub last commit
HAST assembly Haplotype-Resolved Assembly for Synthetic Long Reads Using A Trio-Binning Strategy GitHub last commit
Lancet variant calling Microassembly based somatic variant caller for linked-read data GitHub last commit
Lariat mapping Linked-Read Alignment Tool GitHub last commit
LEVIATHAN structural variants Linked-reads based structural variant caller with barcode indexing GitHub last commit
Link_STR toolkit Analysis scripts developed for genotyping STRs in linked-read data GitHub last commit
LinkedSV structural variants Structural variant caller for linked-read sequencing data GitHub last commit
Linker toolkit Tools for analyzing long and linked read sequencing GitHub last commit
LongRanger pipeline Pipeline for alignment, variant calling, phasing, and ptructural variant calling GitHub last commit
LRez toolkit Standalone tool and library allowing to work with barcoded linked-reads GitHub last commit
LRTK pipeline, toolkit A unified and versatile toolkit for analyzing Linked-Read sequencing data GitHub last commit
LRTK-SIM simulation A program to simulate linked reads sequencing from 10X Chromium System GitHub last commit
LRSIM simulation A simulator for linked reads GitHub last commit
MetaTrass assembly Taxonomic Reads Assembly For a Single Species to Metagenomics GitHub last commit
Minerva assembly Sort Linked Read DNA Into Fragment Specific Clusters GitHub last commit
mLinker (alt) phasing, tookit Tools for Determining Haplotype Phase from Long/Linked Read Sequencing GitHub last commit
MTG-Link assembly Novel gap-filling tool for draft genome assemblies, dedicated to linked read data GitHub last commit
NAIBR (original)
NAIBR (fork)
structural variants Identifies novel adjacencies created by structural variation events such as deletions, duplications, inversions, and complex rearrangements GitHub last commit
GitHub last commit
Novel-X structural variants Novel insertion detection with 10X reads GitHub last commit
NPGREAT assembly A hybrid assembly method that utilizes Nanopore and Linked-Reads datasets for the assembly of the human subtelomere regions. GitHub last commit
Pangaea assembly, metagenomics A metagenome assembler for the linked-reads with high-barcode specificity GitHub last commit
proc10xG toolkit Collection of scripts for processing 10x genomics reads GitHub last commit
Pseudoseq simulation Fake genomes, fake sequencing, real insights. GitHub last commit
Pyslr assembly Construct a Physical Map from Linked Reads GitHub last commit
QuickDeconvolution assembly Quick and scalable software to deconvolve read clouds from linked-reads experiments without a reference genome GitHub last commit
Samovar variant calling Somatic (mosaic) SNV caller for 10X Genomics data using random forest classification and feature-based filters GitHub last commit
samplot structural variants Plot structural variant signals from many BAMs and CRAMs GitHub last commit
Scaff10x (v5)
Scaff10x (≤v4.1)
assembly Pipeline for scaffolding and breaking a genome assembly GitHub last commit
SpecHLA phasing Reconstructs entire diploid sequences of HLA genes and infers LOH events GitHub last commit
SpLitteR (alt) assembly Repeat resolution in assembly graph using synthetic long reads
stLFRdenovo assebly De Novo assembly pipeline to deal with barcoded reads. It is based on Supernova, with a fastq parsing and sorting module constumized for stLFR data. GitHub last commit
stLFRsv structural variants Structure variation(SV) pipeline for stLFR co-barcode reads GitHub last commit
SuperNova assembly 10x Genomics Linked-Read Diploid De Novo Assembler GitHub last commit
SVenX structural variants Pipeline for SV detection using 10X genomics data GitHub last commit
tenx_utils toolkit Utility functions for 10x data GitHub last commit
Tigmint assembly Correct misassemblies using Linked Reads GitHub last commit
TitanCNA_10x pipeline,structural variants,cancer Snakemake workflow for 10X Genomics WGS analysis using TitanCNA GitHub last commit
Topsorter structural variants, qc Graphic assement of structural variants GitHub last commit
VISOR simulation VarIant SimulatOR for short, long and linked reads GitHub last commit
Valor structural variants Variation discovery using long range information in linked-reads GitHub last commit
WhatsHap phasing,qc,toolkit Read-based phasing of genomic variants, also called haplotype assembly. Implements several tools which work with linked reads GitHub last commit
Wrath structural variants,qc Visualisation and identification of candidate structural variants (SVs) from linked read data GitHub last commit
xTea structural variants Comprehensive TE insertion identification GitHub last commit
ZoomX structural variants Single Molecule Based Rearrangement Analysis with Linked Read Sequencing

Linked Read Platforms

10x Genomics Chromium / GemCode

10x Genomics linked-read technology comes in two versions; the older GemCode (v1) and more recent Chromium Genome (v2). Long DNA fragments are combined in droplets with barcode-containing gel-beads to create GEMs ((Gel Bead-In EMulsions). The fragments are amplified and barcoded using a combination of free random hexamers and barcode-linked random hexamers from the gel beads. Following this barcoded fragments are recovered and fragments before ligation of 3' sequencing adaptor. Libraries are sequenced using Illumina Sequencing. The commercial version of the technology is currently discontinued.

TELL-Seq

TELL-seq is based on the technology from Chen et al. 2020 and is commercially available from the company Universal Sequencing. The method uses clonaly barcode beads with attacted tagmentases to cut and barcode individual long DNA fragments in solution. A second tagmentation is also preformed in solution to introduce a second adaptor. The library is sequenced using Illumina sequencing with special setup to sequence the barcode as index 1.

stLFR

stLFR (single-tube long fragment read) is based on the technology described in Wang et al. 2019 and is commercially available from MGI. The technology uses tagmentation to individually cut-and-hold long DNA fragments in solution. The tagmentase-DNA complex is then hybridized and individual wrapped around barcoded beads through the adaptor introduced by the tagmentation. The barcode is then ligated to each subfragment before recovery and final library prepration. Sequencing is preformed on the DNBSEQ platfroms.

DBS

Droplet Barcode Sequencing (DBS) is based on the technology described in Redin el al. 2019. Long DNA fragments are subjected to tagmentation using Tn5-covered beads to cut, tag and wrap the fragment around the beads. The DNA-wrapped beads are then used in emmulsion PCR along with barcoded oligo. Within each emmulsion droplet the barcode and tagged fragments are amplified independently and then linked using overlap-extension. Barcode-linked fragments are recovered and indexed for Illumina sequencing.

CPT-seq

Technologies based Amini et al. 2014 and the follow-up CPTv2-seq from Zhang et al. 2017. These technologies were developed by Illumina but are not commercially available.

Haplotagging

Haplotagging is based on the technology presented in Meier et al. 2021. The technology uses barcoded beads covered with Tn5 tagmentase to cut and barcode individual long DNA fragments in solution. The beads are coated in a combination of two barcodes AB and CB that become inserted at the 5' and 3' of each cut fragment. Barcodes are combinatorialy generated with about 85 million possible combinations in total.

Contributions

Is some linked-read related tool missing from this resource? Either create a new issue with information about the tool you want to add or submit a pull request with the addition directly.

Credits

Inspired by the collection in Awesome-10x-genomics.