Alignment-Free Rapid Sequence Census Quantification (Kallisto)

Tech ID: 25009 / UC Case 2015-156-0

Brief Description

Sequence census experiments utilize next-generation sequence data to estimate the relative abundance of target sequences. Since the samples are often short DNA fragments, they must first be assigned to the correct transcripts and genes that produced them, and this alignment or mapping step currently takes up the majority of computing power and time in most expression analyses.

To avoid this costly process, UC Berkeley researchers have developed a software program (kallisto) for quantifying abundances of transcripts from RNA-Seq data, or more generally of target sequences using high-throughput sequencing reads based on pseudo-alignment for rapidly determining the compatibility of reads with targets, without the need for alignment. Pseudo-alignment of reads preserves the key information needed for quantification.

Suggested uses

RNA-Seq analysis and quantification
Sequence census experiments for copy number variation analysis of DNA sequencing
High-throughput sequencing

Advantages

Hundreds of times faster than standard tools (can quantify 30 million human reads in less than 3 minutes)
Increased accuracy, especially compared to programs that shred reads into k-mers
Tractable to use the bootstrap to determine uncertainty on estimates and includes infrastructure for managing the large amount of data associated with bootstrapped samples

Contact

Learn About UC TechAlerts - Save Searches and receive new technology matches

Inventors

Bray, Nicolas L.
Pachter, Lior S.
Pimentel, Harold J.

Other Information

Keywords

RNA-Seq; Next-generation sequencing; High-throughput; sequence census

Categorized As

Biotechnology
- Bioinformatics
- Genomics
Medical
- Research Tools
- Software
Research Tools
- Bioinformatics
- Nucleic Acids/DNA/RNA