Sequence census experiments utilize next-generation sequence data to estimate the relative abundance of target sequences. Since the samples are often short DNA fragments, they must first be assigned to the correct transcripts and genes that produced them, and this alignment or mapping step currently takes up the majority of computing power and time in most expression analyses.
To avoid this costly process, UC Berkeley researchers have developed a software program (kallisto) for quantifying abundances of transcripts from RNA-Seq data, or more generally of target sequences using high-throughput sequencing reads based on pseudo-alignment for rapidly determining the compatibility of reads with targets, without the need for alignment. Pseudo-alignment of reads preserves the key information needed for quantification.
RNA-Seq; Next-generation sequencing; High-throughput; sequence census