Recent progress in next-generation sequencing has greatly facilitated our study of genomic structural variation. Unlike single nucleotide variants and small indels, many structural variants have not been completely characterized at the nucleotide resolution. Deriving the complete sequences underlying such breakpoints is crucial for not only accurate discovery, but also the functional characterization of altered alleles. However, our current ability to determine such breakpoint sequences is limited because of challenges in aligning and assembling short reads. To address this issue, we developed a targeted iterative graph routing assembler, TIGRA, which implements a set of novel data analysis routines to achieve effective breakpoint assembly from next-generation sequencing data. In our assessment using data from the 1000 Genomes Project, TIGRA was able to accurately assemble the majority of deletion and mobile element insertion breakpoints, with a substantively better success rate and accuracy than other algorithms. TIGRA has been applied in the 1000 Genomes Project and other projects, and is freely available for academic use.
From the abstract, it seems like an interesting paper. Wish we could say more, but the paper is locked up right now. We do not understand what pleasure these researchers get in doing all these hard work and locking up their papers in inaccessible journals.
A little more from the supplement -
TIGRA is a computer program that performs targeted local assembly of structural variant (SV) breakpoints from next generation sequencing short-read data. It takes as input a list of putative SV calls and a set of bam files that contain reads mapped to a reference genome such as NCBI build36. For each SV call, it assembles the set of reads that were mapped or partially mapped to the region of interest (ROI) in the corresponding bam files. Instead of outputing a single consensus sequence, tigra attempts to construct all the alternative alleles in the ROI as long as they received sufficient sequence coverage (usually >= 2x). It also utilizes the variant type information in the input files to select reads for assembly. Tigra is effective at improving the SV prediction accuracy and resolution in short reads analysis and can produce accurate breakpoint sequences that are useful to understand the origin, mechanism and pathology underlying the SVs.