Title: transrate: v1.0.0 alpha 1
Type Software Richard Smith-Unna, cboursnell (2014): transrate: v1.0.0 alpha 1. Zenodo. Software. https://zenodo.org/record/12280
Links
- Item record in Zenodo
- Digital object URL
Summary
transrate v1.0.0 alpha 1
This is the first alpha release of transrate v1.
To install this pre-release, use the following command:
$ gem uninstall transrate $ gem install --pre transrate --version v1.0.0.alpha1 New features The Transrate scoreThe Transrate score is an estimate of the probability that the assembly is correct. A score is produced for the whole assembly, and for each contig. The scoring process uses the reads that were used to generate the assembly as evidence - so if you want to get a Transrate score, you need to run transrate in read-metrics mode (by passing in the reads with --left and --right).
The assembly scoreThe assembly score allows you to compare two or more assemblies made with the same reads. The score is designed so that an increased score is very likely to correspond to an assembly that is more biologically accurate.
The score is calculated as the geometric mean of all contig scores multiplied by the proportion of input reads that provide positive support for the assembly.
Thus, the score captures how confident you can be in what was assembled, as well as how complete the assembly is.
The contig scoreContig scores can be used to filter out bad contigs from an assembly, leaving you with only the well-assembled ones. Examining the distribution of contig scores can also give more detailed insight into the differences between assemblies.
Each contig is assigned a score by measuring how well it is supported by read evidence. The contig score can be thought of as an estimate of the probability that the contig is an accurate, non-redundant representation of a transcript that was present in the sequenced sample
There are five components to the contig score:
The probability that each base has been called correctly. This is estimated using the mean per-base edit distance, i.e. how many changes would have to be made to a read covering a base before the sequence of the read and the covered region of the contig agreed perfectly. The probability that each base is truly part of the transcript. This is estimated by determining whether any reads provide agreeing coverage for a base. The probability that each base is not contained in another contig. This is estimated by considering the root-mean-squared MAPQ score of the reads covering each base. The probability that the contig is derived from a single transcript (rather than pieces of two or more transcripts). This is estimated by assuming that fragments from different transcripts are likely to be generated at different rates, and that this difference is detectable as a difference in coverage distribution. The probability is then calculated using a bayesian sequence segmentation algorithm which models the coverage distribution as a Dirichlet distribution over a reduced set of finite coverage states. The probability that the contig is structurally complete and correct. This is estimated as the proportion of mapped read pairs that agree with the structure and composition of the contig, which in turn is calculated by classifying the read pair alignments.The score is the product of the components.
The score components are useful independently of the contig score, as they can identify contigs that can be treated in different ways to improve the quality of an assembly.
Faster processingWe identified all the major bottlenecks in our code and rewrote large parts of the codebase in C++ to provide an ~20x speedup.
Faster alignmentWe have moved to using the SNAP aligner for an ~20x speedup in read alignment.
Probabilistic assignment of multi-mapping readsWe have moved to using eXpress to select the most likely assignment for each multi-mapping read. This has led to a considerable increase in the usefulness of read-mapping metrics.
More information
- DOI: 10.5281/zenodo.12280
Dates
- Publication date: 2014
- Issued: October 17, 2014
Rights
- info:eu-repo/semantics/openAccess Open Access
Format
electronic resource
Relateditems
Description | Item type | Relationship | Uri |
---|---|---|---|
IsSupplementTo | https://github.com/Blahah/transrate/tree/v1.0.0.alpha.1 | ||
IsVersionOf | https://doi.org/10.5281/zenodo.591478 | ||
IsPartOf | https://zenodo.org/communities/zenodo |