This is a limited proof of concept to search for research data, not a production system.

Search the MIT Libraries

Title: Discovery of tandem and interspersed segmental duplications using high throughput sequencing

Type Dataset Soylev, Arda, Le, Thong, Amini, Hajar, Alkan, Can, Hormozdiari, Fereydoun (2019): Discovery of tandem and interspersed segmental duplications using high throughput sequencing. Zenodo. Dataset. https://zenodo.org/record/2611109

Authors: Soylev, Arda (Bilkent University) ; Le, Thong (University of California, Davis) ; Amini, Hajar (University of California, Davis) ; Alkan, Can (Bilkent University) ; Hormozdiari, Fereydoun (University of California, Davis) ;

Links

Summary

We developed novel algorithms to accurately characterize tandem, direct and inverted interspersed segmental duplications using short read whole genome sequencing data sets. We integrated these methods to our TARDIS tool, which is now capable of detecting various types of SVs using multiple sequence signatures such as read pair, read depth and split read. We evaluated the prediction performance of our algorithms through several experiments using both simulated and real data sets. In the  simulation experiments, using a 30x coverage TARDIS achieved 96% sensitivity with only 4% false discovery rate. For experiments that involve real data, we used two haploid genomes (CHM1 and CHM13) and one human genome (NA12878) from the Illumina Platinum Genomes set. Comparison of our results with orthogonal PacBio call sets from the same genomes revealed higher accuracy for TARDIS than state of the art methods. Furthermore, we showed a surprisingly low false discovery rate of our approach for discovery of tandem, direct and inverted interspersed segmental duplications prediction on CHM1 (less than 5\% for the top 50 predictions). 

More information

  • DOI: 10.5281/zenodo.2611109
  • Language: en

Subjects

  • genomics, structural variation, TARDIS, simulation

Dates

  • Publication date: 2019
  • Issued: March 27, 2019

Notes

Other: Here we deposit current versions of TARDIS (1.0.2) and CNVSim, and all predictions, truth sets, and the CRAM files for the simulation data.

Rights


Much of the data past this point we don't have good examples of yet. Please share in #rdi slack if you have good examples for anything that appears below. Thanks!

Format

electronic resource

Relateditems

DescriptionItem typeRelationshipUri
IsVersionOfhttps://doi.org/10.5281/zenodo.2611108
IsPartOfhttps://zenodo.org/communities/zenodo