This is a limited proof of concept to search for research data, not a production system.

Search the MIT Libraries

Title: Filtered and annotated SNV and indel variants in the PC3 and LNCaP human prostate cancer cell lines

Type Dataset Seim,Inge (2017): Filtered and annotated SNV and indel variants in the PC3 and LNCaP human prostate cancer cell lines. Zenodo. Dataset.

Author: Seim,Inge (Queensland University of Technology) ;



150bp paired-end reads (insert size 350bp) were obtained using the Illumina HiSeqX sequencer. Samtools v1.3.1 mpileup and bcftools were used to interrogate indexed BAM files, from whole-genome reads aligned to human reference genome GRCh38 build 82, and generate a VCF (Variant Call Format) file of single nucleotide variants (SNVs) and short indel variants. Variants private, or unique to a particular cell line, or shared by both were next identified. Variants (likely to be common germline variants) present in HapMap, 1000 genomes phase 3 (2,504 human genomes), and the National Heart Lung and Blood Institute’s Exome Sequencing Project (ESP) (bundled variant data file available at were excluded. Variant files (VCF) were filtered using SnpSift  with the following parameters: 'QUAL \textgreater= 200 \&\& DP \textgreater= 30', where QUAL denotes minimum variance confidence and DP total depth threshold. Filtered variants were annotated using SnpEff v4.3g. Please see for associated scripts.


More information

  • DOI: 10.5281/zenodo.245431


  • whole-genome sequencing, genome, VCF, SNV, indel, PC3, LNCaP, prostate cancer


  • Publication date: 2017
  • Issued: January 15, 2017


Much of the data past this point we don't have good examples of yet. Please share in #rdi slack if you have good examples for anything that appears below. Thanks!


electronic resource


DescriptionItem typeRelationshipUri