This is a limited proof of concept to search for research data, not a production system.

Search the MIT Libraries

Title: Curation and ISA representation of a SARS-Cov2/Covid-19 Proteomics Dataset - PXD107710 - ISA representation

Type Dataset Philippe Rocca-Serra, Susanna Assunta Sansone (2020): Curation and ISA representation of a SARS-Cov2/Covid-19 Proteomics Dataset - PXD107710 - ISA representation. Zenodo. Dataset. https://zenodo.org/record/3742219

Authors: Philippe Rocca-Serra (University of Oxford) ; Susanna Assunta Sansone (University of Oxford) ; Steffen Neumann (IPB Halle) ;

Links

Summary

Curation and ISA representation of a SARS-Cov2/Covid-19 Proteomics Dataset deposited in PRIDE database with accession number: PXD107710

ISA-Tab annotation for the  "SARS-CoV-2 infected host cell proteomics reveal potential therapy targets" publication. 

Github repository: https://github.com/ISA-tools/PXD017710

This is part of an effort to (re-)annotate: https://dx.doi.org/10.21203/rs.3.rs-17218/v1

Additional work done as part of:

 https://github.com/virtual-biohackathons/covid-19-bh20  https://github.com/virtual-biohackathons/covid-19-bh20/wiki/FairData

Proteomics data

Available from PRIDE at https://www.ebi.ac.uk/pride/archive/projects/PXD017710 and [MassIVE/CCMS Maestro+MSstats reanalysis of MSV000085096 / PXD017710]

ISA-Tab representation:

Rationale: Demonstrate suitability of the ISA format for representing MS based protein profiling experiment with more granularity and details, thus providing a better representation of the experiment design. The formatting and re-annotation are based on information extracted from: - the original publication - the supplementary tables available from the publishers site - the 'filtered-results.csv' helper file as supplied to @sneumann during the HUPO-PSI meeting March 2020

Viewing the ISA-tab formatted and re-annotated PXD017710 with ISATab-Viewer

Viewing the ISA-tab formatted and re-annotated PXD017710 locally, do the following:

```bash python -m http.server 8000 ```

Then point your browser to `http://0.0.0.0:8000/isaviewer-demo.html`

Curation tasks performed:

* initial structure of the study design in ISA format:

* linkage of Proteome and Translatome data (supplementary material) to ISA assay tables (via Derived Data File)

* processing the Proteome and Translatome data (supplementary material) with python pandas library to generate the following csv files:

    - proteome_intensities_long_table_ggplot2.txt     - proteome_diffanal_ratio_pvalue_long_table_ggplot2.txt     - translatome_intensities_long_table_ggplot2.txt         - translatome_diffanal_ratio_pvalue_long_table_ggplot2          The files are `long table` corresponding to a `melt` on the Excel file originally generated by the users and can be readily loaded in R ggplot2 library for graphical representation.     The statistical relevant elements have been annotated with the STATO ontology and the tables comply with a Frictionless.io Data Package.     The jupyter notebook for the transformation is available.

* conversion of raw data to mzML format: detailed in https://github.com/ISA-tools/PXD017710

install docker:  ```bash         >brew update         >brew install docker ```

sign in to docker ```bash         >docker start         >docker login ```

pull docker container for ProteoWizard: ```bash >docker pull chambm/pwiz-i-agree-to-the-vendor-licenses ```

:warning: be sure to sign-up and login to https://hub.docker.com/

in order to be able to reach

https://hub.docker.com/r/chambm/pwiz-skyline-i-agree-to-the-vendor-licenses

run the pwiz tool from the container over the raw data: ```bash  docker run -it --rm -e WINEDEBUG=-all -v /Users/Downloads/PXD017710/raw/:/data chambm/pwiz-skyline-i-agree-to-the-vendor-licenses wine msconvert /data/*.raw --mzML ```

* ontology markup for:     * declaration of independent variables as ISA Study Factors:{biological agent, dose, time point, replicate} ->OBI     * Taxonomic information (host cells and virus) -> NCBITaxonomy     * Cell line: CaCo-2 cells -> Cell Line Ontology     * Disease: Colon Cancer -> Human Phenotype Ontology     * MS specific aspect (TMT reagent, instrument ... ) -> PSI-MS     * Statistical Tests -> STATO

Unresolved curatorial issues:

 1. ambiguities related to Tandem Mass Tag labelling protocol     - the publication mentions TMT11 (see Figure 2 in https://www.researchsquare.com/article/rs-17218/v1)     - the information available from PRIDE mentions TMT6 (https://www.ebi.ac.uk/pride/archive/projects/PXD017710)     This may require another round of annotation on the TMT agents and fractions in the ISA a_assay representation

 2. SARS-Cov2 isolate: no clear NCBI Taxonomic anchoring and unclear origin: -> the markup is made to the parent class (as of 06.04.2020)

Release and packaging as a BDBAG:

The tgz file associated with this upload has been producing using https://github.com/fair-research/bdbag. It contains several manifest files detailing metadata and data files, providing md5 and sha256 checksums.

Github repository: https://github.com/ISA-tools/PXD017710

More information

  • DOI: 10.5281/zenodo.3742219
  • Language: en

Subjects

  • FAIR data, Proteomics, mass spectrometry, SARS-Cov2, Covid-19, Caco2 cell line, treated versus control intervention design, ISA format, STATO ontology, bdbag, FAIRsharing

Dates

  • Publication date: 2020
  • Issued: April 06, 2020

Rights


Much of the data past this point we don't have good examples of yet. Please share in #rdi slack if you have good examples for anything that appears below. Thanks!

Funding Information

AwardnumberAwarduriFunderidentifierFunderidentifiertypeFundername
802750info:eu-repo/grantAgreement/EC/H2020/802750/10.13039/100010661Crossref Funder IDEuropean Commission

Format

electronic resource

Relateditems

DescriptionItem typeRelationshipUri
Referenceshttps://doi.org/10.21203/rs.3.rs-17218/v1
Citeshttps://www.ebi.ac.uk/pride/archive/projects/PXD017710
IsVersionOfhttps://doi.org/10.5281/zenodo.3742218
IsPartOfhttps://zenodo.org/communities/covid-19
IsPartOfhttps://zenodo.org/communities/zenodo