Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Documentation on how to return proteomics data to MoBa genetics #1

Open
EvenBirkeland opened this issue Dec 7, 2022 · 0 comments
Open

Comments

@EvenBirkeland
Copy link

How to return proteomics data to MoBA genetics.
Background information:
Over the last 15 years proteomics has become a household method in medical and natural sciences research. Proteomics is defined as the study of all proteins in a biological system. This could be a cell line, a tumor or plasma from blood etc.
Typically, proteomics data are generated in two ways, either by mass-spectrometry or by using a protein array technology. However, there are several different protein arrays and several different mass-spectrometry techniques.
When the different types of data generated using these techniques are run through a normalization and quality control pipeline this is also different depending on data type, software’s and the purpose of the experiment.
So in order to return proteomics data to MoBA we need detailed information about the samples used for the experiment, the experimental design setup, the technique used to generate the data, and how normalization and quality control is done.
Together with documents explaining how data is generated we need the raw files for each sample together with intermediate files if produced. We also require a fully normalized and quality controlled dataset returned to MoBa merged with metadata (data from MoBA questionnaires, or health registries) used in the experiment.

Mass-spectrometry based proteomics.

There is already a standards initiative for mass spectrometry based proteomics called MIAPE (The Minimum Information About a Proteomics Experiment, https://www.psidev.info/miape). We require researchers to follow to follow these guidelines.
However, to make it more simple please follow the setup in PRIDE (https://www.ebi.ac.uk/pride/markdownpage/pridesubmissiontool#submission_details). PRIDE is also a very good example of how we want MoBa digital bio data to be organized.

In short:

  1. Sample information.
  2. Experimental setup.
  3. Materials and methods: Add information about normalization, database search, library generation etc.
  4. Add files and assign filetypes: Make sure that raw files contain both mass-spectrometer setup and HPLC setup.
  5. Assign the relationship to submitted files.
  6. Additional submitted metadata.

Protein array-based proteomics.
For protein arrays the MIAME (Minimum Information About a Microarray Experiment) concept is used (https://www.ncbi.nlm.nih.gov/geo/info/MIAME.html, https://www.nature.com/articles/ng1201-365). Although this concept is made for RNA microarrays the principles apply for protein arrays.

The six most critical elements contributing towards MIAME are:
 Raw data for each assay
 Final processed (normalized) data for the set of assays in the study (e.g., the gene expression data count matrix used to draw the conclusions in the study)
 Essential sample annotation (e.g., tissue, sex and age) and the experimental factors and their values (e.g., compound and dose in a dose response study)
 Experimental design including sample data relationships (e.g., which raw data file relates to which sample, which assays are technical, which are biological replicates)
 Sufficient annotation of the array or sequence features examines (e.g., gene identifiers, genomic coordinates, protein accession keys (Uniprot))
 Essential laboratory and data processing protocols (e.g., what normalization method has been used to obtain the final processed data)
.
If you have any questions or suggestions for improvements, please add your enquiries to our Slack channel #mobagenetics.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant