Added PPI section with MHC subsection #638

zietzm · 2017-08-10T15:55:47Z

References #575 and includes MHC-peptide papers

Added a section on Protein-Protein Interactions (PPI) with a subsection on MHC-peptide binding prediction.

@agitter mentioned PPI networks as a possible area of interest in #575, but this has been neglected here for the sake of not adding too much to an already long section. If a PPI network subsection is still desired, I would be more than happy to add one, but I understand the necessity to minimize additional length being added to this paper.

agapow

Looks good - only minor suggestions. My one fear is that this is fairly technically detailed, perhaps moreso than a lot of the rest of the ms.

agapow · 2017-11-08T12:56:19Z

sections/04_study.md

+However, because many PPIs are transient or dependent on biological context, high-throughput methods can fail to capture a number of interactions.
+Additionally, common types of high-throughput screens for PPIs, such as the yeast two-hybrid, can have issues with high rates of false positive results [@doi:10.1186/s12964-015-0116-8 @doi:10.1002/pmic.200800150].
+
+This section will focus on advances in *de novo* PPI prediction.


Might benefit from a more explicit linking statement of the need for PPI prediction and thus DL.

agapow · 2017-11-08T12:57:48Z

sections/04_study.md

+Beyond predicting whether or not two proteins interact, Du et al. [@doi:10.1016/j.ymeth.2016.06.001] showed that a tandem stacked-autoencoder/deep-neural-network method could be used to predict residue contacts for the interfacial regions of interacting proteins.
+A combination of a hidden Markov model with Fisher scores yielded uniform-length features for each residue. Their method significantly exceeded classical machine learning accuracy.
+
+Because many studies used predefined higher-level features, one of the benefits of deep learning— automatic feature extraction— is not fully leveraged.


space before emdash

agapow · 2017-11-08T13:01:12Z

sections/04_study.md

+Because MHCnuggets had to be trained for every MHC allele, performance was far better for alleles with abundant, balanced training data.
+
+In a comparison of several current methods, Bhattacharya et al. found that the top methods— NetMHC, NetMHCpan, MHCflurry, and MHCnuggets— showed comparable performance, but large differences in speed.
+In the authors analysis, convolutional neural networks (in this case, HLA-CNN) showed comparatively poor performance, while shallow and recurrent neural networks performed the best.


Delete "in the authors analysis" as unnecessary

cgreene · 2017-12-18T13:13:59Z

@zietzm were you planning to make modifications as discussed in #689? Wondering if we should wait for that before reviewing this PR.

zietzm · 2017-12-18T21:50:24Z

@cgreene I had planned to incorporate those changes here. Sorry for the delay in getting those updates ready. I want to get the section finished ASAP, and I hope to push some changes by the beginning of next week.

cgreene · 2017-12-18T23:37:11Z

@zietzm 👍 will wait to review further until then

agitter

Thanks for the great contributions. I have several suggestions, and my main comment is to think about how to summarize the many MHC methods. In some places I tried trimming text that isn't critical.

@cgreene do you think we need to further shorten this section? I think that if we can condense some of the MHC paragraphs we'll be okay.

agitter · 2017-12-30T13:02:46Z

content/04.study.md

@@ -424,6 +424,92 @@ summarized above also apply to interfacial contact prediction for protein
 complexes but may be less effective since on average protein complexes have
 fewer sequence homologs.

+### Protein-Protein Interactions
+
+Protein-protein interactions (PPIs) are highly specific and non-accidental physical contacts between proteins which occur for purposes other than generic protein production or degradation [@doi:10.1371/journal.pcbi.1000807].


Comma before which

agitter · 2017-12-30T13:03:27Z

content/04.study.md

@@ -424,6 +424,92 @@ summarized above also apply to interfacial contact prediction for protein
 complexes but may be less effective since on average protein complexes have
 fewer sequence homologs.

+### Protein-Protein Interactions


Capitalize only the first Protein

agitter · 2017-12-30T13:05:02Z

content/04.study.md

+### Protein-Protein Interactions
+
+Protein-protein interactions (PPIs) are highly specific and non-accidental physical contacts between proteins which occur for purposes other than generic protein production or degradation [@doi:10.1371/journal.pcbi.1000807].
+PPIs are key to many cellular processes like metabolism and immune responses.


PPIs are involved in almost all cellular processes. Perhaps we could cut this line? I'm looking for places to shorten the text.

agitter · 2017-12-30T13:08:04Z

content/04.study.md

+PPIs are key to many cellular processes like metabolism and immune responses.
+Abundant interaction data have been generated in-part thanks to advances in high-throughput screening methods, such as yeast two-hybrid and affinity-purification with mass spectrometry.
+However, because many PPIs are transient or dependent on biological context, high-throughput methods can fail to capture a number of interactions.
+Additionally, common types of high-throughput screens for PPIs, such as the yeast two-hybrid, can have issues with high rates of false positive results [@doi:10.1186/s12964-015-0116-8 @doi:10.1002/pmic.200800150].


If we keep this line, the new manubot style requires ; between references.

agitter · 2017-12-30T13:09:05Z

content/04.study.md

+Protein-protein interactions (PPIs) are highly specific and non-accidental physical contacts between proteins which occur for purposes other than generic protein production or degradation [@doi:10.1371/journal.pcbi.1000807].
+PPIs are key to many cellular processes like metabolism and immune responses.
+Abundant interaction data have been generated in-part thanks to advances in high-throughput screening methods, such as yeast two-hybrid and affinity-purification with mass spectrometry.
+However, because many PPIs are transient or dependent on biological context, high-throughput methods can fail to capture a number of interactions.


This sentence alone might be enough to motivate the need for PPI prediction. Then you could cut the line about false positive rates, because a reader might wonder whether computational predictions really have lower false positive rates than Y2H.

agitter · 2017-12-30T13:56:16Z

content/04.study.md

+A way of working with different network types was shown by Gligorijevic et al., [@doi:10.1101/223339] who developed a multimodal deep autoencoder, deepNF, to find a feature representation common among several different PPI networks.
+This common lower-level representation allows for the combination of various PPI data sources towards a single predictive task.
+An SVM classifier trained on the compressed features from the middle layer of the autoencoder outperformed previous methods in predicting protein function.
+The key advancement of this method is the use of deep learning to incorporate higher-order network information for protein function prediction.


This might already be clear enough from the rest of the paragraph. You could cut it.

agitter · 2017-12-30T13:57:03Z

content/04.study.md

+The key advancement of this method is the use of deep learning to incorporate higher-order network information for protein function prediction.
+
+Hamilton et al. addressed the issue of large, heterogeneous, and changing networks with an inductive approach called GraphSAGE [@arxiv:1706.02216v2].
+By finding node embeddings through learned aggregator functions which describe the node and its neighbors in the network, the GraphSAGE approach allows for the generalization of the model to unknown nodes.


Change which to that.

agitter · 2017-12-30T13:58:13Z

content/04.study.md

+
+Hamilton et al. addressed the issue of large, heterogeneous, and changing networks with an inductive approach called GraphSAGE [@arxiv:1706.02216v2].
+By finding node embeddings through learned aggregator functions which describe the node and its neighbors in the network, the GraphSAGE approach allows for the generalization of the model to unknown nodes.
+Generalization to unseen nodes is especially useful for PPI networks, as these networks represent various types of interactions between proteins in a variety of species, and they can be updated frequently.


I don't think I'm following this. Isn't an unseen node a new protein in a PPI network? Do we encounter new proteins? Or is the idea that a trained model generalizes to new graphs?

agitter · 2017-12-30T13:59:11Z

content/04.study.md

+Hamilton et al. addressed the issue of large, heterogeneous, and changing networks with an inductive approach called GraphSAGE [@arxiv:1706.02216v2].
+By finding node embeddings through learned aggregator functions which describe the node and its neighbors in the network, the GraphSAGE approach allows for the generalization of the model to unknown nodes.
+Generalization to unseen nodes is especially useful for PPI networks, as these networks represent various types of interactions between proteins in a variety of species, and they can be updated frequently.
+In a classification task for the prediction of protein function, Chen and Zhu [@arxiv:1710.10568v1] optimized this approach and enhanced the graph convolutional network with a preprocessing step to improve significantly both training time and prediction accuracy.


What is the preprocessing step?

agitter · 2017-12-30T14:02:23Z

content/04.study.md

+They found that MHCnuggets — the recurrent neural network — was by far the fastest training among the top performing methods.
+In predicting interactions between proteins, deep learning has achieved state-of-the-art results and shows promise to overcome previous challenges in the field.
+
+### PPI networks and graph analysis


We also discuss graph convolutions in the drug discovery section and could link those topics. We have a sentence Modern neural networks can operate directly on the molecular graph as input. that could be changed to Modern neural networks, such as those discussed previously for PPI networks, can operate directly on the molecular graph as input.

agitter

Thanks, these are excellent revisions and address all of my initial comments. The only remaining items to resolve before merging are:

two minor commas noted here
decide what you'd like to do with the #### header
resolve conflicts with master

agitter · 2018-01-02T21:43:37Z

content/04.study.md

+
+Shallow, feed-forward neural networks are competitive methods and have made progress toward pan-allele and pan-length peptide representations.
+Sequence alignment techniques are useful for representing variable-length peptides as uniform-length features [@doi:10.1110/ps.0239403; @doi:10.1093/bioinformatics/btv639].
+For pan-allelic prediction, NetMHCpan [@doi:10.1007/s00251-008-0341-z; @doi:10.1186/s13073-016-0288-x] used a pseudo-sequence representation of the MHC class I molecule which included only polymorphic peptide contact residues.


Comma before which

agitter · 2018-01-02T21:46:25Z

content/04.study.md

 MHCflurry's imputation method increases its performance on poorly characterized alleles, making it competitive with NetMHCpan for this task.
+Kuksa et al. [@doi:10.1093/bioinformatics/btv371] developed a shallow, higher-order neural network (HONN) comprised of both mean and covariance hidden units to capture some of the higher-order dependencies between amino acid locations.


Nice improvement, the HONN makes sense now.

agitter · 2018-01-02T21:50:53Z

content/04.study.md


 An important challenge in PPI network prediction is the task of combining different networks and types of networks.
-A way of working with different network types was shown by Gligorijevic et al., [@doi:10.1101/223339] who developed a multimodal deep autoencoder, deepNF, to find a feature representation common among several different PPI networks.
+Gligorijevic et al., [@doi:10.1101/223339] developed a multimodal deep autoencoder, deepNF, to find a feature representation common among several different PPI networks.


Can remove the comma after et al.

", such as those discussed previously for PPI networks,"

zietzm · 2018-01-04T04:01:24Z

@agitter thanks for your help on these sections! I think my last few commits should now have the PR ready.

This build is based on 75f0dc2. This commit was created by the following Travis CI build and job: https://travis-ci.org/greenelab/deep-review/builds/325010006 https://travis-ci.org/greenelab/deep-review/jobs/325010007 [ci skip] The full commit message that triggered this build is copied below: Added PPI section with MHC subsection (#638) * Added PPI and MHC sections * Updates to PPI/MHC subsection * PPI network section * Updates to all PPI sections * Commas and header * Remove accidental newline * Re-add PPI section reference ", such as those discussed previously for PPI networks,"

cgreene mentioned this pull request Nov 3, 2017

Reviews are in! #678

Closed

17 tasks

agapow approved these changes Nov 8, 2017

View reviewed changes

zietzm added 2 commits November 8, 2017 16:10

Added PPI and MHC sections

22ebebb

Updates to PPI/MHC subsection

aca3eb8

agitter mentioned this pull request Nov 9, 2017

Referee 2.2 #689

Closed

agitter added this to the journal-revisions milestone Nov 17, 2017

agitter added the study label Dec 19, 2017

PPI network section

dfce531

zietzm mentioned this pull request Dec 27, 2017

DeepMHC: Deep Convolutional Neural Networks for High-performance peptide-MHC Binding Affinity Prediction #737

Open

agitter requested changes Dec 30, 2017

View reviewed changes

Updates to all PPI sections

48f56d4

agitter requested changes Jan 2, 2018

View reviewed changes

zietzm added 5 commits January 3, 2018 21:38

Commas and header

78aaa0a

Merge branch 'master' into ppi_section

95ed5f3

Merge branch 'master' into ppi_section

32970e4

Remove accidental newline

9a658bc

Re-add PPI section reference

12f0ed8

", such as those discussed previously for PPI networks,"

agitter approved these changes Jan 4, 2018

View reviewed changes

agitter merged commit 75f0dc2 into greenelab:master Jan 4, 2018

agitter mentioned this pull request Jan 4, 2018

Authors since initial submission #561

Closed

7 tasks

This was referenced Jan 13, 2018

Drug discovery and chemical representation #774

Merged

Protein-Protein Interaction subsection #575

Closed

agitter mentioned this pull request Jun 14, 2018

Identify specific deep review example greenelab/meta-review#41

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added PPI section with MHC subsection #638

Added PPI section with MHC subsection #638

zietzm commented Aug 10, 2017

agapow left a comment

agapow Nov 8, 2017

agapow Nov 8, 2017

agapow Nov 8, 2017

cgreene commented Dec 18, 2017

zietzm commented Dec 18, 2017

cgreene commented Dec 18, 2017

agitter left a comment

agitter Dec 30, 2017

agitter Dec 30, 2017

agitter Dec 30, 2017

agitter Dec 30, 2017

agitter Dec 30, 2017

agitter Dec 30, 2017

agitter Dec 30, 2017

agitter Dec 30, 2017

agitter Dec 30, 2017

agitter Dec 30, 2017

agitter left a comment

agitter Jan 2, 2018

agitter Jan 2, 2018

agitter Jan 2, 2018

zietzm commented Jan 4, 2018 •

edited

Loading

		MHCflurry's imputation method increases its performance on poorly characterized alleles, making it competitive with NetMHCpan for this task.
		Kuksa et al. [@doi:10.1093/bioinformatics/btv371] developed a shallow, higher-order neural network (HONN) comprised of both mean and covariance hidden units to capture some of the higher-order dependencies between amino acid locations.

Added PPI section with MHC subsection #638

Added PPI section with MHC subsection #638

Conversation

zietzm commented Aug 10, 2017

agapow left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cgreene commented Dec 18, 2017

zietzm commented Dec 18, 2017

cgreene commented Dec 18, 2017

agitter left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

agitter left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

zietzm commented Jan 4, 2018 • edited Loading

zietzm commented Jan 4, 2018 •

edited

Loading