-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Choices of upper_rank and tf_type #16
Comments
Hi! Thanks for this new question, happy to see that you are getting familiar with our tool. Regarding the elbow analysis, theoretically you can increase the For selecting the number of factors, try to use a realistic upper limit. We put just 25 since that's a large enough for us. However, if you are willing to handle even more –considering that the more factors you use, the more factors you have to interpret– you can increase the About the I hope this is clear enough, otherwise let me know. Erick |
Many thanks, Erick. Your reply is very helpful, as always. The idea of elbow analysis based on the similarity of decompositions sounds promising. Actually I also tried to increase the rank a bit manually to compare the decompositions with those from auto rank. Looking forward to testing your new elbow analysis method in the future. |
Hi @deepcompbio The elbow analysis based on similarity is now available in the v0.6.2 in this PR #17. The way to use it is to add the parameter
Also, if the curve looks odd, you can smooth it by passing the parameter |
Hi @earmingol This is cool and fast. Many thanks for adding this functionality. I will give it a try. On a relevant topic, I have been recently experimenting with different ranks in factorization, from a dozen to a few hundreds (until my GPU memory runs out). It seems that higher # of factors yields higher # of interesting ligand-receptor interactions (some of the interactions are known in literature for the particular disease I'm studying). Thus the question is how could one determine which rank is sufficient for factorization from the biological perspective? Thanks. |
That's something to expect, the more factors you use the better is the resolution of your results. As we explained in this post, tensor decomposition approximates the original data while capturing the most prominent patterns. In that same post, if you think of the decomposition of a picture (Fig. 1d-e), you should get a reconstructed picture that looks like as the original, and the resolution should improve as you increase the number of factors (in the example we used only 3 factors, but if we would have used 100 factors instead, the reconstructed image would look way more similar to the original). That said, there is a trade-off that you have to deal with, and it is how many factors you are willing to interpret vs how much resolution you want. The elbow analysis helps to have a decent trade-off of both, but it is not necessarily the most adequate way, and it's always up to you what criteria to use for selecting the number of factors. Hope this helps! |
@earmingol Thanks for your helpful answer. |
Hi Erick,
The choice of upper_rank has a significant impact on the automatically determined rank. They go in the same direction. How would one make the choice of upper_rank in elbow analysis and if needed a manual rank for eventual factorization?
And for tf_type, there are four types. Except that 'non_negative_cp_hals' does not allow a mask, how would one choose among the rest three factorization methods?
Thanks.
The text was updated successfully, but these errors were encountered: