-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to run these batch scripts? #2
Comments
Hi Ray, Sorry about the delay in response, and the confusion surrounding getting HYPHY to run correctly. The issue seems to be with setting the paths correctly for your input files so HYPHY can find them. I describe here a simple example set up that you can follow, and familiarize yourself with the path structure of HYPHY. First, make sure you are in the hyphy directory. This is the directory where the HYPHYMP executable resides if hyphy is installed successfully (i.e through make install). Lets call this directory /path/hyphy (please change /path to the specific one in your system). Second, make a directory, called test_run in /path/hyphy that will have a batch file you want to run, and a test alignment file (eg: BranchSites_delta_null.bf batch file and knownGene.uc001hmo.1.1.testHyPhyBS alignment file from github). Third, go to the test_run directory and run HYPHYMP from there. Make sure the BranchSites_delta_null.bf file has the correct paths to the input alignment file, and chooseGeneticCode.def. If you followed this example, those paths should look like: filepath = "knownGene.uc001hmo.1.1.testHyPhyBS"; Note that the path in LoadFunctionLibrary is relative to where you are running HYPHYMP from. So in our example set-up from the test_run directory you need to go two levels up to get to the hyphy directory ("../../hyphy"). Hope this is clear, and you can get some examples to work correctly! Let me know if you have any additional issues with this. |
Also, regarding your question on specifying foreground and background branches: There's a line that specifies the foreground branch in BranchSites_delta.alt.bf that needs to be edited to specify a foreground branch of interest: ExecuteCommands ("givenTree."+"hg18"+".nonSynRate:=omega_FG*givenTree."+"hg18"+".synRate;"); refers to the hg18, or human branch in the example alignment file "knownGene.uc001hmo.1.1.testHyPhyBS". Please change "hg18" to whatever foreground branch you are interested in to estimate omega_2. You would also make this change in the BranchSites_delta.null.bf. Note that foreground omega (or omega_2) is only defined for the selection model. |
Dear Aarti, Congratulations on your paper and thank you for the detailed example! I was confused because I though the script accepts parameters from the command line. I am trying to run it on ~13000 genes, so it's a bit difficult to change the source file every time. About specifying the foreground branch, what do I need to do if I want to specify a whole clade (including multiple tips and internal branches) instead of a single species as the foreground? In CodeML you can mark the tree directly using "#1" and "$1" notations. Is it possible to implement a similar functionality in your scripts? Alternatively, do I need to name all the branches in the tree, and refer to them by name in the "ExecuteCommands" statement? Either way, is it possible to provide a working example? Thanks for your help! Ray |
Hi Ray, Sorry, right now we don't have the functionality for accepting arguments from the command line; The way out for you is to programmatically write a script to change the name of the input file/foreground branch for each gene you are interested in. Also, we have only tested our batch file on a single species foreground branch -- the batch file we provide is a modification of the YangNielsenBranchSite.bf that is available with the hyphy package, designed for this purpose. Aarti |
Dear Aarti,
thank you for the clarification. While the input file is an easy
fix, it would be nice if you can develop a version for multiple foreground
branches.
Previous CodeML simulations have shown better performance with
more taxa, it would be interesting to know if this is also true for MNM.
Best Regards,
Ray
…On Wed, Aug 1, 2018 at 5:40 PM, Aarti Venkat ***@***.***> wrote:
Hi Ray,
Sorry, right now we don't have the functionality for accepting arguments
from the command line; The way out for you is to programmatically write a
script to change the name of the input file/foreground branch for each gene
you are interested in. Also, we have only tested our batch file on a single
species foreground branch -- the batch file we provide is a modification of
the YangNielsenBranchSite.bf that is available with the hyphy package,
designed for this purpose.
Aarti
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#2 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABlYwfDSsFJPQ0Bosnk9gcMb0ikj4YKqks5uMcvugaJpZM4Vcbir>
.
|
Dear Aarti I read your paper "MNM cause false inferences of lineage-specific positive selection" and I wanted to apply the BS+MNM test of positive selection to my data using the batch file you provide in I am having problems running the data using a new version of Hyphy. My question is if the batch available can only be run using Hyphy 2.2.6 version or it does not matter and probably I am not following the right path. Unfortunately I can not install this version in the computers do to incompatibility. Do you have any suggestions. Best regards Tibisay |
Hi Tibisay, |
Hi Aarti Thanks for your reply. I will try and follow what you suggest. Hopefully it will work. I will let you know. If this does not work are you by chance going to test your code in a newer version of Hyphy. Thanks |
Hi Aarti Sorry for the long message. We manage to run the code in a newer version of Hyphy, and also in an older version of Hyphy. We used your data to test if it works and we notice that depending on the hyphy version the parameter values due change slightly, however which one provides better values is difficult to know. We are not clear if the batch only runs only once or if we need to run it 50 times in order to get this ML distribution and the median as shown in Supplementary figure 3. Or this run already represents the result of the 50 replicates? and the Lnl value represents the best fit from the 50 replicates. Please clarify From the results we were not sure about the meaning of certain values, as we could not find any explanation of the meaning of the out-file test results obtained after running the batch. Could you please clarify where are the branch length, the Lnl, np for example after running the BS+MNM null test once, then results showed global delta =value; after the line that says: Just to clarify if we are following the correct steps
Can this code be run in a concatenated data set or only can be run for individual genes? Thanks |
Hello,
I copied all bf files to the "TemplateBatchFiles" folder under the hyphy 2.2.6 installation folder. I modified the relative path to ./TemplateModels/chooseGeneticCode.def in the bf scripts.
However when I ran the script, it just shows an error:
Error:
Could not find source dataset file:filepath Path stack: {/beegfs/group_dv/software/source/hyphy-2.2.6/installed/lib/hyphy/,/beegfs/group_dv/software/source/hyphy-2.2.6/installed/lib/hyphy/TemplateBatchFiles/}
Function call stack
1 : Read Data Set ds from file filepath
Segmentation fault
What is the correct way to run these files?
Best Regards,
Ray
The text was updated successfully, but these errors were encountered: