You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am interested in exploring alignments and gene trees of the HOG orthogroups produced by OrthoFinder to evaluate / spot check OrthoFinder output. I am also interested in using HOG alignments and gene trees for species tree generation using other tools. It seems like HOG orthogroup alignments and trees are not available in the OrthoFinder output - vs OG orthogroup alignments and trees are available. Is this correct?
As further background:
My understanding of Orthofinder OG vs HOG orthogroups is that OGs are initially produced in the pipeline - and that a given OG may include two paralogous gene families due to over clustering by Orthofinder. HOGs are produced at a later stage, wherein Orthofinder goes back and identifies over-clustered OG orthogroups and splits them into separate fully orthologous orthogroups. This updated set of orthogroups (HOGs) consists of 1) previously correctly clustered OG orthogroups plus 2) previously over-clustered and then split OG orthogroups.
HOG orthogroups are found in the N0.tsv file - and as I understand it, they are recommended for phylogenetic analysis, as they are highest quality OrthoFinder orthogroups, containing gene families of strictly orthologous genes.
Going through output of OrthoFinder2 and now the new OrthoFinder3, I can locate OG orthogroup alignments in the Working Directory - and OG orthogroup trees in the Resolve Gene Trees directory. I am not locating any alignments or gene for the HOG orthogroups. Is this correct, no HOG orthogroup alignments or trees are produced by OrthoFinder due to how the pipeline works? If not, would it be possible to include an option to have them produced in the future?
Thank you very much :) Eric
The text was updated successfully, but these errors were encountered:
This will be changing in the full release of orthofinder3 (out in the next few months!) - we will be reporting gene trees, alignments, sequence files etc. for the N0 hierarchical orthogroup
In the meantime, these can be made quite easily if you need a specific one - the N0.tsv file tells you what node of the gene tree the HOG orthogroup comes from, and you can use this info to trim the tree. You can also use the identity of the genes to trim the alignment and sequence files
That will be great to have included in the output! That is a good idea to leverage the gene tree node indicated in the NO tsv. However, alignments and so their subsequent trees can be sensitive to sequences included or not - especially for orthologous but divergent sequences like I am working with in deep evolution - and so it could be better to do fresh alignments of any HOGs that are a result of the OG being split.
For now I am just doing all HOGs fresh to make sure the same settings are used in Mafft and FastTree for all sequences - or is there a way to know the specific settings used in building the OG alignments and trees within OrthoFinder? Then I could potentially run fewer HOG tree pipelines and just do the subset that result from an OG split.
Hi!
I am interested in exploring alignments and gene trees of the HOG orthogroups produced by OrthoFinder to evaluate / spot check OrthoFinder output. I am also interested in using HOG alignments and gene trees for species tree generation using other tools. It seems like HOG orthogroup alignments and trees are not available in the OrthoFinder output - vs OG orthogroup alignments and trees are available. Is this correct?
As further background:
My understanding of Orthofinder OG vs HOG orthogroups is that OGs are initially produced in the pipeline - and that a given OG may include two paralogous gene families due to over clustering by Orthofinder. HOGs are produced at a later stage, wherein Orthofinder goes back and identifies over-clustered OG orthogroups and splits them into separate fully orthologous orthogroups. This updated set of orthogroups (HOGs) consists of 1) previously correctly clustered OG orthogroups plus 2) previously over-clustered and then split OG orthogroups.
HOG orthogroups are found in the N0.tsv file - and as I understand it, they are recommended for phylogenetic analysis, as they are highest quality OrthoFinder orthogroups, containing gene families of strictly orthologous genes.
Going through output of OrthoFinder2 and now the new OrthoFinder3, I can locate OG orthogroup alignments in the Working Directory - and OG orthogroup trees in the Resolve Gene Trees directory. I am not locating any alignments or gene for the HOG orthogroups. Is this correct, no HOG orthogroup alignments or trees are produced by OrthoFinder due to how the pipeline works? If not, would it be possible to include an option to have them produced in the future?
Thank you very much :) Eric
The text was updated successfully, but these errors were encountered: