Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

calculate the LTR insertion time by EDTA #271

Closed
tinyfallen opened this issue May 30, 2022 · 11 comments
Closed

calculate the LTR insertion time by EDTA #271

tinyfallen opened this issue May 30, 2022 · 11 comments
Labels
enhancement New feature or request

Comments

@tinyfallen
Copy link

Hi dear developer,
The LTR_retriever accepts the -u parameter for LTR insertion time calculation while the EDTA not. The user have to recalculate by themselves. For my own, I'm afraid I may make mistakes in the calculation steps. So could you please do a favor to add the -u parameter to EDTA in the next update?
Thanks a lot !

@oushujun oushujun added the enhancement New feature or request label Jun 6, 2022
@oushujun
Copy link
Owner

oushujun commented Jun 6, 2022

Hello @tinyfallen,

This is a good suggestion. I will add -u in the next release.

Shujun

@Chriswinefield
Copy link

Hi Shujun,
Just related to this I was wondering if the EDA output could be used to feed back into LTR_retriever with the -u flag to generate the insertion time calculation, or whether there is a suitable intermediate output file that we could use.

Like @tinyfallen having this flag in EDTA would be fantastic - cant wait for the next release :)

Many thanks Chris

@oushujun
Copy link
Owner

HI Chris,

Thank you for reminding me. Yes, this is on my to-do list and I will try to get it done soon.

Best,
Shujun

@Chriswinefield
Copy link

Chriswinefield commented Jun 16, 2022 via email

@oushujun
Copy link
Owner

Hi Chris,

You can run EDTA_raw.pl with the --type ltr parameter. Then essentially you will be just running LTR_retriever. You may also use the *scn files located in the *raw/LTR/ directory to run LTR_retriever with -inharvest.

Shujun

@oushujun
Copy link
Owner

This parameter (--u) is added to the latest repo and release. Please allow a couple of days to reflect on the conda recepie. Please let me know if you have more questions.

@Chriswinefield
Copy link

Chriswinefield commented Jun 23, 2022 via email

@tinyfallen
Copy link
Author

This parameter (--u) is added to the latest repo and release. Please allow a couple of days to reflect on the conda recepie. Please let me know if you have more questions.

Dear teacher,

That's so fantastic!

Actually, I do have met some problems about the classification of TEs recently.

I found almost half of the final library labelled Unknown which seemed to come from the RepeatModeler steps. So I separately fed the lib to TEsorter and deepTE to perform classification, in which some sequence's classification may be contradictory with each other or the original result of EDTA.

Following comes my questions:

  1. Is it necessary to further classify the unknowns?

  2. The RepeatMasker's out file contains many overlaps between repeats in different repeat types. I found the EDTA's -anno result have a consensus number between the total and the sum of all types' percentages. Could you please tell me how does EDTA deal with the overlaps, or criteria to determine the overlapped genome sequence's classification destiny?

Maybe I should open a new issue to consult the questions, or you could just give me some brief suggestions, thanks!

Best~

@tinyfallen
Copy link
Author

tinyfallen commented Jun 23, 2022

Dear teacher,

I saw the discussion in issue98 and found the genome.mod.EDTA.TEanno.split.gff3, thus I could conduct analysis using this file to exclude overlaps. Descriptions in detail about the outputs could be added into wiki or README if you would like to.

And I still have the interest to know how it accomplished and whether it is necessary to further classify the unknowns.

Wish to see a more powerful and comprehensive EDTA!

Thanks for your excellent work!

@oushujun
Copy link
Owner

Hi @tinyfallen,

Thank you for your suggestions. I have added the following Q&As to the wiki: https://github.com/oushujun/EDTA/wiki/Making-sense-of-EDTA-usage-and-outputs---Q&A
What's the difference between different GFF files?
How to summarize TE annotation in my genome?

For unknown TEs, it's always recommended to classify them as much and accurately as possible, but this is challenging. You may use TEsorter, deepTE, or others to do so, and use --curatedlib to provide the updated library. Always keep in mind, that there will be misclassification.

Best,
Shujun

@tinyfallen
Copy link
Author

Many thanks dear teacher!

Best~

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants