-
Notifications
You must be signed in to change notification settings - Fork 77
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Split genome into several sequences #142
Comments
Good idea. You may check the EDTA_raw code to find the check points and
mock the raw results before the check point, then it will pick them up as
if they were generated from one run. Let me know if it works!
Best,
Shujun
…On Mon, Dec 28, 2020 at 8:24 PM Zea1nfO ***@***.***> wrote:
Hi shujun
Due to some reason, i wanna do TE annotation in this way:
a) split large genome into several sequences
b) run EDTA_raw.pl with each sequence for each type(ltr, helitron, tir)
c) combine all sequences raw results to generate the large genomes all
three type raw result.
d) then run EDTA.pl to finish the whole genome TE annotation.
Can it works?
Thanks a lot.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#142>, or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABNX4NEBPQXUIFTTHE35OUDSXB2JPANCNFSM4VL66IGQ>
.
|
Hi shujun Besides , i try EDTA_raw.pl (ltr) with a 10Mb sequence on SGE system before, but there was no error message. This time i try it with a sequence which is about 200Mb, the error message came out. Do you got any idea to solve it? |
There seems to be a fork issue - try to lower the CPU number to avoid
system resource drainage. Also I checked the LTR_retriever code, it
requires the input sequence order (-genome) matching the candidate sequence
order (-inharvest), so providing separate runs of LTRharvest/LTR_FINDER for
LTR_retrieve may confuse the program and make it run into errors. You may
run LTR_retriever separately for these batches and concatenate their
results to mock EDTA_raw.
Best,
Shujun
…On Thu, Dec 31, 2020 at 4:38 PM Zea1nfO ***@***.***> wrote:
Hi shujun
I have test the method mentioned above on a SGE system. But it encounter
some error message.
After i submit the job, the EDTA_raw.pl (ltr) is good at first.
However ,when it came to LTR_retriever step, the error message came out.
The error is just like this:
*sh: fork: retry: No child processes Can't fork, trying again in 5 seconds
at ${path to
conda}/anaconda3/envs/edta/share/LTR_retriever/bin/align_flanking.pl
<http://align_flanking.pl> line 76. sh: fork: Resource temporarily
unavailable*
Do you got any idea to solve it?
Thanks a lot.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#142 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABNX4NHBISLOZH6FM43ZIA3SXQZ6VANCNFSM4VL66IGQ>
.
|
OK, i will try it as you advised. |
Hi shujun |
Did you successfully run a normal LTR_retriever without splitting the genome before? You need to confirm the program is running properly, then try on different experiments. Also, I would suggest splitting a small genome (ie. Arabidopsis) into two and test on these files first, before using your large genome files. Best, |
Hi shujun Sorry about so many questions, hope to help make EDTA to be better. |
Hello, Thanks for reporting the errors. It seems like they are random and quite rare, and as you mentioned, it would generate results despite these random errors. Because these errors are on a single sequence-basis, which will not have a huge impact on the overall annotation if not occurred in all sequences. I will leave them for the moment unless more reports show up. For your split genome experiments, can you describe your processes in more detail? Also, testing on an interactive node locally may better help to find the cause. Best, |
Please find more discussions in #175. |
Hi,Shujun,
Hi Shuju, Best wish! |
Hi putao,
There aren't big differences in the later updates, or at least you won't
see big differences Most of the time if your case is not applicable for the
improvements. Please check out the release note for more details. If you
can reproduce an error with the latest version, then I can take a look at
your case.
Best,
Shujun
…On Fri, May 21, 2021 at 11:39 AM C-grapes ***@***.***> wrote:
Hi shujun
I had tried the EDTA_raw.pl with a 10 Mb genome. And there was no error.
But when i try it with larger genome(such as 100Mb), the error came out.
And i am sure about that i give it enough memory when i submit the job on
the SGE system.
Besides, i encounter two strange erorrs recently:
a) "*Use of uninitialized value $lLTR_length in string ne at ${path to
EDTA}/EDTA-1.9.6/util/rename_LTR_skim.pl line 29, line 20330.*"
I see this error in v1.9.4 and v1.9.6, but it seems that this error
doesn`t affect the all process to generate result.
b)"*Thread 27 terminated abnormally: substr outside of string at ${path
to EDTA}/EDTA-1.9.6/util/cleanup_nested.pl <http://cleanup_nested.pl> line
190.*"(in new v1.9.6)
It seems that the new cleanup_nested.pl got some flaws.
Sorry about so many questions, hope to help make EDTA to be better.
Thanks a lot.
Hi,Shujun,
Hi shujun
I had tried the EDTA_raw.pl with a 10 Mb genome. And there was no error.
But when i try it with larger genome(such as 100Mb), the error came out.
And i am sure about that i give it enough memory when i submit the job on
the SGE system.
Besides, i encounter two strange erorrs recently:
a) "*Use of uninitialized value $lLTR_length in string ne at ${path to
EDTA}/EDTA-1.9.6/util/rename_LTR_skim.pl line 29, line 20330.*"
I see this error in v1.9.4 and v1.9.6, but it seems that this error
doesn`t affect the all process to generate result.
b)"*Thread 27 terminated abnormally: substr outside of string at ${path
to EDTA}/EDTA-1.9.6/util/cleanup_nested.pl <http://cleanup_nested.pl> line
190.*"(in new v1.9.6)
It seems that the new cleanup_nested.pl got some flaws.
Sorry about so many questions, hope to help make EDTA to be better.
Thanks a lot.
Hi Shuju,
I ran into the same problem as mentioned here. I installed
noarch/edta-1.9.6-0.tar.bz2 and edta-1.9.6-hdfd78af_2.tar.bz2 through
conda. When I first started running the package
edta-1.9.6-hdfd78af_2.tar.bz2, it went smoothly without any errors. Later,
I switched to the conda environment of 1.9.6.0 once, but I switched back to
the conda environment of 1.9.6.2 again. When running, there is always this
error:
Thread 16 terminated abnormally: substr outside of string at
${path}/test2_edta-1.9.6-hdfd78af_2/share/EDTA/util/cleanup_nested.pl
line 190.
Use of uninitialized value $seq_new in substr at
${path}/test2_edta-1.9.6-hdfd78af_2/share/EDTA/util/cleanup_nested.pl
line 190.
I don’t know where is the problem? Also, is there a big difference between
these two packages? I found that the results of their running did not seem
to be very different.
Best wish!
Putao
—
You are receiving this because you modified the open/close state.
Reply to this email directly, view it on GitHub
<#142 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABNX4NC2YBRMIN2BLX2DNDTTOXIWJANCNFSM4VL66IGQ>
.
|
Thank you for your reply.I'll try it as you said . |
Hi shujun
Due to some reason, i wanna do TE annotation in this way:
a) split large genome into several sequences
b) run EDTA_raw.pl with each sequence for each type(ltr, helitron, tir)
c) combine all sequences` raw results to generate the large genome`s all three type raw result.
d) then run EDTA.pl to finish the whole genome TE annotation.
Can it works?
Thanks a lot.
The text was updated successfully, but these errors were encountered: