You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thanks for creating EDTA, keep doing the great work. I wanted to share some observations from my analysis. I have been running EDTA on several red raspberry assemblies and have noticed an unusually high percentage of LTR unknown repeats. Interestingly, most of these repeats seem to be concentrated in the centromeric regions. This trend is consistent across more than 100 genome assemblies.
Could you please provide your insights on why I am observing this unusually high percentage of LTR unknown repeats? I have attached the .sum output and TE density plot to this issue for your reference. EDTA.TEanno.density_plots.pdf
<style>
</style>
Repeat Classes
==============
Total Sequences: 11
Total Length: 287904378
bp
Class
Count
bpMasked
%masked
=====
=====
========
=======
LINE
--
--
--
I
483
427336
0.15%
L1
5010
2376953
0.83%
LTR
--
--
--
Copia
11758
11801425
4.10%
Gypsy
14344
20919798
7.27%
unknown
86279
67771835
23.54%
SINE
--
--
--
tRNA
382
40106
0.01%
TIR
--
--
--
CACTA
9395
4295459
1.49%
Mutator
19391
5551703
1.93%
PIF_Harbinger
14354
4224540
1.47%
Tc1_Mariner
343
118358
0.04%
hAT
12807
4413702
1.53%
nonLTR
--
--
--
pararetrovirus
27
24109
0.01%
nonTIR
--
--
--
helitron
34547
14855621
5.16%
repeat_fragment
19072
5125036
1.78%
total interspersed
228192
141945981
49.30%
snRNA
78
8331
0.00%
Total
228270
141954312
49.31%
The text was updated successfully, but these errors were encountered:
Thanks for sharing your results. One reason is that these LTR/unknown are misannotated from centromeric repeats. You will need to curate them to make sure. Highly recommend using TEtrimmer for this purpose. You may read the paper for more information about the tool usage and intepretation. Another reason - they are real LTR elements with unknown classifications - again, curation will tell you more.
Hi Shujun
Thanks for creating EDTA, keep doing the great work. I wanted to share some observations from my analysis. I have been running EDTA on several red raspberry assemblies and have noticed an unusually high percentage of LTR unknown repeats. Interestingly, most of these repeats seem to be concentrated in the centromeric regions. This trend is consistent across more than 100 genome assemblies.
Could you please provide your insights on why I am observing this unusually high percentage of LTR unknown repeats? I have attached the .sum output and TE density plot to this issue for your reference.
<style> </style>EDTA.TEanno.density_plots.pdf
The text was updated successfully, but these errors were encountered: