You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am using the latest version of the pipeline (v.1.0.3) and I am testing a large group of genomes from ncbi (very variable in quality) using the profile module both with PROT and BUSCO.
For some genomes I got a java exception as follow:
In this case, the genome GCA_017580835.1 had no errors with PRO profiling but the BUSCO profiling did not work.
A few genomes failed (with the same error) with the PRO profiling:
GCA_900068945.1
GCA_900068915.1
GCA_900069095.1
GCA_900068965.1
GCA_900068985.1
GCA_018221805.1
GCA_900068975.1
GCA_900068955.1
What does the error mean and do you have any idea about what is causing it?
Thanks a lot for your help and the amazing pipeline!
Best,
Brigida
The text was updated successfully, but these errors were encountered:
Thank you for reaching out!
Based on the error message you provided, it appears that the result file of the fastBlockSearch run is corrupted or improperly formatted.
You can run the pipeline single-threaded with the --dev option to identify the problematic sub-command.
I attempted to reproduce the error using the assemblies associated with the accession numbers you provided.
However, I was able to successfully run both BUSCO profiling of GCA_017580835.1 and PRO profiling of GCA_900068945.1 without any errors on my system.
If possible, please provide the assembly files that caused the issue for further investigation.
One common feature I could find among these assemblies is that they contain a large number of extremely short DNA contigs.
This caused a significant reduction of computational speed from my system, and may be the cause of the error you reported.
My hypothesis is that rejecting FASTA entries with fewer than a given threshold of base pairs (e.g., 1,000 bps) may resolve this issue.
Hello,
I am using the latest version of the pipeline (v.1.0.3) and I am testing a large group of genomes from ncbi (very variable in quality) using the profile module both with PROT and BUSCO.

For some genomes I got a java exception as follow:
In this case, the genome GCA_017580835.1 had no errors with PRO profiling but the BUSCO profiling did not work.
A few genomes failed (with the same error) with the PRO profiling:
GCA_900068945.1
GCA_900068915.1
GCA_900069095.1
GCA_900068965.1
GCA_900068985.1
GCA_018221805.1
GCA_900068975.1
GCA_900068955.1
What does the error mean and do you have any idea about what is causing it?
Thanks a lot for your help and the amazing pipeline!
Best,
Brigida
The text was updated successfully, but these errors were encountered: