-
Notifications
You must be signed in to change notification settings - Fork 107
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
No entry in lineage_report.csv for very short inputs #409
Comments
Hi, Thank you |
It's unexpected, thanks for flagging! I'll follow up and see why this is happening. |
Looking at the logs sequences with N content of 1.0 are dropped fairly early, before sequences are hashed - in a test run with 7 samples, two of which have N content of 1.0, pangolin reports |
So I think I've fixed this now- just making sure tests pass on the branch! |
Resolved in pangolin v4.0.3! |
Thank you @aineniamh, it works now |
Works perfect! Thanks @aineniamh. |
Hello!
Thanks for all your work developing this great tool! I've been testing V4 and noticed that when I give pangolin V4 a short sequence it gets filtered out at the preprocessing
align_to_reference
step, resulting in alineage_report.csv
that contains only the header line. Would it be feasable to add a warning when this happens?Below is a walkthrough for replicating the issue. Thanks for your help!
My versions:
Example short fasta (the first 6 lines of
pangolin/tests/test-data/sequence1.fasta
):Running Pangolin
Gives a
lineage_report.csv
that looks like:Here's the
mapped.sam
contents:And the file sizes:
The text was updated successfully, but these errors were encountered: