-
Notifications
You must be signed in to change notification settings - Fork 75
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Parser breaks if FASTA file does not contain > #144
Comments
Well, it's not a FASTA file without the description line. Are we talking about a file that starts with a semicolon (like this example)? In that case I could see adding support for FASTA comments. If we're talking about file that just contain sequence and no comments or identifiers I doubt there's an indexing strategy for these, since a multi-FASTA file would have no record separator for multiple entries. If you can provide a bit more detail about how you'd like this supported we can go from there. Thanks! |
Ideally, it would parse it as normal. That being said, I understand if you don't think that supporting non properly formatted FASTA files is within the scope or even advisable for this project (we recently realized that Biopython doesn't support it either and fails silently). If so, could we maybe add a specific warning if no |
Definitely adding better exceptions would be great. Can I have an example of the file format in question? |
Sure! The exact file in question can be viewed here. Basically instead of:
it was getting passed:
|
I've added a case for handling files with no valid description lines and pushed a new release (https://github.com/mdshw5/pyfaidx/releases/tag/v0.5.5.1) that should be on PyPI in a few minutes. |
Thanks a ton! |
Ok this is a weird one: I'm at a hackathon and they handed us "mystery genomes" that were FASTA files with the comment line removed. I tried to use pyfaidx (through squiggle) and got this error:
The text was updated successfully, but these errors were encountered: