-
Notifications
You must be signed in to change notification settings - Fork 75
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
initial version of spliced sequence retrieval #127
Conversation
Thanks for the contribution. I like this idea very much, and it actually fits with some of my recent day-job work :).
I think that an iterable of intervals makes sense. The spliced sequence should always be on the same contig, so I like your current interface.
I'm okay with some inconsistency. We can add
Certainly. I think we would need BED12 parsing to make this useful. I've also considered how to better integrate with gffutils. Currently there is a Let's get the tests passing and go from there! |
I see that there is one test failing relating to my work, I'll get fix that. However, this is not something that I did, right?
I would be happy to add BED12 parsing. Not sure if there's actually a good (lightweight?) module that we could use there, otherwise I'll add it based on what I have for myself so far. It would indeed be great to have GFF/GTF support and if we could profit from @daler's work there. |
I think master was broken before your contribution. I was working on figuring our how samtools faidx works with BGZF compressed files, see #126. I abandoned the work and never came back :). I'll fix up master so the other tests pass. |
Alright, @simonvh. Thanks for the work on your end. After far too much messing around on my side, it look like master is passing tests as well. If you're okay with it I'll cut a new release with this PR merged. Just give a thumbs up if you're okay, otherwise I'll wait if you want to add anything else. |
@simonvh, @mdshw5 I'd love to have spliced sequence retrieval in gffutils. Like BED12 conversion (http://daler.github.io/gffutils/autodocs/gffutils.FeatureDB.bed12.html) it should actually be a method on |
Hi Matt, thanks for an extremely useful module. One feature that would be useful (at least to me) is the ability to retrieve spliced sequences from a FASTA file. Use-case would be to convert bed12 records to sequences, for instance. This PR is initial attempt at doing this. I would be happy to further discuss and adapt code/naming/style.
Some points:
get_seq()
doesn't have this. So at the moment this is not consistent.