-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How can I use gtf file run qapa? #4
Comments
Hi @Trandamere, Thank you for your interest in QAPA. Unfortunately, QAPA currently doesn’t support GTF input at this time, but I may add a parser for it in the future. For the time being, you can get around this by converting your GTF file to genePred format using the Then, download and install the QAPA version from this branch. Be sure to add the option Also, you will need to convert your poly(A) site file into a standard BED format. Hope this helps, |
@kcha
The result like this:
And the identifiers file
I got a error when running qapa
What's wrong with this? Thank you |
Hi @Trandamere, Can you try two things: QAPA filters for protein-coding transcripts. For your purposes, I think you can get around this by manually replacing "processed_transcript" with "protein_coding", like this:
QAPA requires chromosomes to begin with "chr" and skips random chromosomes. Can you modify your genePred file to include this and try again? |
@kcha
But I get another error for qapa fasta
The genome file like this
But I get fasat when run bedtools
|
Hi @trandemere, This appears to be a bedtools warning and not from QAPA. My guess is that because after adding the "chr" prefix to the chromosome, you will have to do the same for your fasta file. Sorry for the inconvenience. In the future, I will look into removing this restriction. The |
@kcha
It works when this parameter is removed -fullHeader
|
Yes, you may be right that -fullHeader is unnecessary. Although I'm not sure why it wouldn't work if you included it. |
@kcha |
@Trandamere, as this is a bedtools error, I don't have a good sense of what's causing the problem. Can you send me your genome.fa via e-mail or a public URL that I can try to replicate the problem on my end? The chr1 sequence you provided above seems too short so I assume you posted a fake sequence? |
@kcha
|
I was able to replicate the issue. Thanks for your help. I have removed the |
@kcha
identifier file
quant.sf file1
quant.sf file2
|
- Update regex for matching Ensembl IDs and species - Fix bug caused by input data coming from only one strand
@Trandamere, thank you for sharing your result. The issue is due to QAPA not recognizing the format of the gene IDs in your data. I have now fixed this to be more flexible. Please pull the latest changes and try again. |
- Update regex for matching Ensembl IDs and species - Fix bug caused by input data coming from only one strand
* Add support for custom genePred files and bypassing annotation step * Add option to stop filtering of random chromosomes - chromosomes also don't need to begin with 'chr' * Disable fullHeader * More changes to support other species (#4) - Update regex for matching Ensembl IDs and species - Fix bug caused by input data coming from only one strand * Update README and add helper script for installing R packages * Restore indents * Fix regex matching of multi-transcript cases * Add optional --species option * Minor checks
Hi:
I'm working on transcriptome of fungi. I have the transcriptome gtf files.Like this:
polyAsite file
Is this enough data to run QAPA?
The text was updated successfully, but these errors were encountered: