-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
build library for hg38 #22
Comments
Does your For the new polyaiste BED file, try using the option |
Thank you very much for your reply.
Weiyan |
Yes you can skip the polyA annotation file step. I don't have a hg38 version at this time. For deltaPPAU, it was based on the median of PAU. |
@kcha Thank you very much for all your help. |
Re-opening this issue as I will try to look into adding support for the hg38 version of polyAsite in a future release. |
I am having a different issue when trying to build an hg38 polyA site annotation. I used the mysql commands as provided to get Ensemble gene metadata table from Biomart and GENCODE gene prediction annotation table, and GENCODE polyA sites track, along with polyA site annotation from PolyASite. However, when I run the build command to create the 3' UTR library the output file (output_utrs.bed) file is empty. I tried replacing some of the int() with float() in the annotate.py script but the issue did not resolve. |
Hi @imcoleman, there is an issue with using PolyASite version 2 as the format changed. This will be addressed in an unpcoming release that I'm slowly working on. Thanks! |
Support for PolyASite version 2 is now available. Please upgrade to the latest release (v1.3.0). Thanks! |
@kcha Thanks! I will try out v1.3.0. |
@kcha Just a quick comment that v1.3.0 has resolved the issue and I was able to process several groups of samples, human and mouse. Thanks! |
Hi there,
When I try to build the library by hg38, I download all the related data as instructed in the protocol.
Then I used the following code for building:
qapa build -N --db ensembl_identifiers.txt gencode.basic.txt > output.utrs.bed
I found the fourth column contains "hg19", when I used this file to extract sequence using the folling code
qapa fasta -f Homo_sapiens.GRCh38.dna.primary_assembly.fa output_utrs.bed output_sequences.fa
I got a empty output_sequences.fa file.
Besides, I also tried to build the library using new poly_A.bed download from https://polyasite.unibas.ch/atlas#2(hg38) and then used the code
qapa build --db ensembl_identifiers.txt -o clusters.bed gencode_hg38.txt > output_utrs_2.bed
There is a returned error message: Error message was:
***** ERROR: Requested column 4, but database file /tmp/pybedtools.11y88d0u.tmp only has fields 1 - 0.
Please advance.
Thanks,
Weiyan
The text was updated successfully, but these errors were encountered: