-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
the exon coordinate calculated by Process_pbsim_data is not the same to GTF file? #1
Comments
Hello, and thanks for reporting this.
Can you tell me what dataset did you test this on, and for which read is this?
One possibile explanation is that the simulated read doesn't contain the last nucleotide. The other one is the error in the script.
I will have a look at the script to check if it is indeed an error.
Best regards,
Krešimir Križanović Ph.D.
University of Zagreb
Faculty of Electrical Engineering and Computing
Croatia
From: ydLiu-HIT [mailto:[email protected]]
Sent: Thursday, March 29, 2018 4:27 AM
To: kkrizanovic/RNAseqEval <[email protected]>
Cc: Subscribed <[email protected]>
Subject: [kkrizanovic/RNAseqEval] the exon coordinate calculated by Process_pbsim_data is not the sam to GTF file? (#1)
When I use Process_pbsim_data to evaluate the SAM file, I print the coordinates of expected exons(Items), but I found that the coordinate is not the same as the coordinates in GTF file.
coordinate printed by Process_pbsim_data:
[image]<https://user-images.githubusercontent.com/27715065/38066632-78b59f66-333b-11e8-99ae-c93c51d9c759.png>
coordinate in GTF
[image]<https://user-images.githubusercontent.com/27715065/38066650-92adec66-333b-11e8-847c-f94ced6a11f2.png>
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub<#1>, or mute the thread<https://github.com/notifications/unsubscribe-auth/AKOrMs_XNpAhVmKBvHnm22mIotRUGGw4ks5tjEZkgaJpZM4S_pWE>.
|
Hello again,
Sorry for taking this long to respond again. I've reviewed our Python code and the difference in exon end coordinates between GTF and our Python code are simply a difference in design. Our end coordinate denotes the position after the last nucleotide belonging to the exon, while in GTF file end coordinate denotes exactly the last base belonging to an exon.
However, you remark helped me uncover another error in calculating expected partial alignments, which I have now fixed and pushed to GitHub. I appreciate your interest in out evaluator, and taking time to comment on it.
Hope this clarifies thing for you.
Best regards,
Krešimir Križanović Ph.D.
University of Zagreb
Faculty of Electrical Engineering and Computing
Croatia
From: Krešimir Križanović
Sent: Thursday, March 29, 2018 9:35 AM
To: 'kkrizanovic/RNAseqEval' <[email protected]>
Subject: RE: [kkrizanovic/RNAseqEval] the exon coordinate calculated by Process_pbsim_data is not the sam to GTF file? (#1)
Hello, and thanks for reporting this.
Can you tell me what dataset did you test this on, and for which read is this?
One possibile explanation is that the simulated read doesn't contain the last nucleotide. The other one is the error in the script.
I will have a look at the script to check if it is indeed an error.
Best regards,
Krešimir Križanović Ph.D.
University of Zagreb
Faculty of Electrical Engineering and Computing
Croatia
From: ydLiu-HIT [mailto:[email protected]]
Sent: Thursday, March 29, 2018 4:27 AM
To: kkrizanovic/RNAseqEval <[email protected]<mailto:[email protected]>>
Cc: Subscribed <[email protected]<mailto:[email protected]>>
Subject: [kkrizanovic/RNAseqEval] the exon coordinate calculated by Process_pbsim_data is not the sam to GTF file? (#1)
When I use Process_pbsim_data to evaluate the SAM file, I print the coordinates of expected exons(Items), but I found that the coordinate is not the same as the coordinates in GTF file.
coordinate printed by Process_pbsim_data:
[image]<https://user-images.githubusercontent.com/27715065/38066632-78b59f66-333b-11e8-99ae-c93c51d9c759.png>
coordinate in GTF
[image]<https://user-images.githubusercontent.com/27715065/38066650-92adec66-333b-11e8-847c-f94ced6a11f2.png>
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub<#1>, or mute the thread<https://github.com/notifications/unsubscribe-auth/AKOrMs_XNpAhVmKBvHnm22mIotRUGGw4ks5tjEZkgaJpZM4S_pWE>.
|
Thank you for committing your new python code. It solves my problem very well. And I have another question about the python code now. But I am not sure, it is just a suggestion. In file Process_pbsim_data.py, line 321, the sentence "while annotation.items[i].getLength() < maf_startpos:", I think it shuould be "while annotation.items[i].getLength() <= maf_startpos:". Because when I test dataset dataset4_sim_dm_g2as.fastq, the later sentence works better. |
When I use Process_pbsim_data to evaluate the SAM file, I print the coordinates of expected exons(Items), but I found that the coordinate is not the same as the coordinates in GTF file.
coordinate printed by Process_pbsim_data:

coordinate in GTF

The text was updated successfully, but these errors were encountered: