Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BCF not yet supported #44

Closed
shibuvp opened this issue Jun 20, 2017 · 22 comments
Closed

BCF not yet supported #44

shibuvp opened this issue Jun 20, 2017 · 22 comments
Milestone

Comments

@shibuvp
Copy link

shibuvp commented Jun 20, 2017

i'm getting following error while running freebayes using cannoli
my command -------
ADAM_MAIN=org.bdgenomics.cannoli.Cannoli adam-submit --jars /opt/cannoli/target/cannoli_2.10-0.1-SNAPSHOT.jar -- freebayes -freebayes_reference hdfs://master.hdp:8020/Data/HumanBase/hg19/hg19.fa hdfs://master.hdp:8020/opt/small5.adam hdfs://master.hdp:8020/opt/sample.genotypes.adam
the error ------ Exception in thread "main" java.lang.AssertionError: assertion failed: BCF not yet supported

cannoli-assertion failed-bcf not yet supported

@shibuvp shibuvp changed the title BCF not yet supported BCF not yet supported BCF not yet supported Jun 20, 2017
@heuermh
Copy link
Member

heuermh commented Jun 20, 2017

Ah, sorry, the example I gave on gitter is incorrect.

As of the current version of Cannoli, freebayes only supports writing out as VCF.
https://github.com/bigdatagenomics/cannoli/blob/master/src/main/scala/org/bdgenomics/cannoli/Freebayes.scala#L106

$ ADAM_MAIN=org.bdgenomics.cannoli.Cannoli \
  adam-submit \
  --jars /opt/cannoli/target/cannoli_2.10-0.1-SNAPSHOT.jar \
  -- \
  freebayes \
  -freebayes_reference hdfs://master.hdp:8020/Data/HumanBase/hg19/hg19.fa \
  hdfs://master.hdp:8020/opt/small5.adam \
  hdfs://master.hdp:8020/opt/sample.vcf

@shibuvp
Copy link
Author

shibuvp commented Jun 21, 2017

okay thanks again!

@shibuvp
Copy link
Author

shibuvp commented Jun 21, 2017

i have used the above command you posted, but i'm getting this error

htsjdk.tribble.TribbleException$InvalidHeader: Your input file has a malformed header: We never saw the required CHROM header line (starting with one #) for the input VCF file
cannoli2

i have changed reference genome but the error is still same!

@heuermh
Copy link
Member

heuermh commented Jun 21, 2017

I believe the tribble exception may be misleading. The could not open hdfs://master.hdp:8020/Data/HumanBase/hg19/hg19.fa is likely more relevant.

$ freebayes -f does-not-exist.fa
Please specify a BAM file or files.

$ echo $?
1

@shibuvp
Copy link
Author

shibuvp commented Jun 22, 2017

when i use -- freebayes -f hdfs://master.hdp:8020/Data/HumanBase/hg19/hg19.fa , shows that
"-f" is not a valid option, but i can find -f in --help (freebayes --help in cannoli), i think -f option not work in freebayes in cannoli.
moreover i never have a bam file , i'm looking in a way that using adam file.
cannoli3

@heuermh
Copy link
Member

heuermh commented Jun 22, 2017

There is no -f in freebayes in cannoli, it should be -freebayes_reference VAL.
This maps to -f when cannoli calls freebayes externally via the pipe API.

@shibuvp
Copy link
Author

shibuvp commented Jun 23, 2017

Hmmm oke...i tried out again with the following command , but the error is the same!
ADAM_MAIN=org.bdgenomics.cannoli.Cannoli adam-submit --jars /opt/cannoli/target/cannoli_2.10-0.1-SNAPSHOT.jar -- freebayes -freebayes_reference hdfs://master.hdp:8020/Data/HumanBase/hg19/hg19.fa hdfs://master.hdp:8020/opt/small5.adam hdfs://master.hdp:8020/opt/sample91.vcf

and getting output as two files, that is
0 sized .vcf file and vcf_head file
cannoli4

header file contains header info only. how should i get ride of this issue?

@a1xt06
Copy link

a1xt06 commented Jun 29, 2017

What is the error?

@shibuvp
Copy link
Author

shibuvp commented Jun 30, 2017

i'm getting the following error

htsjdk.tribble.TribbleException$InvalidHeader: Your input file has a malformed header: We never saw the required CHROM header line (starting with one #) for the input VCF file
cannoli5

@a1xt06
Copy link

a1xt06 commented Jun 30, 2017

Let's start from the beginning, What was the command you input?

@shibuvp
Copy link
Author

shibuvp commented Jun 30, 2017

i had used this command
ADAM_MAIN=org.bdgenomics.cannoli.Cannoli adam-submit --jars /opt/cannoli/target/cannoli_2.10-0.1-SNAPSHOT.jar -- freebayes -freebayes_reference hdfs://master.hdp:8020/Data/HumanBase/hg19/hg19.fa hdfs://master.hdp:8020/sample/salim3.adam hdfs://master.hdp:8020/vcfoutput/sample206.vcf

@a1xt06
Copy link

a1xt06 commented Jun 30, 2017

I'm not sure if a reference on hdfs will work. Could you try using the test reference and see what happens?

@heuermh
Copy link
Member

heuermh commented Jul 1, 2017

Right, the -freebayes_reference argument value hdfs://master.hdp:8020/Data/HumanBase/hg19/hg19.fa is passed to the freebayes executable, which likely doesn't know how to deal with hdfs:// scheme URLs.

@shibuvp
Copy link
Author

shibuvp commented Jul 3, 2017

okey. i was trying to access reference genome from locally on cluster , not from hdfs.
ADAM_MAIN=org.bdgenomics.cannoli.Cannoli adam-submit --jars /opt/cannoli/target/cannoli_2.10-0.1-SNAPSHOT.jar -- freebayes -freebayes_reference /opt/artificial.fa hdfs://master.hdp:8020/sample/salim3.adam -single hdfs://master.hdp:8020/vcfoutput/sample217.vcf

here

artificial.fa is size of 1.15kb

i got this reference genome from adam resources and now i could see this error

cannoli6

the job doesn't completes

getting this output when i was abort the job.

cannoli8
cannoli7

@fnothaft
Copy link
Member

fnothaft commented Jul 3, 2017

@heuermh we should add the ability to use references stored in HDFS. I think this is a separate issue from what's been discussed here, so I've added #50 to track this.

@a1xt06
Copy link

a1xt06 commented Jul 5, 2017

Can you try without the "single" flag?

@shibuvp
Copy link
Author

shibuvp commented Jul 6, 2017

ya sure

@shibuvp
Copy link
Author

shibuvp commented Jul 6, 2017

@a1xt06 getting this output when i was trying without the "single" flag
cannoli9
and the command which i used to run

ADAM_MAIN=org.bdgenomics.cannoli.Cannoli adam-submit --jars /opt/cannoli/target/cannoli_2.10-0.1-SNAPSHOT.jar -- freebayes -freebayes_reference file:///Data/HumanBase/hg19/hg19.fa hdfs://master.hdp:8020/sample/salim3.adam hdfs://master.hdp:8020/vcfoutput/sample229.vcf

also the error is still same as we discussed earlier

cannoli10

@heuermh
Copy link
Member

heuermh commented Jul 6, 2017

@shibuvp Freebayes is complaining that it cannot open your reference file. could not open file:///Data/HumanBase/hg19/hg19.fa

The error above looks to be due to missing or invalid RecordGroups in the original input BAM file. NoSuchElementException: key not found: NewFile

Perhaps it would be worthwhile to try running freebayes directly on a single node with your data.

$ freebayes -f hg19.fa salim3.bam > salim3.vcf 

Then if that succeeds, try running the same via Cannoli.

@shibuvp
Copy link
Author

shibuvp commented Jul 6, 2017

ok @heuermh i w'll try

@shibuvp
Copy link
Author

shibuvp commented Jul 6, 2017

and the issue is , i'm trying to use adam file rather than bam , is it possible in freebyes?

@heuermh
Copy link
Member

heuermh commented Jan 9, 2018

Sorry for not following up sooner. Freebayes itself can only run on BAM/SAM files. Please reopen this issue if we can help further.

@heuermh heuermh closed this as completed Jan 9, 2018
@heuermh heuermh added this to the 0.1.0 milestone Jan 24, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants