Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ValueError: cannot convert float NaN to integer #11

Open
gawbul opened this issue Jun 23, 2015 · 7 comments
Open

ValueError: cannot convert float NaN to integer #11

gawbul opened this issue Jun 23, 2015 · 7 comments

Comments

@gawbul
Copy link

gawbul commented Jun 23, 2015

I'm trying to run hagfish_extract and am getting the following error:

[smoss@biolserva pacbio_assembly]$ hagfish_extract pbreads_to_pbasm_blasr.sorted.bam
/home/smoss/.local/lib/python2.7/site-packages/numpy/core/_methods.py:59: RuntimeWarning: Mean of empty slice.
  warnings.warn("Mean of empty slice.", RuntimeWarning)
/home/smoss/.local/lib/python2.7/site-packages/numpy/core/_methods.py:71: RuntimeWarning: invalid value encountered in double_scalars
  ret = ret.dtype.type(ret / rcount)
Traceback (most recent call last):
  File "/home/smoss/tools/hagfish/hagfish_extract", line 643, in <module>
    stats = doStats(bamBase, seqInfo, readPairs)
  File "/home/smoss/tools/hagfish/hagfish_extract", line 293, in doStats
    label='Peak top (%d)' % int(topInsert))
ValueError: cannot convert float NaN to integer
@gawbul
Copy link
Author

gawbul commented Jun 23, 2015

I notice that topInsert is assigned as smids[top] in the code (where top is the indices of the maximum values in the histogram). In tracing this issue back I took the liberty of printing smids and am left with a list of nan. Printing mids also returns a list of nan. Printing out insertSizes, hist, and edges returns an empty list, a list of zeros and a list of nan respectively.

@gawbul
Copy link
Author

gawbul commented Jun 24, 2015

I printed bamBase, seqInfo and readPairs and get the following:

pb_to_pb_blasr
{'scf7180000000002|quiver': {'length': 5350059}}
{'scf7180000000002|quiver': {'start2': array([], dtype=float64), 'start1': array([], dtype=float64), 'stop1': array([], dtype=float64), 'stop2': array([], dtype=float64)}} 

@gawbul
Copy link
Author

gawbul commented Jun 24, 2015

Debug output here:

[smoss@biolserva pb_pbalign]$ hagfish_extract -vvv ../pb_to_pb_pbalign.bam 
HAGFISH INFO   processing bamfile pb_to_pb_pbalign
HAGFISH DEBUG  get sequence info from ../pb_to_pb_pbalign.bam
HAGFISH INFO   Reading cached seqInfo for pb_to_pb_pbalign
HAGFISH INFO   discovered 1 sequences
HAGFISH INFO   processing BAM file: ../pb_to_pb_pbalign.bam
HAGFISH INFO   Basename pb_to_pb_pbalign
HAGFISH INFO   Processing 1 sequences < 1000 nt (from a total of 1)
HAGFISH DEBUG  executing samtools
HAGFISH DEBUG     samtools view -f 67 ../pb_to_pb_pbalign.bam
HAGFISH INFO   discovered 0 readpairs (insert < 20000 nt) out of a total of 0
HAGFISH INFO   wroted data for 1 sequences with zero pairs
HAGFISH INFO   total no readpairs: 0
/home/smoss/.local/lib/python2.7/site-packages/numpy/core/_methods.py:59: RuntimeWarning: Mean of empty slice.
  warnings.warn("Mean of empty slice.", RuntimeWarning)
/home/smoss/.local/lib/python2.7/site-packages/numpy/core/_methods.py:71: RuntimeWarning: invalid value encountered in double_scalars
  ret = ret.dtype.type(ret / rcount)
HAGFISH DEBUG  stats {'average': nan, 'nopairs': 0, 'median': nan}
HAGFISH DEBUG  creating a histogram (0, nan)
HAGFISH INFO   insert size tops at nan
HAGFISH INFO   Estimating min ok insert size as nan
HAGFISH INFO   Estimating max ok insert size as nan
HAGFISH INFO   plotting normal figure
Traceback (most recent call last):
  File "/home/smoss/tools/hagfish/hagfish_extract", line 643, in <module>
    stats = doStats(bamBase, seqInfo, readPairs)
  File "/home/smoss/tools/hagfish/hagfish_extract", line 293, in doStats
    label='Peak top (%d)' % int(topInsert))
ValueError: cannot convert float NaN to integer

@gawbul
Copy link
Author

gawbul commented Jun 24, 2015

It seems to work fine with short-read data from Illumina that I have mapped to the PacBio (PB) assembly using bwa, but for PB to PB mapping using pbalign/blasr it fails. This seems to be down to the samtools step?

@gawbul
Copy link
Author

gawbul commented Jun 24, 2015

I changed the samFlag input flag to --samFlag=0 and now I am getting output. I'm not entirely sure how this impacts things downstream?

@mfiers
Copy link
Owner

mfiers commented Dec 22, 2015

Dear @gawbul - sorry - I was (so it appears) not paying any attention to this page - is this still relevant?

@gawbul
Copy link
Author

gawbul commented Dec 22, 2015

@mfiers Not working on that project anymore, but was still an issue if I remember. I'm not sure if it was down to issues with the data, but haven't had time to investigate since then.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants