Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to load data to local cluster using harvester #143

Open
medcelerate opened this issue Feb 24, 2020 · 0 comments
Open

Unable to load data to local cluster using harvester #143

medcelerate opened this issue Feb 24, 2020 · 0 comments

Comments

@medcelerate
Copy link

We are attempting to launch our own instance of the meta-knowledgebase and are encountering issues loading data with the harvester. We successfully ran the make file however getting the errors below.

The first is a missing source parameter in the save_bulk for elastic search, we have modified our code to get past this error however now we are having issues with the data loader.

AttributeError: 'NoneType' object has no attribute 'find'
mutant
Traceback (most recent call last):
  File "harvester.py", line 287, in <module>
    main()
  File "harvester.py", line 267, in main
    silos[0].save_bulk(_check_dup(harvest_and_convert(args.genes)), source=h)
  File "/home/ec2-user/g2p-aggregator/harvester/silos/elastic_silo.py", line 129, in save_bulk
    request_timeout=120)
  File "/home/ec2-user/miniconda2/lib/python2.7/site-packages/elasticsearch/helpers/__init__.py", line 194, in bulk
    for ok, item in streaming_bulk(client, actions, **kwargs):
  File "/home/ec2-user/miniconda2/lib/python2.7/site-packages/elasticsearch/helpers/__init__.py", line 161, in streaming_bulk
    for bulk_actions in _chunk_actions(actions, chunk_size, max_chunk_bytes, client.transport.serializer):
  File "/home/ec2-user/miniconda2/lib/python2.7/site-packages/elasticsearch/helpers/__init__.py", line 55, in _chunk_actions
    for action, data in actions:
  File "/home/ec2-user/g2p-aggregator/harvester/silos/elastic_silo.py", line 128, in <genexpr>
    (d for d in _bulker(feature_association_generator)),
  File "/home/ec2-user/g2p-aggregator/harvester/silos/elastic_silo.py", line 118, in _bulker
    for feature_association in feature_association_generator:
  File "harvester.py", line 184, in _check_dup
    for feature_association in harvest:
  File "harvester.py", line 96, in harvest_and_convert
    for feature_association in harvester.harvest_and_convert(genes):
  File "/home/ec2-user/g2p-aggregator/harvester/cgi_biomarkers.py", line 253, in harvest_and_convert
    for feature_association in convert(evidence):
  File "/home/ec2-user/g2p-aggregator/harvester/cgi_biomarkers.py", line 194, in convert
    evidence['Association'])
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc2 in position 9: ordinal not in range(128)`
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant