-
Notifications
You must be signed in to change notification settings - Fork 174
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fixed URLs and fixed ray download error #744
Conversation
- fixed image and colab urls to point to main repo - fixed model download error in pdf2parquet in ray version Signed-off-by: Sujee Maniyam <[email protected]>
@sujee I tested this today. The local versions of both Python and Ray ran successfully. The Python version on Colab is fine, but the Ray version on Colab has the following problem: |
@sujee Thanks for your help in debugging the problem! If I click on "Open with Colab" from the notebook in the repo, it takes the old version of the the notebook from the dev branch. In order to test the new version, I had to open Sujee's fork directly from the Colab [https://colab.research.google.com/github/sujee/data-prep-kit/blob/intro-example2/examples/notebooks/intro/dpk_intro_1_ray.ipynb] and then I was able to run the Ray version in Colab successfully. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. I have tested both notebooks locally and on Google Colab.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please review comment and decide to act on/ or ignore comments . Thanks.
" !wget -O 'input/solar-system/earth.pdf' 'https://raw.githubusercontent.com/sujee/data-prep-kit/intro-example1/examples/notebooks/intro/input/solar-system/earth.pdf'\n", | ||
" !wget -O 'input/solar-system/mars.pdf' 'https://raw.githubusercontent.com/sujee/data-prep-kit/intro-example1/examples/notebooks/intro/input/solar-system/mars.pdf'\n", | ||
" !wget -O 'my_utils.py' 'https://raw.githubusercontent.com/sujee/data-prep-kit/intro-example1/examples/notebooks/intro/my_utils.py'" | ||
" !wget -O 'input/solar-system/earth.pdf' 'https://raw.githubusercontent.com/IBM/data-prep-kit/dev/examples/notebooks/intro/input/solar-system/earth.pdf'\n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How were these files added ? Can anyone use this as a repo for storing/sharing documents ?
" !wget -O 'my_utils.py' 'https://raw.githubusercontent.com/sujee/data-prep-kit/intro-example1/examples/notebooks/intro/my_utils.py'" | ||
" !wget -O 'input/solar-system/earth.pdf' 'https://raw.githubusercontent.com/IBM/data-prep-kit/dev/examples/notebooks/intro/input/solar-system/earth.pdf'\n", | ||
" !wget -O 'input/solar-system/mars.pdf' 'https://raw.githubusercontent.com/IBM/data-prep-kit/dev/examples/notebooks/intro/input/solar-system/mars.pdf'\n", | ||
" !wget -O 'my_utils.py' 'https://raw.githubusercontent.com/IBM/data-prep-kit/dev/examples/notebooks/intro/my_utils.py'" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this here? Why isn't that in the git repo? What am I missing ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The .py file is in the repo, but to run on Google Colab, Sujee puts it in raw.githubusercontent.com. @sujee Can we take this file from the repo itself, as: https://github.com/IBM/data-prep-kit/blob/dev/examples/notebooks/intro/my_utils.py?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, on colab, we need to download any files besides the notebook.
These are being downloaded from the official repo location
@@ -504,7 +504,8 @@ | |||
" \"data_files_to_use\": ast.literal_eval(\"['.pdf']\"),\n", | |||
" # orchestrator\n", | |||
" \"runtime_worker_options\": ParamsUtils.convert_to_ast(worker_options),\n", | |||
" \"runtime_num_workers\": MY_CONFIG.RAY_RUNTIME_WORKERS,\n", | |||
" # \"runtime_num_workers\": MY_CONFIG.RAY_RUNTIME_WORKERS,\n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please decide on using 1 or MY_CONFIG.RAY_RUNTIME_WORKERS and remove commented line. Thanks
Signed-off-by: Sujee Maniyam <[email protected]>
Why are these changes needed?
Related issue number (if any).