Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added ctranslate2 translation example script #83

Merged
merged 10 commits into from
Sep 27, 2024

Conversation

uahmed93
Copy link
Contributor

Added a ctransalte2 example which works on string tokens instead of integer tokens.

To run the example :

python3 example/custom_ct2_model.py --ct2-model-dir <your-modelp-dir> inp.parquet out.parquet

Copy link

copy-pr-bot bot commented Sep 11, 2024

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@uahmed93
Copy link
Contributor Author

This script is breaking because it is unable to serialize ct2 model while loading on worker. I have run this after initializing NDC gpu dask cluster on slurm but this is not working here.

@VibhuJawa
Copy link
Member

@uahmed93 , Can you post the error you saw here please.

@uahmed93
Copy link
Contributor Author

Here:

Deployed LocalCUDACluster(22f3d4fd, 'tcp://127.0.0.1:37693', workers=1, threads=1, memory=1.79 TiB)...
2024-09-11 10:56:33,005 - distributed.protocol.pickle - ERROR - Failed to serialize <ToPickle: HighLevelGraph with 4 layers.
<dask.highlevelgraph.HighLevelGraph object at 0x15522cf81900>
 0. read-parquet-2f062a30cf4676cd0b9a4ab5cf06fe85
 1. repartition-2-aa356a5cb4dedecfbdd625f941566718
 2. to-parquet-03d4d012cafd9ed3bdb4ee7374761cf1
 3. store-to-parquet-03d4d012cafd9ed3bdb4ee7374761cf1
>.
Traceback (most recent call last):
  File "/lustre/fsw/portfolios/llmservice/users/uahmed/test_env/lib/python3.10/site-packages/distributed/protocol/pickle.py", line 63, in dumps
    result = pickle.dumps(x, **dump_kwargs)
AttributeError: Can't pickle local object 'to_parquet.<locals>.<lambda>'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/lustre/fsw/portfolios/llmservice/users/uahmed/test_env/lib/python3.10/site-packages/distributed/protocol/pickle.py", line 68, in dumps
    pickler.dump(x)
AttributeError: Can't pickle local object 'to_parquet.<locals>.<lambda>'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/lustre/fsw/portfolios/llmservice/users/uahmed/test_env/lib/python3.10/site-packages/distributed/protocol/pickle.py", line 81, in dumps
    result = cloudpickle.dumps(x, **dump_kwargs)
  File "/lustre/fsw/portfolios/llmservice/users/uahmed/test_env/lib/python3.10/site-packages/cloudpickle/cloudpickle.py", line 1479, in dumps
    cp.dump(obj)
  File "/lustre/fsw/portfolios/llmservice/users/uahmed/test_env/lib/python3.10/site-packages/cloudpickle/cloudpickle.py", line 1245, in dump
    return super().dump(obj)
TypeError: cannot pickle 'ctranslate2._ext.Translator' object
Traceback (most recent call last):
  File "/lustre/fsw/portfolios/llmservice/users/uahmed/test_env/lib/python3.10/site-packages/distributed/protocol/pickle.py", line 63, in dumps
    result = pickle.dumps(x, **dump_kwargs)
AttributeError: Can't pickle local object 'to_parquet.<locals>.<lambda>'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/lustre/fsw/portfolios/llmservice/users/uahmed/test_env/lib/python3.10/site-packages/distributed/protocol/pickle.py", line 68, in dumps
    pickler.dump(x)
AttributeError: Can't pickle local object 'to_parquet.<locals>.<lambda>'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/lustre/fsw/portfolios/llmservice/users/uahmed/test_env/lib/python3.10/site-packages/distributed/protocol/serialize.py", line 353, in serialize
    header, frames = dumps(x, context=context) if wants_context else dumps(x)
  File "/lustre/fsw/portfolios/llmservice/users/uahmed/test_env/lib/python3.10/site-packages/distributed/protocol/serialize.py", line 76, in pickle_dumps
    frames[0] = pickle.dumps(
  File "/lustre/fsw/portfolios/llmservice/users/uahmed/test_env/lib/python3.10/site-packages/distributed/protocol/pickle.py", line 81, in dumps
    result = cloudpickle.dumps(x, **dump_kwargs)
  File "/lustre/fsw/portfolios/llmservice/users/uahmed/test_env/lib/python3.10/site-packages/cloudpickle/cloudpickle.py", line 1479, in dumps
    cp.dump(obj)
  File "/lustre/fsw/portfolios/llmservice/users/uahmed/test_env/lib/python3.10/site-packages/cloudpickle/cloudpickle.py", line 1245, in dump
    return super().dump(obj)
TypeError: cannot pickle 'ctranslate2._ext.Translator' object

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/lustre/fsw/portfolios/llmservice/users/uahmed/ctransl/ctransl_cf.py", line 145, in <module>
    main()
  File "/lustre/fsw/portfolios/llmservice/users/uahmed/ctransl/ctransl_cf.py", line 141, in main
    outputs.to_parquet(args.output_parquet_path)
  File "/lustre/fsw/portfolios/llmservice/users/uahmed/test_env/lib/python3.10/site-packages/nvtx/nvtx.py", line 116, in inner
    result = func(*args, **kwargs)
  File "/lustre/fsw/portfolios/llmservice/users/uahmed/test_env/lib/python3.10/site-packages/dask_cudf/core.py", line 264, in to_parquet
    return to_parquet(self, path, *args, **kwargs)
  File "/lustre/fsw/portfolios/llmservice/users/uahmed/test_env/lib/python3.10/site-packages/dask/dataframe/io/parquet/core.py", line 1047, in to_parquet
    out = out.compute(**compute_kwargs)
  File "/lustre/fsw/portfolios/llmservice/users/uahmed/test_env/lib/python3.10/site-packages/dask/base.py", line 379, in compute
    (result,) = compute(self, traverse=False, **kwargs)
  File "/lustre/fsw/portfolios/llmservice/users/uahmed/test_env/lib/python3.10/site-packages/dask/base.py", line 665, in compute
    results = schedule(dsk, keys, **kwargs)
  File "/lustre/fsw/portfolios/llmservice/users/uahmed/test_env/lib/python3.10/site-packages/distributed/protocol/serialize.py", line 379, in serialize
    raise TypeError(msg, str_x) from exc
TypeError: ('Could not serialize object of type HighLevelGraph', '<ToPickle: HighLevelGraph with 4 layers.\n<dask.highlevelgraph.HighLevelGraph object at 0x15522cf81900>\n 0. read-parquet-2f062a30cf4676cd0b9a4ab5cf06fe85\n 1. repartition-2-aa356a5cb4dedecfbdd625f941566718\n 2. to-parquet-03d4d012cafd9ed3bdb4ee7374761cf1\n 3. store-to-parquet-03d4d012cafd9ed3bdb4ee7374761cf1\n>')

Copy link
Member

@VibhuJawa VibhuJawa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, @uahmed93, for implementing this feature. The overall approach looks solid, but I've left some suggestions to enhance the code's readability and design.

The foundation is strong, but with a few refinements, it can be polished for better performance and maintainability.

@VibhuJawa
Copy link
Member

Also , CC: @sarahyurick for suggestions .

Copy link
Collaborator

@sarahyurick sarahyurick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @uahmed93 for the updates! I ran black and there are several lines which need to be reformatted, so I added them here.

@sarahyurick
Copy link
Collaborator

Also @uahmed93, I was wondering if I could test this PR myself by downloading the model files at https://huggingface.co/ai4bharat/indictrans2-en-indic-1B? Or should I be using something else?

@uahmed93
Copy link
Contributor Author

Thanks @VibhuJawa and @sarahyurick for detailed analysis. I have now added get_model_output function in Model class to handle postprocessing of outputs as suggested by @VibhuJawa .
I have run black for formatting now.

@uahmed93
Copy link
Contributor Author

@sarahyurick that is hf model. CT2 model can be downloaded from here Base En-indic model.
After downloading you need to unzip and there will be ct2_fp16 model. That model path need to be provided.

Copy link
Collaborator

@sarahyurick sarahyurick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested and it worked for me. Thanks @uahmed93 !

Copy link
Member

@VibhuJawa VibhuJawa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add pytests, mostly other things look good to me.

@uahmed93
Copy link
Contributor Author

Hi, @VibhuJawa @sarahyurick
Sorry was doing some other priority task due to which I could not handle the reviews immediately.
I have added test case here and corrected some suggestions. can please review them?.

Copy link
Member

@VibhuJawa VibhuJawa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Last final bits of review which i missed in the initial review, we should be good to merge after these are fixed.

Umair Ahmed added 6 commits September 25, 2024 23:21
to allow string output types.

Signed-off-by: Ahmed Umair <[email protected]>
handle postprocessing of outputs there only.

Signed-off-by: Ahmed Umair <[email protected]>
Handle some review comments.

Signed-off-by: Ahmed Umair <[email protected]>
changed to type checking instead of string matching in Model.

Signed-off-by: Ahmed Umair <[email protected]>
@uahmed93
Copy link
Contributor Author

Here:

Deployed LocalCUDACluster(22f3d4fd, 'tcp://127.0.0.1:37693', workers=1, threads=1, memory=1.79 TiB)...
2024-09-11 10:56:33,005 - distributed.protocol.pickle - ERROR - Failed to serialize <ToPickle: HighLevelGraph with 4 layers.
<dask.highlevelgraph.HighLevelGraph object at 0x15522cf81900>
 0. read-parquet-2f062a30cf4676cd0b9a4ab5cf06fe85
 1. repartition-2-aa356a5cb4dedecfbdd625f941566718
 2. to-parquet-03d4d012cafd9ed3bdb4ee7374761cf1
 3. store-to-parquet-03d4d012cafd9ed3bdb4ee7374761cf1
>.
Traceback (most recent call last):
  File "/lustre/fsw/portfolios/llmservice/users/uahmed/test_env/lib/python3.10/site-packages/distributed/protocol/pickle.py", line 63, in dumps
    result = pickle.dumps(x, **dump_kwargs)
AttributeError: Can't pickle local object 'to_parquet.<locals>.<lambda>'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/lustre/fsw/portfolios/llmservice/users/uahmed/test_env/lib/python3.10/site-packages/distributed/protocol/pickle.py", line 68, in dumps
    pickler.dump(x)
AttributeError: Can't pickle local object 'to_parquet.<locals>.<lambda>'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/lustre/fsw/portfolios/llmservice/users/uahmed/test_env/lib/python3.10/site-packages/distributed/protocol/pickle.py", line 81, in dumps
    result = cloudpickle.dumps(x, **dump_kwargs)
  File "/lustre/fsw/portfolios/llmservice/users/uahmed/test_env/lib/python3.10/site-packages/cloudpickle/cloudpickle.py", line 1479, in dumps
    cp.dump(obj)
  File "/lustre/fsw/portfolios/llmservice/users/uahmed/test_env/lib/python3.10/site-packages/cloudpickle/cloudpickle.py", line 1245, in dump
    return super().dump(obj)
TypeError: cannot pickle 'ctranslate2._ext.Translator' object
Traceback (most recent call last):
  File "/lustre/fsw/portfolios/llmservice/users/uahmed/test_env/lib/python3.10/site-packages/distributed/protocol/pickle.py", line 63, in dumps
    result = pickle.dumps(x, **dump_kwargs)
AttributeError: Can't pickle local object 'to_parquet.<locals>.<lambda>'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/lustre/fsw/portfolios/llmservice/users/uahmed/test_env/lib/python3.10/site-packages/distributed/protocol/pickle.py", line 68, in dumps
    pickler.dump(x)
AttributeError: Can't pickle local object 'to_parquet.<locals>.<lambda>'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/lustre/fsw/portfolios/llmservice/users/uahmed/test_env/lib/python3.10/site-packages/distributed/protocol/serialize.py", line 353, in serialize
    header, frames = dumps(x, context=context) if wants_context else dumps(x)
  File "/lustre/fsw/portfolios/llmservice/users/uahmed/test_env/lib/python3.10/site-packages/distributed/protocol/serialize.py", line 76, in pickle_dumps
    frames[0] = pickle.dumps(
  File "/lustre/fsw/portfolios/llmservice/users/uahmed/test_env/lib/python3.10/site-packages/distributed/protocol/pickle.py", line 81, in dumps
    result = cloudpickle.dumps(x, **dump_kwargs)
  File "/lustre/fsw/portfolios/llmservice/users/uahmed/test_env/lib/python3.10/site-packages/cloudpickle/cloudpickle.py", line 1479, in dumps
    cp.dump(obj)
  File "/lustre/fsw/portfolios/llmservice/users/uahmed/test_env/lib/python3.10/site-packages/cloudpickle/cloudpickle.py", line 1245, in dump
    return super().dump(obj)
TypeError: cannot pickle 'ctranslate2._ext.Translator' object

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/lustre/fsw/portfolios/llmservice/users/uahmed/ctransl/ctransl_cf.py", line 145, in <module>
    main()
  File "/lustre/fsw/portfolios/llmservice/users/uahmed/ctransl/ctransl_cf.py", line 141, in main
    outputs.to_parquet(args.output_parquet_path)
  File "/lustre/fsw/portfolios/llmservice/users/uahmed/test_env/lib/python3.10/site-packages/nvtx/nvtx.py", line 116, in inner
    result = func(*args, **kwargs)
  File "/lustre/fsw/portfolios/llmservice/users/uahmed/test_env/lib/python3.10/site-packages/dask_cudf/core.py", line 264, in to_parquet
    return to_parquet(self, path, *args, **kwargs)
  File "/lustre/fsw/portfolios/llmservice/users/uahmed/test_env/lib/python3.10/site-packages/dask/dataframe/io/parquet/core.py", line 1047, in to_parquet
    out = out.compute(**compute_kwargs)
  File "/lustre/fsw/portfolios/llmservice/users/uahmed/test_env/lib/python3.10/site-packages/dask/base.py", line 379, in compute
    (result,) = compute(self, traverse=False, **kwargs)
  File "/lustre/fsw/portfolios/llmservice/users/uahmed/test_env/lib/python3.10/site-packages/dask/base.py", line 665, in compute
    results = schedule(dsk, keys, **kwargs)
  File "/lustre/fsw/portfolios/llmservice/users/uahmed/test_env/lib/python3.10/site-packages/distributed/protocol/serialize.py", line 379, in serialize
    raise TypeError(msg, str_x) from exc
TypeError: ('Could not serialize object of type HighLevelGraph', '<ToPickle: HighLevelGraph with 4 layers.\n<dask.highlevelgraph.HighLevelGraph object at 0x15522cf81900>\n 0. read-parquet-2f062a30cf4676cd0b9a4ab5cf06fe85\n 1. repartition-2-aa356a5cb4dedecfbdd625f941566718\n 2. to-parquet-03d4d012cafd9ed3bdb4ee7374761cf1\n 3. store-to-parquet-03d4d012cafd9ed3bdb4ee7374761cf1\n>')

Resolved it.

Copy link
Member

@VibhuJawa VibhuJawa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Last nits around testing

@VibhuJawa VibhuJawa added the enhancement New feature or request label Sep 27, 2024
@VibhuJawa VibhuJawa linked an issue Sep 27, 2024 that may be closed by this pull request
@VibhuJawa
Copy link
Member

/okay to test

@VibhuJawa
Copy link
Member

@uahmed93 , Please fix linting issue.

Signed-off-by: Ahmed Umair <[email protected]>
@VibhuJawa
Copy link
Member

/okay to test

@VibhuJawa VibhuJawa merged commit 23af498 into rapidsai:main Sep 27, 2024
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[FEA] Support for ctranslate2 model translation.
3 participants