Skip to content

Commit

Permalink
Enhance the execution speed of CPM dataloaders (#230)
Browse files Browse the repository at this point in the history
* Update start_pytorch_task.py - Handle non-zero return code in process execution

feat: Handle non-zero return code in process execution

Refactor the code to check the return code of each process execution.
If the return code is non-zero, an exception is raised with a descriptive
error message indicating the process ID and suggesting to check the relevant
issue for further details.

* Enhance the execution speed of CPM dataloaders

Enhance the execution speed of CPM dataloaders, potentially reducing the time by around 30 seconds, subject to potential variations due to different environments 

Initialize jieba library using jieba.initialize()
  • Loading branch information
clemente0731 authored Sep 1, 2023
1 parent e6d4fa4 commit 709c498
Showing 1 changed file with 5 additions and 0 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,11 @@
import sentencepiece as spm
import jieba

try:
jieba.initialize()
except Exception:
pass

try:
from functools import lru_cache
except ImportError:
Expand Down

0 comments on commit 709c498

Please sign in to comment.