-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(Accelerator): Introduce AI runtime configuration scheme #514
Conversation
…evice from API, envvars, CLI. - Introduce the AcceleratorOptions, AcceleratorDevice and use them to set the device where the models run. - Introduce the accelerator_utils with function to decide the device and resolve the AUTO setting. - Refactor the way how the docling-ibm-models are called to match the new init signature of models. - Translate the accelerator options to the specific inputs for third-party models. - Extend the docling CLI with parameters to set the num_threads and device. - Add new unit tests. - Write new example how to use the accelerator options.
Signed-off-by: Christoph Auer <[email protected]>
input_num_threads = data.get("num_threads") | ||
|
||
# Check if to set the num_threads from the alternative envvar | ||
if input_num_threads is None: | ||
docling_num_threads = os.getenv("DOCLING_NUM_THREADS") | ||
omp_num_threads = os.getenv("OMP_NUM_THREADS") | ||
if docling_num_threads is None and omp_num_threads is not None: | ||
try: | ||
data["num_threads"] = int(omp_num_threads) | ||
except ValueError: | ||
_log.error( | ||
"Ignoring misformatted envvar OMP_NUM_THREADS '%s'", | ||
omp_num_threads, | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be done in the field validator not the __init__
, which should already have the standard ENV resolved.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please check my comments. Pydantic raises an exception for an "extra" field if you let it do the standard resolution (under the conditions I describe in my comments). Therefore we cannot use a field validator and any custom code must run before the resolution and have access to the full model input.
As an alternative approach I can implement the same logic inside a model validator.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -78,7 +132,16 @@ class EasyOcrOptions(OcrOptions): | |||
|
|||
kind: Literal["easyocr"] = "easyocr" | |||
lang: List[str] = ["fr", "de", "es", "en"] | |||
use_gpu: bool = True # same default as easyocr.Reader | |||
use_gpu: bool = Field( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
here we should use the newer Annotated[]
way of defining the fields
docling/models/easyocr_model.py
Outdated
from docling.datamodel.settings import settings | ||
from docling.models.base_ocr_model import BaseOcrModel | ||
from docling.utils import accelerator_utils as au |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we are using only decide_device
, let's import that function instead of using aliases for our own modules
download_path = snapshot_download( | ||
repo_id="ds4sd/docling-models", | ||
force_download=force, | ||
local_dir=local_dir, | ||
revision="v2.0.1", | ||
revision="refs/pr/2", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TODO: update with tag before merge
|
||
# Use envvars (regular + alternative) and default values | ||
os.environ["OMP_NUM_THREADS"] = "1" | ||
ao.__init__() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do not test the __init__()
we should check the complete object initialization. (also see comment above, we should not have a custom __init__
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is not a check for the __init__()
but an in-place reloading that allows to re-evaluate the new values of the envvars.
Signed-off-by: Nikos Livathinos <[email protected]>
…structure for fast/accurate model Signed-off-by: Nikos Livathinos <[email protected]>
Signed-off-by: Christoph Auer <[email protected]>
Signed-off-by: Christoph Auer <[email protected]>
Signed-off-by: Nikos Livathinos <[email protected]>
Merging it to |
Signed-off-by: Nikos Livathinos <[email protected]>
Signed-off-by: Nikos Livathinos <[email protected]>
feat(Accelerator): Introduce AI runtime configuration scheme Signed-off-by: Christoph Auer <[email protected]>
feat(Accelerator): Introduce options to control the number of threads and device from API, envvars, CLI.
AcceleratorOptions
,AcceleratorDevice
and use them to set the device where the models run.accelerator_utils
with function to decide the device and resolve the AUTO setting.docling-ibm-models
are called to match the new signature of models constructor.num_threads
anddevice
.This PR implements the points from the Discussion #373
Issues addressed:
#308
#133
Checklist: