You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have encountered the following error while training binary classification task with lightgbm 4.5.0 on H100 and device="cuda":
Traceback (most recent call last):
File "/usr/local/lib/python3.9/site-packages/pywrapper_utils/run_thread/full_batch_run_thread.py", line 47, in _execute_user_function
result = self.user_main_function(**kwargs)
File "/opt/module/source/main.py", line 31, in main
model.perform_all_calculations()
File "/opt/module/source/model/feature_selector.py", line 61, in perform_all_calculations
selected_features: List[Tuple] = self.select_features(base_model, kfold)
File "/opt/module/source/model/feature_selector.py", line 84, in select_features
model.fit(X_train, y_train)
File "/tmp/.local/lib/python3.9/site-packages/lightgbm/sklearn.py", line 1284, in fit
super().fit(
File "/tmp/.local/lib/python3.9/site-packages/lightgbm/sklearn.py", line 955, in fit
self._Booster = train(
File "/tmp/.local/lib/python3.9/site-packages/lightgbm/engine.py", line 307, in train
booster.update(fobj=fobj)
File "/tmp/.local/lib/python3.9/site-packages/lightgbm/basic.py", line 4135, in update
_safe_call(
File "/tmp/.local/lib/python3.9/site-packages/lightgbm/basic.py", line 296, in _safe_call
raise LightGBMError(_LIB.LGBM_GetLastError().decode("utf-8"))
lightgbm.basic.LightGBMError: [CUDA] invalid argument /tmp/pip-install-9rgzugd6/lightgbm_37941d8e64514c0e844ef71f72ef6b9c/src/boosting/goss.hpp 63
Are you able to share a minimal, reproducible example? Or at least, the exact parameters you passed to LightGBM?
The LightGBM functions you use and confirguration you pass to them changes what underlying code is called. Providing details like that reduces the effort required to investigate this.
Description
I have encountered the following error while training binary classification task with lightgbm 4.5.0 on H100 and
device="cuda"
:Traceback (most recent call last):
File "/usr/local/lib/python3.9/site-packages/pywrapper_utils/run_thread/full_batch_run_thread.py", line 47, in _execute_user_function
result = self.user_main_function(**kwargs)
File "/opt/module/source/main.py", line 31, in main
model.perform_all_calculations()
File "/opt/module/source/model/feature_selector.py", line 61, in perform_all_calculations
selected_features: List[Tuple] = self.select_features(base_model, kfold)
File "/opt/module/source/model/feature_selector.py", line 84, in select_features
model.fit(X_train, y_train)
File "/tmp/.local/lib/python3.9/site-packages/lightgbm/sklearn.py", line 1284, in fit
super().fit(
File "/tmp/.local/lib/python3.9/site-packages/lightgbm/sklearn.py", line 955, in fit
self._Booster = train(
File "/tmp/.local/lib/python3.9/site-packages/lightgbm/engine.py", line 307, in train
booster.update(fobj=fobj)
File "/tmp/.local/lib/python3.9/site-packages/lightgbm/basic.py", line 4135, in update
_safe_call(
File "/tmp/.local/lib/python3.9/site-packages/lightgbm/basic.py", line 296, in _safe_call
raise LightGBMError(_LIB.LGBM_GetLastError().decode("utf-8"))
lightgbm.basic.LightGBMError: [CUDA] invalid argument /tmp/pip-install-9rgzugd6/lightgbm_37941d8e64514c0e844ef71f72ef6b9c/src/boosting/goss.hpp 63
Environment info
python3.9
cuda 12.4
scikit-learn==1.6.1
Command(s) you used to install LightGBM
The text was updated successfully, but these errors were encountered: