-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ci] [R-package] macOS clang CMake jobs failing with segfaults #6628
Comments
I just tried re-running ALL R jobs on #6625, let's see if any others fail: https://github.com/microsoft/LightGBM/actions/runs/10581950637?pr=6625 |
I strongly suspect this is related to a release of one of
|
So I think the new On my Mac (M2, Sonoma 14.4.1), I built the latest Rscript build_r.R --no-build-vignettes -j4 Found that, in combination with the latest cat > test.R <<EOF
library(lightgbm)
data(agaricus.train, package = "lightgbm")
lgb.Dataset(
data = agaricus.train\$data
, label = agaricus.train\$label
)\$construct()
EOF
# fails
Rscript test.R The error does not occur if I disable OpenMP parallelism. # succeeds
OMP_NUM_THREADS=1 Rscript test.R Downgrading to the prior release of Rscript -e "remove.packages('data.table')"
Rscript --vanilla -e "install.packages(c('https://cran.r-project.org/src/contrib/Archive/data.table/data.table_1.15.4.tar.gz'), repos = NULL)"
# succeeds
Rscript test.R
# also succeeds
OMP_NUM_THREADS=1 Rscript test.R So it does look like it's something related to the latest |
I noticed that when I build Building 1.16.0 from source, it does. I see lines like this:
It looks like R_LIB=/Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/library
otool -L "${R_LIB}/data.table/libs/data_table.so"
otool -L "${R_LIB}/lightgbm/libs/lightgbm.so"
Which will search in this order: Lines 829 to 833 in fde0157
So I suspect that this is our old friend, the "multiple versions of OpenMP loaded in the same session" problem. Things we could try:
|
Adding a But I found that adding R's main library directory at the beginning of the OpenMP RPATH list did! 🎉 Opened #6629 proposing that. Summary
ImpactBuilding The Windows and Linux users are unaffected. Building with |
Just for awareness, tagging some folks who might be interested (no action required.... this is a LightGBM problem, not a |
Messy! Glad you've found a fix. Linking our recent updates about configuring OpenMP on macOS since they're probably related & I don't see them here yet: Rdatatable/data.table#6034 #6418 is in dev only, but we'll probably put it in a patch release soon. |
Description
The
r-package (macos-13, clang, R 4.3, cmake)
CI jobs are failing with a segfault like this:Reproducible example
This is happening on all PRs. For example, see this build from #6625: https://github.com/microsoft/LightGBM/actions/runs/10581950637/job/29392431659?pr=6625.
On that PR, I manually re-triggered that job 3 times over the last 24 hours.
Additional Comments
It's worth noting that:
{lightgbm}
v4.5.0: https://cran.r-project.org/web/checks/check_results_lightgbm.htmlThe text was updated successfully, but these errors were encountered: