Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

openblas from Cygwin's package deadloops after fork #22822

Closed
embray opened this issue Apr 17, 2017 · 11 comments
Closed

openblas from Cygwin's package deadloops after fork #22822

embray opened this issue Apr 17, 2017 · 11 comments

Comments

@embray
Copy link
Contributor

embray commented Apr 17, 2017

After some pain and headscratching I see what's going on here. Normally when I build Sage for Cygwin I use Cygwin's system liblapack0 package using SAGE_ATLAS_LIB and selecting --with-blas=atlas at configure time.

The liblapack0 package provides a BLAS implementation--the Netlib reference implementation.In any case it works but I haven't done any performance comparisons.

When you install the liblapack0 package it installs the cygblas-0.dll in /usr/lib/lapack as well as an /etc/profile.d/ script that appends /usr/lib/lapack to $PATH so that any cygblas-0.dll is found there.

However, at some point I installed the Cygwin R package, which depends on Cygwin's openblas package which is actually the default BLAS on Cygwin (that is, it installs cygblas-0.dll to /bin. Since this comes earlier on the path, this is the BLAS that gets used if it's installed. That would be fine, except there is a problem with the package. It is not configured to use pthreads, and instead uses OpenBLAS's Windows threading driver. That works fine until you fork a process, at which case the entire threading state becomes invalid, but since the Windows threading driver doesn't know about fork it just carries on, and doesn't have enough error-handling to deal properly with this scenario. Any attempt to perform a multi-threaded operation after a fork goes into a dead wait for a thread that doesn't exist to become available. This is similar to the issue I described in #22694, though more severe since it just locks, rather than crashes.

This can be avoided a number of ways (I will also post something about this on the wiki):

  1. For now, definitely just don't install Cygwin's openblas package.
  2. Cygwin's openblas package should be fixed to be built with pthread support; I don't know why it isn't but I will report this...
  3. OpenBLAS could be fixed upstream to recognize building on Cygwin, and install post-fork handlers even when using the Windows threading driver (similar to what libgc does).
  4. Build either ATLAS or OpenBLAS in Sage and don't use the system package--in my experience this has generally worked too, though I haven't tested either one extensively. I need to do more work with both and see what issues arise. This option is certainly fine too, but it saves a lot of development time to use the system package.

Upstream PR for OpenBLAS: OpenMathLib/OpenBLAS#1450

Discussion on Cygwin mailing list: https://cygwin.com/ml/cygwin/2018-02/msg00057.html

Depends on #24638

Upstream: Fixed upstream, but not in a stable release.

Component: porting: Cygwin

Issue created by migration from https://trac.sagemath.org/ticket/22822

@embray

This comment has been minimized.

@embray
Copy link
Contributor Author

embray commented Feb 6, 2018

Changed upstream from Not yet reported upstream; Will do shortly. to Reported upstream. No feedback yet.

@embray

This comment has been minimized.

@embray
Copy link
Contributor Author

embray commented Feb 6, 2018

comment:3

Finally also brought this up on Cygwin. I meant to do that a long time ago, but it helps to have an actual patch to discuss.

@embray

This comment has been minimized.

@embray

This comment has been minimized.

@jdemeyer
Copy link
Contributor

jdemeyer commented Feb 6, 2018

comment:5

Beware of conflicts with #24638.

@jdemeyer
Copy link
Contributor

jdemeyer commented Feb 6, 2018

Dependencies: #24638

@embray
Copy link
Contributor Author

embray commented Feb 6, 2018

comment:6

AFIACT there is no conflict.

@embray
Copy link
Contributor Author

embray commented Apr 11, 2018

Changed upstream from Reported upstream. No feedback yet. to Fixed upstream, but not in a stable release.

@embray
Copy link
Contributor Author

embray commented Mar 10, 2021

comment:8

FWIW I think this issue is fixed. The bug in OpenBLAS has been long-since fixed, and the Cygwin package has been updated as well.

What we don't have yet is getting Sage built against the system OpenBLAS. Once that's done and confirmed working, we can definitely close this issue.

@mkoeppe mkoeppe closed this as not planned Won't fix, can't repro, duplicate, stale Nov 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants