Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add the default solib dir to the rpath for shared libs with transitions. #14011

Closed
wants to merge 1 commit into from

Conversation

quval
Copy link
Contributor

@quval quval commented Sep 19, 2021

When an executable has dynamically-linked dependencies that have transitions, shared libraries are looked up under rpaths like this:

$ORIGIN/(../)*_solib_k8/../../../k8-fastbuild-ST-[0-9a-f]+/bin/_solib_k8

Unless running under the execroot, the transitioned solib directory may not be available at all; the right files will be present under the default solib directory, so it should be on the rpath.

Adding the default solib directory to the rpath may cause the wrong version of a shared library to be loaded, e.g. if it has been compiled in the default configuration. To prevent this, we also add the transition mnemonic to the mangled name of the library (adding the default solib last to the rpath won't really help, because it can legitimately be added first).

#13819

When an executable has dynamically-linked dependencies that have transitions,
shared libraries are looked up under rpaths like this:
    $ORIGIN/(../)*_solib_k8/../../../k8-fastbuild-ST-[0-9a-f]+/bin/_solib_k8

Unless running under the execroot, the transitioned solib directory may not be
available at all; the right files will be present under the default solib
directory, so it should be on the rpath.

Adding the default solib directory to the rpath may cause the wrong version of
a shared library to be loaded, e.g. if it has been compiled in the default
configuration. To prevent this, we also add the transition mnemonic to the
mangled name of the library (adding the default solib last to the rpath won't
really help, because it can legitimately be added first).
@quval
Copy link
Contributor Author

quval commented Sep 30, 2021

Hey @oquenchil, do you think this might make it on time for Bazel 5 (#14013)? This is mostly the same patch as before, with the change you asked me to make. Thanks!

@quval quval mentioned this pull request Oct 5, 2021
9 tasks
@oquenchil
Copy link
Contributor

Sorry, I need to carefully review this change and see that it doesn't affect anything negatively internally. I haven't had time to do so yet.

@philwo philwo added this to the Bazel 5.0 Release Blockers milestone Oct 5, 2021
@oquenchil
Copy link
Contributor

Thanks for being so patient.

This looks good and internally the tests pass with minor modifications that I can take care of. But I'm still trying to wrap my head around all this, so please bear with me.

Unless running under the execroot, the transitioned solib directory may not be available at all; the right files will be present under the default solib directory, so it should be on the rpath.

When you say the execroot, you mean the repository root where you run bazel build, do you mean where actions themselves run or something else?

As I understand it, if you move the location of the executable, ORIGIN changes, but the code still works regardless of where you move the binary, doesn't it? So where is the solibdir when you have $ORIGIN/../_solib_k8/? You go one directory up from where you are running the binary and then there should be a directory called solib_k8? Do you have to put it there manually? Is it part of runfiles and you just have to copy the runfiles?

That's for the default solibdir. Why doesn't the same thing apply for the transitioned solibdir, i.e.: ../_solib_k8/../../../k8-fastbuild-ST-? If you placed the default solibdir there manually, why not this one as well? If the default solibdir was in runfiles and you copied runfiles, how come that the transitioned solibdir isn't in runfiles too?

I realize that it should be the other way around where I'm answering the questions but since you have a better understanding of what the actual problem is I thought you could help.

Thanks!

@quval
Copy link
Contributor Author

quval commented Oct 6, 2021

I should be the one to thank you – I'm sorry I've been nagging you about it, and I really appreciate your taking the time to work on this. I'm happy to help any way I can; if you'd like to have a chat over this, I can try and demo what I'm seeing – please just let me know.

There seems to be a difference between the layout in local execution (i.e. where bazel build is run: this is what I meant by 'under the execroot') and the directory structure where actions are run remotely. $ORIGIN is the directory where the executable resides, and the structure of the runfiles dir is such that there's one solib directory there, where all the correctly transitioned SOs are available. When depending on transitioned libraries, rpath entries are added that go all the way up to k8-fastbuild-ST-... directories for other configurations. Locally, they do exist and everything works well. In remote execution (and deployment by copying the executable and its runfiles), we only have the runfiles directory, so the only rpath entry that can be actually used is the one that points to the default solib directory. But we don't get an rpath entry pointing there unless we're linking a non-transitioned library. If we don't, all the libraries that Bazel put there can't be found.

Indeed, the trigger for filing the original issue was a cc_test that always failed in remote execution, but always passed locally. I found that it depended only on transitioned libraries – so the default solib dir wasn't on the rpath, whereas that's where all the SOs were in RE. I put together a gist that I hope demonstrates this: https://gist.github.com/quval/5077b2edc902e9dc0bcad2311a245ffa.

@oquenchil
Copy link
Contributor

Alright, understood, thank you very much for the great contribution. I will merge now.

@ulfjack Have you never bumped into this issue?

@bazel-io bazel-io closed this in 20061f8 Oct 7, 2021
quval added a commit to quval/bazel that referenced this pull request Nov 14, 2021
PR bazelbuild#14011 took care of cc_libraries, this fixes the same issue for cc_imports.

Work towards bazelbuild#13819.
quval added a commit to quval/bazel that referenced this pull request Nov 14, 2021
PR bazelbuild#14011 took care of cc_libraries, this fixes the same issue for cc_imports.

Work towards bazelbuild#13819.
bazel-io pushed a commit that referenced this pull request Feb 8, 2022
PR #14011 took care of cc_libraries. This fixes the same issue for cc_imports.

Work towards #13819.

Closes #14272.

PiperOrigin-RevId: 427137675
brentleyjones pushed a commit to brentleyjones/bazel that referenced this pull request Feb 8, 2022
PR bazelbuild#14011 took care of cc_libraries. This fixes the same issue for cc_imports.

Work towards bazelbuild#13819.

Closes bazelbuild#14272.

PiperOrigin-RevId: 427137675
(cherry picked from commit 8734ccf)
Wyverald pushed a commit that referenced this pull request Feb 9, 2022
#14757)

PR #14011 took care of cc_libraries. This fixes the same issue for cc_imports.

Work towards #13819.

Closes #14272.

PiperOrigin-RevId: 427137675
(cherry picked from commit 8734ccf)

Co-authored-by: Yuval K <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla: yes team-Rules-CPP Issues for C++ rules
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants