Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CDI support #4056

Merged
merged 23 commits into from
Feb 11, 2025
Merged

CDI support #4056

merged 23 commits into from
Feb 11, 2025

Conversation

crazy-max
Copy link
Member

@crazy-max crazy-max commented Jul 24, 2023

related to #1436

Adds initial support for specifying fully-qualified CDI device names.

@crazy-max crazy-max force-pushed the cdi branch 2 times, most recently from 4a22da3 to a5c3952 Compare July 24, 2023 10:16
@crazy-max
Copy link
Member Author

I was also looking at moby/moby#45134 to register CDI device drivers. I think we need to add a new attribute in our TOML config to set the CDI specification directories otherwise CDI request will fail.

@elezar
Copy link

elezar commented Jul 25, 2023

@crazy-max I have not yet had a look at these changes in detail.

From my understanding buildkit is used to build containers. Am I to understand correctly that the expectation is that with these changes devices are made available at build time?

Note that from the NVIDIA side, this may have unintended consequences of producing container images that require a specific driver version to work correctly. The intent of the CDI device specifications that are used when running a container is to ensure that drivers that match the kernel mode driver on the host are injected dynamically, allowing applications that are built against the driver API to function as expected.

@crazy-max
Copy link
Member Author

You are correct that BuildKit is used to build containers, and with these changes, the goal is to make devices available at build time. This would allow users to mount specific devices when building container images.

Regarding the unintended consequences of producing container images that require a specific driver version, it sound expected and we might put in place some kind of capabilities. Intention is to provide more flexibility and convenience to users during the container building process. However, I understand the importance of keeping portability across environments.

If you have any further suggestions or concerns, please don't hesitate to let us know. We aim to create a seamless experience for users while ensuring compatibility and reliability across different setups.

@georgettica
Copy link

any update on this? @crazy-max @elezar

so far any place I see I need to disable buildkit to make use of building NVIDIA related components.

I would rather allow a very limited experience with tons of caveats than removing buildkit to build.

Copy link

@elezar elezar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have left some comments.

It might be good to call out in the PR description how this option is used and protected by a feature flag / capabilities. Documentation changes for the option that call out the caveats would also be appreciated.

executor/oci/spec_unix.go Outdated Show resolved Hide resolved
@colinhemmings
Copy link
Collaborator

Hi @georgettica, thanks for your response. I'm a product manager at Docker, and this is something we are actively exploring at the moment, with the aim of getting a timeframe for the roadmap. Do you have details on what would be an acceptable limited experience for you? I would be interested to chat about your use case if you prefer? https://calendly.com/colin-hemmings-dock/buildkit-chat

defer c.Close()

require.NoError(t, os.WriteFile(filepath.Join(sb.CDISpecDir(), "vendor1-device.yaml"), []byte(`
cdiVersion: "0.6.0"
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@crazy-max crazy-max force-pushed the cdi branch 2 times, most recently from 10de940 to 5065ab9 Compare February 11, 2025 13:47
Comment on lines +11119 to +11122
require.Contains(t, strings.TrimSpace(string(dt)), `BAR=injected`)
require.NotContains(t, strings.TrimSpace(string(dt)), `FOO=injected`)
require.NotContains(t, strings.TrimSpace(string(dt)), `BAZ=injected`)
require.NotContains(t, strings.TrimSpace(string(dt)), `QUX=injected`)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Takes first in lexicographical order

@crazy-max crazy-max marked this pull request as ready for review February 11, 2025 14:05
Copy link
Member

@tonistiigi tonistiigi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some more updates are needed before GA, but lets get this in for testing in RC.

@@ -0,0 +1,88 @@
//go:build !windows
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This shouldn't need build tags as the contrib importer is not included by default and already behind a build tag.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, linter runs for all package+arch combinations.

@tonistiigi tonistiigi merged commit c5d6871 into moby:master Feb 11, 2025
106 checks passed
@crazy-max crazy-max deleted the cdi branch February 12, 2025 09:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants