Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cudaPackages: bump cudaPackages_11 -> cudaPackages_12 #222778

Closed
2 tasks
SomeoneSerge opened this issue Mar 23, 2023 · 16 comments
Closed
2 tasks

cudaPackages: bump cudaPackages_11 -> cudaPackages_12 #222778

SomeoneSerge opened this issue Mar 23, 2023 · 16 comments
Labels
6.topic: cuda Parallel computing platform and API

Comments

@SomeoneSerge
Copy link
Contributor

SomeoneSerge commented Mar 23, 2023

CC @NixOS/cuda-maintainers

Blockers:

Thanks @ConnorBaker for the pytorch link

@nviets
Copy link
Contributor

nviets commented Apr 5, 2023

Will this bump happen after #220341? I am attempting to build 11_8 now on your latest branch to see if it fixes #224150. No idea if xgboost will be happy with 12, but I'll give it a shot. Thanks @SomeoneSerge for keeping me in the loop!

@SomeoneSerge
Copy link
Contributor Author

Will this bump happen after #220341?

I guess the bump will happen as soon as someone opens the bump PR and runs a nixpkgs-review on it (with --extra-nixpkgs-config { cudaSupport = true; }, which is going to be very compute-intensive). Right now the concern is that pytorch doesn't yet support CUDA12, but there are probably others, which is why the safest way to see whether we can update is by trial.

There's no reason we couldn't first update to 11.8: that would likely cause fewer regressions than 12, and would enable you to fix xgboost without resorting to overrides. Still, we'd have to run the builds first

@nviets
Copy link
Contributor

nviets commented Apr 5, 2023

Thanks, xgboost 1.7.5 should be safe once I get it building with 11_8 because I added the version argument to options.

@samuela
Copy link
Member

samuela commented Apr 5, 2023

Upgrading to CUDA 12 will happen eventually, but I don't realistically see it happening for another few weeks at least.

@SomeoneSerge
Copy link
Contributor Author

Let's open a bump to 11.8 then. @nviets would you like to take that on?

@nviets
Copy link
Contributor

nviets commented Apr 6, 2023

Sure - what does the bump entail? An update to the default in all-packages.nix and a nixpkgs-review to see if anything downstream breaks?

@nviets
Copy link
Contributor

nviets commented Apr 6, 2023

I am not familiar with the derivation for pytorch, but I see it's coded to accept nixpkgs' default library for CUDA. Should that be left as-is to build with 11.8, or should it be an option? I was under the impression in my xgboost update that it's better to have a configurable version of CUDA using cudaPackages ? {}. If pytorch should also have the option, should it be handled here or in a separate PR?

@SomeoneSerge
Copy link
Contributor Author

We should've bumped long time ago, now we need to do it for #259068

@samuela
Copy link
Member

samuela commented Oct 9, 2023

Agree that we should bump, but why is it needed for #259068?

@SomeoneSerge
Copy link
Contributor Author

Agree that we should bump, but why is it needed for #259068?

To avoid spawning extra cudaPackageses: https://github.com/NixOS/nixpkgs/pull/259068/files#diff-7c765cc9768cdae049f8712b4ef719bc94ec8ec2e8df2dd50ddf14c6acd4e580R13927

@samuela
Copy link
Member

samuela commented Oct 9, 2023

ah right, my own comment all along :P

@Turakar
Copy link

Turakar commented Nov 24, 2023

Hello, what is the state of this? I am currently using dnf on Fedora 37 for all CUDA stuff, but I encounter issues with the upgrade to 38 due to GCC incompatibilities with CUDA on Fedora 38. This led me to maybe try CUDA with Nix, but it seems like we still do not have any CUDA 12 support, although Torch moved to CUDA 12.1 as the new default?

@SomeoneSerge
Copy link
Contributor Author

This led me to maybe try CUDA with Nix, but it seems like we still do not have any CUDA 12 support, although Torch moved to CUDA 12.1 as the new default?

Hi! Note that cudaPackages is just the default we set, and you can override* it locally. That said, it's definitely the time we bumped the default version, we were just caught up with other stuff

*An example of how to override the default (do ping matrix or discourse for details if needed!):

import <nixpkgs> {
  config.allowUnfree = true;
  overlays = [
    (final: prev: { cudaPackages = final.cudaPackages_12; })
  ];
}`

@Turakar
Copy link

Turakar commented Nov 28, 2023

Hi! Note that cudaPackages is just the default we set, and you can override* it locally. That said, it's definitely the time we bumped the default version, we were just caught up with other stuff

Thanks, your suggestion worked! It seems like cuDNN 8.9.2 is not packaged, yet. I think it might be easier for me to go for CUDA 11.8 for now. It is in unstable. Nice 🙂

@SomeoneSerge
Copy link
Contributor Author

@Turakar note the linked PR too: I had a few evaluation errors when I changed the default

@Dessix
Copy link
Contributor

Dessix commented Nov 30, 2023

It is in unstable. Nice 🙂

I just made a PR to add cudnn 8.9.6.50 and get it supported in 12.2, if that helps at all: #271201

@github-project-automation github-project-automation bot moved this from 🔮 Roadmap to ✅ Done in CUDA Team Jan 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
6.topic: cuda Parallel computing platform and API
Projects
Status: Done
Development

No branches or pull requests

6 participants