-
-
Notifications
You must be signed in to change notification settings - Fork 516
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding CUDA 12.8 migration #6980
Comments
Sounds good to me |
Thanks for the issue, and congrats on the quick rollout!
My understanding is the existing packages would still work fine on these new chips, obviously without yet making use of the
We did 12.0 -> 12.6 without a migration, so unless there are (big) breaking changes, we could probably do the same for 12.6 -> 12.8? Not that I have anything against an explicit migration (I was in favour last time as well). Replacing 12.6 is fine as long as we don't lose backward compatibility, but the discussion in #6630 indicates that this won't be the case. However, the "replacing" part could only come when we close the migration (I don't think it's an option to drop 12.6 when we open it), whereas in between, we'd be getting all of I'll note that I saw some incompatibilties when a feedstock ended up pulling in some 12.8 builds recently (logs):
This could have simply been due to a toolchain mismatch (CUDA 12.6 compilers in |
I think a migrator is only necessary if you are writing a note to maintainers that they should only merge the migration once they have enabled the new CUDA archs and if the migrator tries to automatically update any skipping logic to unskip 12.8. |
Thanks for all of the feedback so far! 🙏 Have taken an initial pass at this in PR: #7005 Commented on a few sections of note Would really appreciate if all of you could take a look and share your suggestions 🙂 |
My opinion is that we should either:
What I want to avoid is building for 3 CUDA versions + CPU (=4 times) by default, because it blows up the CI matrices for the entire duration of the 12.8 migration (until we drop 12.6 at the end). Given how long it took to wind down the long tail of the 12.0 migration, I don't want to have several months where CUDA-enabled feedstocks have 4x the number of baseline builds. However, if we can agree to timebox the time until we close the migration (say, maximum one month; regardless how many feedstocks have migrated by then), then I'd be OK with #7005 as proposed. Even if we finish the migration before all feedstocks have caught up, would essentially be a softer version of what we did for 12.6 (i.e. option 2. above). |
Will reiterate what I said in the OP
If that's what we want, let's figure out how to do that |
It's not possible to drop 12.6 when we start the migrator (then all feedstocks that the migration hasn't reached yet would not have any CUDA 12.x builds anymore upon rerendering). If we want to keep 11.8, then the options are to do an immediate switch (like we did for 12.0 -> 12.6), or draw out this process slightly through the proposed migrator (at the cost of quadruple builds and thus for a limited amount of time) to give the most active feedstocks a chance at a more orderly update. |
Am proposing the migrator would replace 12.6 with 12.8 |
Yes, but that replacement can only happen when we close the migrator; in the intervening time, we would have quadruple builds. Hence my point to limit the amount of time we allow the migration to run (since in any case it is already a gentler approach than the immediate switch, which was not terribly disruptive when we did that for 12.6). Please, think through the mechanics of how this would actually play out. I'm not saying no to a migration, but if we do migrate (rather than just do the switch directly in the pinning), my condition is the timeboxing that I've laid out. |
No I'm saying let's capture this behavior in the migrator itself |
That is not possible, to my understanding of how conda-build, smithy and our infrastructure works. |
Would reframe that as it may not be currently possible However based on your feedback it sounds desirable So let's see if we can figure out a way to do it This is likely not the last time we will want something like this |
In issue ( #6630 ) and ( #6720 ), we discussed and decided to move to the latest CUDA 12 (at the time 12.6). We decided to pin the minor version to avoid accidentally picking up incomplete updates of the CUDA Toolkit. Though in principle we were open to updating when a new version came out
Recently CUDA 12.8 was released and packaged on conda-forge ( conda-forge/cuda-feedstock#63 ). One of the new features is the addition of architectures
sm_100
,sm_101
, andsm_120
. For binaries to function correctly on these new architectures, they need to be rebuiltGiven this, think it would make sense to create a CUDA 12.8 migrator to roll out CUDA 12.8 to feedstocks so they can be rebuilt topologically
Should add would be ok dropping CUDA 12.6 at the same time
Would be interested in hearing others thoughts on this
cc @conda-forge/core
The text was updated successfully, but these errors were encountered: