don't depend on cuda-nvcc; use cuda-nvcc-impl to avoid pulling in GCC #38

h-vetinari · 2025-01-28T20:29:32Z

Fixes #37

conda-forge-admin · 2025-01-28T20:31:04Z

Hi! This is the friendly automated conda-forge-linting service.

I just wanted to let you know that I linted all conda-recipes in your PR (recipe/recipe.yaml) and found it was in an excellent condition.

conda-forge-admin · 2025-01-28T21:03:22Z

Hi! This is the friendly conda-forge automerge bot!

I considered the following status checks when analyzing this PR:

linter: passed
azure: failed

Thus the PR was not passing and not merged.

h-vetinari · 2025-01-28T23:19:53Z

Hm, seems like we got some apparently new incompatibility:

 $PREFIX/targets/x86_64-linux/include/generated_cuda_meta.h:754:5: error: 'CUmemcpyAttributes' does not name a type; did you mean 'cudaMemcpyAttributes'?
 │ │     754 |     CUmemcpyAttributes *attrs;
 │ │         |     ^~~~~~~~~~~~~~~~~~
 │ │         |     cudaMemcpyAttributes

see conda-forge@1cb58a3

jakirkham

Thanks Axel! 🙏

Have a couple questions below

recipe/recipe.yaml

h-vetinari · 2025-01-29T03:47:02Z

This is ready except for dealing with CUDA 12.8. I'm waiting for a response in triton-lang/triton#5737 (even though that's for the upcoming 3.2, the same principle would apply to 3.1 as well if my proposed backport is deemed OK)

Otherwise we can add some <12.8 constraint to unblock this.

jakirkham · 2025-01-29T09:28:28Z

Let's stick to CUDA 12.6 for the moment

As great as it would be to have CUDA 12.8 here, there is some prep work needed

Will bring this up for discussion later today

h-vetinari · 2025-01-29T09:32:30Z

Let's stick to CUDA 12.6 for the moment

As great as it would be to have CUDA 12.8 here, there is some prep work needed

Will bring this up for discussion later today

I'm not talking about building anything with 12.8. But currently triton does not have a runtime constraint on the CUDA version (and I'd like to keep it that way). However, that means that we end up pulling in cuda-cupti 12.8 into the test environment, and triton doesn't know which ptx format to use and fails.

It's IMO triton's bug to have an == 6 instead of >=6 in the following

triton-feedstock/recipe/patches/0009-CODEGEN-Support-CUDA-12.6-4588.patch

Lines 20 to 26 in 1cb58a3

    
                if major == 12: 
        
           -        return 80 + minor 
        
           +        if minor < 6: 
        
           +            return 80 + minor 
        
           +        elif minor == 6: 
        
           +            return 85 
        
                if major == 11:

This is the change I'd much rather make, but I wanted to give a bit of time for feedback to arrive in triton-lang/triton#5737.

jakirkham · 2025-01-29T09:37:34Z

Ok for more context CUDA 12.8 adds 2 new architectures. So am thinking about how we roll that out

In any event will raise an issue to discuss and link it here

h-vetinari · 2025-01-29T09:53:56Z

Ok for more context CUDA 12.8 adds 2 new architectures.

I don't see how that matters, as long as builds don't ask to build for those architectures. The function I referenced literally says

     def ptx_get_version(cuda_version) -> int:
         '''
         Get the highest PTX version supported by the current CUDA driver.
         '''

Note "highest". My point is that using ptx 85 should be fine regardless of whether the toolchain is 12.6 or 12.8.

don't depend on cuda-nvcc; use cuda-nvcc-impl to avoid pulling in GCC

daf3519

h-vetinari added the automerge Merge the PR when CI passes label Jan 28, 2025

h-vetinari requested a review from erip as a code owner January 28, 2025 20:29

remove @erip from maintainers, as per their wish

f725c1e

see conda-forge@1cb58a3

h-vetinari removed the automerge Merge the PR when CI passes label Jan 28, 2025

ensure we pull in matching cuda-cupti

6c1a7a8

h-vetinari force-pushed the nvcc_impl branch from 191d9dd to 6c1a7a8 Compare January 29, 2025 00:04

jakirkham reviewed Jan 29, 2025

View reviewed changes

recipe/recipe.yaml Show resolved Hide resolved

recipe/recipe.yaml Outdated Show resolved Hide resolved

use cuda-nvcc-tools

b78932b

use currently known ptx format also for newer CUDA for now

bae12fb

h-vetinari mentioned this pull request Jan 30, 2025

Adding CUDA 12.8 migration conda-forge/conda-forge-pinning-feedstock#6980

Open

backport upstream fix for CUDA 12.8

c53407d

h-vetinari merged commit 8acb8c2 into conda-forge:main Jan 30, 2025
13 checks passed

h-vetinari deleted the nvcc_impl branch January 30, 2025 21:59

h-vetinari mentioned this pull request Feb 2, 2025

Port to aarch64 conda-forge/deepspeed-feedstock#81

Closed

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

don't depend on cuda-nvcc; use cuda-nvcc-impl to avoid pulling in GCC #38

don't depend on cuda-nvcc; use cuda-nvcc-impl to avoid pulling in GCC #38

h-vetinari commented Jan 28, 2025

conda-forge-admin commented Jan 28, 2025

conda-forge-admin commented Jan 28, 2025

h-vetinari commented Jan 28, 2025

jakirkham left a comment

h-vetinari commented Jan 29, 2025 •

edited

Loading

jakirkham commented Jan 29, 2025

h-vetinari commented Jan 29, 2025

jakirkham commented Jan 29, 2025

h-vetinari commented Jan 29, 2025 •

edited

Loading

don't depend on cuda-nvcc; use cuda-nvcc-impl to avoid pulling in GCC #38

don't depend on cuda-nvcc; use cuda-nvcc-impl to avoid pulling in GCC #38

Conversation

h-vetinari commented Jan 28, 2025

conda-forge-admin commented Jan 28, 2025

conda-forge-admin commented Jan 28, 2025

h-vetinari commented Jan 28, 2025

jakirkham left a comment

Choose a reason for hiding this comment

h-vetinari commented Jan 29, 2025 • edited Loading

jakirkham commented Jan 29, 2025

h-vetinari commented Jan 29, 2025

jakirkham commented Jan 29, 2025

h-vetinari commented Jan 29, 2025 • edited Loading

h-vetinari commented Jan 29, 2025 •

edited

Loading

h-vetinari commented Jan 29, 2025 •

edited

Loading