[CIR][CUDA] Skeleton of NVPTX target lowering info #1358

AdUhTkJm · 2025-02-16T11:54:30Z

Added a skeleton of NVPTX target lowering info.

This enables lowering of simple.cu (as it hardly tests device side functionalities), so a test of LLVM IR is also added onto it.

AdUhTkJm · 2025-02-18T17:36:22Z

Just noticed it should also emit the registration function when emitting LLVM IR. So it's probably not ready to be accepted, and I'll update it later.

bcardosolopes · 2025-02-19T19:23:54Z

no problem @AdUhTkJm, let me know when it's ready!

AdUhTkJm · 2025-02-19T22:05:33Z

no problem @AdUhTkJm, let me know when it's ready!

It seems generating that registration function is much harder than I had thought. So I plan to block LLVM IR generation for this PR, and gradually set up a few more PRs to do lowering -- otherwise this one might become really huge.

For more details:

I'm planning to generate the registration in LoweringPrepare. The first argument of __cudaRegisterFunction is a pointer to the fat binary handle. We need to obtain it from an option in CodeGenOpts, which we don't have access to at that stage. So I think I'll add an attribute to the ModuleOp, recording the value of that option.

Another problem is that we'll create a runtime function, and it can be hard without knowing which type is the size_t (which is also in codegen). The same thing is happening for other parts of this pass as well, there are comments in lowerArrayDtorCtorIntoLoop, saying

// TODO: instead of fixed integer size, create alias for PtrDiffTy and unify
// with CIRGen stuff.
auto ptrDiffTy =
      cir::IntType::get(builder.getContext(), 64, /*signed=*/false);

Perhaps just create a unsigned 64 bit for size_t for now?

bcardosolopes · 2025-02-19T23:18:40Z

and gradually set up a few more PRs to do lowering -- otherwise this one might become really huge.

Sounds great to me!

I'm planning to generate the registration in LoweringPrepare. The first argument of __cudaRegisterFunction is a pointer to the fat binary handle. We need to obtain it from an option in CodeGenOpts, which we don't have access to at that stage. So I think I'll add an attribute to the ModuleOp, recording the value of that option.

Sure! Since you are touching this, would you mind migrating the content of CIRDataLayout to the module as well in a follow up PR? See below.

Perhaps just create a unsigned 64 bit for size_t for now?

Ideally we want to share the content of CIRGenTypeCache with LoweringPrepare, it should live in ModuleOp, like the one above. For this PR: you can create a new method in CIRDataLayout, named getPtrDiffTy() and make it return the code you pasted above, while at it, substitute this current example to also use getPtrDiffTy().

AdUhTkJm · 2025-02-20T10:40:09Z

Ideally we want to share the content of CIRGenTypeCache with LoweringPrepare, it should live in ModuleOp, like the one above. For this PR: you can create a new method in CIRDataLayout, named getPtrDiffTy() and make it return the code you pasted above, while at it, substitute this current example to also use getPtrDiffTy().

Yeah, I'll do that. By the way this PR is ready for review now.

bcardosolopes · 2025-02-20T19:18:12Z

clang/lib/CIR/FrontendAction/CIRGenAction.cpp

@@ -260,6 +260,11 @@ class CIRGenConsumer : public clang::ASTConsumer {
      }
    }

+    if (action != CIRGenAction::OutputType::EmitCIR) {
+      if (C.getLangOpts().CUDA && !C.getLangOpts().CUDAIsDevice)


I'd rather not get target specific restrictions as part in this file. What is it that you are hitting that is asking for this? If it's because something cannot lower to LLVM that's fine, this is already the status quo for CUDA.

I want to prevent CIR to lower into LLVM. Currently the lowering pass just go smoothly all the way to LLVM, but the result is wrong: lacking ptx_kernel calling conv and the registration function.

I tried to put that in LoweringPrepare, but it prevents -emit-cir to generate CIR. Where else should I put it?

Because the CUDA support didn't had LLVM right at the beginning, it's fine we are in this intermediate state, which will be soon fixed with your incremental work (for future PRs, whenever you add something in CIR, you should also add LLVM support in the same PR, this will prevent these issues).

I'd say not put this anywhere, but add LLVM output to the tests that make sense and annotate them with the COM directive. That will indicate the current status and give you a easy diff to patch once the callconv work land.

AdUhTkJm requested review from lanza and bcardosolopes as code owners February 16, 2025 11:54

AdUhTkJm force-pushed the main branch from 18c8ec1 to ac8ab30 Compare February 16, 2025 11:56

AdUhTkJm force-pushed the main branch from ac8ab30 to 18e2cdb Compare February 19, 2025 21:59

bcardosolopes reviewed Feb 20, 2025

View reviewed changes

[CIR][CUDA][NFC] Skeleton of NVPTX target lowering info

2888c6c

AdUhTkJm force-pushed the main branch from 18e2cdb to 2888c6c Compare February 20, 2025 21:16

bcardosolopes approved these changes Feb 22, 2025

View reviewed changes

bcardosolopes merged commit cc67bf7 into llvm:main Feb 22, 2025
6 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[CIR][CUDA] Skeleton of NVPTX target lowering info #1358

[CIR][CUDA] Skeleton of NVPTX target lowering info #1358

AdUhTkJm commented Feb 16, 2025

AdUhTkJm commented Feb 18, 2025

bcardosolopes commented Feb 19, 2025

AdUhTkJm commented Feb 19, 2025 •

edited

Loading

bcardosolopes commented Feb 19, 2025

AdUhTkJm commented Feb 20, 2025

bcardosolopes Feb 20, 2025

AdUhTkJm Feb 20, 2025

bcardosolopes Feb 20, 2025 •

edited

Loading

[CIR][CUDA] Skeleton of NVPTX target lowering info #1358

[CIR][CUDA] Skeleton of NVPTX target lowering info #1358

Conversation

AdUhTkJm commented Feb 16, 2025

AdUhTkJm commented Feb 18, 2025

bcardosolopes commented Feb 19, 2025

AdUhTkJm commented Feb 19, 2025 • edited Loading

bcardosolopes commented Feb 19, 2025

AdUhTkJm commented Feb 20, 2025

bcardosolopes Feb 20, 2025

Choose a reason for hiding this comment

AdUhTkJm Feb 20, 2025

Choose a reason for hiding this comment

bcardosolopes Feb 20, 2025 • edited Loading

Choose a reason for hiding this comment

AdUhTkJm commented Feb 19, 2025 •

edited

Loading

bcardosolopes Feb 20, 2025 •

edited

Loading