Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[NFC] Just make the git history straight #7

Closed
wants to merge 3 commits into from

Conversation

minjang
Copy link
Collaborator

@minjang minjang commented May 24, 2024

No functional change. This is just to make the git history straight.

Current:

> % git log --graph --oneline
* 6db262346 - (HEAD -> main, origin/main, origin/HEAD) Prototype of the Triton CPU (#6) 
* 26be3541f - Add a workaround for LLVM bug in...
* 72057f3f9 - Revert unreviewed changes...
*   a19ece70a - Merge pull request #2 from ...
|\  
| * 95d4d0c0f - Support basic lowering...
|/  
* 71161cbc9 - Quick patches to make it...
* 27a55e604 - Convert tt.get_program_id...

Fix it:

> git rebase -i 95d4d0c0f^

Fixed:

> % gl --max-count=10       
* 3e1fc1d5a - (HEAD -> main) Prototype of the Triton CPU backend ...
* ac06f01e5 - Add a workaround for LLVM bug in codegen for bf16 ...
* 3a5afddd8 - Revert unreviewed changes. (#5) (4 seconds ago) <shanenay>
* 95d4d0c0f - Support basic lowering through vector dialect in ...
* 71161cbc9 - Quick patches to make it work after rebasing (#3) (9 days ago) <Minjang Kim>
* 27a55e604 - [BACKEND][CPU] Convert tt.get_program_id and ...
* 68cb5dd9d - Convert tt.func and tt.return (#4) (10 days ago) <Minjang Kim>
* 176cdd21e - Update README.md (10 days ago) <Minjang Kim>

shanenay and others added 3 commits May 24, 2024 15:17
…ion flows (triton-lang#6)

* Support basic lowering through vector dialect in CPU backend.

Signed-off-by: Ilya Enkovich <[email protected]>

* Use axis info in memory op lowering.

Signed-off-by: Ilya Enkovich <[email protected]>

* Mark test_ptx_cast as enabled for CPU.

Signed-off-by: Ilya Enkovich <[email protected]>

* Support umulhi operation.

Signed-off-by: Ilya Enkovich <[email protected]>

* Support tl.clamp, tl.minimum, tl.maximum.

Signed-off-by: Ilya Enkovich <[email protected]>

* Add enable_fp_fusion opt for CPU (only affects ASM dump now).

Signed-off-by: Ilya Enkovich <[email protected]>

* Fix kernel args passing for propagated constants.

Signed-off-by: Ilya Enkovich <[email protected]>

* Add permutations support.

Signed-off-by: Ilya Enkovich <[email protected]>

* Support 2-D transfer_read/transfer_write lowering.

Signed-off-by: Ilya Enkovich <[email protected]>

* Introduce shape info analysis and use it for loads/stores by block pointers.

Delay scalar pointers lowering.

Signed-off-by: Ilya Enkovich <[email protected]>

* Support 'other' arg for loads.

Signed-off-by: Ilya Enkovich <[email protected]>

* Support tl.join.

Signed-off-by: Ilya Enkovich <[email protected]>

* Minor renaming.

Signed-off-by: Ilya Enkovich <[email protected]>

---------

Signed-off-by: Ilya Enkovich <[email protected]>
@minjang minjang requested a review from ptillet as a code owner May 24, 2024 22:52
@minjang minjang requested review from bertmaher and ienkovich May 24, 2024 22:52
@minjang
Copy link
Collaborator Author

minjang commented May 28, 2024

It turned out a PR can't make the git history straight. We'd want to do git push --force. Closing.

@minjang minjang closed this May 28, 2024
minjang pushed a commit that referenced this pull request Jun 24, 2024
When running
[convert_blocked1d_to_slice0](https://github.com/triton-lang/triton/blob/0ba5f0c3cd029d5c3d1f01b9bf29dac32c27345e/test/Conversion/tritongpu_to_llvm.mlir#L924)
Triton ends up computing a rank of a matrix with 0 columns during linear
layout lowering, which trips up f2reduce, and causes undefined behavior,
detectable through
[UBSAN](https://clang.llvm.org/docs/UndefinedBehaviorSanitizer.html).

Fix this by returning the rank (0) early in these cases, without calling
f2reduce.

<details><summary>Stack trace</summary>
<p>

```
third_party/triton/third_party/f2reduce/f2reduce.cpp:421:30: runtime error: shift exponent 18446744073709551615 is too large for 64-bit type 'unsigned long long'
    #0 0x556ee2fea3be in inplace_rref_small third_party/triton/third_party/f2reduce/f2reduce.cpp:421:30
    #1 0x556ee2fea3be in f2reduce::inplace_rref_strided(unsigned long*, unsigned long, unsigned long, unsigned long) third_party/triton/third_party/f2reduce/f2reduce.cpp:470:9
    #2 0x556ee2ea70da in getMatrixRank third_party/triton/lib/Tools/LinearLayout.cpp:125:3
    #3 0x556ee2ea70da in mlir::triton::LinearLayout::checkInvariants(bool) third_party/triton/lib/Tools/LinearLayout.cpp:299:7
    #4 0x556ee2ea656d in mlir::triton::LinearLayout::tryCreate(llvm::MapVector<mlir::StringAttr, std::__u::vector<std::__u::vector<int, std::__u::allocator<int>>, std::__u::allocator<std::__u::vector<int, std::__u::allocator<int>>>>, llvm::DenseMap<mlir::StringAttr, unsigned int, llvm::DenseMapInfo<mlir::StringAttr, void>, llvm::detail::DenseMapPair<mlir::StringAttr, unsigned int>>, llvm::SmallVector<std::__u::pair<mlir::StringAttr, std::__u::vector<std::__u::vector<int, std::__u::allocator<int>>, std::__u::allocator<std::__u::vector<int, std::__u::allocator<int>>>>>, 0u>>, llvm::ArrayRef<std::__u::pair<mlir::StringAttr, int>>, bool) third_party/triton/lib/Tools/LinearLayout.cpp:190:41
    #5 0x556ee2eb2150 in mlir::triton::LinearLayout::divideRight(mlir::triton::LinearLayout const&) third_party/triton/lib/Tools/LinearLayout.cpp:654:51
    #6 0x556ee2ee1c39 in mlir::cvtNeedsSharedMemory(mlir::RankedTensorType, mlir::RankedTensorType) third_party/triton/lib/Analysis/Utility.cpp:652:14
    #7 0x556ee2cf38fd in mlir::triton::getRepShapeForCvtLayout(mlir::triton::gpu::ConvertLayoutOp) third_party/triton/lib/Analysis/Allocation.cpp:66:8
    #8 0x556ee2cf3efa in mlir::triton::getScratchConfigForCvtLayout(mlir::triton::gpu::ConvertLayoutOp, unsigned int&, unsigned int&) third_party/triton/lib/Analysis/Allocation.cpp:95:19
    #9 0x556ee2cf6057 in mlir::triton::AllocationAnalysis::getScratchValueSize(mlir::Operation*) third_party/triton/lib/Analysis/Allocation.cpp:272:24
    #10 0x556ee2cf5499 in operator() third_party/triton/lib/Analysis/Allocation.cpp:343:7
    #11 0x556ee2cf5499 in void llvm::function_ref<void (mlir::Operation*)>::callback_fn<mlir::triton::AllocationAnalysis::getValuesAndSizes()::'lambda'(mlir::Operation*)>(long, mlir::Operation*) third_party/llvm/llvm-project/llvm/include/llvm/ADT/STLFunctionalExtras.h:45:12
    #12 0x556edeeee7a9 in operator() third_party/llvm/llvm-project/llvm/include/llvm/ADT/STLFunctionalExtras.h:68:12
    #13 0x556edeeee7a9 in void mlir::detail::walk<mlir::ForwardIterator>(mlir::Operation*, llvm::function_ref<void (mlir::Operation*)>, mlir::WalkOrder) third_party/llvm/llvm-project/mlir/include/mlir/IR/Visitors.h:174:5
    #14 0x556edeeee87c in void mlir::detail::walk<mlir::ForwardIterator>(mlir::Operation*, llvm::function_ref<void (mlir::Operation*)>, mlir::WalkOrder) third_party/llvm/llvm-project/mlir/include/mlir/IR/Visitors.h:182:9
    #15 0x556ee2cf49e7 in walk<(mlir::WalkOrder)0, mlir::ForwardIterator, (lambda at third_party/triton/lib/Analysis/Allocation.cpp:341:42), mlir::Operation *, void> third_party/llvm/llvm-project/mlir/include/mlir/IR/Visitors.h:313:10
    #16 0x556ee2cf49e7 in walk<(mlir::WalkOrder)0, mlir::ForwardIterator, (lambda at third_party/triton/lib/Analysis/Allocation.cpp:341:42), void> third_party/llvm/llvm-project/mlir/include/mlir/IR/Operation.h:794:12
    #17 0x556ee2cf49e7 in mlir::triton::AllocationAnalysis::getValuesAndSizes() third_party/triton/lib/Analysis/Allocation.cpp:341:16
    #18 0x556ee2cf4852 in run third_party/triton/lib/Analysis/Allocation.cpp:182:5
    #19 0x556ee2cf4852 in AllocationAnalysis third_party/triton/lib/Analysis/Allocation.cpp:169:5
    #20 0x556ee2cf4852 in mlir::Allocation::run(llvm::DenseMap<mlir::FunctionOpInterface, mlir::Allocation, llvm::DenseMapInfo<mlir::FunctionOpInterface, void>, llvm::detail::DenseMapPair<mlir::FunctionOpInterface, mlir::Allocation>>&) third_party/triton/lib/Analysis/Allocation.cpp:627:3
    #21 0x556ee1677402 in operator() third_party/triton/include/triton/Analysis/Allocation.h:227:26
    #22 0x556ee1677402 in void mlir::CallGraph<mlir::Allocation>::doWalk<(mlir::WalkOrder)0, (mlir::WalkOrder)1, mlir::ModuleAllocation::ModuleAllocation(mlir::ModuleOp)::'lambda'(mlir::CallOpInterface, mlir::FunctionOpInterface), mlir::ModuleAllocation::ModuleAllocation(mlir::ModuleOp)::'lambda'(mlir::FunctionOpInterface)>(mlir::FunctionOpInterface, llvm::DenseSet<mlir::FunctionOpInterface, llvm::DenseMapInfo<mlir::FunctionOpInterface, void>>&, mlir::ModuleAllocation::ModuleAllocation(mlir::ModuleOp)::'lambda'(mlir::CallOpInterface, mlir::FunctionOpInterface), mlir::ModuleAllocation::ModuleAllocation(mlir::ModuleOp)::'lambda'(mlir::FunctionOpInterface)) third_party/triton/include/triton/Analysis/Utility.h:350:7
    #23 0x556ee16756b3 in walk<(mlir::WalkOrder)0, (mlir::WalkOrder)1, (lambda at third_party/triton/include/triton/Analysis/Allocation.h:222:9), (lambda at third_party/triton/include/triton/Analysis/Allocation.h:224:9)> third_party/triton/include/triton/Analysis/Utility.h:242:7
    #24 0x556ee16756b3 in mlir::ModuleAllocation::ModuleAllocation(mlir::ModuleOp) third_party/triton/include/triton/Analysis/Allocation.h:220:5
    #25 0x556ee2c2bf18 in (anonymous namespace)::AllocateSharedMemory::runOnOperation() third_party/triton/lib/Conversion/TritonGPUToLLVM/AllocateSharedMemory.cpp:26:22
...
UndefinedBehaviorSanitizer: invalid-shift-exponent third_party/triton/third_party/f2reduce/f2reduce.cpp:421:30 
```
</p>
</details>
Devjiu pushed a commit to Devjiu/triton-cpu that referenced this pull request Nov 13, 2024
…on (triton-lang#7)

* Lift -triton-raise-block-pointer pass from intel-xpu-backend-for-triton

Code was in turn taken from triton-shared (though does not use the tts
dialect).
Devjiu pushed a commit to Devjiu/triton-cpu that referenced this pull request Nov 13, 2024
…on (triton-lang#7)

* Lift -triton-raise-block-pointer pass from intel-xpu-backend-for-triton

Code was in turn taken from triton-shared (though does not use the tts
dialect).
ienkovich pushed a commit to ienkovich/triton-cpu that referenced this pull request Nov 20, 2024
…on (triton-lang#7)

* Lift -triton-raise-block-pointer pass from intel-xpu-backend-for-triton

Code was in turn taken from triton-shared (though does not use the tts
dialect).
Devjiu pushed a commit to Devjiu/triton-cpu that referenced this pull request Jan 20, 2025
…on (triton-lang#7)

* Lift -triton-raise-block-pointer pass from intel-xpu-backend-for-triton

Code was in turn taken from triton-shared (though does not use the tts
dialect).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants