[TRANSFORM] Pipeline (triton-lang#11)

goostavz · Jul 20, 2023 · 5802f75 · 5802f75
1 parent f5aebb5
commit 5802f75
Show file tree

Hide file tree

Showing 3 changed files with 840 additions and 662 deletions.
diff --git a/TODO.md b/TODO.md
@@ -15,6 +15,13 @@ https://github.com/openai/triton-hopper/blob/1ada046fdaef13f94dc7e2f6e6d0966e5d1
 https://github.com/openai/triton-hopper/blob/1ada046fdaef13f94dc7e2f6e6d0966e5d1efe92/lib/Dialect/Triton/IR/Traits.cpp#L10
 * Don't call arith to LLVM conversion from MLIR
 * linearize/delinearize helper have been duplicated (most likely due to layering problems). This should be merged
+* Try scf.if in pipeline to replace remui when indices are at the boundaries
+* Clean up waitIdx, phase, and other indices. Now there are a bunch of loop-carried variables.
+https://github.com/openai/triton-hopper/blob/9453151688804ebaf8bebca38a62ada5bb343d3c/lib/Dialect/TritonGPU/Transforms/Pipeline.cpp#L166 
+* Get rid of the hacky mode variable in pipeline
+https://github.com/openai/triton-hopper/blob/9453151688804ebaf8bebca38a62ada5bb343d3c/lib/Dialect/TritonGPU/Transforms/Pipeline.cpp#L180
+* Pipeline shouldn't have special handling for hopper, ideally it is agnostic to the architecture
+https://github.com/openai/triton-hopper/blob/9453151688804ebaf8bebca38a62ada5bb343d3c/lib/Dialect/TritonGPU/Transforms/Pipeline.cpp#L226
 
 ## bug fixes
 
@@ -25,4 +32,6 @@ https://github.com/openai/triton-hopper/blob/1ada046fdaef13f94dc7e2f6e6d0966e5d1
 * The IR output of `make_tensor_ptr` is wrong. `!tt.ptr` is ignored in the output.
 https://github.com/openai/triton-hopper/blob/1ada046fdaef13f94dc7e2f6e6d0966e5d1efe92/include/triton/Dialect/Triton/IR/TritonOps.td#L562
 * We rely on the `cuda-python` package currently, which prevents us from building triton on any node without CUDA installed. We should invoke TMA related functions in our thin CUDA wrapper.
-https://github.com/openai/triton-hopper/blob/b6a6b32b0ee79e93247d20c95f15fd75039a40b9/python/triton/compiler/utils.py#L3
+https://github.com/openai/triton-hopper/blob/b6a6b32b0ee79e93247d20c95f15fd75039a40b9/python/triton/compiler/utils.py#L3
+* Pipeline doesn't handle block ptrs correctly
+* Pipeline doesn't handle TMAs correctly