Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[Lang] [ir] [cuda] Add clz instruction (#8276)
Issue: #8212 ### Brief Summary <!-- copilot:summary --> ### <samp>🤖 Generated by Copilot at 5d312ab</samp> This pull request implements a new `clz` function in Taichi, which counts the number of leading zeros for a 32-bit integer. The function is available as a Python function decorator, a unary operation in the Taichi expression system, and a backend-specific intrinsic in the code generation. The pull request modifies the relevant files in the `python`, `taichi`, and `codegen` directories. ### Walkthrough <!-- copilot:walkthrough --> ### <samp>🤖 Generated by Copilot at 5d312ab</samp> * Add a new function `clz` to count the number of leading zeros for a 32-bit integer ([link](https://github.com/taichi-dev/taichi/pull/8276/files?diff=unified&w=0#diff-059028cb0798284bed05638becbc32d256736846de19746e196fe5f5ee7fd061R1118-R1126), [link](https://github.com/taichi-dev/taichi/pull/8276/files?diff=unified&w=0#diff-5b3923516b48467202850afb384ef9901ecefae0173f03bcc9055adffe96d738R814-R816), [link](https://github.com/taichi-dev/taichi/pull/8276/files?diff=unified&w=0#diff-f95015864ea3251da5d376f2a11e8f5a0045d7aaf4602370471686f56561dafdR22), [link](https://github.com/taichi-dev/taichi/pull/8276/files?diff=unified&w=0#diff-b0b26408cd63f0a7edc6e9a6936ec09df7dc5f37c2ab65d72b3f9125f1385ba1R90), [link](https://github.com/taichi-dev/taichi/pull/8276/files?diff=unified&w=0#diff-af631a0c71978fe591e17005f01f7c06bc30ae36c65df306bbb3b08ade770167R941)) * Define the function `clz` in the `ops` module in `python/taichi/lang/ops.py` using a unary operation wrapper ([link](https://github.com/taichi-dev/taichi/pull/8276/files?diff=unified&w=0#diff-059028cb0798284bed05638becbc32d256736846de19746e196fe5f5ee7fd061R1118-R1126)) * Add a wrapper function `clz` in the `mathimpl` module in `python/taichi/math/mathimpl.py` to allow using `clz` as a Taichi function decorator ([link](https://github.com/taichi-dev/taichi/pull/8276/files?diff=unified&w=0#diff-5b3923516b48467202850afb384ef9901ecefae0173f03bcc9055adffe96d738R814-R816)) * Add a new macro for the `clz` unary operation in `taichi/inc/unary_op.inc.h` and `taichi/ir/expression_ops.h` to expand to the corresponding enum value and expression class ([link](https://github.com/taichi-dev/taichi/pull/8276/files?diff=unified&w=0#diff-f95015864ea3251da5d376f2a11e8f5a0045d7aaf4602370471686f56561dafdR22), [link](https://github.com/taichi-dev/taichi/pull/8276/files?diff=unified&w=0#diff-b0b26408cd63f0a7edc6e9a6936ec09df7dc5f37c2ab65d72b3f9125f1385ba1R90)) * Add a new macro for the `clz` unary operation in `taichi/python/export_lang.cpp` to bind the operation to the Python interface ([link](https://github.com/taichi-dev/taichi/pull/8276/files?diff=unified&w=0#diff-af631a0c71978fe591e17005f01f7c06bc30ae36c65df306bbb3b08ade770167R941)) * Implement the `clz` unary operation for different backends ([link](https://github.com/taichi-dev/taichi/pull/8276/files?diff=unified&w=0#diff-50537ad5ea3b900c0d55a088f3cc285986340ad68c9b96fea481187c4dce49eaL289-R296), [link](https://github.com/taichi-dev/taichi/pull/8276/files?diff=unified&w=0#diff-3c663c78745adcd3f6a7ac81fe99e628decc3040f292ea1e20ecd4b85a7f4313R210-R213), [link](https://github.com/taichi-dev/taichi/pull/8276/files?diff=unified&w=0#diff-1620f2a387fc8acc55e2b2cfced07bb9cba59702609aae6e9489e703cbab5000R900-R904)) * Add a new case for the `clz` unary operation in the CUDA backend code generation in `taichi/codegen/cuda/codegen_cuda.cpp`, which calls the CUDA intrinsic function `__clz` and checks the input type ([link](https://github.com/taichi-dev/taichi/pull/8276/files?diff=unified&w=0#diff-50537ad5ea3b900c0d55a088f3cc285986340ad68c9b96fea481187c4dce49eaL289-R296)) * Add a new case for the `clz` unary operation in the LLVM backend code generation in `taichi/codegen/llvm/codegen_llvm.cpp`, which calls the LLVM intrinsic function `ctlz` and assigns the result to the statement value ([link](https://github.com/taichi-dev/taichi/pull/8276/files?diff=unified&w=0#diff-3c663c78745adcd3f6a7ac81fe99e628decc3040f292ea1e20ecd4b85a7f4313R210-R213)) * Add a new case for the `clz` unary operation in the SPIRV backend code generation in `taichi/codegen/spirv/spirv_codegen.cpp`, which calls the GLSL 450 extended instruction `FindMSB` and subtracts the result from 32 ([link](https://github.com/taichi-dev/taichi/pull/8276/files?diff=unified&w=0#diff-1620f2a387fc8acc55e2b2cfced07bb9cba59702609aae6e9489e703cbab5000R900-R904)) * Add a new method for the `clz` unary operation in the IR builder class, which is a helper class for constructing IR statements ([link](https://github.com/taichi-dev/taichi/pull/8276/files?diff=unified&w=0#diff-bdb4f85a29d6478a4482d81ca072237534fb641b52f3c529aca93e872ade6fecR278-R281), [link](https://github.com/taichi-dev/taichi/pull/8276/files?diff=unified&w=0#diff-1894085b261e833e3e66924fc5b1cf63b9dd8b8aa0b3e78ec64366396131470dR177)) * Add a declaration for the `clz` unary operation method in the IR builder class header file in `taichi/ir/ir_builder.h` ([link](https://github.com/taichi-dev/taichi/pull/8276/files?diff=unified&w=0#diff-1894085b261e833e3e66924fc5b1cf63b9dd8b8aa0b3e78ec64366396131470dR177)) * Add a definition for the `clz` unary operation method in the IR builder class source file in `taichi/ir/ir_builder.cpp`, which creates and inserts a new unary operation statement with the `clz` type and the input value ([link](https://github.com/taichi-dev/taichi/pull/8276/files?diff=unified&w=0#diff-bdb4f85a29d6478a4482d81ca072237534fb641b52f3c529aca93e872ade6fecR278-R281)) --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Bob Cao <[email protected]> Co-authored-by: Lin Jiang <[email protected]>
- Loading branch information