-
Notifications
You must be signed in to change notification settings - Fork 2.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
call for new feature similar to function __clz() in cuda #8212
Comments
We welcome contribution for this feature. We are happy to offer help if anyone is interested. This PR which adds popcnt to Taichi may be helpful on how to add intrinsics to Taichi. |
Hi! I'm willing to work on this issue. |
Thank you! Please let us know if you need any assistance. |
Thanks! To be clear, I am writing a manual implementation for for i in range(32):
if 2**i > n:
return 32 - i , right? |
Not exactly. We should add an intrinsic to the IR and use the built-in intrinsics in the LLVM and SPIRV based backends. |
noted, thanks! |
Issue: #8212 ### Brief Summary <!-- copilot:summary --> ### <samp>🤖 Generated by Copilot at 5d312ab</samp> This pull request implements a new `clz` function in Taichi, which counts the number of leading zeros for a 32-bit integer. The function is available as a Python function decorator, a unary operation in the Taichi expression system, and a backend-specific intrinsic in the code generation. The pull request modifies the relevant files in the `python`, `taichi`, and `codegen` directories. ### Walkthrough <!-- copilot:walkthrough --> ### <samp>🤖 Generated by Copilot at 5d312ab</samp> * Add a new function `clz` to count the number of leading zeros for a 32-bit integer ([link](https://github.com/taichi-dev/taichi/pull/8276/files?diff=unified&w=0#diff-059028cb0798284bed05638becbc32d256736846de19746e196fe5f5ee7fd061R1118-R1126), [link](https://github.com/taichi-dev/taichi/pull/8276/files?diff=unified&w=0#diff-5b3923516b48467202850afb384ef9901ecefae0173f03bcc9055adffe96d738R814-R816), [link](https://github.com/taichi-dev/taichi/pull/8276/files?diff=unified&w=0#diff-f95015864ea3251da5d376f2a11e8f5a0045d7aaf4602370471686f56561dafdR22), [link](https://github.com/taichi-dev/taichi/pull/8276/files?diff=unified&w=0#diff-b0b26408cd63f0a7edc6e9a6936ec09df7dc5f37c2ab65d72b3f9125f1385ba1R90), [link](https://github.com/taichi-dev/taichi/pull/8276/files?diff=unified&w=0#diff-af631a0c71978fe591e17005f01f7c06bc30ae36c65df306bbb3b08ade770167R941)) * Define the function `clz` in the `ops` module in `python/taichi/lang/ops.py` using a unary operation wrapper ([link](https://github.com/taichi-dev/taichi/pull/8276/files?diff=unified&w=0#diff-059028cb0798284bed05638becbc32d256736846de19746e196fe5f5ee7fd061R1118-R1126)) * Add a wrapper function `clz` in the `mathimpl` module in `python/taichi/math/mathimpl.py` to allow using `clz` as a Taichi function decorator ([link](https://github.com/taichi-dev/taichi/pull/8276/files?diff=unified&w=0#diff-5b3923516b48467202850afb384ef9901ecefae0173f03bcc9055adffe96d738R814-R816)) * Add a new macro for the `clz` unary operation in `taichi/inc/unary_op.inc.h` and `taichi/ir/expression_ops.h` to expand to the corresponding enum value and expression class ([link](https://github.com/taichi-dev/taichi/pull/8276/files?diff=unified&w=0#diff-f95015864ea3251da5d376f2a11e8f5a0045d7aaf4602370471686f56561dafdR22), [link](https://github.com/taichi-dev/taichi/pull/8276/files?diff=unified&w=0#diff-b0b26408cd63f0a7edc6e9a6936ec09df7dc5f37c2ab65d72b3f9125f1385ba1R90)) * Add a new macro for the `clz` unary operation in `taichi/python/export_lang.cpp` to bind the operation to the Python interface ([link](https://github.com/taichi-dev/taichi/pull/8276/files?diff=unified&w=0#diff-af631a0c71978fe591e17005f01f7c06bc30ae36c65df306bbb3b08ade770167R941)) * Implement the `clz` unary operation for different backends ([link](https://github.com/taichi-dev/taichi/pull/8276/files?diff=unified&w=0#diff-50537ad5ea3b900c0d55a088f3cc285986340ad68c9b96fea481187c4dce49eaL289-R296), [link](https://github.com/taichi-dev/taichi/pull/8276/files?diff=unified&w=0#diff-3c663c78745adcd3f6a7ac81fe99e628decc3040f292ea1e20ecd4b85a7f4313R210-R213), [link](https://github.com/taichi-dev/taichi/pull/8276/files?diff=unified&w=0#diff-1620f2a387fc8acc55e2b2cfced07bb9cba59702609aae6e9489e703cbab5000R900-R904)) * Add a new case for the `clz` unary operation in the CUDA backend code generation in `taichi/codegen/cuda/codegen_cuda.cpp`, which calls the CUDA intrinsic function `__clz` and checks the input type ([link](https://github.com/taichi-dev/taichi/pull/8276/files?diff=unified&w=0#diff-50537ad5ea3b900c0d55a088f3cc285986340ad68c9b96fea481187c4dce49eaL289-R296)) * Add a new case for the `clz` unary operation in the LLVM backend code generation in `taichi/codegen/llvm/codegen_llvm.cpp`, which calls the LLVM intrinsic function `ctlz` and assigns the result to the statement value ([link](https://github.com/taichi-dev/taichi/pull/8276/files?diff=unified&w=0#diff-3c663c78745adcd3f6a7ac81fe99e628decc3040f292ea1e20ecd4b85a7f4313R210-R213)) * Add a new case for the `clz` unary operation in the SPIRV backend code generation in `taichi/codegen/spirv/spirv_codegen.cpp`, which calls the GLSL 450 extended instruction `FindMSB` and subtracts the result from 32 ([link](https://github.com/taichi-dev/taichi/pull/8276/files?diff=unified&w=0#diff-1620f2a387fc8acc55e2b2cfced07bb9cba59702609aae6e9489e703cbab5000R900-R904)) * Add a new method for the `clz` unary operation in the IR builder class, which is a helper class for constructing IR statements ([link](https://github.com/taichi-dev/taichi/pull/8276/files?diff=unified&w=0#diff-bdb4f85a29d6478a4482d81ca072237534fb641b52f3c529aca93e872ade6fecR278-R281), [link](https://github.com/taichi-dev/taichi/pull/8276/files?diff=unified&w=0#diff-1894085b261e833e3e66924fc5b1cf63b9dd8b8aa0b3e78ec64366396131470dR177)) * Add a declaration for the `clz` unary operation method in the IR builder class header file in `taichi/ir/ir_builder.h` ([link](https://github.com/taichi-dev/taichi/pull/8276/files?diff=unified&w=0#diff-1894085b261e833e3e66924fc5b1cf63b9dd8b8aa0b3e78ec64366396131470dR177)) * Add a definition for the `clz` unary operation method in the IR builder class source file in `taichi/ir/ir_builder.cpp`, which creates and inserts a new unary operation statement with the `clz` type and the input value ([link](https://github.com/taichi-dev/taichi/pull/8276/files?diff=unified&w=0#diff-bdb4f85a29d6478a4482d81ca072237534fb641b52f3c529aca93e872ade6fecR278-R281)) --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Bob Cao <[email protected]> Co-authored-by: Lin Jiang <[email protected]>
In the process of building linear bvh in gpu, a good idea is that dividing the objectsd by the highest differing bit in their Morton codes, corresponds to classifying them on either side of an axis-aligned plane in 3D. Thus, cuda provides intrinsic function __clz() to count the number of leading zero bits in a 32-bit integer. However, Taichi is not aviliable for this feature (but it is important in building linear bvh).
I am calling for adding the features which is similar to __clz().
The text was updated successfully, but these errors were encountered: