Skip to content
forked from iree-org/iree
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(iree): fuse elementwise generic ops into iree_linalg_ext.scan ops #39

Draft
wants to merge 1 commit into
base: artem/cpu-stack/scan-op-semantics
Choose a base branch
from

Conversation

AGindinson
Copy link

@AGindinson AGindinson commented Mar 13, 2025

Enhance iree_linalg_ext.scan support in the pipeline by introducing a basic fusion pattern for generic ops that are determined as unary elementwise operations. Provided that the full shapes match between the GenericOp producer and the ScanOp consumer, we can simply repeat the unary operation directly during the input data indexation of a ScanOp reduction routine.

This is essentially serving as a workaround for LLVMCPU pipeline, where stack-bound allocations tend to exceed the size limit in the presence of a ScanOp that just receives the pipeline's default tiling configuration.
Further enhancements should include tiling config fine-tuning for LinalgExt operations, conscious restrictions of work-group level tiling depending on the predicted size of stack-bound allocas within a dispatch. However, the fusion itself is beneficial, and we should look to support more producer cases, expanding the logic onto other pre-defined LinalgExt ops.

Copy link
Author

AGindinson commented Mar 13, 2025

Warning

This pull request is not mergeable via GitHub because a downstack PR is open. Once all requirements are satisfied, merge this PR as a stack on Graphite.
Learn more

This stack of pull requests is managed by Graphite. Learn more about stacking.

@AGindinson AGindinson force-pushed the artem/cpu-stack/fuse-scan branch from 92dccd2 to 7dd47ee Compare March 14, 2025 18:01
@AGindinson AGindinson force-pushed the artem/cpu-stack/scan-op-semantics branch from 81c38ad to 03fbc4f Compare March 14, 2025 18:01
Enhance `iree_linalg_ext.scan` support in the pipeline by introducing a basic
fusion pattern for generic ops that are determined as unary elementwise
operations. Provided that the full shapes match between the GenericOp
producer and the ScanOp consumer, we can simply repeat the unary operation
directly during the input data indexation of a ScanOp reduction routine.

This is essentially serving as a workaround for LLVMCPU pipeline, where
stack-bound allocations tend to exceed the size limit in the presence of a
ScanOp that just receives the pipeline's default tiling configuration.
Further enhancements should include tiling config fine-tuning for LinalgExt
operations, concious restrictions of work-group level tiling depending
on the predicted size of stack-bound allocas within a dispatch. However,
the fusion itself is beneficial, and we should look to support more producer
cases, expanding the logic onto other pre-defined LinalgExt ops.
@AGindinson AGindinson force-pushed the artem/cpu-stack/fuse-scan branch from 7dd47ee to fb955a8 Compare March 17, 2025 09:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant