Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove (X, Y) coordinates from NpuDmaMemcpyNdOp #1971

Draft
wants to merge 33 commits into
base: main
Choose a base branch
from

Conversation

pvasireddy-amd
Copy link
Collaborator

@pvasireddy-amd pvasireddy-amd commented Dec 12, 2024

Issue #1964
Requires #1997

This PR fixes an issue where the X and Y input attributes of the NpuDmaMemcpyNdOp given through the python bindings were always (0, 0), which may not be the coordinates of the Shim tile that the operation will be mapped to (and were not intended to represent that, by design). Instead, the lowering uses the metadata input of the NpuDmaMemcpyNdOp to find the corresponding ShimDMAAllocationOp. As this operation has the column coordinate (and the row coordinate is implicit for Shim tiles) this PR removes the X and Y coordinates from the NpuDmaMemcpyNdOp altogether and fully relies on the metadata to find the information.

Copy link
Contributor

github-actions bot commented Dec 13, 2024

Coverage Report

Created: 2025-02-12 19:24

Click here for information about interpreting this report.

FilenameFunction CoverageLine CoverageRegion CoverageBranch Coverage
IR/AIEXDialect.cpp 100.00% 87.97% 89.35% 78.37%
Transforms/AIECtrlPacketToDma.cpp 100.00% 97.32% 80.77% 68.18%
Transforms/AIEDmaToNpu.cpp 96.15% 89.32% 78.24% 61.11%
Totals 98.44% 89.87% 84.88% 71.63%
Generated by llvm-cov -- llvm version 18.1.3

@pvasireddy-amd pvasireddy-amd changed the title npu1 vs npu1_4col issue [TEST] Throws runtime error when using tiles in columns other than column 0 without defining a tile for column 0. Dec 13, 2024
@pvasireddy-amd pvasireddy-amd changed the title [TEST] Throws runtime error when using tiles in columns other than column 0 without defining a tile for column 0. [TEST] Throws error when using tiles in columns other than column 0 without defining a tile for column 0. Dec 13, 2024
@pvasireddy-amd pvasireddy-amd changed the title [TEST] Throws error when using tiles in columns other than column 0 without defining a tile for column 0. [TEST] Issue #1964 Dec 13, 2024
@pvasireddy-amd
Copy link
Collaborator Author

pvasireddy-amd commented Dec 13, 2024

Debug status:

  1. Error occurs when changing the device name from npu1_1col to npu1 while trying to perform a passthrough from ShimTile (0,0) to ComputeTile (0,2). This results in the following error: aiex.npu.dma_memcpy_nd' op Unsupported tile type at (0,0). Must be ShimNOC, Mem, or Core. This happens because getX() and getY() are used instead of the column and row information, which should be obtained from the metadata of dma_memcpy_nd in this line. Tile (0,0) is considered a ShimPLTile when the entire NPU is used, but this check needs to be moved since the row and column information is not available at the NpuDmaMemcpyNd stage.
  2. The second case involves changing both the ShimTile and ComputeTile to columns other than the zeroth column and using the appropriate device name for the same passthrough example. This results in a runtime error. One temporary workaround is to declare at least one tile (which can be either a ShimTile, MemTile, or ComputeTile with an empty body) in the zeroth column, and the test passes. The reason for this dependency on the zeroth column is unclear.

abisca and others added 7 commits December 16, 2024 09:42
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
abisca and others added 9 commits December 19, 2024 03:58
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
@AndraBisca AndraBisca changed the title [TEST] Issue #1964 Remove X, Y inputs from NpuDmaMemcpyNdOp Dec 22, 2024
@AndraBisca AndraBisca changed the title Remove X, Y inputs from NpuDmaMemcpyNdOp Remove (X, Y) coordinates from NpuDmaMemcpyNdOp Dec 22, 2024
@@ -9,26 +9,29 @@
//
//===----------------------------------------------------------------------===//

// RUN: aie-opt --split-input-file --verify-diagnostics %s
// RUN: aie-opt --split-input-file --aie-dma-to-npu --verify-diagnostics %s
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the pass added here just to run verification? Why the move of the verifier from the dialect to the pass?

Copy link
Collaborator Author

@pvasireddy-amd pvasireddy-amd Feb 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previously, the verifyStridesWraps function was part of NpuDmaMemcpy::verify(), but it relied on getX() and getY() to retrieve row and column values. As a result, the function has been moved into the pass, where it now obtains row and column values from ShimDmaAllocOp. This change means that some tests in bad_npu_nd.mlir related to sizes and strides, which were previously verifiable directly from the dialect, can no longer be verified this way and now require the pass.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previously, the verifyStridesWraps function was part of NpuDmaMemcpy::verify(), but it relied on getX() and getY() to retrieve row and column values. As a result, the function has been moved into the pass, where it now obtains row and column values from ShimDmaAllocOp. This change means that some tests in bad_npu_nd.mlir related to sizes and strides, which were previously verifiable directly from the dialect, can no longer be verified this way and now require the pass.

But all we've done by moving the verification is hard code the row to zero and get the column from the info op. I don't understand how that is dependent on being inside the pass? Can't we do the same hardcode + info op lookup in the previous code location?

@pvasireddy-amd pvasireddy-amd marked this pull request as draft February 13, 2025 15:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants