Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ultrascale DMA via HPC #925

Merged
merged 2 commits into from
May 10, 2022
Merged

Ultrascale DMA via HPC #925

merged 2 commits into from
May 10, 2022

Conversation

tausen
Copy link
Contributor

@tausen tausen commented Apr 27, 2022

  • Support connecting HPC ports on PS8
  • Add register to axi_dmac controlling AxCACHE

I also have a patch for the dma-axi-dmac Linux driver, setting the register if the DMA node in the devicetree has the dma-coherent property. Will make a PR if this one is accepted.

The purpose here is to speed up transfers when the CPU should manipulate the samples immediately after transferring them from PL to main memory:

  • Without the dma-coherent devicetree property, it is assumed that it is unsafe to read addresses from cache that the DMA may have written to (and rightfully so). This means all reads will be from main memory.
  • Using the Ultrascale HPC port with AxCACHE set accordingly, DMA writes will invalidate the cache at those addresses. This means we can safely set the dma-coherent property and enjoy cache hits and speedup.

I've only tested transfers from PL to memory, not the other way around. In my testcase I use iio_buffer_refill and memcpy from the iio buffer to elsewhere in main memory. Without this patch, I'm limited to about 900 Mbps with one CPU core at 100%. After these changes, the limit is close to 6 Gbps.

Inspired by Xilinx' wiki page here:
https://xilinx-wiki.atlassian.net/wiki/spaces/A/pages/18842098/Zynq+UltraScale+MPSoC+Cache+Coherency

The wiki page suggests AxPROT[1] should be set on Linux, marking transactions as non-secure. In what little I have read about AxPROT, I haven't seen anything that suggests this is related to cache coherency, and during testing it seems to make no difference.

Thoughts? Anything I've missed?

Support connecting HPC0 and HPC1 on PS8.

Co-authored-by: Mathias Tausen <[email protected]>
@ronagyl
Copy link
Contributor

ronagyl commented Apr 27, 2022

Thank you for your contribution, this looks very useful.

If DMAC cache coherency in Linux is a device tree attribute (static setting) why should we have a control register for it ? Can't we just set the value of AxCACHE through a synthesis parameter and reflect that in the device tree? Is there any usecase when you want to set the new control register to zero on a HPC port ?

Device tree should describe hardware and not store configuration data.

Can you please share also a link to the patch for the Linux driver so we can have a better understanding?

Thank you
Laszlo

@tausen
Copy link
Contributor Author

tausen commented Apr 27, 2022

Is there any usecase when you want to set the new control register to zero on a HPC port ?

Probably not. It should always be set if dma-coherent is set in the devicetree, which is why I figured a register set by the Linux driver made sense -- but I agree a synthesis parameter probably would be cleaner.

Can you please share also a link to the patch for the Linux driver so we can have a better understanding?

Sure: tausen/linux@8ed3a5f

@ronagyl
Copy link
Contributor

ronagyl commented May 3, 2022

but I agree a synthesis parameter probably would be cleaner.

Are you planning to do this change ? I would prefer this path since it would not affect the register map that might conflict with upcoming changes.

This new synthesis parameter can be reflected in a bit of a newly created read only register at address 0x14 (0x05)

Thank you,
Laszlo

@tausen
Copy link
Contributor Author

tausen commented May 3, 2022

Sure, will do. Thanks for the input 👍

On architectures with ports that support cache coherency, the AWCACHE
signal must be set to indicate that transactions are cached. This patch
adds a parameter allowing AWCACHE to be set on an AXI4 destination port.
@CLAassistant
Copy link

CLAassistant commented May 5, 2022

CLA assistant check
All committers have signed the CLA.

@tausen
Copy link
Contributor Author

tausen commented May 5, 2022

Branch updated with parameter instead of register.

This new synthesis parameter can be reflected in a bit of a newly created read only register at address 0x14 (0x05)

Did I capture this the way you intended?

Replacement driver patch: tausen/linux@0fce760

Copy link
Contributor

@ronagyl ronagyl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Thank you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants