-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ultrascale DMA via HPC #925
Conversation
Support connecting HPC0 and HPC1 on PS8. Co-authored-by: Mathias Tausen <[email protected]>
Thank you for your contribution, this looks very useful. If DMAC cache coherency in Linux is a device tree attribute (static setting) why should we have a control register for it ? Can't we just set the value of AxCACHE through a synthesis parameter and reflect that in the device tree? Is there any usecase when you want to set the new control register to zero on a HPC port ? Device tree should describe hardware and not store configuration data. Can you please share also a link to the patch for the Linux driver so we can have a better understanding? Thank you |
Probably not. It should always be set if
Sure: tausen/linux@8ed3a5f |
Are you planning to do this change ? I would prefer this path since it would not affect the register map that might conflict with upcoming changes. This new synthesis parameter can be reflected in a bit of a newly created read only register at address 0x14 (0x05) Thank you, |
Sure, will do. Thanks for the input 👍 |
On architectures with ports that support cache coherency, the AWCACHE signal must be set to indicate that transactions are cached. This patch adds a parameter allowing AWCACHE to be set on an AXI4 destination port.
Branch updated with parameter instead of register.
Did I capture this the way you intended? Replacement driver patch: tausen/linux@0fce760 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. Thank you.
axi_dmac
controllingAxCACHE
I also have a patch for the
dma-axi-dmac
Linux driver, setting the register if the DMA node in the devicetree has thedma-coherent
property. Will make a PR if this one is accepted.The purpose here is to speed up transfers when the CPU should manipulate the samples immediately after transferring them from PL to main memory:
dma-coherent
devicetree property, it is assumed that it is unsafe to read addresses from cache that the DMA may have written to (and rightfully so). This means all reads will be from main memory.AxCACHE
set accordingly, DMA writes will invalidate the cache at those addresses. This means we can safely set thedma-coherent
property and enjoy cache hits and speedup.I've only tested transfers from PL to memory, not the other way around. In my testcase I use
iio_buffer_refill
andmemcpy
from the iio buffer to elsewhere in main memory. Without this patch, I'm limited to about 900 Mbps with one CPU core at 100%. After these changes, the limit is close to 6 Gbps.Inspired by Xilinx' wiki page here:
https://xilinx-wiki.atlassian.net/wiki/spaces/A/pages/18842098/Zynq+UltraScale+MPSoC+Cache+Coherency
The wiki page suggests
AxPROT[1]
should be set on Linux, marking transactions as non-secure. In what little I have read aboutAxPROT
, I haven't seen anything that suggests this is related to cache coherency, and during testing it seems to make no difference.Thoughts? Anything I've missed?