-
Notifications
You must be signed in to change notification settings - Fork 485
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CR-1201062: Fixing issue where enabling both AIE trace and PL trace resulted in PL trace not being generated #8241
Conversation
Signed-off-by: Jason Villarreal <[email protected]>
…call inside the static info database. Now it is no longer the responsibility of each individual plugin to create this interface. Instead, they must create a concrete xdp::Device to describe how to communicate with the physical device and pass that in to updateDevice. Signed-off-by: Jason Villarreal <[email protected]>
…so remving unused functions. Signed-off-by: Jason Villarreal <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As the PR renames DeviceIntf to PLDeviceIntf, we can drop "device" and just call it PLIntf. It can be done in a later PR too.
@@ -2111,6 +2105,8 @@ namespace xdp { | |||
|
|||
if (!initializeStructure(currentXclbin, xrtXclbin)) { | |||
delete currentXclbin; | |||
if (xdpDevice != nullptr) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can skip the nullptr check before delete.
XclbinInfo* xclbin = deviceInfo[deviceId]->currentXclbin(); | ||
if (xclbin == nullptr) | ||
return nullptr; | ||
return; | ||
if (xclbin->deviceIntf != nullptr) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can skip nullptr check before delete
Signed-off-by: Jason Villarreal <[email protected]>
…eIntf so client doesn't use the PLDeviceIntf Signed-off-by: Jason Villarreal <[email protected]>
Problem solved by the commit
During application shutdown, if trace is active then all of the trace information from the device needs a final flush from memory to disk. The order in which AIE trace and PL trace is offloaded is not defined, so either order has to be supported. The AIE trace was removing the connection to the PL hardware after it had finished flushing AIE trace, which resulted in the PL trace to be lost.
This pull request fixes that issue by moving the ownership of the DeviceIntf object into a common location so each individual profiling/trace plugin doesn't have to be responsible for managing the shared memory.
A future planned pull request will change this implementation from raw pointers to shared_ptrs to further clarify ownership and make sure we catch all possible memory issues.
Additionally, this pull request renames several objects in profiling to make it clear which ones are associated with PL constructs as opposed to AIE constructs, as the generic names of "device" and "trace" were ambiguous.
Bug / issue (if any) fixed, which PR introduced the bug, how it was discovered
The bug was due to the aie_trace plugin removing the object used to communicate with the PL portion of the device after AIE trace had been flushed. It may have been introduced in recent refactorings, but was discovered with regression testing on multiple combinations of profiling options.
How problem was solved, alternative solutions (if any) and why they were rejected
The problem was solved by no longer deleting the PL device interface at the end of AIE trace flushing. This necessitated moving code to a common location and altering all of the profiling plugins that communicate with the PL side of the device.
Risks (if any) associated the changes in the commit
Low risk of affecting profiling execution in some configuration. Low risk of uncaught memory leaks to be addressed in future pull request.
What has been tested and how, request additional testing if necessary
The original failing test case has been tested on the board. Full regression with all combinations of profiling options enabled needs to be run to fully verify.
Documentation impact (if any)
No documentation impact.