-
-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix lbt07 pre-processing #385
Conversation
🧪 Code Coverage Summary
Results for commit: c992b6eece98bb2aef3a29c8dd10ebca1317cbd4 Minimum allowed coverage is ♻️ This comment has been updated with latest results |
Unit Test Performance Difference
Results for commit c992b6eece98bb2aef3a29c8dd10ebca1317cbd4 ♻️ This comment has been updated with latest results. |
Observed that the ASCII output includes horizontal line displayed between header and body but using Viewer() does not display the line. Was thinking that the hline might also be missing in teal output since Viewer() is also html. this is admittedly trivial. ;-) |
I'll bring this up to the SME team! I've just added functionality to include an "All Patients" column in lbt07, done by setting the |
@edelarua |
Signed-off-by: Emily de la Rua <[email protected]>
Hi @edelarua , |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Selection of worst grade not happening I don't think?
both output layouts as expected. thanks! |
Hi @barnett11, I looked into the spec. for the template and found that the necessary pre-processing was not originally implemented in this template. I have added in filtering for worst grade low/high flag and post-baseline records. As per the discussion here I have also added in the pruning option. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @edelarua ,
I believe the filtering of specific AVISIT
is not necessary - ONTRTFL
selects the records we need.
I think the set-up still allows for double-counting - I have a scenario where WGRHIFL=WGRLOFL (same abnormal grade applied throughout) and I think your template counts them twice in this case? Perhaps we need to limit HIGH to only look at grades 1-4, and LOW to grades -1 to -4?
In the case of
There should be no double counting in these scenarios if the same abnormal grade is applied throughout. In the pre-processing step |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @edelarua ,
I still believe the addition of filtering on specific visits should be avoided - just because synthetic data is not set-up correctly doesn't mean we should have this in standard pre-processing. I beleive ontrtfl
should be a sufficient indicator of post-baseline.
I need to investigate more why numbers are over-inflating in my example study data - but one issue I observed is if you subset syn_data to one patient, eg. AB12345-BRA-1-id-105
, the function errors - could this be checked thank you?
Hi @barnett11, I have corrected the derivation of Let me know if anything else needs fixing! |
Hi @edelarua , |
to test against real study data... ran the template against the Lupus Data Mart (LDM) which includes 3 studies and 6 treatment groups. from a benchmarking exercise a while back I noted that what slows tern down is number of treatment columns not number of data records. LDM adlb has 271088 records. after adding preprocessing steps the template produced a table for both with and without all patients. so that was encouraging. also pretty fast. I didn't not verify the accuracy of the descriptive summary stats since I had to kludge variables into the data that we don't have and when speaking with Joe he wanted me to focus on the format and functionality against scda and real study data. assuming tern is accurately summarizing the descriptive summary stats. |
Awesome, thanks for the update Nick! We're mostly just working out some special cases for this template now, but glad to know the output is correctly formatted. |
Hi @barnett11, I've tested with the case you mentioned above (consistent abnormal grade, same row for hi/lo flag) and didn't have any issues. Could you provide me with some data to help determine the issue? The Note that I have pushed a fix for the warnings you're seeing in the above example. |
Hi @edelarua , |
Hi @barnett11, Is this case something that would occur in real data? From the specifications, it seems that If this is something that would be possible in real data, would it be preferred to filter out these cases where |
Thanks @edelarua - yes this is from applying standard ADLB processing on a real study. I think the specifications actually state direction "L" or "B"/"H" or "B" (this is from metadata which is one of the things I want to dsicuss further how we apply, but that is separate discussion), so HGB has direction of "B" that causes this high flagging for all low values. |
@barnett11 I have implemented the CENSORED processing you mentioned above. Let me know if that fixes the issue! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Beuatiful - that's a 100% match, thanks a lot @edelarua , I know this was tricky but we got there thanks!
Also added option for overall column as per Nick's request (see below).
Closes #382