Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix lbt07 pre-processing #385

Merged
merged 14 commits into from
Feb 16, 2023
Merged

Fix lbt07 pre-processing #385

merged 14 commits into from
Feb 16, 2023

Conversation

edelarua
Copy link
Contributor

@edelarua edelarua commented Feb 6, 2023

Also added option for overall column as per Nick's request (see below).

Closes #382

@edelarua edelarua added the sme label Feb 6, 2023
@github-actions
Copy link
Contributor

github-actions bot commented Feb 6, 2023

🧪 $Test coverage: 89.20%$

Code Coverage Summary

Filename                     Stmts    Miss  Cover    Missing
-------------------------  -------  ------  -------  ------------------------------------------------------------------------
R/aet01_aesi.R                 188       8  95.74%   39, 41-45, 272, 285
R/aet01.R                      330       6  98.18%   38, 240, 253, 316, 322, 554
R/aet02.R                      222       2  99.10%   136, 489
R/aet03.R                       84       0  100.00%
R/aet04.R                       93       0  100.00%
R/assertions.R                  36       0  100.00%
R/checks.R                      20       0  100.00%
R/chevron_tlg-S4class.R         21       0  100.00%
R/chevron_tlg-S4methods.R      123      17  86.18%   35, 94-102, 137-145, 396-468
R/cmt01a.R                     191       0  100.00%
R/cmt02_pt.R                    52       0  100.00%
R/dmt01.R                       40       0  100.00%
R/dst01.R                      287       0  100.00%
R/dtht01.R                     104       0  100.00%
R/egt01.R                       46       0  100.00%
R/egt02.R                       60       0  100.00%
R/egt03.R                      132      91  31.06%   32-52, 120-243, 290-326
R/ext01.R                       81       8  90.12%   238-241, 245-248
R/gen_args.R                     1       1  0.00%    30
R/lbt01.R                       95       0  100.00%
R/lbt04.R                       52       1  98.08%   127
R/lbt05.R                       76      10  86.84%   33-41, 152
R/lbt07.R                       87       1  98.85%   171
R/lbt14.R                      202      34  83.17%   57, 59, 107-110, 112-120, 150, 181, 268, 270, 318-321, 323-331, 361, 392
R/mht01.R                       72       2  97.22%   33-34
R/mng01.R                       93      12  87.10%   112, 116-119, 128-136, 181
R/pdt01.R                       61      38  37.70%   35-52, 108-150
R/pdt02.R                       68       0  100.00%
R/utils.R                      174     116  33.33%   70, 83-219, 380
R/vst01.R                       48       0  100.00%
R/vst02.R                      102       3  97.06%   42, 120, 257
TOTAL                         3241     350  89.20%

Results for commit: c992b6eece98bb2aef3a29c8dd10ebca1317cbd4

Minimum allowed coverage is 80%

♻️ This comment has been updated with latest results

@github-actions
Copy link
Contributor

github-actions bot commented Feb 6, 2023

Unit Tests Summary

    1 files    25 suites   1m 28s ⏱️
119 tests   84 ✔️ 35 💤 0
262 runs  176 ✔️ 86 💤 0

Results for commit cab6e86.

♻️ This comment has been updated with latest results.

@github-actions
Copy link
Contributor

github-actions bot commented Feb 6, 2023

Unit Test Performance Difference

Test Suite $Status$ Time on main $±Time$ $±Tests$ $±Skipped$ $±Failures$ $±Errors$
default_tlg 💚 $40.47$ $-2.97$ $0$ $0$ $0$ $0$

Results for commit c992b6eece98bb2aef3a29c8dd10ebca1317cbd4

♻️ This comment has been updated with latest results.

@npaszty
Copy link

npaszty commented Feb 6, 2023

@edelarua

Observed that the ASCII output includes horizontal line displayed between header and body but using Viewer() does not display the line.

ASCII
image

Viewer()
image

Was thinking that the hline might also be missing in teal output since Viewer() is also html. this is admittedly trivial. ;-)

How does one add "All Patients" column? from the specs.
image

@edelarua
Copy link
Contributor Author

edelarua commented Feb 7, 2023

@npaszty

Observed that the ASCII output includes horizontal line displayed between header and body but using Viewer() does not display the line.

Was thinking that the hline might also be missing in teal output since Viewer() is also html. this is admittedly trivial. ;-)

I'll bring this up to the SME team!

How does one add "All Patients" column? from the specs. image

I've just added functionality to include an "All Patients" column in lbt07, done by setting the lbl_overall argument to "All Patients".

@npaszty
Copy link

npaszty commented Feb 7, 2023

@edelarua
thanks for the lbl_overall argument. I might need to pull and rebuild.

@edelarua edelarua requested a review from barnett11 February 7, 2023 17:04
@barnett11
Copy link
Contributor

Hi @edelarua ,
The numbers look over inflated - I don't think we're selecting the highest grade per patient for summary, but rather counting any occurrences? If a patient has grades 1 and 2, then only 2 should be counted in the display.
Thanks,
Tim

Copy link
Contributor

@barnett11 barnett11 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Selection of worst grade not happening I don't think?

@npaszty
Copy link

npaszty commented Feb 8, 2023

@edelarua

both output layouts as expected. thanks!
run(lbt07_1, db, lbl_overall = "All Patients")
run(lbt07_1, db)

@edelarua
Copy link
Contributor Author

edelarua commented Feb 8, 2023

Selection of worst grade not happening I don't think?

Hi @barnett11, I looked into the spec. for the template and found that the necessary pre-processing was not originally implemented in this template. I have added in filtering for worst grade low/high flag and post-baseline records.

As per the discussion here I have also added in the pruning option.

@edelarua edelarua requested a review from barnett11 February 8, 2023 20:28
Copy link
Contributor

@barnett11 barnett11 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @edelarua ,
I believe the filtering of specific AVISIT is not necessary - ONTRTFL selects the records we need.
I think the set-up still allows for double-counting - I have a scenario where WGRHIFL=WGRLOFL (same abnormal grade applied throughout) and I think your template counts them twice in this case? Perhaps we need to limit HIGH to only look at grades 1-4, and LOW to grades -1 to -4?

@edelarua
Copy link
Contributor Author

edelarua commented Feb 9, 2023

@barnett11

I believe the filtering of specific AVISIT is not necessary - ONTRTFL selects the records we need.

In the case of syn_data$adlb, there are no records with ONTRTFL == "N", but ~2400 with AVISIT set to SCREENING or BASELINE. Adding the filter for AVISIT removes these visits as necessary.

I think the set-up still allows for double-counting - I have a scenario where WGRHIFL=WGRLOFL (same abnormal grade applied throughout) and I think your template counts them twice in this case? Perhaps we need to limit HIGH to only look at grades 1-4, and LOW to grades -1 to -4?

There should be no double counting in these scenarios if the same abnormal grade is applied throughout. In the pre-processing step GRADE_DIR is created so that HIGH only looks at grades 1 to 4, and LOW only looks at grades -1 to -4. GRADE_DIR is then used to create a map so that values are displayed according to their combination of PARAM, GRADE_DIR, and GRADE_ANL (which prevents double counting).

@edelarua edelarua requested a review from barnett11 February 9, 2023 19:18
Copy link
Contributor

@barnett11 barnett11 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @edelarua ,
I still believe the addition of filtering on specific visits should be avoided - just because synthetic data is not set-up correctly doesn't mean we should have this in standard pre-processing. I beleive ontrtfl should be a sufficient indicator of post-baseline.
I need to investigate more why numbers are over-inflating in my example study data - but one issue I observed is if you subset syn_data to one patient, eg. AB12345-BRA-1-id-105, the function errors - could this be checked thank you?

@edelarua
Copy link
Contributor Author

edelarua commented Feb 9, 2023

Hi @barnett11,

I have corrected the derivation of ONTRTFL in syn_data$adlb and removed the AVISIT filter accordingly. I also found that the mapping was incomplete which was causing the error when subsetting to one patient id and corrected that as well.

Let me know if anything else needs fixing!

@edelarua edelarua requested a review from barnett11 February 9, 2023 22:33
@barnett11
Copy link
Contributor

Hi @edelarua ,
I'm having problems with a specific patient having a consistent abnormal grade - I think it's because WGRLOFL and WGRHIFL can be on the same abnormal value.
image

@npaszty
Copy link

npaszty commented Feb 10, 2023

@edelarua

to test against real study data...

ran the template against the Lupus Data Mart (LDM) which includes 3 studies and 6 treatment groups. from a benchmarking exercise a while back I noted that what slows tern down is number of treatment columns not number of data records. LDM adlb has 271088 records.

after adding preprocessing steps the template produced a table for both with and without all patients. so that was encouraging. also pretty fast.

I didn't not verify the accuracy of the descriptive summary stats since I had to kludge variables into the data that we don't have and when speaking with Joe he wanted me to focus on the format and functionality against scda and real study data. assuming tern is accurately summarizing the descriptive summary stats.
👍

@edelarua
Copy link
Contributor Author

edelarua commented Feb 10, 2023

@npaszty

Awesome, thanks for the update Nick! We're mostly just working out some special cases for this template now, but glad to know the output is correctly formatted.

@edelarua
Copy link
Contributor Author

edelarua commented Feb 11, 2023

Hi @barnett11,

I've tested with the case you mentioned above (consistent abnormal grade, same row for hi/lo flag) and didn't have any issues. Could you provide me with some data to help determine the issue?

The WGRLOFL and WGRHIFL variables are not used in any calculations/derivations other than the initial filtering for the template - LOW is defined as ATOXGR < 0 and HIGH is defined as ATOXGR > 0, so in the above example the patient would only be counted once under the Direction of Abnormality = LOW, Highest NCI CTCAE Grade = 1 row.

Note that I have pushed a fix for the warnings you're seeing in the above example.

@barnett11
Copy link
Contributor

Hi @edelarua ,
Thanks for the warnings fix, that now produces the correct output. I think I'm now closer to what the problem is.
For the pre-processing .data$WGRLOFL == "Y" | .data$WGRHIFL == "Y", I believe this causes problems if a patient has all abnormal grading in the same direction. An example is if patient had on-treatment grading of -1 and -2. The -1 would be assigned WGRHIFL="Y" and -2 WGRLOFL="Y". The problem is that both would be counted in the table currently as LOW.
image

@edelarua
Copy link
Contributor Author

edelarua commented Feb 15, 2023

Hi @edelarua , Thanks for the warnings fix, that now produces the correct output. I think I'm now closer to what the problem is. For the pre-processing .data$WGRLOFL == "Y" | .data$WGRHIFL == "Y", I believe this causes problems if a patient has all abnormal grading in the same direction. An example is if patient had on-treatment grading of -1 and -2. The -1 would be assigned WGRHIFL="Y" and -2 WGRLOFL="Y". The problem is that both would be counted in the table currently as LOW. image

Hi @barnett11,

Is this case something that would occur in real data? From the specifications, it seems that WGRLOFL is only applied to the first lowest post-baseline grade where direction is "L" and similarly WGRHIFL is only applied to the first highest post-baseline grade where direction is "H". From my understanding, this means that WGRLOFL would only ever be "Y" for records with negative ATOXGR (i.e. direction is L) and WGRHIFL would only ever be "Y" for records with positive ATOXGR (i.e. direction is H), so this case should not occur naturally.

If this is something that would be possible in real data, would it be preferred to filter out these cases where WGRHIFL = "Y" on negative ATOXGR records and WGRLOFL = "Y" on positive ATOXGR records during pre-processing? Or how else would you like these cases handled?

@barnett11
Copy link
Contributor

barnett11 commented Feb 15, 2023

Thanks @edelarua - yes this is from applying standard ADLB processing on a real study. I think the specifications actually state direction "L" or "B"/"H" or "B" (this is from metadata which is one of the things I want to dsicuss further how we apply, but that is separate discussion), so HGB has direction of "B" that causes this high flagging for all low values.
In CENSORED I see they get around it at the processing of Low by looking at both WGRLOFL="Y" and ATOXGR<0.

@edelarua
Copy link
Contributor Author

edelarua commented Feb 15, 2023

@barnett11 I have implemented the CENSORED processing you mentioned above. Let me know if that fixes the issue!

@edelarua edelarua dismissed barnett11’s stale review February 15, 2023 20:00

Ready for another review.

Copy link
Contributor

@barnett11 barnett11 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Beuatiful - that's a 100% match, thanks a lot @edelarua , I know this was tricky but we got there thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

lbt07 preprocessing is not working as expected
3 participants