-
Notifications
You must be signed in to change notification settings - Fork 596
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CreateVariantIngestFiles robust to partially / fully loaded samples [VS-262] #7843
Conversation
Codecov Report
@@ Coverage Diff @@
## ah_var_store #7843 +/- ##
================================================
Coverage ? 86.304%
Complexity ? 35190
================================================
Files ? 2170
Lines ? 164844
Branches ? 17783
================================================
Hits ? 142267
Misses ? 16254
Partials ? 6323 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good.
src/main/java/org/broadinstitute/hellbender/tools/gvs/ingest/CreateVariantIngestFiles.java
Outdated
Show resolved
Hide resolved
src/main/java/org/broadinstitute/hellbender/tools/gvs/ingest/CreateVariantIngestFiles.java
Outdated
Show resolved
Hide resolved
@@ -37,6 +37,10 @@ public final class RefCreator { | |||
private static final String PREFIX_SEPARATOR = "_"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not yours, but it distresses me to see 'private static final', followed by 'private final static'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lol given how nitpicky IntelliJ is in general it seems strange it doesn't complain about this
@@ -37,6 +37,10 @@ public final class RefCreator { | |||
private static final String PREFIX_SEPARATOR = "_"; | |||
private final static String REF_RANGES_FILETYPE_PREFIX = "ref_ranges_"; | |||
|
|||
public static boolean doRowsExistFor(CommonCode.OutputType outputType, String projectId, String datasetName, String tableNumber, String sampleId) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to test this? Is there even a test frameworkf for BigQuery?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did test this manually in this run where shards 2 and 3 exercised all of the injected failure conditions on their failed attempts, during which they would have called this code. There are some BQ "unit tests" in BigQueryUtilsUnitTest
but I'm not sure how to create an automated test for these conditions.
@@ -37,6 +37,10 @@ public final class RefCreator { | |||
private static final String PREFIX_SEPARATOR = "_"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lol given how nitpicky IntelliJ is in general it seems strange it doesn't complain about this
src/main/java/org/broadinstitute/hellbender/tools/gvs/ingest/CreateVariantIngestFiles.java
Outdated
Show resolved
Hide resolved
@@ -37,6 +37,10 @@ public final class RefCreator { | |||
private static final String PREFIX_SEPARATOR = "_"; | |||
private final static String REF_RANGES_FILETYPE_PREFIX = "ref_ranges_"; | |||
|
|||
public static boolean doRowsExistFor(CommonCode.OutputType outputType, String projectId, String datasetName, String tableNumber, String sampleId) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did test this manually in this run where shards 2 and 3 exercised all of the injected failure conditions on their failed attempts, during which they would have called this code. There are some BQ "unit tests" in BigQueryUtilsUnitTest
but I'm not sure how to create an automated test for these conditions.
@@ -154,49 +158,6 @@ private void setCoveredInterval(String variantChr, int start, int end) { | |||
previousInterval = new SimpleInterval(possiblyMergedGenomeLoc); | |||
} | |||
|
|||
public List<List<String>> createRows(final long start, final long end, final VariantContext variant, final String sampleId) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
drive by deletion of code IntelliJ flagged as unused
@@ -217,9 +205,51 @@ public void onTraversalStart() { | |||
logger.info("Sample id " + sampleId + " was detected as already loaded, exiting successfully."); | |||
System.exit(0); | |||
} else if (state == LoadStatus.LoadState.PARTIAL) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you want this to be PARTIAL || NONE? What happens if we've never attempted to load this sample before (which is the most common case by far)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the sample load state is NONE
there shouldn't be any ref ranges or vet rows, so no point running the expensive-ish queries below checking for rows. For load state NONE
the refRangesRowsExist
and vetRowsExist
variables on lines 193-194 would remain at their initialized values of false
which cause the ref ranges and vet *Creators to be constructed on lines 248 and 252.
loadStatus.writeLoadStatusStarted(Long.parseLong(sampleId)); | ||
} | ||
|
||
if (refCreator != null) { | ||
if (enableReferenceRanges && refCreator != null) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The control logic feels a little complex here between flags and null checks combined with the check for rows (which represents it's decision by the null for refCreator?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can remove these specific "enable" flag reads since they're redundant to the creation of the *Creators.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
one style/clarity thing, and one major question about Load Status
de67320
to
0096b7c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great -- assuming the testing code will be removed (ie the mod stuff)
0096b7c
to
42bfa0b
Compare
Makes CreateVariantIngestFiles robust to partially or fully loaded samples.
Commit 21828af is what I actually propose to merge, while commit de67320 randomly injects failures covering all the known failure modes. I tested these changes using both commits and was able to verify that partially loaded samples were handled correctly on subsequent attempts to load the sample (unfortunately we can't actually prevent these partial loadings from happening in the first place because preemptions, among other possible reasons).