Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace TwoPassReadWalker with more general MultiplePassReadWalker. #6154

Merged
merged 3 commits into from
Oct 25, 2019

Conversation

cmnbroad
Copy link
Collaborator

Replace TwoPassReadWalker with the more general MultiplePassReadWalker to support SV team's request, and also addresses an issue found by @tedsharpe where TwoPassReadWalker never closed the ReadsDataSource from the previous iteration.

@cmnbroad cmnbroad force-pushed the cn_multipass_readwalker branch from 70f1839 to 633d1aa Compare September 11, 2019 12:47
Copy link
Collaborator

@jamesemery jamesemery left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly some documentation and user logging changes. The tests look pretty good. I agree its gross to create so much infrastructure for what is a glorified "is sam file" check.

void consume( GATKRead read, ReferenceContext reference, FeatureContext features );
}

public abstract void traverseReads();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a more descriptive comment to this method explaining how to implement this method here.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also perhaps add a comment mentioning what we discussed about resetting the random generator if necessary.

public final void traverse() {
countedFilter = makeReadFilter();
traverseReads();
logger.info(countedFilter.getSummaryLine());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Upon thinking about this, I actually believe it will be considerably less confusing to the user to print the countedFilter summary line for only one pass of the tool. (imagine gatk telling my bam has 2x the number of reads I know it to have? That could easily cause enough confusion to be be asked by a forum user some day) I would advocate reconstructing the filter every pass and just returning the most current one (which should be equivalent)

private boolean firstPass = true;

@FunctionalInterface
public interface GATKReadConsumer {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should have similar documentation to the apply() method of the normal read walker (perhaps just link to that methods documentation).

read,
new ReferenceContext(reference, readInterval), // will be empty if reference or readInterval is null
new FeatureContext(features, readInterval)); // will be empty if features or readInterval is null
progressMeter.update(readInterval);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Considering that the progress meter is going to print the position with the traversal perhaps it would be bettter to print a log statement each time just stating, "re-traversing intervals for traversal n" or some such.

Copy link
Collaborator

@jamesemery jamesemery left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These changes look good and have addressed my comments @cmnbroad. Feel free to merge.

@cmnbroad cmnbroad merged commit da7d4b8 into master Oct 25, 2019
@cmnbroad cmnbroad deleted the cn_multipass_readwalker branch October 25, 2019 21:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants