Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rewrite complex SV functional annotation in SVAnnotate #8516

Merged
merged 4 commits into from
Jan 23, 2024
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -163,6 +163,7 @@ public enum ComplexVariantSubtype {
public static final String NONCODING_BREAKPOINT = "PREDICTED_NONCODING_BREAKPOINT";
public static final String NEAREST_TSS = "PREDICTED_NEAREST_TSS";
public static final String TSS_DUP = "PREDICTED_TSS_DUP";
public static final String PARTIAL_DISPERSED_DUP = "PREDICTED_PARTIAL_DISPERSED_DUP";

// SVTYPE classes
public enum StructuralVariantAnnotationType {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -123,6 +123,10 @@
* duplicated. The partial duplication occurs when a duplication has one breakpoint within the transcript and one
* breakpoint after the end of the transcript. When the duplication is in tandem, the result is that there is one
* intact copy of the full endogenous gene.</p></li>
* <li><p><i>PREDICTED_PARTIAL_DISPERSED_DUP</i><br />
* Gene(s) which are partially overlapped by an SV's dispersed duplication. This annotation is applied to a
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to clarify this a little more ("an SV's dispersed duplication" could potentially be the insert interval in the mind of some readers I think), I might suggest the wording, "Gene(s) which are partially overlapped by the duplicated segment involved in an SV's dispersed duplication."

* dispersed (non-tandem) duplication segment that is part of a complex SV if the duplicated segment overlaps part
* of a transcript but not the entire transcript (which would be a PREDICTED_COPY_GAIN event).</p></li>
* <li><p><i>PREDICTED_INV_SPAN</i><br />
* Gene(s) which are entirely spanned by an SV's inversion. A whole-gene inversion occurs when an inversion spans
* the entire transcript, from the first base of the 5' UTR to the last base of the 3' UTR. </p></li>
Expand Down Expand Up @@ -354,6 +358,7 @@ private void addAnnotationInfoKeysToHeader(final VCFHeader header) {
header.addMetaDataLine(new VCFInfoHeaderLine(GATKSVVCFConstants.NONCODING_SPAN, VCFHeaderLineCount.UNBOUNDED, VCFHeaderLineType.String, "Class(es) of noncoding elements spanned by SV."));
header.addMetaDataLine(new VCFInfoHeaderLine(GATKSVVCFConstants.NONCODING_BREAKPOINT, VCFHeaderLineCount.UNBOUNDED, VCFHeaderLineType.String, "Class(es) of noncoding elements disrupted by SV breakpoint."));
header.addMetaDataLine(new VCFInfoHeaderLine(GATKSVVCFConstants.NEAREST_TSS, VCFHeaderLineCount.UNBOUNDED, VCFHeaderLineType.String, "Nearest transcription start site to an intergenic variant."));
header.addMetaDataLine(new VCFInfoHeaderLine(GATKSVVCFConstants.PARTIAL_DISPERSED_DUP, VCFHeaderLineCount.UNBOUNDED, VCFHeaderLineType.String, "Gene(s) overlapped partially by a dispersed duplication in a complex SV."));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similarly, I might suggest, "Gene(s) overlapped partially by a the duplicated interval involved in a dispersed duplication event in a complex SV"


}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -211,14 +211,20 @@ protected static String annotateDeletion(final SimpleInterval variantInterval,
* Get consequence of duplication variant on transcript
* @param variantInterval - SimpleInterval representing structural variant
* @param gtfTranscript - protein-coding GTF transcript
* @param isComplex - boolean: true if SV type is CPX, false otherwise
* @return - consequence of duplication variant on transcript
*/
@VisibleForTesting
protected static String annotateDuplication(final SimpleInterval variantInterval,
final GencodeGtfTranscriptFeature gtfTranscript) {
final GencodeGtfTranscriptFeature gtfTranscript,
boolean isComplex) {
final SimpleInterval transcriptInterval = new SimpleInterval(gtfTranscript);
if (variantSpansFeature(variantInterval, transcriptInterval)) {
return GATKSVVCFConstants.COPY_GAIN;
return GATKSVVCFConstants.COPY_GAIN; // return CG immediately because same regardless of isDispersed
}
if (isComplex) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
}
if (isComplex) {
} else if (isComplex) {

Just a little clearer

// if not CG, overlaps part of gene --> if complex, immediate PARTIAL_DISPERSED_DUP
return GATKSVVCFConstants.PARTIAL_DISPERSED_DUP;
} else if (variantOverlapsTranscriptionStartSite(variantInterval, gtfTranscript)) {
return GATKSVVCFConstants.TSS_DUP;
} else if (!transcriptInterval.contains(variantInterval)) {
Expand Down Expand Up @@ -276,7 +282,7 @@ protected static String annotateDuplication(final SimpleInterval variantInterval
protected static String annotateCopyNumberVariant(final SimpleInterval variantInterval,
final GencodeGtfTranscriptFeature gtfTranscript,
final Set<String> MSVExonOverlapClassifications) {
final String consequence = annotateDuplication(variantInterval, gtfTranscript);
final String consequence = annotateDuplication(variantInterval, gtfTranscript, false);
if (MSVExonOverlapClassifications.contains(consequence)) {
return GATKSVVCFConstants.MSV_EXON_OVERLAP;
} else {
Expand Down Expand Up @@ -338,12 +344,14 @@ protected static String annotateBreakend(final SimpleInterval variantInterval,
* Add consequence of structural variant on an overlapping transcript to consequence dictionary for variant
* @param variantInterval - SimpleInterval representing structural variant
* @param svType - SV type
* @param isComplex - boolean: true if SV type is CPX, false if not
* @param transcript - protein-coding GTF transcript
* @param variantConsequenceDict - running map of consequence -> feature name for variant to update
*/
@VisibleForTesting
protected void annotateTranscript(final SimpleInterval variantInterval,
final GATKSVVCFConstants.StructuralVariantAnnotationType svType,
final boolean isComplex,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I am realizing that there is a slight semantic issue with StructuralVariantAnnotationType since it now contains the CPX type. It is then somewhat confusing to have the isComplex parameter.

I think the issue stems from the fact that this method is intended for use on an SVSegment rather than a full variant. One option is to create, say, a SVSegmentAnnotationType that is the same as StructuralVariantAnnotationType but omits CPX, but that's probably unnecessary.

I think at least some of the variable names should change here, variantInterval -> segmentInterval, svType -> segmentType, variantConsequenceDict -> segmentConsequenceDict, and perhaps even annotateTranscript -> annotateSegmentTranscript. Also updating the documentation to make this clear would help.

It's not hugely a concern since this is intended to be a private method, but this would improve the readability I think.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've been thinking about this and I'm not sure of the optimal names here.

  • I agree that the StructuralVariantAnnotationType is being overused here, although I agree that it may not be worthwhile to create two different types with and without CPX. Maybe if it was just StructuralVariantType for a more neutral category - especially since it is getting used in other tools now?
  • I think variantInterval is not inaccurate, since it is an interval of an SV / an interval that varies compared to the reference, and it is more descriptive than segmentInterval when distinguishing between a feature interval and a variant interval. But it is true that only one segment is considered at a time - which is clear because the variantInterval is a single interval rather than a list. Maybe best would be variantSegmentInterval? Or some documentation changes?
  • variantConsequenceDict is the correct name because it accumulates consequences across SV segments
  • True, annotateGeneOverlaps and annotateTranscript are only applied to one segment at a time. But the method names focus on the features from the GTF rather than saying they are applied to a variant vs. a segment. Perhaps just a docs change here?

final GencodeGtfTranscriptFeature transcript,
final Map<String, Set<String>> variantConsequenceDict) {
final String consequence;
Expand All @@ -355,7 +363,7 @@ protected void annotateTranscript(final SimpleInterval variantInterval,
consequence = annotateInsertion(variantInterval, transcript);
break;
case DUP:
consequence = annotateDuplication(variantInterval, transcript);
consequence = annotateDuplication(variantInterval, transcript, isComplex);
break;
case CNV:
consequence = annotateCopyNumberVariant(variantInterval,transcript, MSV_EXON_OVERLAP_CLASSIFICATIONS);
Expand Down Expand Up @@ -477,36 +485,88 @@ protected static GATKSVVCFConstants.StructuralVariantAnnotationType getSVType(fi
* Add protein-coding annotations for any transcripts overlapping the variant to the variant consequence dictionary
* @param variantInterval - SimpleInterval representing structural variant
* @param svType - SV type
* @param isComplex - boolean: true if SV type is CPX, false otherwise
* @param variantConsequenceDict - running map of consequence -> feature name for variant to update
*/
@VisibleForTesting
protected void annotateGeneOverlaps(final SimpleInterval variantInterval,
final GATKSVVCFConstants.StructuralVariantAnnotationType svType,
final boolean isComplex,
final Map<String, Set<String>> variantConsequenceDict) {
final Iterator<SVIntervalTree.Entry<GencodeGtfTranscriptFeature>> gtfTranscriptsForVariant =
gtfIntervalTrees.getTranscriptIntervalTree().overlappers(
SVUtils.locatableToSVInterval(variantInterval, sequenceDictionary)
);
for (Iterator<SVIntervalTree.Entry<GencodeGtfTranscriptFeature>> it = gtfTranscriptsForVariant; it.hasNext(); ) {
SVIntervalTree.Entry<GencodeGtfTranscriptFeature> transcriptEntry = it.next();
annotateTranscript(variantInterval, svType, transcriptEntry.getValue(), variantConsequenceDict);
annotateTranscript(variantInterval, svType, isComplex, transcriptEntry.getValue(), variantConsequenceDict);
}
}

/**
* Get section of one interval (primaryInterval) that is not overlapped by the other (secondaryInterval)
* @param primaryInterval - SimpleInterval
* @param secondaryInterval - SimpleInterval overlapping (but not fully containing) primaryInterval
* @return - SimpleInterval representing the portion of primaryInterval not overlapped by secondaryInterval
*/
@VisibleForTesting
protected static SimpleInterval getNonOverlappingInterval(final SimpleInterval primaryInterval,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not add this to the SimpleInterval class? Make is similar to the spanWith() method. Let's call it extendWithSpan().

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I started doing this originally but then realized that it's not super generalizable because I made a lot of assumptions about the intervals. I could still add it but would need to enforce the secondary interval overlapping but not containing the primary interval

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider also the name subtractInterval (a la bedtools).

final SimpleInterval secondaryInterval) {
if (primaryInterval.getStart() < secondaryInterval.getStart()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should check that the contigs are the same.

return new SimpleInterval(primaryInterval.getContig(), primaryInterval.getStart(), secondaryInterval.getStart());
}
else {
return new SimpleInterval(primaryInterval.getContig(), secondaryInterval.getEnd(), primaryInterval.getEnd());
}
}

/**
* Parse one interval string from CPX_INTERVALS INFO field into an SVSegment representing the SV type and
* interval of one of the components of the complex event
* @param cpxInterval - one element from CPX_INTERVALS list, a string representing one component of complex SV
* @return - SVSegment representing one component of the complex SV (type and interval)
* Parse CPX_INTERVALS field into a list of SV segments for annotation of protein-coding consequences.
* Ignore or adjust INV intervals as required by the CPX event type
* @param cpxIntervals - list of elements from CPX_INTERVALS field, each describing one segment of a complex SV
* @param complexType - Complex SV event type category, from CPX_TYPE field
* @return - List of SVSegments representing component of the complex SV (type and interval) to annotate for
* protein-coding consequences
*/
@VisibleForTesting
protected static SVSegment parseCPXIntervalString(final String cpxInterval) {
final String[] parsed = cpxInterval.split("_");
final GATKSVVCFConstants.StructuralVariantAnnotationType svTypeForInterval = GATKSVVCFConstants.StructuralVariantAnnotationType.valueOf(parsed[0]);
final SimpleInterval interval = new SimpleInterval(parsed[1]);
return new SVSegment(svTypeForInterval, interval);
protected static List<SVSegment> getComplexAnnotationIntervals(final List<String> cpxIntervals,
final String complexType) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be more maintainable to have separate parsing and processing methods for this. Define inner classes ComplexInterval and ComplexType that are created in the parsing method(s) and then pass them in here. Ideally you would make use of the GATKSVVCFConstants.ComplexVariantSubtype type for this.

Doing so is a bit more "brittle," but it is safer to be explicitly checking and representing the input this way. Also it avoids the string matching, which can be prone to bugs.

final List<SVSegment> segments = new ArrayList<>(cpxIntervals.size() + 1);
final List<SimpleInterval> dupIntervals = new ArrayList<>(cpxIntervals.size());
SimpleInterval inversionIntervalToAdjust = null;
for (final String cpxInterval : cpxIntervals) {
final String[] parsed = cpxInterval.split("_");
final GATKSVVCFConstants.StructuralVariantAnnotationType svTypeForInterval = GATKSVVCFConstants.StructuralVariantAnnotationType.valueOf(parsed[0]);
final SimpleInterval interval = new SimpleInterval(parsed[1]);
if (svTypeForInterval == GATKSVVCFConstants.StructuralVariantAnnotationType.INV) {
// ignore INV segment for dDUP_iDEL or INS_iDEL
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a quick explanation of why

if (complexType.contains("iDEL") || complexType.contains("dDUP")) {
continue;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This may just be a stylistic preference but personally I find a loop that has multiple if/else clauses in it, some of which have a continue while others don't, a bit tricky to read and follow. I guess I'd prefer to have something like a addSegment flag set in the conditional clauses and then always check it before doing segments.add(originalSegment). But not a huge deal.

}
// save INV interval to adjust later for dupINV / INVdup / dupINVdup / dupINVdel / delINVdup
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also here

else if (complexType.contains("INV") && complexType.contains("dup")) {
inversionIntervalToAdjust = new SimpleInterval(interval);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you need to call new here?

continue;
}
}
if (svTypeForInterval == GATKSVVCFConstants.StructuralVariantAnnotationType.DUP) {
dupIntervals.add(interval);
}
segments.add(new SVSegment(svTypeForInterval, interval));
}
// adjust INV interval for dupINV / INVdup / dupINVdup / dupINVdel / delINVdup
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And here

if (inversionIntervalToAdjust != null) {
SimpleInterval adjustedInversionInterval = inversionIntervalToAdjust;
for (final SimpleInterval dupInterval : dupIntervals) {
adjustedInversionInterval = getNonOverlappingInterval(adjustedInversionInterval, dupInterval);
}
segments.add(new SVSegment(GATKSVVCFConstants.StructuralVariantAnnotationType.INV, adjustedInversionInterval));
}

return segments;
}


/**
* Get SV type to use for annotation for a breakend VCF record
* Breakend may represent BND, CTX, or DEL / DUP if the user specifies {@code SVAnnotate.MAX_BND_LEN_NAME}
Expand Down Expand Up @@ -562,17 +622,15 @@ protected static List<SVSegment> getSVSegments(final VariantContext variant,
final String chr2 = variant.getAttributeAsString(GATKSVVCFConstants.CONTIG2_ATTRIBUTE, null);
final int end2 = variant.getAttributeAsInt(GATKSVVCFConstants.END2_ATTRIBUTE, pos);
if (overallSVType.equals(GATKSVVCFConstants.StructuralVariantAnnotationType.CPX)) {
final List<String> cpxIntervalsString = variant.getAttributeAsStringList(GATKSVVCFConstants.CPX_INTERVALS, null);
if (cpxIntervalsString == null) {
final List<String> cpxIntervals = variant.getAttributeAsStringList(GATKSVVCFConstants.CPX_INTERVALS, null);
if (cpxIntervals == null) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Recently I learned that the defaultValue parameter in getAttributeAsStringList() is used as the default for list items that are null (i.e. .), but the method always returns a List. I'm not aware of a case where we expect an empty list or contains a null entry, so I think what you want to check is if cpxIntervals.isEmpty() or it contains null.

throw new UserException("Complex (CPX) variant must contain CPX_INTERVALS INFO field");
}
if (complexType == null) {
throw new UserException("Complex (CPX) variant must contain CPX_TYPE INFO field");
}
intervals = new ArrayList<>(cpxIntervalsString.size() + 1);
for (final String cpxInterval : cpxIntervalsString) {
intervals.add(parseCPXIntervalString(cpxInterval));
}
intervals = getComplexAnnotationIntervals(cpxIntervals, complexType);
// no need to add sink site INS for INS_iDEL because DEL coordinates contain sink site
if (complexType.contains("dDUP")) {
intervals.add(new SVSegment(GATKSVVCFConstants.StructuralVariantAnnotationType.INS,
new SimpleInterval(chrom, pos, pos + 1)));
Expand Down Expand Up @@ -620,7 +678,46 @@ protected static List<SVSegment> getSVSegments(final VariantContext variant,
return intervals;
}

/**
* Update list of SVSegments to use for promoter & noncoding annotations for complex SVs. Removes DUP segments
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* Update list of SVSegments to use for promoter & noncoding annotations for complex SVs. Removes DUP segments
* Returns a new subsetted list of SVSegments to use for promoter & noncoding annotations for complex SVs. Removes DUP segments

IMO, "Updates" implies in-place modification of the input list

* which are never tandem in CPX events
* @param svSegments - List of SVSegments used for gene overlap annotations
* @return - Updated list of SVSegments to use for promoter & noncoding annotations for CPX SVs
*/
@VisibleForTesting
protected static List<SVSegment> getSegmentsForNonCodingAnnotations(final List<SVSegment> svSegments) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd modify the name of this to reinforce that it's meant for CPX events only. Or alternatively, you might consider making a little helper class (CPXEventAnnotationSegmenter?) that groups some of these methods that are specific to CPX events: getSegmentsForNonCodingAnnotations, getSegmentForNearestTSS, etc.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated these functions to run on all SV types instead of just complex, to go with the other change you suggested below to create new lists of segments for each type of annotation

final List<SVSegment> updatedSegments = new ArrayList<>(svSegments.size());
for (final SVSegment svSegment : svSegments) {
if (svSegment.getIntervalSVType() != GATKSVVCFConstants.StructuralVariantAnnotationType.DUP) {
updatedSegments.add(svSegment);
}
}
return updatedSegments;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could be done in one line with a stream:

return svSegments.stream().filter(seg -> seg.getIntervalSVType() != GATKSVVCFConstants.StructuralVariantAnnotationType.DUP).collect(Collectors.toList());

}

/**
* Update list of SVSegments to use for nearest TSS annotations for complex SVs. DUP segments are already removed.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* Update list of SVSegments to use for nearest TSS annotations for complex SVs. DUP segments are already removed.
* Returns a new subsetted list of SVSegments to use for nearest TSS annotations for complex SVs. DUP segments are already removed.

* Merges remaining intervals (DEL, INV) for deletion-containing CPX events.
* @param svSegments - List of SVSegments used for gene overlap annotations
* @return - Updated list of SVSegments to use for nearest TSS annotations for CPX SVs
*/
@VisibleForTesting
protected static List<SVSegment> getSegmentForNearestTSS(final List<SVSegment> svSegments,
final String complexType) {
// for dDUP_iDEL, INS_iDEL, delINV, INVdel, dupINVdel, delINVdup, delINVdel --> merge all remaining SV segments
// which will be INS, DEL, INV types (DUPs already removed)
if (complexType.contains("del") || complexType.contains("DEL")) {
SimpleInterval spanningSegment = svSegments.get(0).getInterval();
for (int i = 1; i < svSegments.size(); i++) {
spanningSegment = spanningSegment.spanWith(svSegments.get(i).getInterval());
}
return Collections.singletonList(new SVSegment(GATKSVVCFConstants.StructuralVariantAnnotationType.DEL,
spanningSegment));
} else {
// for dDUP, dupINV, INVdup, dupINVdup --> no further modifications (already adjusted INV, removed DUPs)
return svSegments;
}
}

/**
* Create a copy of the variant consequence dictionary in which the feature names for each consequence are sorted
Expand Down Expand Up @@ -649,16 +746,25 @@ protected static Map<String, Object> sortVariantConsequenceDict(final Map<String
protected Map<String, Object> annotateStructuralVariant(final VariantContext variant) {
final Map<String, Set<String>> variantConsequenceDict = new HashMap<>();
final GATKSVVCFConstants.StructuralVariantAnnotationType overallSVType = getSVType(variant);
final List<SVSegment> svSegments = getSVSegments(variant, overallSVType, maxBreakendLen);
final boolean isComplex = overallSVType == GATKSVVCFConstants.StructuralVariantAnnotationType.CPX;
final String complexType = variant.getAttributeAsString(GATKSVVCFConstants.CPX_TYPE, null);
List<SVSegment> svSegments = getSVSegments(variant, overallSVType, maxBreakendLen);

// annotate gene overlaps
if (gtfIntervalTrees != null && gtfIntervalTrees.getTranscriptIntervalTree() != null) {
for (SVSegment svSegment : svSegments) {
annotateGeneOverlaps(svSegment.getInterval(), svSegment.getIntervalSVType(), variantConsequenceDict);
annotateGeneOverlaps(svSegment.getInterval(), svSegment.getIntervalSVType(), isComplex, variantConsequenceDict);
}
}

// if variant consequence dictionary is empty (no protein-coding annotations), apply INTERGENIC flag
final boolean noCodingAnnotations = variantConsequenceDict.isEmpty();

// for CPX events, update SV segments to annotate promoter & noncoding consequences
if (overallSVType == GATKSVVCFConstants.StructuralVariantAnnotationType.CPX) {
svSegments = getSegmentsForNonCodingAnnotations(svSegments);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than changing the value of svSegments I think it would be better to be explicit and make (final) svSegmentsForGeneOverlaps, svSegmentsForNonCodingAnnotations, and svSegmentsForNearestTSS variables that you set appropriately before each call. Overriding svSegments before each call seems like too much dependence on mutable state to me.

}

// then annotate promoter overlaps and non-coding feature overlaps
if (gtfIntervalTrees != null && gtfIntervalTrees.getPromoterIntervalTree() != null) {
for (final SVSegment svSegment : svSegments) {
Expand All @@ -672,6 +778,11 @@ protected Map<String, Object> annotateStructuralVariant(final VariantContext var
}
}

// for CPX events, update SV segments to annotate nearest TSS
if (overallSVType == GATKSVVCFConstants.StructuralVariantAnnotationType.CPX) {
svSegments = getSegmentForNearestTSS(svSegments, complexType);
}

// annotate nearest TSS for intergenic variants with no promoter overlaps
if (gtfIntervalTrees != null && gtfIntervalTrees.getTranscriptionStartSiteTree() != null &&
!variantConsequenceDict.containsKey(GATKSVVCFConstants.PROMOTER) && noCodingAnnotations) {
Expand Down
Loading