Opened 3 years ago

Closed 3 years ago

Last modified 3 years ago

#1329 closed defect (fixed)

HaplotypeCaller is "inventing" new variants

Reported by: Nicklas Nordborg Owned by: Nicklas Nordborg
Priority: major Milestone: Reggie v4.32.1
Component: net.sf.basedb.reggie Keywords:


The output from HaplotypeCaller sometimes contain results for more variants that we submitted in the reference VCF file. There is probably some logic to it, but we would like to only keep the results for the variants that we asked for. For example, for ESR1 we test two variants that are located next to each other:

chr6:152098787 T›A
chr6:152098788 A›C

HaplotypeCaller may also output results for the combined variant:

chr6:152098787 TA›AC

On the protein level this results in 3 different variants: Y539N, Y539T and Y539S so this could very well be important, but it is not what we asked for.

Since we are annotating the results with the TYPE annotation from the reference VCF it should be relatively easy to remove entries that have no TYPE annotation from the final result.

Change History (3)

comment:1 by Nicklas Nordborg, 3 years ago

I think it should be possible to use grep "\(^#\|TYPE\=\)" to filter out the lines without a TYPE annotation. We need to keep all header lines (starting with # and all other lines containing TYPE=.

comment:2 by Nicklas Nordborg, 3 years ago

Resolution: fixed
Status: newclosed

In 6405:

Fixes #1329: HaplotypeCaller is "inventing" new variants

comment:3 by Nicklas Nordborg, 3 years ago

Milestone: Reggie v4.33Reggie v4.32.1

Milestone renamed

Note: See TracTickets for help on using tickets.