Opened 2 months ago

Closed 5 weeks ago

#1323 closed task (fixed)

Genotype targeted variants

Reported by: Nicklas Nordborg Owned by: Nicklas Nordborg
Priority: major Milestone: Reggie v4.32
Component: net.sf.basedb.reggie Keywords:
Cc:

Description

We should implement an analysis step for genotyping specific variants. For example, there are known variants in the ESR1, DPYD and PIK3CA genes that can affect treatment. The current variant calling analysis may catch some of the variants, but the filtering in that analysis is sometimes too hard and we may miss existing variants. A miss may also be due to low coverage on a specific location.

The analysis step should use one or more VCF files that specify the target variants as input. It should be possible to add more VCF files in the future. Genotyping can probably be done with HaplotypeCaller as we already do in the QC step. The results should be saved in VCF files attached to the Variant call raw bioassay. It is probably most practical to have one result VCF files for each target VCF file.

The result VCF files should be annotated in the same manner as the annotated VCF files from the existing variant calling analysis. This would make it possible to index and make the searchable by the Variant search extension.

Change History (28)

comment:1 Changed 2 months ago by Nicklas Nordborg

In 6356:

References #1323: Genotype targeted variants

Added an item list for storing variant call raw bioassays that should be genotyped. The index page has been updated with a new wizard and counter and the variant call auto-confirmation has been prepared for starting the genotyping once it has been implemented.

comment:2 Changed 8 weeks ago by Nicklas Nordborg

In 6358:

References #1323: Genotype targeted variants

Started to implement a wizard for selecting variant calls that should be genotyped and submitting scripts to the cluster. Script generation is not yet implemented.

comment:3 Changed 8 weeks ago by Nicklas Nordborg

In 6359:

References #1323: Genotype targeted variants

Started with implementing a script for targeted genotyping. It creates the HaplotypeCaller step, but post-processing and annotation is also needed.

comment:4 Changed 8 weeks ago by Nicklas Nordborg

In 6360:

References #1323: Genotype targeted variants

Added a new section in reggie-config.xml <targeted-genotyping> for configuring VCF files that define the variants that should be genotyped.

comment:5 Changed 7 weeks ago by Nicklas Nordborg

In 6361:

References #1323: Genotype targeted variants

Result VCF files are now linked with the VariantCall item in BASE.

comment:6 Changed 7 weeks ago by Nicklas Nordborg

In 6362:

References #1323: Genotype targeted variants

Auto-confirmation after variant calling should now start the targeted genotyping.

comment:7 Changed 7 weeks ago by Nicklas Nordborg

In 6363:

References #1323: Genotype targeted variants

Added LeftAlignAndTrimVariants step to split genomic locations with multiple alternate alleles to multiple rows. This is needed for the annotation processing to work.

comment:8 Changed 7 weeks ago by Nicklas Nordborg

In 6364:

References #1323: Genotype targeted variants

Implemented steps for annotating the variants from the databases and with snpEff. The procedure is more or less the same as for the main variant calling.

comment:9 Changed 7 weeks ago by Nicklas Nordborg

In 6365:

References #1323: Genotype targeted variants

Added TargetedGenotype annotation that can be used on VCF files to store the target definition that produced the VCF file.

comment:10 Changed 7 weeks ago by Nicklas Nordborg

In 6366:

References #1323: Genotype targeted variants

Updated the dialog for viewing information about variants with more information:

  • Genotype
  • Allelic depth
  • HGVS.c
  • HGVS.p

comment:11 Changed 6 weeks ago by Nicklas Nordborg

In 6367:

References #1323: Genotype targeted variants

Change to GATK version 4.x (4.1.8.1 is currently installed) since the 3.8 version was always unable to call some of the variants, but 4.x can. The --max-mnp-distance parameter is important since otherwise it will split some multi-base variants into several SNV. For example, ESR1: chr6:152098785:TC>AG is split into chr6:152098785:T>A and chr6:152098786:C>G which doens't catch the correct protein change.

comment:12 Changed 6 weeks ago by Nicklas Nordborg

In 6368:

References #1323: Genotype targeted variants

The "View variant" dialog crashed if a variant could not be called (eg. the GT field was set to ./. by the HaplotypeCaller). Counting also needed some adjustments.

comment:13 Changed 6 weeks ago by Nicklas Nordborg

In 6369:

References #1323: Genotype targeted variants

Added auto-confirm support for targeted genotyping and linking to the Variant Indexing service via item lists.

comment:14 Changed 6 weeks ago by Nicklas Nordborg

In 6370:

References #1323: Genotype targeted variants

Added auto-confirm support for targeted genotyping and linking to the Variant Indexing service via item lists.

comment:15 Changed 5 weeks ago by Nicklas Nordborg

In 6382:

References #1323: Genotype targeted variants

Skip 'p.?' when displaying information and use data from Cosmic if no information is available from SnpEff?.

comment:16 Changed 5 weeks ago by Nicklas Nordborg

In 6384:

References #1323: Genotype targeted variants

Ignore some effects from the ANN annotation. (see #1326)

comment:17 Changed 5 weeks ago by Nicklas Nordborg

In 6386:

References #1323: Genotype targeted variants

Genotype VCF files should be included in the release.

comment:18 Changed 5 weeks ago by Nicklas Nordborg

In 6387:

References #1323: Genotype targeted variants

Merge all changes to the trunk.

comment:19 Changed 5 weeks ago by Nicklas Nordborg

In 6388:

References #1323: Genotype targeted variants

Added NumTargetedGenotypes and NumTargetedVariants annotations to be used on file items.

Also copy the description from the genotyping target definition to the file items.

comment:20 Changed 5 weeks ago by Nicklas Nordborg

Milestone: Reggie v4.xReggie v4.32

comment:21 Changed 5 weeks ago by Nicklas Nordborg

In 6389:

References #1323: Genotype targeted variants

Updating @since annotation to correct Reggie version 4.32

comment:22 Changed 5 weeks ago by Nicklas Nordborg

In 6390:

References #1323: Genotype targeted variants

The link to the dialog for viewing the genotypes didn't work in all cases since it was trying to find a DERIVEDBIOASSAY parent instead of RAWBIOASSAY for the targeted genotype files.

comment:23 Changed 5 weeks ago by Nicklas Nordborg

In 6391:

References #1323: Genotype targeted variants

Add link to a dialog in the Variant Search extension for displaying more information about a variant/genotype.

comment:24 Changed 5 weeks ago by Nicklas Nordborg

In 6392:

References #1323: Genotype targeted variants

Default configuration for targeted genotyping.

comment:25 Changed 5 weeks ago by Nicklas Nordborg

In 6393:

References #1323: Genotype targeted variants

Display information from the targeted genotyping in the case summary.

comment:26 Changed 5 weeks ago by Nicklas Nordborg

In 6394:

References #1323: Genotype targeted variants

Variants with no data is displayed with gray text.

comment:27 Changed 5 weeks ago by Nicklas Nordborg

In 6396:

References #1323: Genotype targeted variants

Remove debug code.

comment:28 Changed 5 weeks ago by Nicklas Nordborg

Resolution: fixed
Status: newclosed
Note: See TracTickets for help on using tickets.