#1009 assigned task

Genotype quality control wizard

Since #1001 we are creating a VCF file with genotypes for 51 SNPs. A wizard should be implemented that uses this information for quality control. There are basically two things that we can check by comparing two VCF files:

  • If the samples are from different patients the genotypes should be different.
  • If the samples are from the same patient the genotypes should be similar.

The wizard is intended to be run manually at regular intervals. Typically one time for each library plate. When running the wizard it should compare the VCF files from the selected samples with each other AND the VCF files for all other samples that has already been checked. The wizard should NOT compare against unselected and unchecked samples.

Since we are starting out with lots of existing and unchecked samples, the wizard should sort them in library plate order to make it easy to go through all of them in a controlled way.

The exact details of the how to compare the VCF files and the parameters for situations that should generate warnings are not yet settled. We also need to think about how to store the warnings since most will be of a nature that can't be solved immediately (for example, a sample may need to be re-processed).

Change History (2)

(In [4642]) References #1009: Genotype quality control wizard

Started with the "Genotype quality control" wizard. It has been added to the index page under the "Hisat" section.

The annotation type QC_GenotypeStatus was added to keep track of alignments that has aldready been checked (or disabled).

The first step of the wizard will display alignments waiting to be checked (they have no QC_GenotypeStatus annotation). The alignments are sorted by library plate and at most 250 at a time.

The VCF statistics has also been moved from the HisatServlet to the GenotypeServlet.

