Changes between Version 1 and Version 2 of Ticket #1225


Ignore:
Timestamp:
Jan 28, 2020, 8:37:09 AM (5 years ago)
Author:
Nicklas Nordborg
Comment:

Legend:

Unmodified
Added
Removed
Modified
  • Ticket #1225 – Description

    v1 v2  
    11The aim is to update all databases that has been updated since the original variant calling pipeline. Most of the work is done outside of Reggie. Some changes and information can be found here http://onk-wiki.bmc.lu.se/trac/scanbprim/browser/scanbprim/support-files/variant-calling
    22
    3 == Summary of changes ==
     3----
    44
    55**dbSNP updated to version 153**\\
     
    88**COSMIC updated to version 90**\\
    99They have made major changes to ID assignment and how samples are reported. This affected the custom scripts for calculating mutation frequencies. This was solved by matching ID+GENE from the VCF to ID+GENE in the sample mutation table. The end result should be compatible with older version of COSMIC.
     10
     11**gnomAD updated to version 2.1.1**\\
     12The major change is a very big increase in file size due to a lot more annotations that have been added to the VCF files. Most annotations are related to variant frequencies in different populations. The big files are impractical so we create smaller files by simply removing all annotations that we don't need. The annotations we keep:
     13
     14 * Exomes: AF, popmax, AF_popmax, AF_female, AF_nfe
     15 * Genomes: AF, AF_female, AF_male, AF_nfe
     16
     17**Swegen updated to version 20180409**\\
     18We decided to use the newer hg38 version (`swegen_frequencies_fixploidy_GRCh38_20190204.vcf.gz`) instead of the hg19 version (`swegen_frequencies_hg19_20180409.tar`).