Changes between Version 7 and Version 8 of net.sf.basedb.varsearch/using


Ignore:
Timestamp:
May 6, 2024, 11:36:35 AM (2 weeks ago)
Author:
Nicklas Nordborg
Comment:

Update documentation

Legend:

Unmodified
Added
Removed
Modified
  • net.sf.basedb.varsearch/using

    v7 v8  
    22
    33== Introduction ==
    4 The current search functionality is integrated directly in BASE. Go to the raw bioassays list page and add the **Variant (filtered)**, **Variants (all)** and/or **Variants (targeted)** columns to the table using the **Columns...** dialog, or right-click on the title row and enable them from the popup menu.
     4The current search functionality is integrated directly in BASE. Go to the raw bioassays list page and add one or more of the columns mentioned below to the table using the **Columns...** dialog, or right-click on the title row and enable them from the popup menu.
    55
    66By default, the columns display **Yes** or **No** depending on if the VCF file has been indexed or not. If **Yes**, the number of variants is also displayed.
     7
     8==== Variants for the RNAseq pipeline ====
     9
     10The following indexes are for the RNAseq pipeline. Set a filter on the **Pipeline** annotation to **RNAseq/Hisat/VariantCall** as a baseline for selecting bioassays that are included in the variant indexes.
    711
    812 * **Variants (all)**: This index contain all variants found in the variant calling pipeline. There are typically several thousands for each rawbioassay.
     
    1216  - PIK3CA (11 variants related to Alpelisib treatment)
    1317  - DPYD (6 variants related to fluoropyrimidine-associated toxicity)
     18
     19==== Variants for the WGS pipeline ====
     20
     21The following indexes are for the WGS pipeline. Set a filter on the **Pipeline** annotation to **DNA/Paired/VariantCall** as a baseline for selecting bioassays that are included in the variant indexes.
     22
     23 * **Variants (WGS)**: This index contain the variants that are considered to be somatic by the WGS variant calling pipeline.
     24
     25==== Variants for the genotyping pipeline ====
     26
     27The following indexes are for the !OncoArray genotyping pipeline. Set a filter on the **Pipeline** annotation to **     DNA/Genotyping** as a baseline for selecting bioassays that are included in the variant indexes.
     28
     29 * **Genotyping (OncoArray500K)**: This index contains genotypes measured on the !OncoArray 500K chip.
     30 * **Genotyping (Imputed)**: This index contains genotypes that have been imputed from the !OncoArray data. There is information about ~84 million variants.
     31
    1432
    1533== Searching ==
     
    99117
    100118
    101 == Notes about the !OncoArray genotyping index ==
     119== Notes about the !OncoArray and imputed genotyping indexes ==
    102120
    103 This index also behaves differently than the other indexes. Since there are almost 500 thousands variants that have been genotyped for each raw bioassay it is not possible to create a single index with all information. It has to be split into one index for the "design" that contain information and gene annotations about the 500 thousand variants, and one index that enumerates the variants for each of the three possible genotypes. The index doesn't contain any information about allele frequency or other information that is per raw bioassay and variant.
     121These two indexes also behaves differently than the other indexes. Since there are several thousands and millions of variants variants that have been genotyped for each raw bioassay it is not possible to create a single index with all information. It has to be split into one index for the "design" that contain information and gene annotations about the variants, and one index sample-specific information. The index doesn't contain any information information that is per raw bioassay and variant.
    104122
    105123Thus, `gt` is the only genotype-related field that is searchable and it is only possible to use it once in the query.
     124
     125The imputed index has some extra fields that are searchable:
     126
     127 * `dr2:`: Dosage R-Squared which is a value between 0 and 1 that indicate how good the imputation results are for a variant across all samples.
     128 * `svtype:`: Used for structural variants.
    106129
    107130== Timeouts ==