Changes between Version 1 and Version 2 of net.sf.basedb.varsearch/install

Jun 11, 2021, 9:13:15 AM (3 years ago)
Nicklas Nordborg

Added information about indexing VCF files and the Index Manager


  • net.sf.basedb.varsearch/install

    v1 v2  
    1313 7. Go to the '''Extensions -> Variant Search (admin)''' menu and then continue to the '''Installation wizard'''. It should display some warnings and error messages. Click on the `Create missing items` button to fix them.
    1414 8. The installation is now complete, but before we can start searching the VCF files need to be indexed.
     16== Indexing VCF files ==
     18Before it is possible to search for variants, the VCF files need to be indexed. We can't just index any random VCF, but require things to be connected in a certain structure:
     20 * We need a raw bioassay with a  raw data type that include the VCF file type
     21 * The raw bioassay must have a VCF file linked via the '''VCF file type'''. This VCF is assumed to be the filtered VCF file.
     22 * The raw bioassay may also have a VCF linked via an any-to-any link named '''variants-annotated.vcf.gz'''. This VCF is assumed to be the unfiltered (raw) VCF.
     24Indexing is controlled via Item lists. The installation procedure created two item lists:
     26 * Variant index (filtered)
     27 * Variant index (all)
     29To index the VCF file we need to add raw bioassays to the item lists. Raw bioassays that are added to the ''Variant index (filtered)'' list will get the filtered VCF indexed, and raw bioassays that are added to the ''Variant index (all)'' list will get the full VCF indexed. Usually the indexing will start automatically once the Index Manager detects that there are things to index, but it may take 10-15 minutes. The raw bioassays are removed from the lists after they have been indexed.
     31== The Index Manager ==
     33As an administrator, it is possible to get information about and manage the index via the Index Manager. Go to the '''Extensions -> Variant Search (admin)''' menu and continue to the '''Index Manager'''.
     35It should display two tables with information. One table for the filtered index and one table for the full index.
     37|| '''Path''' || This is the path on the disk (relative the BASE userfiles directory) where the index is stored. It is possible to '''Delete''' the index. ||
     38|| '''Size on disk''' || Hard-disk space that the index is using. ||
     39|| '''Total variants''' || Total number of variants in the index. ||
     40|| '''Indexed raw bioassays''' || Number of raw bioassays that has been indexed. The '''Rebuild index''' action will re-index all VCF files. Since this may take a long time, the existing index is kept until the new index is complete. ||
     41|| '''Deleted raw bioassays''' || Number of raw bioassays in the index that no longer exists in BASE. The '''Remove from index''' action will remove them from the index. ||
     42|| '''Cached query results''' || Searches for variants that take a long time are cached for up to an hour. The cache is automatically cleared when the index is modified. Use the '''Clean cache''' action to manually clear the cache. ||
     43|| '''Item list''' || The item list that is controlling which raw bioassays that should be added to the index. ||
     44|| '''Auto-processing''' || If auto-processing is '''enabled''', the Index Manager will automatically index the VCF files for raw bioassays that are added to the list. Auto-processing can be '''disabled'''. ||
     45|| '''Items in queue''' || Number of raw bioassays in the item list that are waiting to be indexed. The '''Add to index''' action can be used to start the indexing ignoring the auto-processing setting. ||
     46|| '''Status''' || Typically '''IDLE''' if the Index Manager is not working at the moment, otherwise it will display a progress bar indicating the flow of the current action. ||