Changes between Version 34 and Version 37 of Ticket #887

May 19, 2016, 8:00:48 AM (8 years ago)
Nicklas Nordborg


  • Ticket #887 – Description

    v34 v37  
    44  - `tidmatrix.features.txt`: Array design features with some annotations. The first line is a header line:
    55   * `id`, `geneSymbol`, `refSeq`, `protAcc`, `description`, `chr`, `entrez`.
    6    * Rows are sorted by external ID (or internal ID?? see [comment:33 comment below] for more information)
     6   * Rows are sorted by internal ID (see [comment:33 comment below] for more information)
    77   * All raw bioassays in the input list must use the same array design.
    88  - `tidmatrix_data.txt`: FPKM values for all raw bioassays. Each row represents a feature and each column a raw bioassay.
    1515  - `genematrix_data.txt`: Sum of FPKM values per gene symbol.
    1616   * The first line is a header line with raw bioassay names.
    17    * The first column is the gene symbol (in ~~no particular~~ '''alphabetical''' order).
     17   * The first column is the gene symbol (in no particular order).
    1818  - `is.NM.gene.txt`: TRUE/FALSE flag for each gene indicating if the refSeq ID starts with `NM_` or not.
    1919   * No header line.
     20   * Rows must be in the same order as in `genematrix_data.txt`.
    2021   * First column is the line number (in this file, add +1 for getting the line number in `genematrix_data.txt`).
    2122   * Second column is `TRUE` or `FALSE`.
     23   * Third column is the gene symbol.
    2325 * Cohort data (in folder `cohortTables`): A set of tab-separated files with data for each raw bioassay and the parent items it is derived from. Each file starts with a header line. Each row contains data for one raw bioassay. The first column (`rba`) is always the name of the raw bioassay. Columns ending with `.A.` are annotation columns. Date values are formatted as `YYYY-MM-DD` unless otherwise noted.