Context Navigation

← Previous Change
Ticket History
Next Change →

Changes between Version 34 and Version 37 of Ticket #887

Timestamp:: May 19, 2016, 8:00:48 AM (8 years ago)
Author:: Nicklas Nordborg
Comment:

Legend:

: Unmodified
: Added
: Removed
: Modified

Ticket #887 – Description

-              v34
+              v37
   - `tidmatrix.features.txt`: Array design features with some annotations. The first line is a header line:
    * `id`, `geneSymbol`, `refSeq`, `protAcc`, `description`, `chr`, `entrez`.
    * Rows are sorted by external ID (or internal ID?? see [comment:33 comment below] for more information)
+   * Rows are sorted by internal ID (see [comment:33 comment below] for more information)
    * All raw bioassays in the input list must use the same array design.
   - `tidmatrix_data.txt`: FPKM values for all raw bioassays. Each row represents a feature and each column a raw bioassay.
 …
   - `genematrix_data.txt`: Sum of FPKM values per gene symbol.
    * The first line is a header line with raw bioassay names.
    * The first column is the gene symbol (in ~~no particular~~ '''alphabetical''' order).
+   * The first column is the gene symbol (in no particular order).
   - `is.NM.gene.txt`: TRUE/FALSE flag for each gene indicating if the refSeq ID starts with `NM_` or not.
    * No header line.
+   * Rows must be in the same order as in `genematrix_data.txt`.
    * First column is the line number (in this file, add +1 for getting the line number in `genematrix_data.txt`).
    * Second column is `TRUE` or `FALSE`.
+   * Third column is the gene symbol.
  * Cohort data (in folder `cohortTables`): A set of tab-separated files with data for each raw bioassay and the parent items it is derived from. Each file starts with a header line. Each row contains data for one raw bioassay. The first column (`rba`) is always the name of the raw bioassay. Columns ending with `.A.` are annotation columns. Date values are formatted as `YYYY-MM-DD` unless otherwise noted.