Opened 17 years ago

Closed 11 years ago

Last modified 11 years ago

#79 closed task (fixed)

Center on assay (se.lu.onk.Center)

Reported by: Jari Häkkinen Owned by: olle
Priority: major Milestone: ZZ Other tickets
Component: se.lu.onk series plug-ins Keywords:
Cc:

Description

Add function on median/mean center. Center on assay. Group assays to find series.

Change History (14)

comment:1 by Jari Häkkinen, 16 years ago

Component: onk.lu.sese.lu.thep series plugins
Milestone: ZZ Other tickets
Owner: changed from Johan Enell to Jari Häkkinen

comment:2 by Jari Häkkinen, 16 years ago

Summary: Center on assayCenter on assay (se.lu.onk.Center)

comment:3 by Jari Häkkinen, 16 years ago

Component: se.lu.thep series pluginsse.lu.onk series plug-ins

comment:4 by olle, 11 years ago

Note: Current median calculation in Center will only give the correct result in very special cases, and should be replaced by a more general algorithm, see Ticket #497 (Center (se.lu.onk.Center) median calculation is flawed and should be replaced).

comment:5 by olle, 11 years ago

(In [2012]) Fixes #497. Refs #79. Class/file center/Center.java in se.lu.onk.Center updated in median calculation:

  1. Private method median(List<Float> vec) is updated by replacing the current algorithm by a call to new private method float calculatePercentile(List<Float> vec, float fraction), with fraction value set to 0.5.
  2. New private method float calculatePercentile(List<Float> vec, float fraction) added. It first sorts the values in ascending order (the original input list is not changed), and then returns a weighted value of the list values of the two index values nearest to the desired percentile. If the percentile corresponds exactly to an integer index, the list value for that index is returned.

comment:6 by olle, 11 years ago

Owner: changed from Jari Häkkinen to olle

comment:7 by olle, 11 years ago

Status: newassigned

Ticket accepted.

comment:8 by olle, 11 years ago

Background:

The centering plug-in is intended to work with data tables where the columns represent assays (relatively few, ~1-10), and the rows represent reporter genes (many, ~30000 and up). The values are fluorescent responses indicative of the presence of a reporter gene in an assay. These responses are obtained by use of different stains, and the intensity of the response is often depending on the stain used. To eliminate the influence of the latter, it is customary to swap stains between assays and normalize the data. This plug-in was created to perform normalization in different ways.

The original version of the Center plug-in allowed normalization over assays, genes, or both. Normalization over both assays and genes was performed by repeating cycles of separate normalization over genes and assays, where the user can choose the number of cycles to perform (default is 5). The user could also select if the median or mean value should be used for the normalization.

comment:9 by olle, 11 years ago

Design update:

In order to make the normalization more flexible, the following functionality should be added:

  1. The normalization may be restricted to a group or groups of assays:

    a. Default - Arrays: each assay in its own group, genes: all assays in one group
    b. Single assay group - Center all data based on values in single assay group
    c. Assay groups - Center each assay group separately

  2. Center group(s) assay names. Only needed if something other than "Default" has been chosen in step 1. Names of assays in a group are separated by commas, and groups are separated by a pipe "|" character.

  3. In order to better check the workings of the normalization procedure, it should be possible to request that extra files with debug information will be stored with the data (default is not to store extra debug information). If chosen, a number of text files named after the java method creating them (after a prefix "debug") will be stored in a new directory "data". The files will contain values of used parameter values and the fitting values used in the normalization step. When centering over genes, the large number of fitting values are stored in a separate file, "debugCenterRowsRFitList.txt" or "debugCenterRowsForGroupsSeparately.txt".

Notes on the new functionality:

  • The default choice in step 1 (option a.) corresponds to the current functionality.
  • Even without creation of extra debug files, results of some combinations of assay groups can be easily checked by inspection of the correction factor plots in the overview plots (correction values to be subtracted are shown in blue, and original values in red). If centering over genes where one group contains a single assay, the correction values will totally overlap the original values for that assay, leading to a flat line at zero after normalization! Of more practical interest, grouping simultaneous-looking assays together in a group, will lead to normalized values more closely adhering to the zero line for those assays.
Last edited 11 years ago by olle (previous) (diff)

comment:10 by olle, 11 years ago

Implementation details:

  1. BASE1 plug-in configuration file plugin_Transformation_Center.base in Center/misc/ will be updated with new options and description text. New parameter variables:

    a. centerAssayGroups (enumerated) - Grouping alternatives for assays.
    b. centerGroupsAssayNames (text) - Names of assays in a group, if grouping is used.
    c. createDebugFiles (enumerated) - Boolean flag indicating if extra debug files should be stored (default is "no").
  1. Java file Center.java in Center/src/center/ will be updated to implement the new functionality:

    a. New private instance variable List<Integer> singleCenterGroupAssayIndexList added. It is used to store index values for assays to use for centering. If its value is null, a single assay group is not used for centering.
    b. New private instance variable HashMap<Integer,List<Integer>> centerGroupsAssayIndexHashMap added. It is used to store index values for assays in different groups. It its value is null, separate centering of assays in different groups is not used.
    c. New private instance variable boolean debug is added. It indicates if extra files with debug information should be stored with the data.
    d. Public method void extractSettings(BASEFileSection section) updated to read new input parameter values, and transfer them to instance variables.
    e. Public method void extractAssays(BASEFileSection section, BASEFileReader reader) updated to find index values for assays in various groups, if special grouping is used.
    f. Public method void extractSpots(BASEFileSection section, BASEFileReader reader) updated to set index values for assays in various groups, if default grouping is used.
    g. Public method void center() updated to call new private methods void centerRowsForGroupsSeparately(List<AssayRow> data_arr) or void centerColumnsForGroupsSeparately(List<AssayRow> data_arr), if separate centering of assays in different groups should be performed, otherwise private methods void centerRows(List<AssayRow> data_arr) or void centerColumns(List<AssayRow> data_arr) are called.
    h. Private method void centerRows(List<AssayRow> data_arr) updated to restrict the data used for normalization to assays with indices in singleCenterGroupAssayIndexList, if the latter differs from null.
    i. New private method void centerRowsForGroupsSeparately(List<AssayRow> data_arr) added. It centers rows (genes) for groups separately, based on the grouping in HashMap<Integer,List<Integer>> centerGroupsAssayIndexHashMap.
    j. Private method centerColumns(List<AssayRow> data_arr) updated to restrict the data used for normalization to assays with indices in singleCenterGroupAssayIndexList, if the latter differs from null.
    k. New private method void centerColumnsForGroupsSeparately(List<AssayRow> data_arr) added. It centers columns (assays) for groups separately, based on the grouping in HashMap<Integer,List<Integer>> centerGroupsAssayIndexHashMap.
    l. New private inner class enum CenteringGroups added. It is used to manage the alternatives for assay groups, divided into DEFAULT(1), ASSAYGROUPSINGLE(2), and ASSAYGROUPS(3).
Last edited 11 years ago by olle (previous) (diff)

comment:11 by olle, 11 years ago

Note on finding the extra debug files, if these should be created.

Prerequisites:

  1. Assume that "Experiment A" has been selected under View -> Experiments.
  2. Select tab "Bioassay sets" to show available bioassay sets for the experiment.
  3. Assume that "Filtered bioassay set (guest)" has been selected by clicking on the name. Properties for that bioassay sets are then shown in the "Properties" tab.
  4. Click on tool bar button "Run analysis..." in the "Properties" tab to run an analysis plug-in for the bioassay set. The plug-in configuration window will open, to allow selection and configuration of a plug-in.
  5. Select the "Center" plug-in. Assume that the following settings are chosen:
    Child name: Filtered bioassay set (guest) - centering all genes after group [0,2] mean
    Center on genes/arrays: Genes (rows)
    Assay groups for centering: Single assay group - Center all data based on values in single assay group
    Center group(s) assay names: Raw bioassay A.00h,Raw bioassay A.24h
    Number of centering cycles: 1
    Centering using median or mean: Mean
    Create debug files: yes
    Start the plug-in and wait until it finishes.

After successful execution of the plug-in:

  1. Close the plug-in configuration window and again select tab "Bioassay sets" for "Experiment A" (steps 1 and 2 above). Click on the "Refresh" button in the BASE menu bar to update the web page.
  2. Search the item tree after an entry named "Filtered bioassay set (guest) - centering all genes after group [0,2] mean", whose parent "Center" leaf has a date and time corresponding the that of the plug-in's execution (the latter step is only necessary if more analyses with the same name have been performed before).
  3. Click on the result leaf's parent "Center" leaf in the item tree, to display data related to the plu-in execution of interest. In table "Items related to this transformation", an entry "data" should now exist. In column "Item", click item "data" to open a window listing files found in directory "data". The debug files can now be inspected (if not to large), or downloaded for later inspection.
Version 0, edited 11 years ago by olle (next)

comment:12 by olle, 11 years ago

(In [2026]) Refs #79. Center plug-in updated to allow centering using different groupings of assays:

  1. BASE1 plug-in configuration file plugin_Transformation_Center.base in Center/misc/ updated with new options and description text.
  2. Java file Center.java in Center/src/center/ updated to implement the new functionality.

comment:13 by olle, 11 years ago

Resolution: fixed
Status: assignedclosed

Ticket closed since the desired functionality has been added.

comment:14 by Jari Häkkinen, 11 years ago

(In [2118]) Addresses #79. Updated README to reflect changes made in ticket:79

Note: See TracTickets for help on using tickets.