#79 closed task (fixed)
Center on assay (se.lu.onk.Center)
Reported by: | Jari Häkkinen | Owned by: | olle |
---|---|---|---|
Priority: | major | Milestone: | ZZ Other tickets |
Component: | se.lu.onk series plug-ins | Keywords: | |
Cc: |
Description
Add function on median/mean center. Center on assay. Group assays to find series.
Change History (14)
comment:1 by , 16 years ago
Component: | onk.lu.se → se.lu.thep series plugins |
---|---|
Milestone: | → ZZ Other tickets |
Owner: | changed from | to
comment:2 by , 16 years ago
Summary: | Center on assay → Center on assay (se.lu.onk.Center) |
---|
comment:3 by , 16 years ago
Component: | se.lu.thep series plugins → se.lu.onk series plug-ins |
---|
comment:4 by , 11 years ago
comment:5 by , 11 years ago
(In [2012]) Fixes #497. Refs #79. Class/file center/Center.java
in se.lu.onk.Center
updated in median calculation:
- Private method
median(List<Float> vec)
is updated by replacing the current algorithm by a call to new private methodfloat calculatePercentile(List<Float> vec, float fraction)
, withfraction
value set to 0.5. - New private method
float calculatePercentile(List<Float> vec, float fraction)
added. It first sorts the values in ascending order (the original input list is not changed), and then returns a weighted value of the list values of the two index values nearest to the desired percentile. If the percentile corresponds exactly to an integer index, the list value for that index is returned.
comment:6 by , 11 years ago
Owner: | changed from | to
---|
comment:8 by , 11 years ago
Background:
The centering plug-in is intended to work with data tables where the columns represent assays (relatively few, ~1-10), and the rows represent reporter genes (many, ~30000 and up). The values are fluorescent responses indicative of the presence of a reporter gene in an assay. These responses are obtained by use of different stains, and the intensity of the response is often depending on the stain used. To eliminate the influence of the latter, it is customary to swap stains between assays and normalize the data. This plug-in was created to perform normalization in different ways.
The original version of the Center plug-in allowed normalization over assays, genes, or both. Normalization over both assays and genes was performed by repeating cycles of separate normalization over genes and assays, where the user can choose the number of cycles to perform (default is 5). The user could also select if the median or mean value should be used for the normalization.
comment:9 by , 11 years ago
Design update:
In order to make the normalization more flexible, the following functionality should be added:
- The normalization may be restricted to a group or groups of assays:
a. Default - Arrays: each assay in its own group, genes: all assays in one group
b. Single assay group - Center all data based on values in single assay group
c. Assay groups - Center each assay group separately
- Center group(s) assay names. Only needed if something other than "Default" has been chosen in step 1. Names of assays in a group are separated by commas, and groups are separated by a pipe "|" character.
- In order to better check the workings of the normalization procedure, it should be possible to request that extra files with debug information will be stored with the data (default is not to store extra debug information). If chosen, a number of text files named after the java method creating them (after a prefix "
debug
") will be stored in a new directory "data
". The files will contain values of used parameter values and the fitting values used in the normalization step. When centering over genes, the large number of fitting values are stored in a separate file, "debugCenterRowsRFitList.txt
" or "debugCenterRowsForGroupsSeparately.txt
".
Notes on the new functionality:
- The default choice in step 1 (option a.) corresponds to the current functionality.
- Even without creation of extra debug files, results of some combinations of assay groups can be easily checked by inspection of the correction factor plots in the overview plots (correction values to be subtracted are shown in blue, and original values in red). If centering over genes where one group contains a single assay, the correction values will totally overlap the original values for that assay, leading to a flat line at zero after normalization! Of more practical interest, grouping simultaneous-looking assays together in a group, will lead to normalized values more closely adhering to the zero line for those assays.
comment:10 by , 11 years ago
Implementation details:
- BASE1 plug-in configuration file
plugin_Transformation_Center.base
inCenter/misc/
will be updated with new options and description text. New parameter variables:
a.centerAssayGroups
(enumerated) - Grouping alternatives for assays.
b.centerGroupsAssayNames
(text) - Names of assays in a group, if grouping is used.
c.createDebugFiles
(enumerated) - Boolean flag indicating if extra debug files should be stored (default is "no
").
- Java file
Center.java
inCenter/src/center/
will be updated to implement the new functionality:
a. New private instance variableList<Integer> singleCenterGroupAssayIndexList
added. It is used to store index values for assays to use for centering. If its value isnull
, a single assay group is not used for centering.
b. New private instance variableHashMap<Integer,List<Integer>> centerGroupsAssayIndexHashMap
added. It is used to store index values for assays in different groups. It its value isnull
, separate centering of assays in different groups is not used.
c. New private instance variableboolean debug
is added. It indicates if extra files with debug information should be stored with the data.
d. Public methodvoid extractSettings(BASEFileSection section)
updated to read new input parameter values, and transfer them to instance variables.
e. Public methodvoid extractAssays(BASEFileSection section, BASEFileReader reader)
updated to find index values for assays in various groups, if special grouping is used.
f. Public methodvoid extractSpots(BASEFileSection section, BASEFileReader reader)
updated to set index values for assays in various groups, if default grouping is used.
g. Public methodvoid center()
updated to call new private methodsvoid centerRowsForGroupsSeparately(List<AssayRow> data_arr)
orvoid centerColumnsForGroupsSeparately(List<AssayRow> data_arr)
, if separate centering of assays in different groups should be performed, otherwise private methodsvoid centerRows(List<AssayRow> data_arr)
orvoid centerColumns(List<AssayRow> data_arr)
are called.
h. Private methodvoid centerRows(List<AssayRow> data_arr)
updated to restrict the data used for normalization to assays with indices insingleCenterGroupAssayIndexList
, if the latter differs fromnull
.
i. New private methodvoid centerRowsForGroupsSeparately(List<AssayRow> data_arr)
added. It centers rows (genes) for groups separately, based on the grouping inHashMap<Integer,List<Integer>> centerGroupsAssayIndexHashMap
.
j. Private methodcenterColumns(List<AssayRow> data_arr)
updated to restrict the data used for normalization to assays with indices insingleCenterGroupAssayIndexList
, if the latter differs fromnull
.
k. New private methodvoid centerColumnsForGroupsSeparately(List<AssayRow> data_arr)
added. It centers columns (assays) for groups separately, based on the grouping inHashMap<Integer,List<Integer>> centerGroupsAssayIndexHashMap
.
l. New private inner classenum CenteringGroups
added. It is used to manage the alternatives for assay groups, divided intoDEFAULT(1)
,ASSAYGROUPSINGLE(2)
, andASSAYGROUPS(3)
.
comment:11 by , 11 years ago
Note on finding the extra debug files, if these should be created.
Prerequisites:
- Assume that "
Experiment A
" has been selected underView -> Experiments
. - Select tab "
Bioassay sets
" to show available bioassay sets for the experiment. - Assume that "
Filtered bioassay set (guest)
" has been selected by clicking on the name. Properties for that bioassay set are then shown in the "Properties
" tab. - Click on tool bar button "
Run analysis...
" in the "Properties
" tab to run an analysis plug-in for the bioassay set. The plug-in configuration window will open, to allow selection and configuration of a plug-in. - Select the "
Center
" plug-in. Assume that the following settings are chosen:
Child name:Filtered bioassay set (guest) - centering all genes after group [0,2] mean
Center on genes/arrays:Genes (rows)
Assay groups for centering:Single assay group - Center all data based on values in single assay group
Center group(s) assay names:Raw bioassay A.00h,Raw bioassay A.24h
Number of centering cycles:1
Centering using median or mean:Mean
Create debug files:yes
Start the plug-in and wait until it finishes.
After successful execution of the plug-in:
- Close the plug-in configuration window and again select tab "
Bioassay sets
" for "Experiment A
" (steps 1 and 2 above). Click on the "Refresh
" button in the BASE menu bar to update the web page. - Search the item tree after an entry named "
Filtered bioassay set (guest) - centering all genes after group [0,2] mean
", whose parent "Center
" leaf has a date and time corresponding the that of the plug-in's execution (the latter step is only necessary if more analyses with the same name have been performed before). - Click on the result leaf's parent "
Center
" leaf in the item tree, to display data related to the plu-in execution of interest. In table "Items related to this transformation
", an entry "data
" should now exist. In column "Item
", click item "data
" to open a window listing files found in directory "data
". The debug files can now be inspected (if not to large), or downloaded for later inspection.
comment:12 by , 11 years ago
(In [2026]) Refs #79. Center plug-in updated to allow centering using different groupings of assays:
- BASE1 plug-in configuration file
plugin_Transformation_Center.base
inCenter/misc/
updated with new options and description text. - Java file
Center.java
inCenter/src/center/
updated to implement the new functionality.
comment:13 by , 11 years ago
Resolution: | → fixed |
---|---|
Status: | assigned → closed |
Ticket closed since the desired functionality has been added.
Note: Current median calculation in
Center
will only give the correct result in very special cases, and should be replaced by a more general algorithm, see Ticket #497 (Center (se.lu.onk.Center) median calculation is flawed and should be replaced).