$Id: README_PluginDetails 1218 2010-03-29 08:48:15Z jari $

== Introduction ==

There are many plug-ins in the Illumina plug-in package for BASE. This
file gives detailed information on some of the contributed plug-ins.


== Illumina expression background correction plug-in ==

This plug-in will remove a per-slide global background from all
spots. The background is calculated from a set of negative control
spot on the array. See implementation details below for details.

=== Parameters ===

There is one parameter to set that specifies how background
intensities should be calculated. Allowed values are median or mean,
i.e. the background is either the median or the mean of the negative
control spots on the array.

The expression of the background probes is optionally saved to a
file. The default is not to save the expression matrix but this can be
changed during job configuration.

=== Implementation details ===

Each assay is treated separately, i.e., no samples are combined
together. All calculations are made on the current bioassay data
implying that this plug-in should be used early in analysis and before
background spots are removed.

A spot is considered to be a negative control spot if it has an
''Control group name'' exactly matching the string ''negative''.


== Illumina detection P-value calculation ==

This plug-in implements !BeadStudio like detection P-value calculations
for Illumina expression data (see
http://www.genomecenter.ucdavis.edu/expression_analysis/documents/illumina_normalization_081201.pdf)
on detection P-values.

The plug-in will ''always'' base the detection P-value calculation on
raw data values, ''i.e.'', the mean raw intensity for the different
signals. By default the calculations are based on negative controls
available in the root bioassay set for the current analysis
branch. The user may change this to only use negative controls in the
current bioassay set.

The detection P-value plug-in does not filter the assays, it provides
the detection P-values usable in a filter step after running this
plug-in.

=== Parameters ===

The plug-in requires input of array type since detection P-value are
calculated differently depending on array type.

Users may select to use negative controls in the current bioassay set
only. The default behaviour is to use all negative controls in the
root bioassay set for the current bioassay set.

A cut off parameter is available to exclude outliers within
the negative controls. The `cutoff` defines the acceptable negative
control signal range
{{{
median-MAD*cutoff < I < median+MAD*cutoff
}}}
where `MAD` is the median absolute deviation.

=== Implementation details ===

Each assay is treated separately, i.e., no samples are combined
together. All calculations are made on the raw bead-type level data,
i.e., on the average expression value for each bead type and raw data
is always used irrespective when in analysis the detection P-value is
calculated.

''Pvalue calculation for whole genome arrays:''

For all signals `i` calculate the detection P-value as
`Pvalue = 1-R/N` where `R` is the rank of the signal `i` relative to the
negative controls and `N` is the number of negative controls.

''Pvalue calculation for others array types (DASL, miRNA, !VeraCode
DASL, and Focused Arrays):''

For all signals `i` calculate the detection P-value as
`Pvalue = 1/2 - 1/2 * erf( [i-AvgControl]/StdControl/sqrt(2) )` where
`AvgControl` is the average intensity of the negative controls,
`StdControl` is the standard deviation of the the negative controls,
and `erf` is the error function
(http://mathworld.wolfram.com/Erf.html). The error function is used
for arguments within the range (-4,4). To save CPU cycles, the function
value for arguments outside this range is set to -1 and 1,
respectively.

A spot is considered to be a negative control spot if it has an
''Control group name'' exactly matching the string ''negative''.


== Control Summary plots ==

This extension provides overview plots for Illumina expression data. The
'Overview plots' tab becomes available in the experiment analysis tree
when the user selects a bioassay set. The plots are automatically
generated when the user clicks on the Overview plots tab. The display
of the plots cannot be changed by the user and the same plot is shown
irrespective which bioassay set is selected.

Currently two control summary curves are generated in one plot; The
average intensity of perfect match beads in each assay, and the
average intensity of housekeeping beads in each assay.

The average intensity I_avg is calculated as

		I_avg = sum[Iraw_i] / N

where N is the number of bead types in the sum, Iraw_i is the raw
mean intensity for bead type i.

A bead type is considered to belong to the perfect match group if it
is annotated with ':pm' in the reporter annotation column '[Rep]
Control group id', and a bead type is grouped as housekeeping if it is
annotated with 'housekeeping' in reporter annotator column '[Rep]
Control group id'.

----------------------------------------------------------------------
{{{
Copyright (C) 2009, 2010 Jari Häkkinen

This file is part of Illumina plug-in package for BASE.
Available at http://baseplugins.thep.lu.se/
BASE main site: http://base.thep.lu.se/

This is free software; you can redistribute it and/or
modify it under the terms of the GNU General Public License
as published by the Free Software Foundation; either version 3
of the License, or (at your option) any later version.

The software is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
GNU General Public License for more details.

You should have received a copy of the GNU General Public License
along with BASE. If not, see <http://www.gnu.org/licenses/>.
}}}
----------------------------------------------------------------------