= Illumina SNP Normalization = tQN is a strategy using quantile normalization to improve the quality of data from Illumina Infinium Whole-Genome Genotyping SNP Beadchips described in ''Normalization of Illumina Infinium whole-genome SNP data improves copy number estimates and allelic intensity ratios'' [[BR]] J. Staaf, J. Vallon-Christersson, D. Lindgren, G. Juliusson, R. Rosenquist, M. Höglund, Å. Borg, M. Ringnér [[BR]] ''submitted'' [[BR]] === License === The tQN software is available as a stand-alone software package, and will become available as as a plug-in to BASE as the handling of SNP arrays in BASE is developed. Both versions are available under the [http://www.gnu.org/copyleft/gpl.html GNU General Public License]. === Download tQN === The software will be made available when the manuscript describing the method is accepted for publication. === How to use tQN === ''Requirements'' tQN is written in R with a Perl wrapper, so both R and Perl are required. Required Perl modules are: File::Spec, Getopt::Long, IO::File and Pod::Usage (http://www.cpan.org). Required R package is limma (http://www.bioconductor.org). ''Installation'' Download and unzip the file available under the section ''Download tQN'' on this page. OS X or Linux: The programs should run as they are. You need R and perl in your path. Windows: Depending on how you have installed R and Perl on your system you may have to edit the variable ''$R_command'' at the beginning of the file ''tQN_normalize_samples.pl'' so it contains the full path to your R executable. For example, we have successfully used tQN using !ActivePerl on a Window system with the following ''$R_command'': {{{ # Mac OS X and Linux # my $R_command="R --vanilla --no-save --slave < tQN.R"; # Windows my $R_windows=File::Spec->canonpath('C:/"Program Files"/R/R-2.7.0/bin/Rscript'); my $R_command="$R_windows --vanilla tQN.R"; }}} ''Input data format'' tQN is applied to data exported from !BeadStudio. For a set of samples, the file exported from !BeadStudio should be tab-delimited in the following format: ||Name||Chr||Position||sample1.X||sample1.Y||sample2.X||sample2.Y||sample3.X||sample3.Y|| ||rs12354060||1||10004||0.04424883||1.818238||0.03157751||1.632767||0.04973672||1.770216|| ||rs2691310||1||46844||0.7046126||1.305445||0.8322142||1.271329||0.8042333||1.151523|| ||...||...||...||...||...||...||...||...||...|| The data extracted from !BeadStudio needs to be split into a separate file for each sample using the script ''split_beadstudio_samples.pl''. {{{ split_beadstudio_samples.pl --beadstudio_file=example/example_beadstudio_data.txt }}} where ''example_beadstudio_data.txt'' is a file exported from !BeadStudio in the format described above. This script will generate one file per sample together with a file ''sample_names.txt'' in the tQN subdirectory ''extracted''. These files are used when tQN is run and can be deleted once the samples are normalized. ''Performing tQN'' Run tQN with the following command: {{{ tQN_normalize_samples.pl --beadchip=humancnv370-duo }}} This command will perform tQN on the samples in the tQN subdirectory ''extracted'' that are specified in the file ''sample_names.txt''. If you want to perform tQN on a subset of samples you can edit ''sample_names.txt'' accordingly. The normalized data is stored in the tQN subdirectory ''normalized''. For each sample, there is a file with tQN normalized data. A file ''tQN_beadstudio.txt'' is also generated with tQN BAF and Log R Ratios for all samples in a format suitable for import into !BeadStudio using its import column process. tQN also supports generating tQN data for further analysis with PennCNV and QuantiSNP. Running tQN with the following command: {{{ tQN_normalize_samples.pl --beadchip=humancnv370-duo --output_format=PennCNV }}} generates one data file per sample in the tQN subdirectory ''normalized'' for further analysis using PennCNV. Alternatives for ''--output_format'' are ''QuantiSNP'', which generates one data file per sample for further analysis with QuantiSNP and ''!BeadStudio'', which is the default argument generating the default ''tQN_beadstudio.txt'' file with data for all samples. Beadchip types for which there is a cluster file in the tQN subdirectory ''lib'' are supported by tQN. For PennCNV and QuantiSNP, SNPs having missing values in either B allele frequencies or log R ratios after normalization are excluded from the respective output files. === Contact === If you have suggestions, comments or bug reports, please send an email to johan.staaf@med.lu.se