wiki:se.lu.onk.CGHDataDumper

Version 3 (modified by Jari Häkkinen, 15 years ago) ( diff )

--

CGH Data Dumper

The CGH Data Dumper plug-in exports typical BAC data in several different simple formats readily accessible for e.g. Excel, CGHExplorer, and MeV CGH viewer. The plug-in works on a normal BASEfile (matrix mode). Therefore, before running this plug-in, you might want to join array designs using the VirtualArray plug-in.

This plug-in is currently unsupported at supplied as is. The plug-in is written by Johan Staaf at the Dept. of Oncology at Lund University.

Exported fields are

ReporterId = BAC clone Id

GeneSymbol = Gene symbols of genes mapped to the BAC clone. Could be bundled using:

Chromosome

Cytoband

StartPosition = Start position in BP for the BAC clone

EndPosition = End position in BP for the BAC clone

M = Log2(ratio)

Note that the CGH formats do require probes to be exported to have a genomic mapping.

Formats

Standard Format

Format of header line is the above common columns and then the assays with assay names in the header.

Lite Format

Format of header line is reporterId, chromosome, start, stop, and then the assays with assay names in the header. NOTE: This format is compatible with MeV CGH viewer.

Sample Name export

Only the sample names are exported into a file. No spot data.

Sample Name and annotation export

The sample name and a chosen annotation is exported into a file. No spot data.

Annotation statistics

By specifying in the appropiate text field below in the format e.g. |ER status|brca_family_status|..| you can get a summary of how many assays has a certain annotation value for each of the specified annotation types.

BED format

This creates a format similar to BED format as defined by UCSC and Ensembl. Contains four columns; chr, start, stop, amd reporterId. You can use BED files to create your own tracks in the UCSC genome browser.

Single file Lite

This creates one file for each assays. The file name speciefed as a parameter will be the suffix to the assay name. E.g specifying myfile.txt will create a file for Ca13928 as Ca13928_myfile.txt .The format of the individual files are the Lite format (see above).

Complete annotations to file

The annotations for all assays are printed to a file. Blanks are filled with NA

Selected annotations to file

Only selected annotations and their assays are printed to file. Blanks are filled with NA.

Mev + annotations

An MeV compatible file is printed containing annotations. This format is experimental and needs to be completed.

Annotation Display dump for selected annotations

For annotation display option. Enter selected annotations in the format: |ER Status|p_brca_family_status|etc..| These annotations will be separated and a matrix created with N rows = N samples and for each annotation value type, a column will be made with N rows, where each row entry is 1 (annotation value exists for this assay) or 0, no annotation value.

Parameters

  1. [Optional] Give a valid full file name for the created file.
  1. [Optional] Sort the data as well. NOTE! Using the sort option may require you to increase the RAM usage considerably if working with large data sets.
  1. Select export mode. Either standard or standard_lite as described above, or format suitable for Agilent CGH-analytics software.
  1. Select annotation for sample name and annotation export
  1. Define annotation types for annotation statistics in the form |ERstatus|brca_family_status|...|

Download and installation

There is currently no detailed notes on how to install the plug-in in BASE 1, please fall back on the BASE 1 documentation on how to install plug-ins. Download the files from the repository

BASE 2

In BASE 2 this plug-in can be run using the Base1PluginExecuter. Simply create a configuration for the CGH Data Dumper following these steps:

  1. To be written.

License

This plug-in is released under the GNU General Public License.

Note: See TracWiki for help on using the wiki.