wiki:net.sf.basedb.reggie/notes427

Version 6 (modified by Nicklas Nordborg, 4 months ago) (diff)

--

Updating to Reggie 4.27

A. Before updating Reggie

1. Install updated pipeline scripts

There are new pipeline scripts: http://baseplugins.thep.lu.se/browser/other/pipeline/trunk Download and install at a suitable location on the prime cluster:

  • novaseq_status.sh: Used by auto-confirmation to check status of NovaSeq sequencing.

2. Install R-script for the Single Sample Predictor (SSP)

A zip-file with all R script and model .RData files that are needed can be found at casa4:/home/thep-nni/ssp/SSP.zip.

Extract this zip-file to a suitable location. Make sure that the files are accessible for the Tomcat user acccount.

3. Import annotation types that are required for storing SSP results

The ssp_annotationtypes.xlsx file contains definitions for annotation types required for the SSP models. Upload this file to BASE and use the batch annotation type importer to create the annotation types. It should create 16 new SSP_* annotation types.

B. Update Reggie

Update Reggie to version 4.27. Some changes are required for reggie-config.xml:

The <rscript>/<ssp>/<path> entry should be modified to point to the directory where the R scripts are installed (see step 2).

Add the following models to the <rscript>/<ssp>/<models> section (remove the <model> entry that is already there).

<model name="Subtype" annotation-type="SSP_Subtype" annotation-type-scores="SSP_Subtype_Scores">
	Training_Run19081Genes_noNorm_SSP.PAM50subtype4Most.Fcc15_5x5foldCV.num.rules.50_21.selRules.AIMS.GS.RData
</model>
<model name="PAM50 subtype" annotation-type="SSP_PAM50subtype" annotation-type-scores="SSP_PAM50subtype_Scores">
	Training_Run19081Genes_noNorm_SSP.subtypeMost.Fcc15_5x5foldCV.num.rules.50_24.selRules.AIMS.GS.Rdata
</model>
<model name="ROR asT0" annotation-type="SSP_RORasT0" annotation-type-scores="SSP_RORasT0_Scores">
	Training_Run19081Genes_noNorm_SSP.scaled.ROR.tot.asT0.c005.Fcc15_5x5foldCV.num.rules.50_21.selRules.AIMS.GS.RData
</model>
<model name="ER" annotation-type="SSP_ER" annotation-type-scores="SSP_ER_Scores">
	Training_Run19081Genes_noNorm_SSP.ER_v2.Fcc15_5x5foldCV.num.rules.50_19.selRules.AIMS.GS.RData
</model>
<model name="PR" annotation-type="SSP_PR" annotation-type-scores="SSP_PR_Scores">
	Training_Run19081Genes_noNorm_SSP.PR_v2.Fcc15_5x5foldCV.num.rules.50_3.selRules.AIMS.GS.Rdata
</model>
<model name="ERBB2" annotation-type="SSP_ERBB2" annotation-type-scores="SSP_ERBB2_Scores">
	Training_Run19081Genes_noNorm_SSP.HER2.Fcc15_5x5foldCV.num.rules.50_8.selRules.AIMS.GS.RData
</model>
<model name="CC15" annotation-type="SSP_CC15" annotation-type-scores="SSP_CC15_Scores">
	Training_Run19081Genes_noNorm_SSP.CClusterK20_15_5x5foldCV.num.rules.50_27.selRules.AIMS.GS.Rdata
</model>
<model name="IntClust10" annotation-type="SSP_IntClust10" annotation-type-scores="SSP_IntClust10_Scores">
	Training_Run19081Genes_noNorm_SSP.iC10.mimic_5x5foldCV.num.rules.50_41.selRules.AIMS.GS.RData
</model>

Do not forget to run the Installation wizard.

C. Run SSP for existing StringTie raw bioassays

The configured SSP models are running automatically for new sequencing runs if auto-confirmation is selected.

To run SSP for existing data, add StringTie raw bioassays to the Single Sample Predictor analysis item list.

Then go to the Start Single Sample Predictor analysis wizard in the Secondary analysis/Hisat and StringTie pipeline section and start jobs.

It is possible to run 500 raw bioassays at the same time.

Attachments (1)

Download all attachments as: .zip