wiki:net.sf.basedb.reggie/notes427

Version 4 (modified by Nicklas Nordborg, 9 days ago) (diff)

Added instructions for running SSP models for existing raw bioassays

Updating to Reggie 4.27

Note that BASE 3.16 is required for running Reggie 4.26.

A. Before updating Reggie

1. Install new/updated pipeline scripts

There are new pipeline scripts: http://baseplugins.thep.lu.se/browser/other/pipeline/trunk Download and install at a suitable location on the prime cluster:

  • novaseq_status.sh: Used by auto-confirmation to check status of NovaSeq sequencing.

2. Install R-script for the Single Sample Predictor

TODO

3. Import annotation types that are required for storing SSP results

The ssp_annotationtypes.xlsx file contains definitions for annotation types required for the SSP models. Upload this file to BASE and use the batch annotation type importer to create the annotation types. It should create 16 new SSP_* annotation types.

B. Update Reggie

Update Reggie to version 4.27. Add the following models to the <rscript>/<ssp>/<models> section in reggie-config.xml (remove the <model> entry that is already there). Do not forget to run the Installation wizard.

<model name="Subtype" annotation-type="SSP_Subtype" annotation-type-scores="SSP_Subtype_Scores">
	Training_Run19081Genes_noNorm_SSP.PAM50subtype4Most.Fcc15_5x5foldCV.num.rules.50_21.selRules.AIMS.GS.RData
</model>
<model name="PAM50 subtype" annotation-type="SSP_PAM50subtype" annotation-type-scores="SSP_PAM50subtype_Scores">
	Training_Run19081Genes_noNorm_SSP.subtypeMost.Fcc15_5x5foldCV.num.rules.50_24.selRules.AIMS.GS.Rdata
</model>
<model name="ROR asT0" annotation-type="SSP_RORasT0" annotation-type-scores="SSP_RORasT0_Scores">
	Training_Run19081Genes_noNorm_SSP.scaled.ROR.tot.asT0.c005.Fcc15_5x5foldCV.num.rules.50_21.selRules.AIMS.GS.RData
</model>
<model name="ER" annotation-type="SSP_ER" annotation-type-scores="SSP_ER_Scores">
	Training_Run19081Genes_noNorm_SSP.ER_v2.Fcc15_5x5foldCV.num.rules.50_19.selRules.AIMS.GS.RData
</model>
<model name="PR" annotation-type="SSP_PR" annotation-type-scores="SSP_PR_Scores">
	Training_Run19081Genes_noNorm_SSP.PR_v2.Fcc15_5x5foldCV.num.rules.50_3.selRules.AIMS.GS.Rdata
</model>
<model name="ERBB2" annotation-type="SSP_ERBB2" annotation-type-scores="SSP_ERBB2_Scores">
	Training_Run19081Genes_noNorm_SSP.HER2.Fcc15_5x5foldCV.num.rules.50_8.selRules.AIMS.GS.RData
</model>
<model name="CC15" annotation-type="SSP_CC15" annotation-type-scores="SSP_CC15_Scores">
	Training_Run19081Genes_noNorm_SSP.CClusterK20_15_5x5foldCV.num.rules.50_27.selRules.AIMS.GS.Rdata
</model>
<model name="IntClust10" annotation-type="SSP_IntClust10" annotation-type-scores="SSP_IntClust10_Scores">
	Training_Run19081Genes_noNorm_SSP.iC10.mimic_5x5foldCV.num.rules.50_41.selRules.AIMS.GS.RData
</model>

C. Run SSP for existing StringTie raw bioassays

The configured SSP models are running automatically for new sequencing runs if auto-confirmation is selected.

To run SSP for existing data, add StringTie raw bioassays to the Single Sample Predictor analysis item list.

Then go to the Start Single Sample Predictor analysis wizard in the Secondary analysis/Hisat and StringTie pipeline section and start jobs.

It is possible to run 500 raw bioassays at the same time.

Attachments (1)

Download all attachments as: .zip