Add secondary analysis section to Reggie
— at Version 6
This is the master ticket for adding secondary analysis registration functionality to Reggie. The secondary analysis is the steps done from sequencing down to expression values have been generated, including demux and alignment against a reference genome.
Note! Primary analysis is the base calling performed by the Illumina software during the sequencing.
The pipeline will be something like this. See the other tickets (to be created) for more information about each step:
- (#545) Register sequencing as ended. Part of the "Library preparation wizards" section and done by someone in the lab.
- (#546) Confirm sequencing as completed. First wizard in the "Secondary analysis wizards" section. Used to decide if the sequenced data is ok or not. If ok, continue with demuxing, otherwise flag pools for re-sequencing.
- (#547) Start demux and merge. This wizard starts the demux and merge operations.
- (#548) Register demux and merge as ended. At the end we have one "MergedSequences" item for each "Library" from the flow cells that was sequenced. A count of the number of reads for each library must be recorded and is used to determine if the library needs to be re-sequenced or not. FASTQ files for each library are stored on the server.
- Start filtering and alignment. Bowtie and TopHat is used to first filter and then align against a pre-defined set of transcripts.
- Register filtering and alignment as ended. At the end we have one "AlignedSequences" item for each "Library" from the flow cells that was sequenced. BAM files for each library are stored on the server.
- Start feature extraction. Cufflinks is used to calculate expression values.
- Register feature extraction. At the end we have one "RawBioAssay" item for each "Library" from the flow cells that was sequenced. FPKM files are uploaded to BASE and imported into the database.
Overview PDF v1