Opened 7 years ago
Closed 7 years ago
#1022 closed enhancement (fixed)
INCA import/statistics wizards changes due to INCA 2.0
Reported by: | Nicklas Nordborg | Owned by: | Nicklas Nordborg |
---|---|---|---|
Priority: | major | Milestone: | Reggie v4.16 |
Component: | net.sf.basedb.reggie | Keywords: | |
Cc: |
Description
INCA has been updated to version 2.0. Most variables have been renamed, which affects the INCA import and INCA statistics wizards. A quick check reveals that at least the following variables are referenced by name:
- A030DiaDat
- A080OpDat
- A090InvCa_Värde
- PATID
- A030Sida_Beskrivning
- A100ER_Värde
- A100HER2_Värde
- A100NHG_Värde
- A100PR_Värde
- A000Alder
- A090HistoTumStl
We need to update the code to be able to work with the new variable names.
Change History (14)
comment:2 by , 7 years ago
The A030DiaDat annotation was defined as: Tidigaste datum då diagnos fastställdes kliniskt och/eller genom morfologisk undersökning.
The a_diag_dat annotation in the new version is defined as: F.o.m 2.0.0 provtagningsdatum tidigare diagnosdatum. Ange datum för första punktion/biopsi.
This may affect the meaning of the ReferenceDate
and ReferenceDataSource
annotation we use in Reggie for calculating relative dates. The ReferenceDateSource
is an enum with possible options:
- IncaDiagnosisDate
- SamplingDate
- ConsentDate
- RegistrationDate
The first option is used whenever we get a date from INCA. Should we rename that option to something else? IncaSamplingDate? Or is this confusing with the second option?
comment:3 by , 7 years ago
Status: | new → assigned |
---|
comment:4 by , 7 years ago
(In [4702]) References #1022: INCA import/statistics wizards changes due to INCA 2.0
The import wizard now uses INCA2_
as the expected prefix for INCA annotations.
a_diag_dat
is used instead of A030DiaDat
for setting the reference date.
a_pat_sida_Beskrivning
is used instead of A030Sida_Beskrivning
for matching the case with laterality.
The import wizard seems to be working, but has only been tested with a simle file (3 entries).
The statistics wizard crashes with an ArrayIndexOutOfBoundsException
which is not suprising since it still use the old variable names.
comment:5 by , 7 years ago
(In [4703]) References #1022: INCA import/statistics wizards changes due to INCA 2.0
Updated the statistics wizard to use the new variables as specified earlier in this ticket. It doens't crash and seems to produce a result. A more detailed check is required since some of the variables now have a list of options that is different from the old list.
comment:6 by , 7 years ago
(In [4704]) References #1022: INCA import/statistics wizards changes due to INCA 2.0
The "Cancer type" filter that used to have three options:
1 = Only invasive cancer 2 = Only in situ cancer 3 = Both
Now only has two options:
1 = Invasive cancer with or without in situ cancer 2 = Only in situ cancer
The filter has been updated to reflect the changes. Options for the other variables seems to be the same as before (the options that we care about).
comment:7 by , 7 years ago
(In [4707]) References #1022: INCA import/statistics wizards changes due to INCA 2.0
Added support for importing annotations from "uppföljning" file. Basically there are two things that are different from the regular file:
- Laterality is taken from the
u_pat_sida_Beskrivning
variable. - If there are multiple lines with the same personal number and laterality only the line with the latest value in
u_dat
is kept.
comment:8 by , 7 years ago
comment:9 by , 7 years ago
comment:10 by , 7 years ago
(In [4719]) References #1022: INCA import/statistics wizards changes due to INCA 2.0
Ignore lines from the "uppföljning" file without a u_dat
value. The filtering is happening at an early stage in the parsing process in order to not disturb downstreams functionality. In princicple it is as if the line was not in the file to begin with. A possible side effect is that line numbers may not be reported correctly.
comment:11 by , 7 years ago
comment:12 by , 7 years ago
(In [4746]) References #1022: INCA import/statistics wizards changes due to INCA 2.0
The code for filtering the follow-up file on date did't work as expected due to interference with the filter that removed all but two data lines for the same patient. The filtering code has been re-organized and the the ">2 lines" filter has been replaced with a "missing lateratlity" filter.
There was also a problem with the output CSV file which included duplicate entries of all imported lines (but without the PAT* value).
comment:13 by , 7 years ago
comment:14 by , 7 years ago
Resolution: | → fixed |
---|---|
Status: | assigned → closed |
The following table should map the old names to the new names.
Some of the
A100
variables also have a variant witha_pad_
prefix, but from the example data it seems like the information has been migrated to theop_pad_
variables.