Opened 2 months ago

Closed 2 months ago

#1615 closed enhancement (fixed)

Implement SBS96 for RNAseq variant calling

Reported by: Nicklas Nordborg Owned by: Nicklas Nordborg
Priority: major Milestone: Reggie v5.3.1
Component: net.sf.basedb.reggie Keywords:
Cc:

Description

The same python script that is used in the WGS variant calling pipeline should probably work with some minor modifications (eg. there is no normal sample in the VCF).

The container need to be updated and include the SigProfilerPlotting package.

Change History (4)

comment:1 by Nicklas Nordborg, 2 months ago

In 7869:

References #1615: Implement SBS96 for RNAseq variant calling

Updated the container definition.

  • Changed base image to Rocky 9
  • Switched to mamba instead of miniconda
  • Added SigProfilerPlotting python package.
  • Need to specify explicit version of numpy due to an incompatibility between 'panda' and 'numpy' (see below and https://github.com/numpy/numpy/issues/26710).


Traceback (most recent call last):
  File "/home/thep-nni/sbs96.py", line 22, in <module>
    import sigProfilerPlotting as sig_plot
  File "/conda/lib/python3.10/site-packages/sigProfilerPlotting/__init__.py", line 1, in <module>
    from .sigProfilerPlotting import *
  File "/conda/lib/python3.10/site-packages/sigProfilerPlotting/sigProfilerPlotting.py", line 30, in <module>
    import pandas as pd
  File "/conda/lib/python3.10/site-packages/pandas/__init__.py", line 22, in <module>
    from pandas.compat import is_numpy_dev as _is_numpy_dev  # pyright: ignore # noqa:F401
  File "/conda/lib/python3.10/site-packages/pandas/compat/__init__.py", line 18, in <module>
    from pandas.compat.numpy import (
  File "/conda/lib/python3.10/site-packages/pandas/compat/numpy/__init__.py", line 4, in <module>
    from pandas.util.version import Version
  File "/conda/lib/python3.10/site-packages/pandas/util/__init__.py", line 2, in <module>
    from pandas.util._decorators import (  # noqa:F401
  File "/conda/lib/python3.10/site-packages/pandas/util/_decorators.py", line 14, in <module>
    from pandas._libs.properties import cache_readonly
  File "/conda/lib/python3.10/site-packages/pandas/_libs/__init__.py", line 13, in <module>
    from pandas._libs.interval import Interval
  File "pandas/_libs/interval.pyx", line 1, in init pandas._libs.interval
ValueError: numpy.dtype size changed, may indicate binary incompatibility. Expected 96 from C header, got 88 from PyObject

comment:2 by Nicklas Nordborg, 2 months ago

In 7870:

References #1615: Implement SBS96 for RNAseq variant calling

The container needed more updates:

  • Updated to R 4.3
  • Updated to mosdepth 0.3.8
  • Updated to vcfanno 0.3.5
  • Updated to bedtools 2.31
  • Updated to MutationalPatterns 3.12

The main reason is that the R 3.6 installation didn't work properly. The mutation_signature.R failed with:

Error: package or namespace load failed for ‘MutationalPatterns’ in dyn.load(file, DLLpath = DLLpath, ...):
 unable to load shared object '/conda/lib/R/library/openssl/libs/openssl.so':
  libssl.so.1.1: cannot open shared object file: No such file or directory
Execution halted

The simplest solution was to use a newer R version (4.3) instead. The other packages also needed to be updated in order to avoid conflicts.

comment:3 by Nicklas Nordborg, 2 months ago

In 7871:

References #1615: Implement SBS96 for RNAseq variant calling

Added SBS96 calculations to RNAseq variant calling script.

comment:4 by Nicklas Nordborg, 2 months ago

Resolution: fixed
Status: newclosed
Note: See TracTickets for help on using tickets.