Opened 11 months ago

Closed 11 months ago

Last modified 11 months ago

#1508 closed enhancement (fixed)

Use Apache Commons Compress instead of builtin GZIPInputStream

Reported by: Nicklas Nordborg Owned by: Nicklas Nordborg
Priority: major Milestone: Variant Search v1.9
Component: net.sf.basedb.varsearch Keywords:
Cc:

Description

One of our servers is running with Java 11 (OpenJDK Runtime Environment 11.0.17+8-LTS). There is a problem with parsing VCF files from the variant calling (variants-annotated.vcf.gz) that has been compressed with bgzip. The stack trace:

net.sf.basedb.core.InvalidDataException: Could not find line #41612: variants-annotated.vcf.gz
        at net.sf.basedb.varsearch.servlet.HitServlet.doGet(HitServlet.java:106)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:634)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:741)
...

The same file can be downloaded and de-compressed with other tools and it does have more than the specified number of lines. And we don't see the problem on our other server with a newer Java version.

The problem seems to be related to the multi-block format that bgzip uses (this is required for indexing and quick access to random locations in the file).

There is an Apache project with an alternate implementation that seems to work. https://commons.apache.org/proper/commons-compress/

Change History (3)

comment:1 by Nicklas Nordborg, 11 months ago

In 7302:

References #1508: Use Apache Commons Compress instead of builtin GZIPInputStream

Added Apache Commons Compress to the repository.

comment:2 by Nicklas Nordborg, 11 months ago

Resolution: fixed
Status: newclosed

In 7303:

Fixes #1508: Use Apache Commons Compress instead of builtin GZIPInputStream

The GzipCompressorInputStream implementation from Apache Commons Compress is now used instead of the GZIPInputStream.

comment:3 by Nicklas Nordborg, 11 months ago

Milestone: Variant Search 1.9Variant Search v1.9

Milestone renamed

Note: See TracTickets for help on using tickets.