Opened 8 years ago

Closed 8 years ago

#822 closed enhancement (fixed)

Fix character encoding problems with R scripts

Reported by: Nicklas Nordborg Owned by: Nicklas Nordborg
Priority: critical Milestone: Reggie v3.7.1
Component: net.sf.basedb.reggie Keywords:
Cc:

Description

When testing the pilot report R script on the production server it failed with an error message:

java.lang.RuntimeException: Warning message: 
In source("D:/Data/R/pilotReport_v3/pilotReport_wrapper_v3.0.R", : 
invalid input found on input connection 'D:/Data/R/pilotReport_v3/pilotReport_wrapper_v3.0.R' 
Error: could not find function "pilotReport" Execution halted 

It seemed like a file encoding problem (due to åäö inside the file) since the same error can be forced by saving the R script with ANSI encoding instead of UTF-8. After double and triple-checking the R file we could conclude that the file was encoded in UTF-8 and that R was trying to read it as UTF-8.

The script worked when running it manually, but not when starting it via BASE.

http://stackoverflow.com/questions/5031630/how-to-source-r-file-saved-using-utf-8-encoding gave us another hint. The problem could be that R could not store the åäö characters internally in the memory. Investigations showed that when running R manually the locale was set to 'en_US.UTF-8', but when running via BASE it was set to 'POSIX'. POSIX doesn't have åäö and this is probably why the script doesn't work.

A possible solution would be to switch locale in R before loading the pilot R script. See https://stat.ethz.ch/R-manual/R-devel/library/base/html/locales.html

Reggie should include a call to Sys.setlocale() before calling source() in the short R script that is generated in the RScriptDefinition class:

Sys.setlocale(...); source(...); pilotReport(...);

Change History (1)

comment:1 by Nicklas Nordborg, 8 years ago

Resolution: fixed
Status: newclosed

(In [3578]) Fixes #822: Fix character encoding problems with R scripts

Calling Sys.setlocale(locale='en_US.UTF-8'); solved the problem.

Note: See TracTickets for help on using tickets.