Opened 9 years ago
Closed 9 years ago
#822 closed enhancement (fixed)
Fix character encoding problems with R scripts
Reported by: | Nicklas Nordborg | Owned by: | Nicklas Nordborg |
---|---|---|---|
Priority: | critical | Milestone: | Reggie v3.7.1 |
Component: | net.sf.basedb.reggie | Keywords: | |
Cc: |
Description
When testing the pilot report R script on the production server it failed with an error message:
java.lang.RuntimeException: Warning message: In source("D:/Data/R/pilotReport_v3/pilotReport_wrapper_v3.0.R", : invalid input found on input connection 'D:/Data/R/pilotReport_v3/pilotReport_wrapper_v3.0.R' Error: could not find function "pilotReport" Execution halted
It seemed like a file encoding problem (due to åäö inside the file) since the same error can be forced by saving the R script with ANSI encoding instead of UTF-8. After double and triple-checking the R file we could conclude that the file was encoded in UTF-8 and that R was trying to read it as UTF-8.
The script worked when running it manually, but not when starting it via BASE.
http://stackoverflow.com/questions/5031630/how-to-source-r-file-saved-using-utf-8-encoding gave us another hint. The problem could be that R could not store the åäö characters internally in the memory. Investigations showed that when running R manually the locale was set to 'en_US.UTF-8', but when running via BASE it was set to 'POSIX'. POSIX doesn't have åäö and this is probably why the script doesn't work.
A possible solution would be to switch locale in R before loading the pilot R script. See https://stat.ethz.ch/R-manual/R-devel/library/base/html/locales.html
Reggie should include a call to Sys.setlocale()
before calling source()
in the short R script that is generated in the RScriptDefinition
class:
Sys.setlocale(...); source(...); pilotReport(...);
(In [3578]) Fixes #822: Fix character encoding problems with R scripts
Calling
Sys.setlocale(locale='en_US.UTF-8');
solved the problem.