Java

Contents:

Pentaho runs inside a Java virtual machine, and hence is bound by the properties of that VM.

These optimisations can apply to just about any Java application, including the Pentaho BI Server and GUI tools.

Java Runtime Environment (JRE)

Version

There are many different virtual machines out there, but Pentaho is developed and tested first to run on the Sun Java Virtual Machine.

Note that merely having Sun Java installed doesn't mean that it is set to be the default JRE. To check, run from a terminal

$ java -version

It should say something like

java version "1.5.0_14"
Java(TM) 2 Runtime Environment, Standard Edition (build 1.5.0_14-b03)
Java HotSpot(TM) Server VM (build 1.5.0_14-b03, mixed mode)

Warning: Pentaho officially supports the JVM supported by the major servlet containers (Tomcat, JBoss, etc.). At present the latest version is 1.5.

We have some instructions for installing Java on Red Hat/CentOS and Fedora systems. If you're using Ubuntu, refer to this page.

Many Linux distributions ship with GCJ by default. You will need to switch to Sun Java for best results. OpenJDK is quickly gaining in popularity and will likely become a standard, but for the moment we recommend Sun's proprietary JRE.

Environment variables

Tomcat uses the JAVA_HOME environment variable to locate the installation of Java to use.

You can set it in your /etc/profile (system-wide) or ~/.profile (per user)

JAVA_HOME="/usr/java/jdk1.5.0_15"
export JAVA_HOME
(alter the path according to your installation)

Logout and log back in, then test the variable

$ echo $JAVA_HOME
/usr/java/jdk1.5.0_15

Memory

Assuming that the machine is dedicated to Pentaho, you can maximise the amount of RAM allocated to the Java VM.

-Xms: set initial Java heap size
-Xmx: set maximum Java heap size

The JVM starts with -Xms amount of memory for the heap (storing objects, etc.) and can grow to a maximum of -Xmx amount of memory. -Xmx is always equal to or larger than -Xms.

Increase the memory options to what your system will allow. Remember that you'll need to leave RAM for your operating system, database, etc. (and you'll likely get a performance degradation if you starve important services of memory). For example, if you've got 12GB of RAM you might want to give 8GB to the Java VM for Pentaho:

-Xms8g -Xmx8g -XX:MaxPermSize=512m

Note that the limit to how much can be allocated to a JVM can be considerably less than the amount of memory you have installed in total. See this page for details.

Pentaho server (JBoss)

  1. Open the file {pentaho}/jboss/bin/run.conf in a text editor
  2. Look for the Java options section:
    #
    # Specify options to pass to the Java VM.
    #
    if [ "x$JAVA_OPTS" = "x" ]; then
       JAVA_OPTS="-Xms128m -Xmx512m -XX:MaxPermSize=256m -Dsun.rmi.dgc.client.gcInterval=3600000 -Dsun.rmi.dgc.server.gcInterval=3600000 -Djava.awt.headless=true -Djava.io.tmpdir=$MYTEMP"
    fi

Pentaho server (Tomcat)

  1. Open the file {pentaho}/start-pentaho.sh (or start-pentaho.bat) in a text editor
  2. Adjust the settings for the CATALINA_OPTS variable

Pentaho Data Integration

  1. Open the file spoon.sh or spoon.bat in a text editor
  2. Look for a section that looks like this:
    # ******************************************************************
    # ** Set java runtime options                                     **
    # ** Change 256m to higher values in case you run out of memory.  **
    # ******************************************************************
    
    OPT="-Xmx256m -cp $CLASSPATH -Djava.library.path=$LIBPATH -DKETTLE_HOME=$KETTLE_HOME -DKETTLE_REPOSITORY=$KETTLE_REPOSITORY -DKETTLE_USER=$KETTLE_USER -DKETTLE_PASSWORD=$KETTLE_PASSWORD -DKETTLE_PLUGIN_PACKAGES=$KETTLE_PLUGIN_PACKAGES"
  3. Change the -Xmx parameter to alter the maximum heap size, e.g. -Xmx1024m

Garbage Collection

If more than 98% of the total time is spent in garbage collection and less than 2% of the heap is recovered, an OutOfMemoryError will be thrown.

This feature is designed to prevent applications from running for an extended period of time while making little or no progress because the heap is too small. If necessary, this feature can be disabled by adding the following option to the command line.

-XX:-UseGCOverheadLimit

Resources


Creator: sd on 2008/08/22 09:29
XWiki Enterprise 1.7.2.16857 - Documentation