Building R on old versions of Linux

If you’re a user of R and would like to build a recent version for yourself, but you’re working on a fairly old Linux operating system, you may encounter some issues regarding various libraries that don’t meet R’s minimum version requirements. This is my guide to getting everything you need installed.

As of R-3.3.0 previously bundled versions of several system tools were removed, meaning that you need to have them available somewhere else on the system. Here’s the relevant entry from R’s NEWS file:

  • The previously included versions of zlib, bzip2, xz and PCRE have been removed, so suitable external (usually system) versions are required (see the ‘R Installation and Administration’ manual).

One such operating system is CentOS 5, which is pretty damn old now, but it’s not uncommon to find it running on some centralised computing resource, like a work compute cluster, where you don’t have admin rights. Sometimes stability is to be lauded, but unfortunately many of the system libraries available by default for CentOS 5 are similarly ancient, and don’t meet the (fairly generous) minimum requirements to build R. In my experience the first place this will fail is is in the configure step, where the available version of zlib wont meet the minimum requirements.

Here are the steps I took to compile R-3.3.2 in an (almost) default CentOS 5.11 install. I’m going to assume that a system wide versions of gcc, g++ and gfortran are available – if you’ve got to build those too then I’m not sure what the system has ever been used for!

Installation strategy and file locations

Once you start compiling packages from source, things can get a little confusing regarding where to keep things.  First you have to download and unpack things, then run unfamiliar compilation commands that may copy things to new locations, and then you need to let other programs know that’s where you put the first program.  It’s easy to lose track of things and miss something, or to end up with environment variables like $LD_LIBRARY_PATH with really long lists of locations attached.

In this guide we’re going to focus on using two folders, both in our home directory. The first of these is called src and is the place where we’re going to download and compile the various tools we need. Once this complete installation is done, you can delete this folder if you want. The second location will be called usr. That’s a bit of a weird name, but usually system libraries are installed in /usr (which we don’t have permission to write to), so I like to keep things familiar.  You can obviously use whatever names you like for these two folders, just remember to change the appropriate entries in the code blocks below.

Compiling the dependencies

First off, we’re going to create a directory called src in our home folder, and download all the various libraries we need, plus the source of R.

Next up we will unzip all the files we just downloaded. Note: if you’ve got any other .tar.gz files in the src folder this will extract them too!

Now we’re going to compile and install each of the dependencies. The for the most part this involves entering each folder and running the configure script with the information about where we want to install things. That’s what the --prefix argument does.  Once this is done we run make, which compiles the source code, and then make install, which moves everything into the place we specified.

The first three libraries follow exactly this pattern. For PCRE we have to specify one more option regarding UTF8 support (otherwise R will complain later) and for bzip2 the process is a bit more complex.

Compiling R

Hopefully everything has installed correctly up to this point.  Before we can compile R itself,  we now have to do something a bit different – namely tell our system where we’ve been installing these tools, so that it knows where to find them when we try to build R.  We do this by setting some environment variables, which the configure script and then compiler will use to find the various things we’ve just built.

Note: I wasn’t sure the adding things to $PATH was strictly necessary, but I think R checks the capabilities of curl using the executable, and so it will fail if this can’t be found.

Now we’re going to build R itself. This is essentially the same process as we carried out for all the other tools. We enter the R directory, run the configure script providing our install location, and then run make.  There are many other options than can be supplied here, but for now I’m only interested in getting a minimal version of R working.  Compiling R takes a fair bit longer than what we’ve done before, so now is the time to get a cup of coffee.

Final thoughts

Hopefully the compilation completed successfully, and you can now just type R in your terminal and it will launch.  Unfortunately it probably won’t as soon as you open a new terminal.  That’s because earlier we set the $PATH environment variable, but it only affected the terminal you were working in then.  You can of course run the export PATH=$HOME/usr/bin:$PATH every time, put it’s probably preferable to set it somewhere permanent.  There’s an excellent post on how best to set $PATH on StackExchange here.

Leave a Reply

Your email address will not be published. Required fields are marked *