Overview

This page describes how to install software necessary for participating in the course "Statistical analysis of gene expression data with R and Bioconductor". You should arrive at the course with this software pre-installed.

The installation consist of 3 steps:

  1. Installing R.
  2. Installing add-on packages for R, including Bioconductor.
  3. Other stuff.

1. Installing R

We will be using R version 2.9.1. Binaries for the major operating systems (Linux, Mac OS X and Windows) may be found at CRAN. Installing R should be straightforward on most platforms.

Users running Windows Vista may have problems with permissions, please check the information in item 2.24 in the R for Windows FAQ

In the case of some Linux distributions you may be compiling R from source. This requires a number of necessary tools (X.org sources, GCC, gfortran, and others). We advise against compiling R from source on Mac OS X, use the binary instead.

2. Installing Bioconductor and additional packages

We need a large number of packages from Bioconductor and CRAN. In order to make it easy to install these packages we have provided a small script. For this to work, it is important to make sure you are running R-2.9.1. Please read all of this section before starting. The highest chance of success is to make a clean installation of R-2.9.1 and proceed as described below.

You will need to have internet access from your laptop, and the download will be fairly big, so we advise to do it somewhere with a decent internet connection. Start R and type or copy/paste the following at the command prompt:

R> source("http://www.math.ku.dk/~richard/courses/bioconductor2009/copenhagen.R")

(most likely you will see > and not R>.)

If you have problems connecting R to the internet, consider the following:

What you do next depends on whether or not you already have parts of Bioconductor installed. We suggest to first run

R> copenhagen("install.all")

This may take some (a lot of) time. This is a "better safe than sorry" solution where all the different packages including all their dependencies are installed. After this has completed, run

R> copenhagen("check")

to see that you have everything you need. If something is wrong you may retry the complete installation step above or see below how to proceed. Finally, if everything went well you load the required packages by

If you have a clean R-2.9.1 installation you should also get the correct installation by just

copenhagen("install")

where some steps have been taken to avoid reinstallations of packages. Finally, if everything went well you load the required packages by

R> copenhagen("load")

Thursday, Vince will use a couple of packages not included above. They are very large (around 800 MB and 500 MB) and therefore not included in the standard installation. If possible, install those by running:

R> biocLite("BSgenome.Hsapiens.UCSC.hg18")

R> biocLite("pd.genomewidesnp.6")

What if something went wrong - and how do I install a few packages?

If some packages failed and you need to retry installing them (it could fail simply because of download problems), you can install the remaining packages (without installing everything that has already been installed) by

R> copenhagen("install")

This call will check if the packages have been installed and install only those that have not been installed. Alternatively, individual packages can be installed by hand by first running

R> copenhagen("check")

to get a list of these packages and then do the following:

R> source("http://www.bioconductor.org/biocLite.R")

followed by

R> biocLite("PACKAGE-NAME")

Installing packages in a non-standard location

Perhaps you want to install your packages in a non-standard location, either because of permission problems or because you want to be able to delete some of these packages after the course. The easiest way is to create some directory somewhere and then execute

R> .libPaths("myDir")

with myDir being the directory (note that R uses / as a directory separator even on windows). You can type .libPaths() to see what the setting is, your custom directory should be at the top of the list.

Windows issues

Most packages are a breeze to install in Windows. A few R/Bioconductor packages do not work under Windows, we will not be using those in this course.

Mac OS X issues

The easiest thing for Mac OS X users is to install binary versions of the various packages. Check whether this is the case by

R> options("pkgType")

the answer should be mac.binary. You can set it by

R> options(pkgType = "mac.binary")

You will also need to install the binary version of Graphviz matching Rgraphviz from Bioconductor. You will find this program at R for Mac OS X, scroll to the bottom. It is a DMG image which is straigtforward to install.

Note: If you really want to have the possibility of compiling packages from source, you will need the newest version of Xcode installed from Apple. This will require creating an account with their developers network. In addition you might need a fortran compiler, which you can get from the DMG image of R from CRAN (click "customize" during the installation session - it is easy to miss). You will also need the sources for Graphviz, from graphviz.org (use an even numbered version, preferably version 2.12 or 2.14)

Linux issues

None particular

3. Other issues

The package classGraph may be demonstrated but is not required. The interested might consider bowtie, including the yeast genome index, but again this is not required.

During the course you will enter a lot of R code. It will be useful to have a reasonable editor available. What you use depends on your past experience, and operating system. It can be as simple as Notepad for Windows. Popular editors include

but please make sure that you do not start using a completely new editor at the course: the aim is to learn R, learning a new editor at the same time would be counterproductive.