Distributions in R

All standard distributions are available in R with a systematic convention for function names. Each distribution has an R name, like the normal distribution with R name norm. If we want to compute the value of the density for the normal distribution, the R function is called dnorm. For other distributions we just put a d in front of the R name of the distribution as in dgamma, dpois, dbeta etc.

dnorm(0.5)
## [1] 0.3521
exp(-0.5^2/2)/sqrt(2 * pi)
## [1] 0.3521
curve(dnorm, -4, 4, ylab = "Density", main = "The density for the normal distribution")

plot of chunk normal

We can also get the computer to generate samples from a given distribution. For the normal distribution this is done using the rnorm function. We generate 100 random variables from the normal distribution and compare the data to the theoretical density using a so-called kernel density estimate. This can be thought of as a smoothed histogram, and is generally preferable over a histogram.

x <- rnorm(100)
plot(density(x), xlim = c(-4, 4), ylim = c(0, 0.4), main = "Kernel density estimate")
rug(x)
curve(dnorm, -4, 4, add = TRUE, col = "red")

plot of chunk unnamed-chunk-1

The R functions take additional arguments like mean (the location parameter) and sd (the scale parameter) arguments, whose default values are 0 and 1, respectively.

The Weibull distribution has R name weibull and we can generate random Weibull distributed samples and compare to the density just as for the normal distribution. Note, however, that the Weibull distribution has a shape parameter, which is not given any default value, so it has to be specified.

x <- rweibull(100, shape = 10)
plot(density(x, from = 0), xlim = c(0, 1.5), ylim = c(0, 4), main = "Kernel density estimate")
rug(x)
curve(dweibull(x, shape = 10), 0, 1.5, add = TRUE, col = "red")

plot of chunk unnamed-chunk-2