Distribution functions for a range of standard distributions in R are available by putting a p
in front of the R-name. Thus pnorm
is the distribution function for the normal distribution. First we compare the distribution function for the \( \Gamma \)-distribution with shape parameter \( \lambda \) with the distribution function for the normal distribution with the same mean (\( \lambda \)) and variance (also \( \lambda \)).
lambda <- 3
### Normal distribution function with mean and variance lambda
curve(pnorm(x, lambda, sqrt(lambda)), 0, 10,
ylim = c(0, 1))
### Gamma distribution function with shape parameter lambda (and
### thus mean and variance lambda)
curve(pgamma(x, lambda), add = TRUE, col = "red")
It is not obvious from the plot how the distributions differ. That is more clear from their densities.
Quantiles are available using e.g. qnorm
and qgamma
.
### Normal quantile function with mean and variance lambda
curve(qnorm(x, lambda, sqrt(lambda)), 0, 1)
### Gamma quantile function with shape parameter lambda (and
### thus mean and variance lambda)
curve(qgamma(x, lambda), add = TRUE, col = "red")
What is usually more informative is to plot the quantiles against each other.
q <- seq(0.01, 0.99, 0.01) ## Sequence of probabilities
plot(qnorm(q, lambda, sqrt(lambda)),
qgamma(q, lambda), type = "l",
xlim = c(-1, 9),
ylim = c(-1, 9))
abline(0, 1, lty = 2) ## adding the line y = x
The large \( \Gamma \)-quantiles are larger than the corresponding large normal quantiles meaning that the distribution is skewed to the right in the right tail. The small \( \Gamma \)-quantiles are also larger than the small normal quantiles meaning that the left tail of the \( \Gamma \)-distribution is more condensed than the left tail of the normal distribution.
The QQ-plot is mostly used with one distribution (whose quantiles are plotted on the second axis) being the empirical distribution. Then you can visually check if the empirical distribution fits the theoretical (whose quantiles are on the first axis).
x <- rnorm(30)
### Creating athe empirical (cumulative) distribution function.
edf <- ecdf(x)
plot(edf, xlim = c(-3, 3))
curve(pnorm(x), add = TRUE, col = "red")
qqnorm(x)
abline(0, 1, lty = 2) ## This is an OK QQ-plot
Try increasing the number of simulated observations from 30 to 300 or 3000 to see what happens.
If we simulate from the the \( \Gamma \)-distribution instead we get a QQ-plot againts the normal distribution that reveals that something is wrong. With only 30 observations it is sometimes difficult to see but with 100 it is clear.
x <- rgamma(100, lambda)
qqnorm(x)
abline(lambda, sqrt(lambda), lty = 2)