|
Week 40 - comments
Probability Theory 1 and Measure and Integration
Theory
|
Quantiles
I was not very satisfied with the lecture on the distribution function and in
particular the results on quantiles. The chapter contains a number of technical
results related to quantiles, transformations and choices of quantiles. The
technical difficulties boil down to the fact that there is in general no
inverse of the distribution function. I tried to partially solve this by first
focusing on the generalized inverse and then on the possibility of choosing
other quantile functions. This I found did not work as well as
expected -- though we managed to prove Lebesgue-Stieltjes Theorem (17.4).
One thing that did not come out very clearly was why we need to care about the
problem with no inverse in general. Is that not just an exotic mathematical
problem? No! The empirical distribution is a very important distribution
function which is neither surjective nor injective, and we need to deal with
quantiles for this distribution function also.
The take home message is that despite some technical issues related to the
definition of what a quantile is, the practical use of quantiles present no
problems and they are used rutinely to compare distributions for instance via
QQ-plots, where we check is the points are on a straight line.
In the lecture I sketched a QQ-plot and an exaggerated deviation from a stright
line. Instead of listening to the complaint from the audience I insisted that a real
QQ-plot could look like that. But of course I overdid it. There is no way that
the curve can become decreasing.
|
Multivariate distributions
The central point here is that when considering more than one stochastic
variable we need to know the joint distribution of the variables, which is a
probability measure on a product space. Formally the measure is obtained from
the transformation of the background measure, but when specifying models in
practice we write for instance "Let X1,...,Xn be n real
valued random variables with distribution mu", which means that we specify the
joint distribution as mu.
One way to make the specification is to specify the marginal distributions of
each of the random variables and then say that they are independent. This means
that the joint distribution is the product measure given by the marginal
distributions. This reveals one of fundamental links between probability theory
and abstract measure theory. Without the development of the product measure
terminology it's impossible to deal with the notion of independence in a
satisfying way.
The multivariate regular, normal distribution is the best example and most important model for
multivariate observations where we can specify independece as well as
dependence. The normal distribution has a very well developed mathematical
theory, and you will meet it again and again in future courses in statistics as
well as in a range of applications. In this course it is worth noting that it
has nice transformation properties. Transforming a regular normal distribution by an
affine, surjective map gives a regular normal distribution. There are explicit
formulas for how the parameters transform in Lemma 18.26 and Corollary
18.29. An interesting consequence, that you should notice, is that if the joint
distribution of (X1,X2) is a regular normal distribution
then the distribution of the sum X1 + X2 is a regular
normal distribution. The parameters are given by (18.24) but note, there is a
misprint. Erase either the "2" or the "Sigma21".
|
| |