Poisson GLM and existence of MLE

The exitence of the MLE for the Poisson likelihood is determined by how and if the data set contains zero-counts. If there are no zero-counts, the MLE exists.

If there are zero-counts, the existence of the MLE is characterized as follows. The \( n \) predictors \( \tilde{X}_1, \ldots, \tilde{X}_n \) span a convex cone in \( \mathbb{R}^{p} \) with
vertex in 0. The MLE exists if and only if \[ \tilde{t} = \sum_{i=1}^n Y_i \tilde{X}_i \] is in the interior of the cone, where the \( Y_i \)'s are the responses. If there is an intercept in the model (which there usually is) and \( \tilde{X}_n = (1, X_n)^T \in \mathbb{R}^{p+1} \) for \( X_n \in \mathbb{R}^p \) it can be shown that \( \tilde{t} \) is in the interior of the cone if and only if \[ t = \frac{1}{\sum_{i=1}^n Y_i} \sum_{i=1}^n Y_i X_i \] is in the interior of the convex hull of \( X_1, \ldots, X_n \) (the smallest convex set spanned by the predictors). This is easy to visualize for \( p = 1, 2 \).

We generate a small toy example with \( p = 2 \) and \( n = 4 \). First we construct the predictors and visualize their convex hull.

library(ggplot2)
X <- data.frame(x1 = c(-2, -1, 2, 0), x2 = c(1, -1, 0, 2))
p <- qplot(x1, x2, data = X, geom = "polygon", alpha = I(0.5)) + geom_point(size = 5, 
    alpha = 1)
p

plot of chunk unnamed-chunk-1

Then we consider an example where the average \( t \) ends up in the interior of the convex hull, and we fit the Poisson model using glm.

y <- c(1, 2, 1, 0)
Xy <- cbind(y, X)
t <- c(y %*% X$x1, y %*% X$x2)/sum(y)
p + geom_point(aes(t[1], t[2]), size = 5, color = "blue")

plot of chunk unnamed-chunk-2

testGlm <- glm(y ~ x1 + x2, family = poisson, data = Xy)
summary(testGlm)
## 
## Call:
## glm(formula = y ~ x1 + x2, family = poisson, data = Xy)
## 
## Deviance Residuals: 
##      1       2       3       4  
##  0.358  -0.162   0.213  -0.745  
## 
## Coefficients:
##             Estimate Std. Error z value Pr(>|z|)
## (Intercept)   0.0267     0.5461    0.05     0.96
## x1           -0.1237     0.3742   -0.33     0.74
## x2           -0.6550     0.5157   -1.27     0.20
## 
## (Dispersion parameter for poisson family taken to be 1)
## 
##     Null deviance: 2.77259  on 3  degrees of freedom
## Residual deviance: 0.75402  on 1  degrees of freedom
## AIC: 13.37
## 
## Number of Fisher Scoring iterations: 5

Next we consider an example where the average \( t \) ends up on the boundary of the convex hull. We can fit a Poisson model using glm, and we won't get any warnings that the MLE does not exist.

y <- c(1, 2, 0, 0)
Xy <- cbind(y, X)
t <- c(y %*% X$x1, y %*% X$x2)/sum(y)
p + geom_point(aes(t[1], t[2]), size = 5, color = "blue")

plot of chunk unnamed-chunk-3

testGlm <- glm(y ~ x1 + x2, family = poisson, data = Xy)
summary(testGlm)
## 
## Call:
## glm(formula = y ~ x1 + x2, family = poisson, data = Xy)
## 
## Deviance Residuals: 
##         1          2          3          4  
##  0.00e+00   0.00e+00  -3.52e-07  -2.01e-05  
## 
## Coefficients:
##             Estimate Std. Error z value Pr(>|z|)
## (Intercept)   -12.84   25534.58       0        1
## x1             -8.79   17023.05       0        1
## x2             -4.74    8511.53       0        1
## 
## (Dispersion parameter for poisson family taken to be 1)
## 
##     Null deviance: 4.4987e+00  on 3  degrees of freedom
## Residual deviance: 4.0610e-10  on 1  degrees of freedom
## AIC: 10.61
## 
## Number of Fisher Scoring iterations: 21

There are several signs in the summary output that the MLE does not exist. First, the residual deviance is almost driven to 0, which is a warning sign. Second, the estimated standard errors are huge. Moreover, the 21 iterations is actually a lot. Typically, the IWLS algorithm will converge in very few iterations.