--- title: Bias reduction using Firth's penalized likelihood technique output: rmarkdown::html_vignette vignette: > % \VignetteIndexEntry{Bias reduction using Firth's penalized likelihood technique} % \VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} bibliography: mclogit.bib --- Firth's penalized likelihood technique [@Firth.1993:Biasreduction] has two advantages in comparison to the conventional maximum likelihood estimator: 1. It has a smaller asymptotic bias than the MLE, i.e. it converges "faster" to its limit as the sample size approaches infinity: In somewhat imprecise terms, its bias is $O(n^{-2})$ instead of $O(n^{-1})$, where $n$ is the sample size. 2. In case of logistic regression and other models for categorical responses, it tends to be less afflicted by the problem of complete separation or quasi-complete separation. That is, Firth's penalized likelihood technique gives finite coefficient estimates in cases where a (finite) MLE for the coefficients does not exist. # Application of the technique to conditional and baseline logit models In the In case of the models supported by the *mclogit* package, the difference between Firth's PML technique and conventional ML estimation is that it solves for each coefficient $\alpha_q$ the modified gradient equation $$ 0 = U^*(\alpha_q) = \frac{\partial\ell}{\partial\alpha_q} + \frac12 \mathrm{tr}\left( (\boldsymbol{X}'\boldsymbol{W}\boldsymbol{X})^{-1} \boldsymbol{X}' \frac{\partial\boldsymbol{W}}{\partial\alpha_q} \boldsymbol{X} \right) $$ instead of $$ 0 = U(\alpha_q) = \frac{\partial\ell}{\partial\alpha_q} $$ for the MLE, where $\boldsymbol{X}'\boldsymbol{W}\boldsymbol{X}$ is the information matrix for the full coefficient vector. This technique is equivalent to maximizing the penalized (log-)likelihood $$ \mathcal{L}^*(\boldsymbol{\alpha}) = \mathcal{L}(\boldsymbol{\alpha}) \frac12 \det(\boldsymbol{X}'\boldsymbol{W}\boldsymbol{X}) \quad\text{or}\quad \ell^*(\boldsymbol{\alpha}) = \ell(\boldsymbol{\alpha}) + \frac12 \ln\det(\boldsymbol{X}'\boldsymbol{W}\boldsymbol{X}). $$ Like the conventional MLE, the *mclogit* package uses an IWLS-algorithm, however, using a modified working response vector with elements $$ y_{ij}^* = \boldsymbol{x}_{ij}'\boldsymbol{\alpha} + \frac{y_{ij}-\pi_{ij}}{\pi_{ij}} + \frac12 \sum_r \sum_s \zeta_{ij,rs} I^{(r,s)} $$ where $I^{(r,s)}$ is the $r,s$ element of $(\boldsymbol{X}'\boldsymbol{W}\boldsymbol{X})^{-1}$ and $$ \begin{aligned} \zeta_{ij,rs} &= (x_{ijr}-\sum_kx_{ikr}\pi_{ik})(x_{ijs}-\sum_l\pi_{il}x_{ils}) - \sum_k\pi_{ik}(x_{ikr}-\sum_lx_{ilr}\pi_{il})(x_{iks}-\sum_l\pi_{il}x_{ils}) \\ &= \Delta_{i,j,\boldsymbol{\pi}_i} (\Delta_{i,j,\boldsymbol{\pi}_i}x_{ijr} \Delta_{i,j,\boldsymbol{\pi}_i}x_{ijs}) \end{aligned} $$ where $\Delta_{i,j,\boldsymbol{\pi}_i}a_{ij}=a_{ij} - \sum_k\pi_{ik}a_{ik}.$ # A comparison of results obtained with other packages For a multinomial baseline logit model for a two-category response, `mblogit(..., Firth = TRUE)` gives an identical result with function `brglm()` from Ioannis Kosmidis' package *brglm* [@kosmidis:brglm]: ```{r, echo = TRUE} if(require("brglm", quietly = TRUE)) { suppressMessages(library(mclogit)) data(lizards) lizards.brglm <- brglm(cbind(grahami, opalinus) ~ height + diameter + light + time, family = binomial(logit), data=lizards, method = "brglm.fit") print(summary(lizards.brglm)) lizards.mblogit_F <- mblogit(cbind(opalinus, grahami) ~ height + diameter + light + time, data = lizards, Firth = TRUE) print(summary(lizards.mblogit_F)) } ``` With multicategorical responses `mblogit(..., Firth = TRUE)` gives an identical result with function `brmultinom()` from Ioannis Kosmidis' package *brglm2* [@kosmidis:brglm2] (called with `type = "AS_mean"`): ```{r, echo = TRUE} library(MASS) #For the housing data print(house.mblogit <- mblogit(Sat ~ Infl + Type + Cont, weights = Freq, data = housing, Firth = TRUE)) if(require("brglm2", quietly = TRUE)){ print(brmultinom(Sat ~ Infl + Type + Cont, weights = Freq, data = housing, ref = 1, type = "AS_mean")) } ``` It should be noted, however, that the *mclogit* package makes Firth's bias correction also available for multinomial conditional logit models, in contrast to *brglm2*. For conditional logit models with Firth's bias correction, one can use a function call like `mclogit(..., Firth = TRUE)`. # References