% -*- mode:context -*-
% Time-stamp: <2008-04-15 22:05:14 chl>
% \usemodule[bib]
% \usemodule[bibltx]
% \setupbibtex[database=001]
\setuppapersize[A4][A4]
\setupindenting[none]
\setupwhitespace[medium]
\setupbodyfont[modern,11pt]
\setuptyping[bodyfont=small]
\setupcolors[state=start]
\setupinteraction[state=start,color=green,style=bold]
\defineframedtext
[admonition]
[width=fit,frame=on,align=middle,
location=middle,corner=round,background=screen]
\useURL[article-src][http://www.aliquote.org/memos/2008/04/15/interaction-terms-in-non-linear-models][][aliquote.org/memos/interaction-terms-in-non-linear-models]
\useURL[article-cit][http://www.unc.edu/~enorton/AiNorton.pdf][][pdf version]
\starttext
\midaligned{\bfb Interaction terms in non-linear models}
\blank[2*big]
\startadmonition
This article was first published on {\tt www.aliquote.org} (April, 2008).
\stopadmonition
\blank[2*big]
This discussion is primarily based on the following article
(\from[article-cit], but see also [1] and [2]):
\startquotation
Ai, C. and Norton, E.C. (2003). Interaction terms in logit and probit
models. {\em Economics Letters}, {\bf 80}: 123-129.
The magnitude of the interaction effect in nonlinear models does not equal the
marginal effect of the interaction term, can be of opposite sign, and its
statistical significance is not calculated by standard software. We present
the correct way to estimate the magnitude and standard errors of the
interaction effect in nonlinear models.
\stopquotation
The main ideas of this article is that both the test and interpretation of an
interaction term in a GLM are done in the wrong way. Instead of interpreting
the true interaction coefficient, discussion often relies on the marginal
effect of the interaction term. What are marginal effect? In this context, it
means that an interaction between two factors should be combined with the main
effects marginal to that interaction, e.g. [6,8]. Using the notation of the
authors, the interaction effect is the cross derivative of the expected value
of $y$:
\startformula
\frac{\partial^2 \phi(\cdot)}{\partial x_1 \partial x_2} = \underbrace{\beta_{12}\phi'(\cdot)}_{\text{marginal effect}}\kern-1em+(\beta_1+\beta_{12}x_2)(\beta_2+\beta_{12}x_1)\phi''(\cdot)
\stopformula
If $\beta_{12}=0$, we have
\startformula
\left.\frac{\partial^2 \phi(\cdot)}{\partial x_1 \partial x_2}\right|_{\beta_{12}=0}=\beta_1\beta_2\phi''(\cdot)
\stopformula
In addition, the authors highlight the fact that the interaction effect might
be not negligible even if the corresponding $\beta=0$. Further, the interaction
effect is conditional on the independent variables. Finally, the sign of the
interaction term does not necessarily indicate the sign of the interaction
effect as it may have different signs for different values of the
covariates. Altogether, these remarks should give support to the practitioner
so that a careful examination of the model fit is done before concluding or
interpreting the interaction effect without caution.
What's about the computation of the odds-ratio associated to the main effects
when the interaction is significant? For instance, consider a logistic
framework with two main effects, say $\beta_1$ and $\beta_2$ and a third
coefficient $\beta_{12}$ that represents the interaction. For the purpose of
the illustration, we can easily imagine two such factors, e.g. the smoking
status (smoker or not) and the drinking status (occasional vs. regular
drinker). The response variable might be any medical outcome of potential
interest (cancer, malignant affection, etc.), coded as a binary (0/1)
variable. If the interaction is significant, then talking about the odds-ratio
(OR) associated to the smoking variable has no sense at all. Instead, one must
describe two ORs: the OR for subjects who are smoker but occasional drinker,
$\exp(\beta_1)$ and the OR for subjects who are smoker and regular drinker,
$\exp(\beta_1+\beta_{12})$.
\useURL[Stata][http://www.stata.com/][][Stata]
\useURL[R][http://www.cran.r-project.org][][R]
The two authors hold the necessary \from[Stata] code on their
homepage. However, I would like to illustrate the issues raised by the
interpretation of interaction terms when using non-linear models with
\from[R].
The \type{effects} package facilitates in some way the graphical display of
effects sizes and we will use it in the short application proposed in the next
few paragraphs.
\subject{Application}
\useURL[Arrests]
[http://goliath.ecnext.com/coms2/summary_0199-3319989_ITM]
[]
[brief report]
Let's consider the \type{Arrests} data, which is also used in [5] (See
also this \from[Arrests], and [9], Chap. II.2, p. 57) and aims at studying the
probability of release of individuals arrested in Toronto for simple
possession of small quantities of marijuana. Characteristics of interest are:
subjects' race, age, employment, citizenship, previous recording in police
databases.
\starttyping
library(effects)
data(Arrests)
opar <- par(mfrow=c(2,2),las=1)
quali.var <- c(2,5,6,7)
for (i in quali.var)
barplot(table(Arrests$released,Arrests[,i]),col=c(2,4),
ylim=c(0,4000),xlab=colnames(Arrests)[i])
legend("topleft",c("Yes","No"),pch=15,col=c(2,4),bty="n",
title="Released")
par(opar)
\stoptyping
\placefigure
[here,force]
[fig:arrests1]
{none}
{\externalfigure[arrests][width=.5\textwidth]}
The above figure could hardly be interpreted as is because we need to consider
both marginal (not shown) and conditional (these plots) distributions at the
same time. However, we can run a reduced (compared to that used in [5]) model
including colour, age and sex, as well as colour $\times$ age. This is done as
follows:
\starttyping
arrests.glm <- glm(released ~ colour + age + sex + colour:age,
family=binomial,data=Arrests)
summary(arrests.glm)
\stoptyping
and here is the resulting output:
\placetable[here][tab:1]{none}{
\starttable[|r|r|r|r|r|r|]
\HL
\NC Effects \NC Estimate \NC Std. Error \NC z value \NC Pr(<|z|) \SR
\HL
\NC (Intercept) \NC 0.853219 \NC 0.241020 \NC 3.540 \NC 0.0004 \NC *** \MR
\NC colourWhite \NC 1.645338 \NC 0.241690 \NC 6.808 \NC 9.92e-12 \NC *** \MR
\NC age \NC 0.014389 \NC 0.007822 \NC 1.839 \NC 0.0658 \NC . \MR
\NC sexMale \NC -0.161870 \NC 0.142805 \NC -1.133 \NC 0.2570 \NC \MR
\NC colourWhite:age \NC -0.037299 \NC 0.009362 \NC -3.984 \NC 6.78e-05 \NC *** \SR
\HL
\stoptable}
At first glance, the model seems quite satisfactory and no deviations from
standard assumptions are noticed (see next Figure).
\placefigure
[here,force]
[fig:arrests2]
{none}
{\externalfigure[Arrests_glm][width=.5\textwidth]}
Now, as proposed in [5], we can get an ANOVA-like summary by issuing
\starttyping
library(car)
Anova(arrests.glm)
\stoptyping
The results are shown below:
\starttyping
Anova Table (Type II tests)
Response: released
LR Chisq Df Pr(>Chisq)
colour 81.854 1 < 2.2e-16 ***
age 5.947 1 0.01474 *
sex 1.325 1 0.24962
colour:age 16.479 1 4.918e-05 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
\stoptyping
Next, we can propose the following effect display for the colour $\times$ age
interaction.
\placefigure
[here,force]
[fig:arrests3]
{none}
{\externalfigure[Arrests_glm2][width=.5\textwidth]}
Note that the vertical axis is labelled on the probability scale (i.e. the
response scale) while the estimated effects are plotted on the scale of the
linear predictor. The 95\% (pointwise) confidence interval is wider at the
extreme values of the age variable.
Now, we can proceed to the estimation of the interaction effect using the
method proposed by Ai \& Norton.
% \completepublications
\subject{References}
\startcolumns[n=2,rule=on,tolerance=verytolerant]
\startitemize[m]
\useURL[1][http://citeseer.ist.psu.edu/ai00interaction.html][][www]
\item Ai, C. and Norton, E.C. (2000). {\em Interaction terms in nonlinear
models}. Unpublished draft manuscript. [\from[1]]
\useURL[2][http://www.unc.edu/~enorton/NortonWangAi.pdf][][pdf]
\item Norton, E.C., Wang, H., and Ai, C. (2004). Computing interaction effects
and standard errors in logit and probit models. {\em The Stata Journal}, {\bf
4(2)}: 154-167. [\from[2]]
\useURL[3][http://www.business.uiuc.edu/Working_Papers/papers/03-0100.pdf][][pdf]
\item Hoetker, G. (2003). Confounded coefficients: Accurately comparing logit
and probit coefficients across groups. {\em College of Business Working
Papers}, University of Illinois. [\from[3]]
\useURL[4][http://polmeth.wustl.edu/retrieve.php?id=692][][pdf]
\item Berry, W.D., Esarey, J., and Rubin, J.H. (2007). Testing for interaction
in binary logit and probit models: Is a product term essential? {\em Working
Papers of the Society for Political Methodology}. [\from[4]]
\useURL[5][http://polmeth.wustl.edu/retrieve.php?id=692][][pdf]
\item Fox, J. (2003). Effect displays in R for generalized linear models. {\em
Journal of Statistical Software}, {\bf 8(15)}, 18 pp. [\from[5]]
\item Fox, J. (1987). Effect displays for generalized linear models. In Clogg,
C.C. (Ed.), {\em Sociological Methodology 1987}, pp. 347-361. American
Sociological Association, Washington DC.
\useURL[6][http://www.gllamm.org/gllamerr.pdf][][pdf]
\item Rabe-Hesketh, S., Skrondal, A., and Pickles, A. (2001). Maximum likelihood
estimation of generalized linear models with covariate measurement
errors. {\em The Stata Journal}, {\bf 1(1)}, 26 pp. [\from[6]]
\item Bouyer, J., Hémon, D., Cordier, S., Derriennic, F., Stücker, I., Stengel,
B., and Clavel, J. (1995). {\em Épidémiologie, Principes et méthodes
quantitatives}. Éditions INSERM.
\useURL[7][http://www.springer.com/statistics/computational/book/978-3-540-33036-3][][www]
\item Chen, C.-H., Härdle, W., and Unwin, A. (Eds.) (2008). {\em Handbook of
Data Visualization}. Springer Verlag. [\from[7]]
\useURL[8][http://www.jstatsoft.org/v08/i01/paper][][pdf]
\item Tomz, M., Wittenberg, J., and King, G. (2003). Clarify: Software for
interpreting and presenting statistical results. {\em Journal of Statistical
Software}, {\bf 8(1)}, 29 pp. [\from[8]]
\stopitemize
\stopcolumns
\stoptext