Here are a few papers that I read over the past months, in the CS and Stat category as usual.

For everything related to marginal effects, I usually trust Stata margins command. By the way, there is a dedicated book for this very useful command written by Michael N. Mitchell. The present article deals with estimating marginal effects in the case of non-linear models, which usually involves derivatives as discussed in an old post. Here the authors argue in favor of forward differences, hence the approach coined “forward marginal effects.”

I once gave a brief overview of recommend techniques for the cross tabulation of two categorical variables. This article goes a lot deeper by discussing agreement measures in 2x2 tables, inlcuding the cases of extreme tables where there’s a high level of agreement or a large imbalance between cells. The authors suggest that Holley and Guilford’s G and Gwet’s AC1 are the best candidates, among many other agreement measures (Cohen’s kappa, Pearson’s r, Yule’s Q and Y, Scott’s $\pi$, Shankar and Bangdiwala’s B, Dice’s F1 and McNemar’s $\chi^2$).

Unsupervised learning does not lead naturally to statistical inference, except maybe in the case of assessing clustering stability or choosing the optimal number of clusters. I discussed resampling-based appraoches in another post. However, with added constraints like in Mclust, we can work out the likelihood and optimize the number of clusters, for instance. Another inferential issue is that of testing whether cluster members differ on average on some attributes. Usually, we don’t test attributes that were used to build the partitioning: it is often meaningless since we are usually trying to maximize the separation between clusters based on those very specific attributes, and in any way this would lead to inflated Type I error rate. Most of the times, we establish and eventually test clustering profiles based on held out variables. In this paper, the authors propose a finite-sample p-value that controls the selective Type I error for a test of the difference in means between a pair of clusters obtained using k-means clustering.

This aricle is in fact mostly a tutorial on moderation analysis using PS weighting, and illustrations in Stata (inline code) and R (external package) are provided.

Usually, latent trait models apply to categorical response variable (dichotomous or polytomous items, or rating scales). The objective is to build a continuous score reflecting the position of an individual on the ltent trait being measured. For a more detailed overview, see my answer on Cross Validated. This article deals with restricted continuous responses (e.g. positive only or interval-based responses), and extends response time models.

The authors use restricted maximum likelihood (REML) estimators to separate the estimation of the regression coefficients and covariance parameters in order to estimate fixed-effects with L1 penalization in Gaussian linear mixed models.

The authors use the TOST technique to analyze replication equivalence studies. A “success interval” for the relative effect size is used to compare the replicate to the original study effect.

An interesting paper to complement one companion textbook I worked on long time ago.

♪ Benny Golson • *Little Karin*