# aliquot

## < a quantity that can be divided into another a whole number of time />

Some (old) random notes found by chance on my iPhone.

The use of IRT modeling has also been discussed in the context of gene-environment intercation, see Waller, N.G. and Reise, S.P. (1989). Genetic and environmental influences on item response pattern scalability, Behavior Genetics, 22(2): 135-152.

• Mastery testing refers to delivering ordered graded to examinees, see the Connecticut Mastery Test (CMT). “In a sequential mastery test, the decision is to classify a subject as a master, a nonmaster, or to continue sampling and administering another random item.” (Vos, Applying the Minimax Principle to Sequential Mastery Testing).

• Niels Smits discussed the “measurement-prediction” paradox in a nice paper: The Measurement Versus Prediction Paradox in the Application of Planned Missingness to Psychological and Educational Tests. The key point is that there's a compromise between selecting items that have high inter-item correlations or select items that have high correlations with the criterion, and low inter-item correlations (because items with high inter-item correlations usually explain the same part of the criterion’s variance, hence do not contribute to predictive ability). Quoting the authors, p. 3:

> So, it seems to be impossible to maximize both measurement precision and predictive validity at the same time. If measurement is maximized, reliability is high, inter-item correlations are high, and, as a consequence predictive validity tends to be lower. If predictive validity is maximized, inter-item correlations are low, and as a consequence, measurement precision tends to be lower. This paradox was considered in three classical handbooks for psychological testing (Cronbach & Gleser, 1965, pp. 136-137; Gulliksen, 1950, pp. 380-381; Lord & Novick, 1968, p. 332). In practice, researchers have to make a choice between either measuring precisely, or predicting accurately. Maximize either predictive validity at the expense of measurement precision, or maximize measurement precision, at the expense of predictive validity.


Cronbach, L. J., & Gleser, G. C. (1965). Psychological tests and personnel decisions. Urbana: University of Illinois Press; Gulliksen, H. (1950). Theory of mental tests. New York: Wiley; Lord, F. M., & Novick, M. R. (1968). Statistical theories of mental test scores. Reading, MA: Addison-Wesley.

• I often heard about Reckase's criterion for assessing unidimensionality of a scale (where we should observe 20% or more of explained variance on the first PCA component). Here is the complete reference: Reckase, M. D. (1979). Unifactor latent trait models applied to multi-factor tests: Results and implications. Journal of Educational Statistics, 4: 207-230.

(See here to know how abstracts looked like in plain typewriter font.)

• Stark et al. (2006) argued that assuming continuity is clearly inadequate for testing differential item functioning (DIF) with dichotomous items: Stark, S., Chernyshenko, O. S., and Drasgow, F. (2006). Detecting differential item functioning with confirmatory factor analysis and item response theory: Toward a unified strategy. Journal of Applied Psychology, 91(6): 1292-1306.

But see Lisrel MG-CFA procedure. More papers:

  - Hernández, A. and González-Romá, V. (2003). [Evaluating the multiple-group mean and covariance structure analysis model for the detection of differential item functioning in polytomous ordered items](http://www.psicothema.com/psicothema.asp?id=1064). Psicothema, 15(2): 315-321.
- Wu, A.D., Li, Z., and Zumbo, B.D. (2007). [Decoding the Meaning of Factorial Invariance and Updating the Practice of Multi-group Confirmatory Factor Analysis: A Demonstration With TIMSS Data](http://pareonline.net/pdf/v12n3.pdf). PARE Online, 12(3).