I won't detail the conference program itself, but just drop some words on packages that were presented together with their applications (in various fields: epidemiology, social sciences, teaching, high dimensional data, chemometrics).
Multivariate data analysis
Stéphanie Bougeard talked about two new functions in the
aiming at the analysis of K+1 tables (several blocks of explanatory
variables and a block of response variables). I can't find those functions,
mbpcaiv, but they look interesting. I wonder how they compare
to RGCCA or PLS
path modeling (e.g.,
Bougeard, S, Qannari, EM, Rose, N (2011). Multiblock redundancy analysis: interpretation tools and application in epidemiology. Journal of Chemometrics, 25(9): 467–475.
Other (related) papers of interest:
- Bougeard, S, Qannari, EM, Lupo, C, and Hanafi, M (2011). From Multiblock Partial Least Squares to Multiblock Redundancy Analysis. A Continuum Approach. Informatica, 22(1): 11-26.
- Bougeard, S, Qannari, EM, Lupo, C, and Chauvin, C (2011). Multiblock redundancy analysis from a user's perspective. Application in veterinary epidemiology. Electronic Journal of Applied Statistical Analysis, 4(2): 203-214.
I've also learned that
capabilities will be rebased on the lattice package, allowing for complex
layout on graphical device (Alice Julien-Laferriere's talk). This was done
using S4 classes on top of existing functions visible to the user
Aurélie Thébault presented her work on locally-weighted PLS regression, with applications in infrared spectral analysis. The idea is to introduce a local calibration stage, before computing PLS components. The idea of local PLS is to predict new observations from a subset of the original samples that resemble the characteristics of these new observations (weighting process). This seems to be highly specific of near-infrared spectroscopy, but it might be interesting for signal processing (?).
The PCAmixdata was discussed by Vanessa Kuentz-Simonet. This is a package that deals with VARIMAX rotation in factor analysis.
Chavent, M, Vanessa, K, and Saracco, J (2011). Orthogonal rotation in PCAMIX. arXiv:1112.0301
At UseR! 2011, there was a related talk on the selection of variables by those authors: ClustOfVar: an R package for the clustering of variables. Other interesting papers I have to read or reread:
- Kiers, HAL and Krijnen, W (1991). An efficient algorithm for PARAFAC of three-way data with large numbers of observation units, Psychometrika, 56(1): 147-152.
- Takane, Y and Shibayama, T (1991). Principal component analysis with external information on both subjects and variables, Psychometrika, 56(1): 97-120.
- Takane, Y, Kiers, HAL, and de Leeuw, J (1995). Component analysis with different sets of constraints on different dimensions. Psychometrika, 60(2): 259-280.
- Kiers, HAL (1991) Simple structure in component analysis techniques for mixtures of qualitative and quantitative variables. Psychometrika, 56: 197-212.
- Lorenzo-Seva, U, van de Velden, M, and Kiers, HAL (2009). CAR: A MATLAB Package to Compute Correspondence Analysis with Rotations. Journal of Statistical Software, 31(8).
The mixOmics package has been updated with new functions, including Independent Principal Component Analysis. It now has an official website where more information are available, and a there is also a mixOmics wizard where users can see on-line illustrations and get explanation of the techniques used therein (good point for reproducible research!).
Charles Bouveyron provided a general overview of the
package (but see the JSS paper,
HDclassif: An R Package for Model-Based Clustering and Discriminant Analysis of High-Dimensional Data),
which is for supervised and unuspervised classification. There was a nice
demo of clustering with the
crabs dataset, which can be found in
demo_hddc(). Below is a screenshot from running model-based clustering
with the EM algorithm, k-means initialization for cluster centres, and
AkBkQkDk model for the general variance-covariance structure (see section
2.1 of the JSS paper for more explanation).
Florent Langrognet presented the Rmixmod package; this is a porting from the mixmod project for high performance model-based cluster and discriminant analysis, which comes as a C++ library with command-line utilities and a MATLAB frontend. Interestingly, this package also works with semi-supervised problem, and it allows for case weighting.