I have been quite busy during the last two months. This was in part due to writing teaching material, including an 11-week statistical computing course with R and Stata for a French university. As a matter of fact, it took me something like 70 hours to write 300 pages of exercices with solutions, and I am currently writing a 110-page long textbook which aims at introducing students to R and Stata for medical statistics. I can’t release this work in the public domain, though. However, I expect to develop a similar open access course with other languages, probably Python, Mathematica and Lisp. What I found difficult for this course was to write self-assessment exercices with multiple-choice response options (to ease automatic grading on an online companion website). That was, however, a pleasant experience. I am secretly jealous of O’Reilly’s Try R and I am wondering if such an online toolkit wouldn’t be the solution for ensuring students work regularly at home and keep typing code between two sessions.
My reading activity has significantly dropped out during this period, notwithstanding the fact that I have been sick for almost 3 weeks. I was still looking at Twitter feeds from time to time, and I noticed that Prismatic often came to my rescue when nothing interesting in my Google feeds reader pop out. However, I started reading several books, and finished some.
Theus and Urbanek, Interactive Graphics for Data Analysis. Chapman & Hall, 2009. A must-have for people interested in dynamic graphics and exploratory data analysis with Mondrian and R. The companion website include a set of slides and datasets for use in teaching classes: http://www.interactivegraphics.org.
Malley, Malley and Pajevic, Statistical Learning for Biomedical Data. Cambridge, 2001. A light but efficient introduction to machine learning and statistical modeling in life science with good tips and tricks on some widely used ensemble methods, like Random Forests.
Murphy, Machine Learning. MIT Press, 2012. The new big book on machine learning that I bought right out but haven’t time to read so far.
Jenkinson, Measuring Health and Medical Outcomes. UCL Press, 1994. Quite an old book on patient reported outcomes and (subjective) health measurement in clinical trials. It doesn’t replace Streiner and Norman’s Health Measurement Scales, but it does a good job at describing the state of affairs in the early 90’s.
Kane, Understanding Health Care Outcomes Research. Jones and Bartlett, 2006. Yet another book on health measurement that I would qualify as a modern replacement of the preceding one.
Indrayan, Medical Biostatistics. Chapman & Hall, 2013. An essential reference for all those interested in medical statistics. Contrary to most applied textbook, this one provides in-depth discussion of quantitative assessment and evidence-based medicine and common caveats in data analysis and interpretation of classical medical studies. That’s a big book, but probably worth to buy in addition to Rothman and Greenland’s Modern Epidemiology.
Friedman, Furberg and DeMets, Fundamentals of Clinical Trials. Springer, 2010. Actually, I don’t remember how I came across this textbook, but it is one of those rare books on RCTs that include two specific chapters on the measurment of quality of life and patients adherance. The rest of the book is pretty classic and one may be appy with just Chow and Liu’s Design and Analysis of Clinical Trials.
Husson, Lê and Pagès, Exploratory Multivariate Analysis by Example Using R, Chapman & Hall, 2011. When buying this book, I was interested in learning more from the FactoMineR package. Although there’s a large chapter dedicated to multiple correspondance analysis, there’s nothing about on “advanced methods” such as simple and hierarchical multiple factorial analysis or procruste methods.
Boslaugh, Secondary Data Sources for Public Health. Cambridge, 2007. A concise overview of auxiliary health-related database that can be used to extend analysis of primary endpoints. It consists in a brief description of common health survey about health services utilization, health behavior and risk factors, fertility and mortality, etc. Interesting to get an idea of why and how such surveys were developed. On a related point, I started following Anthony Damico’s weblog, analyze survey data for free, which includes R code to fetch and process available data from those US surveys.
There are many more books for 2012 than there is place in this post, so I will probably talk of them later.