Weaving scientific document

2011-04-25

Some notes about installing and testing StatWeave with R and Stata.

StatWeave is yet another way to weave code chunks and text into a single document. The idea of interlacing code and doucmentation is borrowed from the famous web/cweb system developed by D Knuth who also coined the term literate programming. There is a draft article on Lightweight Literate Programming, which has evolved as an extended discussion of the following paper:

Allan Stavely, Lynda Walsh, and John Shipman. “Lightweight Literate Programming: a Documentation Practice”. Technical Communication. Vol. 55. No. 1. 23-37. February 2008.

For R, we already have Sweave and its variants, but see the CRAN Task View on Reproducible Research. I used it a lot for preparing my courses (handouts and slides with beamer). But, in the meantime, I rediscovered noweb which is closer to the web software, and more general in some sense. I know Luke Tiernay is using it for documenting his R projects, and I once found a very pretty document by Jason Catena using Tufte LaTeX class together with noweb, Study Haskell.

I grabbed the statweave jar file a long ago, but never tried it for real. Now, following a question on tex.stackexchange.com I reinstalled it and try to process some R and Stata document. Well, I don't think I'd pushed it to the limits, but at first sight it is quite straightforward to get a running PDF in few commands. What I like is that it works with Stata on my Mac.(a)

Here is a sample R document

rdoc

and here is one with Stata commands

statadoc

The idea is as simple as that of Sweave: you include code chunks in a dedicated environment (here, it starts with \begin{XXcode} and stops with \end{XXcode}, where XX stands for the foreign language). The Stata example reads

\documentclass{article}
\usepackage{Statweave}
\begin{document}
\StataweaveOpts{prompt=". "}

\section{StatWeave example using Stata}

Here are some fake data:

\begin{Statacode}{label=summary,saveout}
sysuse auto, clear
summarize mpg
\end{Statacode}

Here is the result:

\recallout{summary}
\end{document}

Nothing really difficult in terms of syntax. The configuration of Statweave is also easy. I just changed the default setting for Stata, before noticing that everything was already in the on-line help (RTFM!). So, only the first item in the configuration dialog has to be update in order to reflect the location of your Stata executable:

stataopt

It seems to be possible to use Matlab and Maple. I will try to configure StatWeave to accept Octave code, instead of Matlab that I don't use anymore since the end of my PhD.

Notes

(a) To run Stata from the command line directly, you just have to symlink the stata-se executable and add Stata directory in your path, e.g.

$ sudo ln -s /Applications/Stata10/StataSE.app/Contents/MacOS/stata-se \
  /usr/local/bin/stata-se
$ export PATH="/Applications/Stata10:$SPATH"
---

Articles with the same tag(s):

Collecting email usage statistics from mu
Data science at the command-line
From Beamer to Deckset
Interacting with Weka from Jython
CoffeeScript or how to avoid typing ugly Javascript code
Workflow for statistical data analysis
Weaving Stata documents
Playing with Julia
Easier literate programming with R
Happy texying

---