Quick recap’ of February on the Micro blog.
2019-02-02: Cleaning up some old stuff on my HD and just found this nice Beamer template: Mypreferred one is the “fi” variant.
2019-02-02: Tropical geometry of statistical models. At least, the conclusion is veryunderstandable:> The algebraic representation for graphical models with hidden variables leads> naturally to an interpretation of a parameterized model as a point on an> algebraic variety. Marginal probabilities are coordinates of points on the> variety. Varieties can be tropicalized, and the statistical meaning is that the> MAP prob- abilities (calculated with logarithms of the parameters) can be> interpreted as coordinates of points on the positive part of the tropical> variety.
2019-02-03: 📖 Christine Angot, Une semaine de vacances (Flammarion, 2012)
2019-02-03: Nick Drake, Five Leaves Left.
2019-02-03: After Tamás K. Papp’s CL libraries, I discovered this new library for numericalcomputing in the Lisp world: MAGICL.
2019-02-03: Home alone again…
2019-02-03: Just doing my little technical care and weekly backup. Homebrew 2 is out.
text~ brew --versionHomebrew 2.0.0Homebrew/homebrew-core (git revision 175af; last commit 2019-02-02)Homebrew/homebrew-cask (git revision 05a81; last commit 2019-02-02)
2019-02-04: Colossal Youth, Young Marble Giants.
2019-02-04: Just throw out more than 30k messages from my Gmail account. I have a localcopy, so no worries, but the Google team will have a harder time to analyze it.Incidentally, I just came across a new testimony from people tired of Google.Last round shown below: BTW, did you know that Google actually stores everything you buy based onpayment or shipping receipts?
2019-02-04: Updating my global dist for the newly released v1.1 of Julia. Installingpackages is much easier (e.g. Gadfly) and smoother compared to the precedingversions (prior to v1). Only caveat is that rendering plot via Gadfly is kind ofslow, especially compared to other graphing engines (R, Gnuplot, Mathematica, oreven Stata).
2019-02-04: Magithub (soon, forge) is now part of Spacemacs/magit. No need to add furtherconfiguration to your
init.el. Today I was trying to send an issue for on one ofmy repo and I figured out that there’s some trouble at the moment.
2019-02-05: 📖 Zoé Valdés, Une habanera à Paris (Gallimard, 2005)
2019-02-05: PJ Harvey, Stories from the City, Stories from the Sea.
2019-02-05: A recent tweet reminded me of gtools, a Stata package that aims to speed upbuilt-in command for data wrangling. I should give it a go.
2019-02-05: Death Cab for Cutie, Thank You for Today. Just streaming the Apple alt’ radio:
2019-02-05: I am not very lucky with Spacemacs these days. Now,
SPC-/ to search project fortext (aka,
spacemacs/search-project-auto) is no longer working. Not funny, trustme.
2019-02-05: Apache Arrow and Feather are two interesting projects that I think should beavailable in data science-related PLs. Recently, Rust joined the list, at leastregarding Arrow: DataFusion: A Rust-native Query Engine for Apache Arrow.
2019-02-05: Hacker Tools: A user-friendly introduction to various command line utilities,editors and VCS. (via @newsycombinator)
2019-02-06: Back to a fully functional Spacemacs, after a complete reinstall. Some minorannoyances with MELPA actually, but nothing serious; fixed a weird bug with the
ocaml layer, since I learned that the
syntax-version layer should come before
ocaml, but otherwise everything is fine. Also, I’m trying to go all Helminstead of Ivy.
2019-02-07: After attending months of Twitter discussion about what could be the bestsoftware–R or Python–for data science several months ago, this is now the timeof the R vs. Stata debate, here and there. Arguably, Stata is a paid softwareand does not offer the same scripting facilities than R for some tasks, mainlynon-statistical tasks. However, what’s the point? Did anyone ever mentioned thefact that Stata has a GUI which completely mimics the command-line operations,so that people afraid of typing commands or just interested in running alogistic regression on a well-formed dataset can just do it in under a minute?It is slow with some estimators or optimization approaches (e.g.,
gglamm), andwe had to wait a bit long to get full support for unicode and XLS, bettergraphical rendering, etc. But the versioning system allows to repoduce anyresult prior to the current version of Stata. And it does interact very wellwith Stan and R, too. The question is not which software is better, the realquestion is who’s the end user?
2019-02-07: Found a new playlist on Apple Music.
2019-02-07: Fun fact: I saved a database from Stata 15 in old format (i.e., compatible withStata 13). I cannot view unicode characters in Stata GUI, but it works perfectlyfine when run through Emacs/ESS!
2019-02-07: Today I was reading Jack Baty’s latest posts and I noticed an interestingmicro-post about keyboard versus mouse usage.> The stopwatch consistently proves mousing is faster than keyboarding.I think this deserves two additional remarks. First, it depends on the task athand: For instance, even if I prefer reading email with Apple Mail I use mu4eunder Emacs because I find it more convenient for bulk actions like archiving ordeleting a bunch of messages. Think of it a little: You just have to use yourpreferred movement keys or the arrow keys and strike a key, and it’s all done!Likewise, for text editing or interacting with an REPL, I found Emacskeybindings much more powerful than any combination of custom Services or evenTextExpander, together with using a mouse. I believe Vim users would agree aswell. Second, this does not account for people not using a mouse at all. I forone have always been very happy with Macbook trackpad, and I come a lot slowerwhen I have to use a mouse, notwithstanding the fact that it is very badpractice for the elbow and wrist. For most movement, I use the trackpad and I donot worry much about Emacs or Vim keybindings, because there I am faster withthe trackpad. Hence, we should better clearly state what actions are betterperformed using a mouse before claiming than the mouse win over the keyboard.
2019-02-08: Just found what I think is one of the best concise tutorial on “How to GitHub“if you are looking to collaborate on a common repository. As always, it worksbest when you read the Magit manual and check what’s available there.
2019-02-08: How to blog. Nice take by Tom MacWright. I don’t have a very strict schedule.However, I’ve been trying to post more or less regularly in recent years(sometimes even just links of Twitter bookmarks), specifically to avoid lettingmy blog die.
2019-02-09: A few days ago, I noticed someone citing A Computational Approach to StatisticalLearning on Twitter. I no longer buy statistical books so I can’t tell if it isworth a read, but I note that the author of the R package bigmemory is one ofthe co-authors.
2019-02-09: I still read Mastering Emacs from time to time. Recently, I was just checkingan article on regular expression. I have been using Emacs for about 15 years andI am afraid that now I would be far more comfortable with most key chords aftertwo or three years of Spacemacs. This is not that I really like modal editing–Idon’t like it at all in fact–but the consistent key bindings conveyed viawhich-key and the configuration layers for most packages make it a reallypleasant tool to use on a daily basis. I’ve come to have only Emacs on mydesktop. No more iTerm2 or Marked2 or even Desktop icons.
2019-02-09: Let’s start Season 4 of The 100 in a few minutes.
2019-02-09: The more I use Org for authoring simple or more complex text documents, the moreI like. I like to think of it as Markdown with better markup for links, codeblocks, tables, and references, and of course there’s Emacs inline preview.Except for collaborating with colleagues or drafting short RMarkdown documents,I mostly stopped using Markdown these days. Maybe I should just revisit some oldMd files and just convert them to Org.
emacs-lisp(defun markdown-convert-buffer-to-org ()"Convert the current buffer's content from markdown to orgmode format."(interactive)(shell-command-on-region (point-min) (point-max)(format "pandoc -f markdown -t org -o %s"(concat (file-name-sans-extension (buffer-file-name)) ".org"))))See also: Org-Mode Is One of the Most Reasonable Markup Languages to Use forText.
2019-02-09: GNU Coreutils Cheat Sheet. (via @UnixToolTip)
2019-02-10: Kate Bush, Never for Ever.
2019-02-10: After Jupyter notebook, we now get Jupyter book. Looks like a seriousalternative to RMarkdown/Gitbook (aka bookdown).
2019-02-10: Just found Racket Machine Learning – Core.
2019-02-10: Machine Learning Refined, with nice blog posts by Jeremy Watt & Reza Borhani.
2019-02-11: Jack DeJohnette, Ravi Coltrane & Matt Garrison, In Movement.
2019-02-11: Gary Peacock, Jack DeJohnette & Keith Jarrett, My Foolish Heart (Live at Montreux).
2019-02-11: > Portacle is a complete IDE for Common Lisp that you can take with you on a USB> stick.If you are looking for a quick solution, here it is. Otherwise, learn Emacs forgood.
2019-02-11: Another nice article about GTD by BSAG. I enjoy reading her blog posts, and Ireally love her website design. Funny thing: I was just reading some old postswritten by Bastien Guerry on Org mode.
2019-02-11: Interesting read. (via Daniel Lemire)> Though we age, it is unclear how our bodies keep track of the time (assuming they do). Researchers claim that our blood cells could act as time keepers. When you transplant organs from a donor, they typically behave according to the age of the recipient. However, blood cells are an exception: they keep the same age as the donor. What would happen if we were to replace all blood cells in your body with younger or older ones?
2019-02-11: While I usually run Slime for little Lisp hacking, I noticed that serious peopleare looking at SLY, the Sylvester the Cat’s Common Lisp IDE for Emacs. It lookslike there is even a Spacemacs layer.
2019-02-11: Yet another mind-mapping tool if you are not ued to Emacs Org mode: Hook. (viaJack Baty)
2019-02-11: Staying with Common Lisp. Safe no move perhaps? On a related note, here is anenlightening discussion about Racket vs. Lisp: Why I haven’t jumped ship fromCommon Lisp to Racket (just yet).
2019-02-12: I’ve been following Greg Stein on Caches to caches for a long time now, becausethe site has such a beautiful design and useful material on Emacs and Org mode.Recently they published a series of posts on AI and ML.
2019-02-12: disk.frame is a new (
dplyr-compliant) R package to manipulate structured tabulardata that doesn’t fit into RAM, in the spirit of Dask for Python.
2019-02-13: Not sure how we can think of GTD when we spend about one hour cleaning updefunct stuff on our HD, but sure we are close…
2019-02-13: One of the first hit when looking for “Lisp and bioinformatics” on the internet:How the strengths of Lisp-family languages facilitate building complex andflexible bioinformatics applications.
2019-02-13: Why the 3? Earlier in the morning I was reading one of the latest posts published byJohn D. Cook about dose finding studies. I am well aware of the 3+3 design.Incidentally, I attended a meeting yesterday where a PhD student was presentinghis work in microbiology, and they used triplicates. It is interesting that thesame 3 seems like a magic number here, but it is not the same. Maybe I shoulddrop a note in a few days.
2019-02-14: Joy Division, Closer.
2019-02-14: I’m halfway thru my new TV show (Occupied), but I’m struggling to motivatemyself to move forward right now, even to watch TV right now. Besides that, I’mfinally getting a job back. Let’s just hope I don’t go back to the hospital toosoon.
2019-02-15: Causal Inference Book, Python code hosted on GitHub (by the author of the Statakernel). (via @kaz_yos)
2019-02-15: How to stay as private as possible on Apple’s iPad and iPhone. (viaIrreal)
2019-02-16: An analysis of lossless data compression programs: Large Text CompressionBenchmark. (via SO–it looks it is the very first question on the beta site)> The amount of genomic sequence data being generated and made available through> public databases continues to increase at an ever-expanding rate. Downloading,> copying, sharing and manipulating these large datasets are becoming difficult> and time consuming for researchers. We need to consider using advanced> compression techniques as part of a standard data format for genomic data. The> inherent structure of genome data allows for more efficient lossless compression> than can be obtained through the use of generic compression programs. We apply a> series of techniques to James Watson’s genome that in combination reduce it to a> mere 4MB, small enough to be sent as an email attachment.> – Human genomes as email attachments
2019-02-16: I haven’t yet embraced the full power of Julia for data munging, but surely thisarticle is a gem to understand the language at a deeper level.
2019-02-16: Useful tips to build and manage R packages: rOpenSci Packages: Development,Maintenance, and Peer Review.
2019-02-16: Algorithms in Bioinformatics: A Practical Introduction. (via SO)
2019-02-16: Probability and Statistics: a simulation-based introduction, by Bob Carpenter. Ilike it when there are instructions for those like me who do not want to installRStudio to build the book.
2019-02-17: Nick Cave & The Bad Seeds, Push the Sky Away.
2019-02-17: Nick Cave & The Bad Seeds, The Boatman’s Call.
2019-02-17: Again, I’m slowly updating stata-sk. It took me a while to reset the publishingsystem to use Stata 13 MP instead of Stata 15 since I no longer get a freelicense for it. This will probably be my last textbook on Stata.
2019-02-17: I am about to exceed the 150th micro-posts in my Org file. (Other posts arepublished from the terminal directly.) I added a little cookie to keep track ofthe number of entries, although a little harder path would be to write someelisp code.
2019-02-17: I don’t have any big needs in terms of image processing, and I am generallyhappy with ImageMagick. However, Acorn and Retrobatch (h/t Brett Terpstra) lookpretty nice.
2019-02-17: Just cleanup a little bit more my Dropbox (6 Go of data, reports and papersaccumulated along 8 years!).
2019-02-17: Look. Even Racket has some support for statistical data structure like dataframes. In addition, here is an essential read if you want to get started withcommon data structures: An Overview of Common Racket Data Structures.
2019-02-17: Machine learning in Clojure with XGBoost. Note that there are bindings for theawesome xgboost in various other languages (Python, Julia, R), not just the JVM.
#clojure> Python didn’t become the leader in the field because it’s inherently better or> more performant, but because of scikit-learn, pandas and so on. While as> Clojurists we don’t really need pandas (dataframes) or similar stuff (everything> is just a map, or if you care more about memory and performance a record) we> don’t have something like scikit-learn that makes really easy to train many kind> of machine learning models and somewhat easier to deploy them.
2019-02-17: merlin - a unified framework for data-analysis, and many other interestingpackages by the same author or other coworker.
2019-02-18: Nick Cave & The Bad Seeds, Nocturama.
2019-02-18: New Order, Power, Corruption & Lies.
2019-02-18: I am reading the Racket guide again, this time using Dash only. It’s amazing howconvenient this application is, especially for navigating between text andfunction definitions, which by default are all hyperlinked thanks to theScribble documentation system.
2019-02-18: Today was my first day at my new lab. Everything went fine, despite a very badnight. At least I have been able to go back home without too much dizziness orparesthesia in the legs (I don’t know where this one comes from). Guess what:For the first time in 10 years, I am able to connect my Macbook on the network!
2019-02-18: Pretty Magit - Integrating commit leaders. I have been using Git leaders foralmost two years, but now I realize that I completely forgot about them.
2019-02-19: Timber Timbre, Timber Timbre.
2019-02-19: Diving into computational molecular biology. It’s a fun world after all,especially compared to medical statistics. I am trying to devise a reliableworkflow for taking notes and using a live notebook, mostly inspired from my oldsetup, but basically it’s all about Org files with tags and “TODO items”,including a diary and
helm-bibtex for managing my bibliography. Nothing fancy,but it just has to do the job right after all.
2019-02-19: Today’s lunch:
2019-02-20: Nick Cave & The Bad Seeds, Nocturama. I’m often lazy when it comes to changing a CD.
2019-02-20: Morcheeba, Who Can You Trust?.
2019-02-20: After jupyter-book, there is now jupytext (via @marcwouts). Looks like we nowhave a serious competitor to RStudio.
2019-02-20: I disabled Dropbox syncing on my Mac for a long time now, but I realizedyesterday that Transmit allows to connect to Dropbox very easily now. Even if Ino longer use Dropbox these days, that may be a very good option for the future.
2019-02-20: Two handy org commands:
org-journal-new-scheduled-entry can be used to schedulefuture entries in org-journal (see discussion here);
org-tree-to-indirect-bufferis a good alternative to
2019-02-20: Discrete Stochastic Processes. It’s amazing how many excellent tutorials can befound on the MIT OpenCourseWare.
2019-02-21: Despite the useful utility under the “File” menu, my attempt at installing aMathematica package properly failed miserably this morning. I ended upcopying/pasting the wole archive into
~/Library/Mathematica/Applications.Anyway, this worked and I am now able to plot phylogenetic trees!
2019-02-21: Didn’t know there was such a thing: MacJournal (via Jack Baty). Whether you areinterested in this app or not, the author provides a nice discussion of the prosand cons of keeping a diary vs. a journal, and on the importance of meta data.
2019-02-21: Merlin Mann et Marie Kondō sont dans une d’emails, by Bastien Guerry. Nicesummary of the situation regarding emails. I already deleted 30k+mails in one pass so I know what batch processing is.
2019-02-22: Didn’t know either: Beware that
wc counts newlines, and not lines. (via Irreal)
2019-02-22: Emacs build-status: a nice package that allows to monitor build on Travis orCircleCI.
2019-02-22: Stephen Wolfram reflecting on his “productive” and digital life. What a man!
2019-02-22: The first edition of Interpretable Machine Learning is out. (via @ChristophMolnar)
2019-02-22: Yet another org-powered website. This makes me think that I added a little
org-capture template to write those micro-posts without having to open my
emacs-lisp("b" "Blog post" entry (file+headline "~/org/micro.org" "Micro")"** TODO %?\n:PROPERTIES:\n:EXPORT_FILE_NAME:\n:END:\n%^g\n":empty-lines 1)
2019-02-23: And we are finally done with The 100. Looking forward to looking to The Expanseduring winter holidays.
2019-02-23: It’s been a while since I haven’t run any ML model using caret, especially sinceMax Kuhn engaged in the RStudio team to develop a brand new ML pipeline in thename of the tidy new wave: tidymodels, then parsnip (slides near here). Anyway,here is a good tutorial if you want to get started with
2019-02-23: When you insist on your CLI-based workflow (reproducibility, text-based, etc.you know…) and you realize that Stata 13 does not recognize
graph exportwith a PDF backend (while Stata 15 does) from a Terminal. Back to EncapsulatedPostScript then, like in the 90s!
2019-02-23: While I appreciate that there are so useful Docker images available, I think Iwill need to build a more lightweight one if I want to stay on CircleCI freeplan. Hopefully, it looks like someone already had the same idea.
2019-02-23: Ecological causes of uneven diversification and richness in the mammal tree oflife. (via @rlmcelreath)
2019-02-23: Statistical Thinking for the 21st Century.
2019-02-25: Peter Erskine Trio, As It Was.
2019-02-25: I am still unsure how best to use org-journal. I already use a “diary” filewhere I bookmark important stages of my working day. This way, I get a nicesummary with
org-agenda. Obviously, I could do exactly the same using
org-journal, but I was thinking that it could also be used to record my posts onthe main site: (1) I would be writing using Org mode directly, (2) I would get asearchable archive from Emacs directly (and more convenient than
deft), and (3)that would be just cool.
2019-02-25: I just added permalinks in this section (here, a small hash symbol near thedate). I was missing a way to link to previous micro-posts.
2019-02-25: I’m almost done with Occupied. I initially thought I would be able to finish thelast two episodes of the first season this evening, but I’m so tired (I’m upsince 4am) that I’m afraid I won’t be able to stand up for long.
2019-02-25: This moment when you realize that you are stuck with Java 8 on your OS… Twooptions: use Homebrew (
brew cask install java) or proceed manually. I think Iwill love bioinformatics tools.
2019-02-25: How to delete empty lines in a file by Emacs? Useful to clean up an HTML pagewith lot of extra blank lines.
M-x flush-lines RET ^[[:space:]]\*\$ RET
2019-02-26: A few days ago, I read a thread on Biostars (which I haven’t consulted in awhile) on the use of Wolfram mathematica in bioinformatics, and I wondered whypeople are so critical of this software. The same applies to Stata (if you seethe recent flame on Twitter, you know what I mean), albeit in this case there’snot even this man behind it.
2019-02-26: I think this is the first time this site is referenced in Sacha Chua excellentEmacs newsletter.
2019-02-26: Long time no see. I have been compiling several pieces of bioinformaticssoftware lately. No issues whatsoever, except for a few glitch with boostlibraries.
2019-02-26: Mathematica implementations of machine learning algorithms used for predictionand personalization.> This open source project is for Mathematica implementations of statistical and> machine learning algorithms that can be used for data analysis, prediction, and> recommendation systems.Note that the Github repository also includes Lua, Java and R code. Thecompanion website is Mathematica for prediction algorithms.
2019-02-27: I’m finally done with Occupied.
2019-02-27: Look, I read two of the latest newletters by Sacha Chua and I already learnedabout two new Org features: org-reverse-datetree and org-bib-template. Moreover,I didn’t know that there were such thing as meta repository for ESS users.
2019-02-27: Trying out Travis CI for a Bookdown project. I’m already at the third failureand it starts to be painful.
2019-02-27: Linear algebra in Emacs using MKL and dynamic modules.
2019-02-28: Okay, so it looks like we started with season 2 of The Expanse instead of season 1.Great! That may well explain why we didn’t understand anything during the firstepisodes.
2019-02-28: Prompted by a recent Twitter question, I was about to benchmark some R packagesto process large files. However, there already seems to be a very nice postabout this: Working with pretty big data in R.
2019-02-28: This Elisp cheatsheet (PDF) is really great.
2019-02-28: Today was my first attempt at building a Flask site, using a boilerplateBootstrap theme, and a Postgresql backend. Done. On reflection, I wonder why Icontinue to maintain PHP websites.
2019-02-28: Anatomy of a logistic growth curve, by Tristan Mahr. Nice looking visualizationand clearly a non mathy but well put explanation of the logistic curve. I wish Iread this earlier, when I started teaching psychometrics.
2019-02-28: Immersive Linear Algebra.
2019-02-28: Principles and Techniques of Data Science. Nice ressource to have! It’s beenwritten using Jupyter book, btw.