A list of all the posts and pages found on the site. For you robots out there is an XML version available for digesting as well.



How Well Did I Follow Pedagogy Guidelines at R Bootcamp 2017?

7 minute read


This week I had the privilege of participating in two workshops: I was a participant at a train-the-trainer workshop to become a Software Carpentry instructor and an instructor at the R Bootcamp put on by the Statistics Department and D-Lab. It was a unique opportunity to spend two days learning how to teach one of these bootcamps, and then to put my skills to the test a few days later.

Embedding Python plotly figures in markup

2 minute read


A lightweight markup language is a simple, human-readable language for formatting text. It’s easy to read and compatible with most text editors. Documents written in lightweight markup are usually then converted to things that are harder for people, but easier for computers, to read, like HTML. The most common ones that I’ve heard of people using are Markdown, R Markdown, and reStructured Text. I imagine that most people who do data analysis/exploratory visualization/data science use a markup language more often than they write in raw HTML.

Which logistic regression method in Python should I use?

6 minute read


This question is related to my last blog post about what people consider when choosing which Python package to use. Say I want to use some statistical method. I have a few options. I could code it up from scratch myself, knowing that this might have undetected bugs and be pretty slow. I could Google what I’m looking for and use the first thing I find; similarly, there are no guarantees. Or, I could do my research, find all the packages that seem to offer what I’m looking for, and decide which looks best based on how thoroughly they’ve documented and tested their code.

What’s important when vetting open source packages?

6 minute read


I’m in the early stages of creating several Python packages right now (shameless self plug – see permute, cryptorandom, and pscore_match). I want people to actually use them when they’re ready. They have potential for wide use, but they have narrow functionality compared to big packages like numpy or scipy. I could imagine that somebody looking to do a particular task in Python, like propensity score matching, would do a Google search and stumble upon my package.



Simple Random Sampling: Not So Simple


We propose several best practices for researchers using PRNGs for simulations, including the wide adoption of hash function based PRNGs.


Talk 1 on Relevant Topic in Your Field

Published in UC San Francisco, Department of Testing, 2012

This is a description of your talk, which is a markdown files that can be all markdown-ified like any other post. Yay markdown!