I was working on a project yesterday where I needed to amortize out a bunch of loans to calculate the total interest a borrower would pay if he or she paid the minimum monthly payment for the full term of the loan. I couldn't find any package in R that already contained the necessary math, so I looked around and found this post as well as this one. They both presented the R code to do the basic math involved in amortization, but each function was built to handle only one loan at a time. I had well over 100,000 loans I needed to go through, and loops aren't all that efficiently implemented in R.

So I revised the code to perform that math on all of the loans at once by organizing everything into matrices that could then be added, subtracted, etc. It only took a little over three seconds to amortize 110,335 loans. I don't know how long it would have taken to amortize each loan individually - I killed the process after I got tired of waiting for it to finish.

(More)

I have a post up at the Data Science for Social Good blog, making the ethical and scientific case for openness and accessibility in data science.

See it here.

(More)

I have mixed feelings about my Ph.D. in anthropology. I often suspect I wasted a lot of time and money getting that degree. I studiously avoid using most of the theory I picked up in my formal education, I use methods that many (of course not all) anthropologists seem to view as quite un-anthropological, and anthropology as an academic discipline sometimes seems bent on making a poor name for itself among the general population.

(More)

I wrote a couple days about about importing Excel files into R. There are lots of ways to do this, but all the ways that use only R have drawbacks (as I outlined in my last post), and all the other ways require installation of programs other than R. I'm not opposed to using programs other than R - it's easy enough to weave, for example, Python and R code into each other. But I'd become curious about the possibility of solving this problem without the need for added programs, so I did some more searching. Turns out you can import an .xlsx document into pretty much anything that can parse XML, because that's all an .xlsx document is.

(More)

I’ve realized recently how the last few years have changed the way I think about my work. This post is an attempt to put that thinking into writing.

I left grad school feeling I wanted to do more “applied” work than what academia usually offers, but I still assumed that application was a matter of doing an analysis and then letting people who make decisions consume and implement the lessons of that analysis. I created a lot of those for-application sorts of analyses for the U.S. Department of the Army, but left feeling I wanted to be part of the decision making process rather than just producing fodder for it. My current employer gave me the opportunity to work interactively with decision makers to clarify their goals and adapt my analyses to their needs, and also to be somewhat involved in the implementation side of things. So my career has followed a path of closer and closer integration of my analytic work with the decisions and implementation that my work is supposed to facilitate. I think that process has helped me better define how I think about “application.”

(More)