Tuesday, January 29, 2013

What is this statistical analysis you speak of?

I've spent the last month or so reading everything I can about the Data Science. It's been fun and interesting, but I've come to the conclusion that I don't know a damn thing about statistical analysis. I feel that's going to be a problem going forward.

Code I understand. Heck, I've been writing code for most of my life, and particularly code that needed to be far more robust than most of what I've seen in the field. Being able to hack an algorithm together is going to be my strong point.

Vocabulary is one of the most important things you can learn about a new field. Being able to communicate effectively about statistics is not a skill I have. I took a statistics class when I was in college, but my retention after fifteen years of not using it is pretty poor. I have a vague recollection of a few probability concepts, but that's about it. I'm almost certain that I didn't even learn anything about analysis.

I found this blog post from Andy Mueller that describes his standard approach for new data - I don't have context for half the words on there. But it provides me a great place to start looking for things to learn.

I've also grabbed myself a couple of statistics books, and want to try implementing a few of these algorithms. I've found that I understand concepts much better when I've implemented them in code.

Anyone have any other ideas for learning statistics?

