Wednesday, November 7, 2012

Data Science

Data science just sounds really corny as a name - maybe it's just that "science" is overused as a term these days. As much as I like Mythbusters, I cringe every time they claim they're doing "science". They have a really cool program, and it's neat to see things in slow motion (like a rocket-propelled car) but Mythbusters has only a passing relationship with the scientific method. I also cringe when they have a sample size of two. TWO IS NOT A SAMPLE.

But I digress. Data science.

Yes, data science sounds corny. However, it's what I've been doing for a long time now. When I was in the Army, my job was to use data to find terrorists - if that wasn't data science, I don't know what is. Before that, my second job out of college involved running Fourier transforms over massive (at the time) data sets. And now I make software that allows people to do intelligence analysis, which requires that I crunch data sets into a palatable form. Data science!

I think I may be a data scientist.

Probably not a very good one yet. Apparently one needs to know how to use R in order to be a proper data scientist. And having used Ruby for the last few months, R looks like a bit of a different beast.


This is something that started when I read one of Michael O Church's posts about programming: He advocated that most programmers need to acquire another talent to go along with programming. I don't necessarily agree with the ratios (I like being a programmer who has other skills) but I do agree with the necessity - unless you're a Charles Nutter, Jeff Atwood, or Jon Skeet you're eventually going to hit a ceiling with your skill level.

I love writing code and I think I'm fairly good at it, but I am no Headius. The focus and effort that those people put into being the best is impressive, and I think my priorities are simply different than that.

So instead of beating my head against my own limitations on my time and focus, I've been considering alternate career focuses. I've been making an attempt at user interfaces since that's a part of what I seem to enjoy, but it didn't feel quite right. I know that systems engineering is not my strong suit either - I get frustrated when setting up software.

When I finally read a description of data science (almost by accident) I realized I had finally found what I've been looking for. You developers can keep your LRU caches and your NP-complete algorithms - I have data! Of course, processing that data depends on LRU caches and NP-complete algorithms - but that no longer has to be the focal point. The important thing is to get the answer out, not to get it out in the shortest possible time.


This is probably boring as hell to read if you're not me. Which I believe is the very definition of a blog.

Anyway, I want to be a data scientist and I'm REALLY excited about it. If you can read this without getting excited, then I think data science isn't for you. But I love it. Next stop: Kaggle!

No comments:

Post a Comment