Data Scientist’s Toolbox Review

Data Scientist’s Toolbox Review

By Lucas | August 11, 2014

Coursera Data Scientist's Toolbox

The Data Scientist’s Toolbox is the first course in the nine course sequence (plus capstone) that Johns Hopkins is offering via Coursera towards a Data Science Specialization. This course was not only my first course in that sequence, it was my first class on Coursera. In fact, it was my first MOOC (massive open online course).

While I was under the impression that the general public is now pretty well informed about MOOC’s, it’s been pretty obvious from speaking with my college educated peers that that is not the case. I’ve had to do a lot of explaining to my friends over the summer about the classes that I’ve been taking–how anyone can enroll, that the instructor’s videos are available on demand, that quizzes and projects are submitted online, and that for a small fee, Coursera will use a few identity verification features to issue you a special “verified” certificate at the end of the course that can even be posted to LinkedIn automatically.

I took the Data Scientist’s Toolbox concurrently with course 2, “R Programming.” I disproportionately spent most of my study time in May on R Programming, not this course. Professor Jeff Leek did a good job with motivating information to begin the Data Science Specialization, both giving interesting examples of data science in the news and explaining how it is growing as a career path. I found Leek’s ability to keep me engaged with relevant examples and stories has been one of his greatest teaching strengths throughout the 3 courses he teaches in the specialization.

The bulk of the course was spent on setting up the software that is used throughout the remainder of the specialization. That is, anyone taking this class will spend most of their time installing RStudio and Git Bash on their computer and setting up a GitHub account. If you are already “good with Git,” you’ll find this class to be extremely easy. If you are not, expect an hour or two of struggles with Git as you complete the course project that involves Git.  Version control with Git is not the most exciting way to begin your data science studies, but having completed most of the rest of the sequence, I would urge anyone who plans to complete the sequence not to overlook this project since it is a tool the professors use in most of the remaining classes.

There’s not much else to say. This class is much, much easier than any of the other classes in the Data Science Specialization. The course dependency chart lists this course as a “soft dependency” for R Programming. I would agree that these two classes can certainly be doubled up.

Click here to register for the Johns Hopkins Data Science Specialization on Coursera. (Affiliate link, thanks for your support!)

comments powered by Disqus