Consider this a #tbt.
I’ve made a lot of decisions (or, in failing to make a decision, reinforced the path I was on) re: education, career, and life, and for the most part I regret none of it. All my cumulative life experiences exist to put me exactly where I need to be today. And here is a good place! So in the hypothetical “what would you say to your college self if you could go back” I’d probably say — this is a nice gig so enjoy it, be kind to your friends/family and yourself. Everything is going to take longer than you think now, but you’ll get there.
You know: heavy on validation, easy on specifics.
If I could, though, I’d be like JUST ONE THING: will you please take at least one statistics class and intro to economics. At some point in my college tenure I decided that I didn’t want to study those things, so I didn’t. I, in fact, *rejected a major in journalism* because it required those electives.
And there’s precedent for me taking college classes not because I had to but because it seemed like the right thing to do — throughout my k-12 schooling, “calculus” always seemed like the pinnacle of stuff smart people learn, so I took calculus. I bet if my older, wiser, more attractive self came to visit college Mary, I could have convinced myself to take some stats classes. This has always been my Achilles Heel as I’ve explored career paths that involve any type of number crunching (and most do).
So let’s talk data science.
Last night I went to a General Assembly (GA) “Intro to Data Science” workshop with the objective to more clearly understand what are the skills and tools needed, what are the job titles, etc., behind “data science.” I picked up some good resources from the instructor, a biz intelligence manager for a company that makes your phone. Specifically, I know a bit about data science in academia, but have only a vague notion of how this all works in the business world (and, let’s not kid ourselves, there’s zero consensus, but this perspective helped!).
In addition to needing to be good problem solvers, communicators, and open-minded, here are a few technical skills required if you’re thinking of exploring this road:
- Working knowledge of Bayesian Statistical Techniques, when and how to apply them:
- Regression — forecast and prediction from numeric values
- K-Means — segmentation and cluster analysis
- Naïve Bayes — “is this spam?” (distinguishing between 2 options)
- Nearest neighbor — matching similar product users, for example
Software development skills to obtain, clean, combine and manipulate data and implement models
- If you’re going to learn *one* programing language, make it Python — tons of resources widely available for this
- R is also free and good for building models
- Learn some SQL and how to build queries (mySQL, Postgres, SqlLite, etc)
Data visualizations (Tableau, libraries (Python, R), SPSS)
- Tableau is $$$ but has a free public version! Who knew? Great for learning but you have to make your data public.
- The work of data science, from this perspective at least, is a lot more “business-y” and less “program-y” than I had anticipated. The upside: I have some existing and developing skills to support this work. The downside: the path I’m taking to become a web developer doesn’t really intersect with this world. So I left thinking “worth exploring more a bit further down the road…”
- ^^ This doesn’t necessarily apply to data science in academia, but looking at that as a *career path* requires layers upon layers of content knowledge and research skills/knowledge. A whole other bag. If you just want to dabble or see “how could this apply to my work as an [X]?”, I wholeheartedly recommend Mako’s Community Data Science Workshop.
- I downloaded and plan to learn a little bit about R and Tableau, both good tools for data analysis and visualization, but I won’t go deep into these unless there’s a need to.
- I also plan to continue learning about SQL (it’s an important part of Ruby on Rails) and noSQL databases.
- NOT MENTIONED: Ethics. Granted, I arrived late, so let’s assume I just missed it. I won’t go deeply into this now, I’ll just direct you to the UW’s Tech Policy Lab for some good research on complex policy issues emerging from 21st century technology, including online privacy, big data, public records access, wearable tech, etc.
- Related to #5, I was looking for an article about the whole “what exactly is data science in business” debate and I was only finding white dudes, so I tried adding “feminism” to my search and here’s what I got:
Well that’s a bummer. pic.twitter.com/bh3fqDMOvV
— Mary Dickson Diaz (@marythought) August 13, 2015
GA has some cool resources for data science worth investigating, starting with a series of workshops on excel, SQL, and Tableau. These classes are usually $35 a pop, 2.5-3 hours, in-person only, enough to get started and pick up some resources for how to learn more.
They also have an 11 week data science course which is two nights a week (3 hours each class) starting in late October, for $4,000. I would consider this (and it’s actually good timing for the bootcamp I’m doing this fall), but it’s a bit far removed from the skills I need to learn to become a Rails developer and I think I’ll need to direct my time and resources elsewhere. But maybe it would work for you? Get your company to pay for it!
Across town, Galvanize has a full-time 12-week data science bootcamp for $16,000 (see if you can qualify for scholarship or financial aid). I found this self-paced data science primer in their FAQ. They accept students from a range of backgrounds but look for “background in a quantitative discipline including foundational statistics, probability, linear algebra, and mathematics” (attn college Mary, do you see now what I’m talking about?).
Here’s a bunch of other options outside Seattle: SkilledUp List of Data Science Bootcamps & Fellowships (from February of this year, this field changes fast).
August 13, 2015 at 1:17 pm
Lassana wrote a good blog post on R: http://www.lassanamagassa.com/2013/09/ready-for-r/