The sexiest job of the 22nd centuryThree questions you should ask in a data science interviewCassie KozyrkovBlockedUnblockFollowFollowingMar 15Data science has been called “the sexiest job of the 21st century” — a sentiment I’d believe if I saw more business leaders hiring data scientists into environments where we can be effective.
Instead, many of us feel misunderstood and invisible.
The world isn’t ready for us (yet).
Who we areWe are the people who help inspire new directions for your business, reduce your risk of setting important decisions on fire, and automate the ineffable through machine learning and AI.
We make your data useful, yet you make us live in resource squalor.
You ask us to make our peace with:Unskilled leadership — If you don’t have personnel skilled at leading and managing the data science function, we’ll have a miserable time.
When you don’t have decision-makers skilled at assigning work appropriately and making data-driven decision, employers will keep calling their data scientists useless.
No data — If you hire data scientists before data engineers, it usually means we have no data to work with and we must first build the data engineering function for you or we’re forced to tuck and roll.
If we stay, we end up doing a job other than the one you claim you’re hiring us for.
I’ve said it and I’ll keep saying it: you need high-quality data for data science to be effective.
We’re not magical leprechauns, so we can’t make something out of nothing for you.
Nasty tools — Data science developer tools are a misery.
The ecosystem is fragmented, especially when it comes to AI, and even the best options are far from perfect.
There’s always something that makes the ride bumpy.
If you’re interviewing for a data science gig, make sure you grill your potential employer about their plan for all three of these points so you don’t end up in a sad spot.
Don’t forget to ask about people – whose job is it to make sure you have data?.Who gets fired if all your insights aren’t used for anything?.Who picks the tools you use and makes sure they play nice with all the other infrastructure?I’ve written plenty about leadership and data, so it’s high time I mentioned tools.
Applied data scientists (including those working on ML/AI) don’t want to build our tools from scratch (that’s a different job – if we wanted it, we’d be in it already).
For example, we’d much rather use an existing package to make a histogram than write the code that displays rectangles to a screen.
Asking us to roll our own is like asking you to build your own microwave if you’re opening a restaurant.
We’ll build them if we have to, but we’d prefer to jump straight into the cooking.
Working with SatanSometimes the proprietary tools that management foists on data scientists are even worse than the ones they could cobble together themselves.
I remember one that my friends had nicknamed “Satan” — as in, “Yeah, I know that takes one line in R but it you should probably budget all day to get it working in Satan.
” It’s hard to go through the day with a song in your heart when the tools at your disposal are horrible.
Take a designer’s perspectiveSometimes the trouble with available tools is in the eye of the beholder – perhaps the root of your frustration is that you’ve picked up a tool that wasn’t made for you.
Let’s take a look at two tools of Google origin.
Keras is not only a beautiful API, but it was built with the data scientist in mind.
For example, Keras’s error messages are designed to guide the data scientist’s next move, so they’re concise and friendly-looking, while an equivalent mistake in TensorFlow spits out a text jumble of Dickensian proportions.
This shouldn’t surprise you if you put your design thinking hat on; as the industrial lathe of AI, TensorFlow was not originally designed with data science users in mind.
It was made for researchers breaking new ground at Google scale… and it’s good at what it’s built for.
The great news for us data science types is that even TensorFlow is getting more cuddly.
The new 2.
0 release is moving in our direction and it shows.
Instead of grumbling about what’s still missing, I’m cheering them on!.I can’t wait to have a laptop sticker that says “I love TensorFlow” (as opposed to “I tolerate TensorFlow because it’s the only thing that handles my data at this scale”).
I’m delighted to be part of TensorFlow initiatives that explicitly identify data scientists as primary users.
One example I’m excited to tell you all about in a future post is the What-If Tool, which makes model understanding, bias detection, and ML data exploration easy.
The team included a user experience designer tasked with making data scientists happy.
from Day 1!.You can sneak a look at the results here.
Curious to know what you’re looking at?.Find out more about the What-If Tool here.
If it’s not made for you, it probably won’t fit youIt’s important to take a moment to think about the origins of a tool you’re considering learning as well as the communities its builders are making overtures towards as they’re steering the development of new versions.
Try before you buyWhile we’re at it, if you’re calling the tooling shots for your organization, don’t commit to a tool before your data scientists have playtested it.
You’d think this would go without saying, but Satan suggests otherwise.
Thinking about picking up a tool built for analysts in the retail space and plugging it right into your healthcare company?.Oh dear.
You might want to consider some tooling support from dedicated engineers so your data scientists aren’t miserable.
Your current analysts probably didn’t sign up for dealing with what will feel like a piece of junk to them, and they might not have been around the block enough times to know to ask about tools (and engineering support for these tools!) during their interviews.
PunchlineIf you’re a data scientist looking for a new gig, don’t forget to check that whoever you’re about to trust with your career understands your needs.
Ask potential employers pointed questions about data, decision-makers, and tools.
Make sure they have what our kind needs to be happy and effective.
If you love the work, I’d hate to see you become yet another Director of Data Science in a company with no data!.