Hiring data scientists
<<< Back to writing
2022 Update: I believe I wrote this post around 2014 (!) and, while many of my views remain the same, some of them have been updated as I have interviewed and hired many data scientists since. I will be updating this post in due course to reflect what I believe are contemporary best practices.
These are my thoughts about the process of hiring data scientists. By process I mean the actual process of conducting interviews and asking candidates questions. I don't mean the topic of the questions or the skills you need to look for in data scientists (I covered that earlier here). Many of these thoughts also extend to hiring software developers and beyond, but they are written specifically about data scientists. This piece is the result of my interviewing candidates and being interviewed as a candidate at mid-size pre-IPO companies, large public companies, and small startups. Full disclosure -- these are not data-driven, at least not with any large samples. However, I believe they are internally consistent and will result in a better, more equitable hiring process for both those doing the hiring and those seeking a position and help fight against employee churn caused by mismatched hires.
Obligatory note that these are my own views and not the views of any of my employers, past or present.
My ideal process looks something like this. I'll detail each of these steps below.
That's it. There aren't multiple on-site interviews unless the candidate requests them (which is a totally good thing). The goal is to hire people who are interested in the job, who would be interesting to work with, and have the potential to grow in the position while having a positive impact on the company. That's it. You're not looking for ninjas, rockstars, pirates, unicorns, 10x data scientists, or Stanford degrees. You're looking for people -- of all genders and all ethnicities and all backgrounds.
These four steps should start with a fairly wide net and get narrower as the process is completed. Err on the side of interesting, not on the side of exacting resume requirements. People with diverse backgrounds bring a lot to the table -- things you may not even realize you're missing in your organization. When you select for highly specific resume requirements (i.e., a specific degree level, a degree in a specific field, a certain programming language, etc.), you are unnecessarily limiting the search. Not only are you missing out on extremely intelligent and motivated people, you're most likely thinking about solving the very specific problems you're facing right now rather than the problems you'll be facing in six or twelve months.
Don't overfit to your organization's immediate situation or some template for what a "data science resume" should look like. Data science is a new field and there are many different kinds of data scientists. If you only hire people with PhDs in computer science from Stanford or Berkeley, you're likely to miss out on great candidates from other fields and schools. If you value diverse knowledge, viewpoints, and expertise, you need to branch out. Drew Conway has a great talk on this. Obviously, if the Stanford CS PhD is the best candidate for the job, you should hire them! But don't make this your primary selection characteristic.
1) Treat candidates like they're intelligent human beings, with honest intentions, and with something to add. That doesn't mean the something they can add is the thing you need -- but it doesn't mean they're stupid, or incompetent, or any of the other pejoratives that are often thrown around about candidates. Looking for a job is stressful for everyone.
2) Give candidates the benefit of the doubt. Drop the "prove it" mentality. It's not clear to me why so many people in tech are convinced that the world is just full of idiots masquerading as qualified candidates and it's up to YOU to find them out and expose them for the fraudsters they are. In many other fields, your resume and your references speak to your ability to do your job. You're not required to prove from the ground up that you possess the basic skills necessary.
Depending on the seniority of the position, a technical screen may be in order. These should be pretty short and open-ended, devoid of "trivia" or specific coding questions. Try to standardize these and make the entire call take less than 30 minutes. Standardizing it and keeping it short means you can do more of these, which reduces the likelihood that you'll filter on arbitrary criteria such as a specific degree or years of experience prior to speaking to the actual candidate.
I tend to ask a single question or two that spans most of the 30 minutes and allows the candidate to be as specific or technical as they like. I usually spend five minutes or so at the beginning of the call discussing the specifics of the position, such as daily duties, and why we're hiring for that position. I also always leave at least five minutes at the end of the call for the candidate to ask questions.
If the candidate is obviously more senior and their technical skills are well-known to you, or if they have shared public code (e.g., on GitHub) with you, this conversation can be more informal and more about the specific position. The only goal of the technical screen is to signify a minimum of ability and qualifications to advance to the next round, homework.
I'm a fan of homework assignments. Ideally for me, they take place after an initial phone conversation that signals that the candidate is interested in the work and that they seem minimally qualified. Homework shouldn't be onerous and take lots of time. Shoot for a task that can be completed in one to two hours, because let's be realistic -- many candidates want the job and some will spend as long as they can on the task. The task should approximate a real task that a qualified candidate should know how to do. It shouldn't contain "gotchas" and doesn't have to contain a single "right" answer that the candidate must discover in order to advance.
The entire point of this exercise to to make the work as representative of a "real" work environment as possible. The candidate can use his/her own equipment in a setting of their choosing using their favorite development environment. It's to remove the artificiality of coding on a whiteboard, or coding up an algorithm from scratch that the candidate probably wouldn't have to do in a typical day's work.
For data scientists, this means providing a dataset that doesn't require a ton of cleaning, but perhaps requires some cleaning. Then ask the candidate to build a model to answer some question, explain why they made the modeling choices she made, and how they evaluated the performance of their model. That's it. Seriously. The parameters are broad enough that candidates can be as sophisticated or not as they like, can demonstrate substantive knowledge about a problem domain (e.g., class imbalance in some prediction cases, or the in/appropriateness of linear models for certain tasks).
Many people tend to make both technical screens and homework assignments too hard and overly specific to the job. Do a survey of your colleagues and try to gauge where they were at, skill-wise, when they were hired for a similar job to the one listed. Figure out what they learned on the job. If a skill / method was primarily learned on the job, it doesn't make for a good screen / homework question. It might make a good speculative "how would you handle" on-site question.
Two common arguments against assigning homework are a) it means you won't be able to attract the very top candidates who don't have to do homework for other prospective jobs, and b) it provides an undue burden on those candidates who have kids or other obligations at night / on the weekends.
The first criticism makes little sense to me. Candidates that are excited about the work and want to make sure they will like the kind of work they will be doing are the kinds of candidates you want to hire. Candidates who see themselves as invaluable commodities that will only go to the employer that showers them with money and asks nothing in return are probably not. I'm more afraid of missing out on that person who may not have the typical profile but will be an awesome colleague than I am on missing out on someone with a picture-perfect but uninteresting resume.
The second criticism can be a fair one -- that's why it's so important to make sure you make the homework task a reasonable one, that can be completed to satisfaction in a short amount of time. This takes practice! Make sure you solicit feedback from all of the candidates who complete the homework assignment about how long it took to complete the task. Remember that candidates may be asked to do homework assignments for several different companies. There are also options to use a time-boxed environment.
If the candidate really can't do the homework assignment or you really want to be strict about how long the homework assignment takes, you can make this part of the on-site interview. In this case, the first two hours of the day will be devoted to the assignment and the candidate can bring his or her own laptop to work on. Naturally, this increases the number of on-site interviews you will likely do.
On-site interviews should be conversational, not adversarial. They should be technical but not trivia sessions. One good idea for a session is to have the candidate present his/her homework to one or more of the interviewers, where they can ask probing questions, hear explanations about why the candidate made certain choices, etc. This isn't a defense, but a discussion. Other good questions include extended conversations about projects that the candidate has worked on (within the range of discussion allowed by NDAs, of course) where implementation details, architectures, and technologies can be discussed.
Ask the same questions (within reason) of all candidates. This provides standardization and a baseline that you can use to evaluate candidates. It also (somewhat) removes your own personal biases from the process where you tailor certain kinds of questions to certain kinds of candidates.
No. Whiteboard. Coding. That merits repeating. No. Whiteboard. Coding. I said it. Whiteboard coding is a bad idea for a number of reasons.
First, it's not a natural environment for writing code in; even if it is "unnatural for everyone" as its proponents like to argue, it's not actually true that the degree of unnatural-ness is uniformly distributed. No one actually writes code on the board, without any kind of tools like tab-complete or and IDE or without Google/Stack Overflow/help files.
Second, whiteboard coding problems naturally skew towards trivial programming exercises that reflect "Algorithms 101" style problems. If you've been taking my advice and recruiting across a variety of backgrounds, many very qualified candidates won't have a formal CS education and won't be able to rattle off various sort algorithms from memory. Building on the first point, it's rare that people have to implement these things from scratch anyway -- and when they do, they have tools and books at their disposal. Making a whiteboard coding exercise look like an Algorithms 101 quiz question will tend to favor young candidates who just graduated with a CS degree and candidates who have spent a lot of time prepping for coding interviews. That's about it.
If the homework assignment was "iffy" in quality, ask the candidate to email in some code samples or post them on GitHub that can be discussed during the interview as well. This eliminates the need for whiteboard coding.
Interviewing is extremely stressful. Make sure you build in plenty of small breaks for the candidate to use the restroom, have water/coffee/etc., and recollect their thoughts. If the candidate wants to whiteboard things, let them, but don't force them. Treat them like humans that might be your co-worker one day soon!
Always, always, always leave time for the candidate to ask questions about the company and the position -- never omit this portion of an interview session merely because you want to ask more questions. As a candidate, doing so is a huge red flag for me. I interviewed at a very large tech firm where an interviewer told me he was forgoing my Q&A opportunity because he wanted to ask me one more question. This was a data point I used when I declined their offer.
Hiring and job-seeking is a matching process. You want everyone involved to have as much information as possible to avoid the possibility of a mismatch -- which is costly for everyone. Candidates need to know the truth about what it's like to work at your company, you need to see their work and discuss it with them. This is why conversations, not inquisitions are important.
I am advocating for an empathetic but potentially more time-consuming process here. I realize a lot of this advice runs contrary to widespread hiring practices, and I'm comfortable with that. I'm interesting in promoting a workplace that welcomes diversity in all forms and that matches candidates with jobs that help them grow as people and as data scientists. Similarly, you should be hiring people that will push your organization forward, and that encourages learning and teaching between peers.
<<< Back to writing
Copyright ©2022 Trey Causey