By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
SmartData Collective
  • Analytics
    AnalyticsShow More
    data analytics in sports industry
    Here’s How Data Analytics In Sports Is Changing The Game
    6 Min Read
    data analytics on nursing career
    Advances in Data Analytics Are Rapidly Transforming Nursing
    8 Min Read
    data analytics reveals the benefits of MBA
    Data Analytics Technology Proves Benefits of an MBA
    9 Min Read
    data-driven image seo
    Data Analytics Helps Marketers Substantially Boost Image SEO
    8 Min Read
    construction analytics
    5 Benefits of Analytics to Manage Commercial Construction
    5 Min Read
  • Big Data
  • BI
  • Exclusive
  • IT
  • Marketing
  • Software
Search
© 2008-23 SmartData Collective. All Rights Reserved.
Reading: Interviewing Software Engineers: Retiring a Great Interview Problem
Share
Notification Show More
Latest News
data analytics in sports industry
Here’s How Data Analytics In Sports Is Changing The Game
Big Data
data analytics on nursing career
Advances in Data Analytics Are Rapidly Transforming Nursing
Analytics
data analytics reveals the benefits of MBA
Data Analytics Technology Proves Benefits of an MBA
Analytics
anti-spoofing tips
Anti-Spoofing is Crucial for Data-Driven Businesses
Security
ai in software development
3 AI-Based Strategies to Develop Software in Uncertain Times
Software
Aa
SmartData Collective
Aa
Search
  • About
  • Help
  • Privacy
Follow US
© 2008-23 SmartData Collective. All Rights Reserved.
SmartData Collective > Data Management > Best Practices > Interviewing Software Engineers: Retiring a Great Interview Problem
Best PracticesCommentaryCulture/Leadership

Interviewing Software Engineers: Retiring a Great Interview Problem

Daniel Tunkelang
Last updated: 2011/08/08 at 11:41 AM
Daniel Tunkelang
12 Min Read
SHARE

Interviewing software engineers is hard. Jeff Atwood bemoans how difficult it is to find candidates who can write code. The tech press sporadically publishes “best” interview questions that make me cringe — though I love the IKEA question. Startups like Codility and Interview Street see this challenge as an opportunity, offering hiring managers the prospect of outsourcing their coding interviews. Meanwhile, Diego Basch and others are urging us to stop subjecting candidates to whiteboard coding exercises.

More Read

database compliance guide

Four Strategies For Effective Database Compliance

How The Explosive Growth Of Data Access Affects Your Engineer’s Team Efficiency
What Are the Most Serious Privacy Concerns Regarding Big Data?
7 Consequences of a Data Intrusion: Insights From Asiaciti Trust & MGM International
How To Improve Incident Response Time for Data Breaches

I don’t have a silver bullet to offer. I agree that IQ tests and gotcha questions are a terrible way to assess software engineering candidates. At best, they test only one desirable attribute; at worst, they are a crapshoot as to whether a candidate has seen a similar problem or stumbles into the key insight. Coding questions are a much better tool for assessing people whose day job will be coding, but conventional interviews — whether by phone or in person — are a suboptimal way to test coding strength. Also, it’s not clear whether a coding question should assess problem-solving, pure translation of a solution into working code, or both.

In the face of all of these challenges, I came up with an interview problem that has served me and others well for a few years at Endeca, Google, and LinkedIn. It is with a heavy heart that I retire it, for reasons I’ll discuss at the end of the post. But first let me describe the problem and explain why it has been so effective.

The Problem

I call it the “word break” problem and describe it as follows:

Given an input string and a dictionary of words,
segment the input string into a space-separated
sequence of dictionary words if possible. For
example, if the input string is "applepie" and
dictionary contains a standard set of English words,
then we would return the string "apple pie" as output.

Note that I’ve deliberately left some aspects of this problem vague or underspecified, giving the candidate an opportunity to flesh them out. Here are examples of questions a candidate might ask, and how I would answer them:

Q: What if the input string is already a word in the
dictionary?
A: A single word is a special case of a space-separated
sequence of words.

Q: Should I only consider segmentations into two words?
A: No, but start with that case if it's easier.

Q: What if the input string cannot be segmented into a
sequence of words in the dictionary?
A: Then return null or something equivalent.

Q: What about stemming, spelling correction, etc.?
A: Just segment the exact input string into a sequence
of exact words in the dictionary.

Q: What if there are multiple valid segmentations?
A: Just return any valid segmentation if there is one.

Q: I'm thinking of implementing the dictionary as a
trie, suffix tree, Fibonacci heap, ...
A: You don't need to implement the dictionary. Just
assume access to a reasonable implementation.

Q: What operations does the dictionary support?
A: Exact string lookup. That's all you need.

Q: How big is the dictionary?
A: Assume it's much bigger than the input string,
but that it fits in memory.

Seeing how a candidate negotiates these details is instructive: it offers you a sense of the candidate’s communication skills and attention to detail, not to mention the candidate’s basic understanding of data structures and algorithms.

A FizzBuzz Solution

Enough with the problem specification and on to the solution. Some candidates start with the simplified version of the problem that only considers segmentations into two words. I consider this a FizzBuzz problem, and I expect any competent software engineer to produce the equivalent of the following in their programming language of choice. I’ll use Java in my example solutions.

String SegmentString(String input, Set dict) {
int len = input.length();
for (int i = 1; i < len - 1; i++) {
String prefix = input.substring(0, i);
if (dict.contains(prefix)) {
String suffix = input.substring(i+1, len);
if (dict.contains(suffix)) {
return prefix + " " + suffix;
}
}
}
return null;
}

I have interviewed candidates who could not produce the above — including candidates who had passed a technical phone screen at Google. As Jeff Atwood says, FizzBuzz problems are a great way to keep interviewers from wasting their time interviewing programmers who can’t program.

A General Solution

Of course, the more interesting problem is the general case, where the input string may be segmented into any number of dictionary words. There are a number of ways to approach this problem, but the most straightforward is recursive backtracking. Here is a typical solution that builds on the previous one:

String SegmentString(String input, Set dict) {
if (dict.contains(input) return input;
int len = input.length();
for (int i = 1; i < len - 1; i++) {
String prefix = input.substring(0, i);
if (dict.contains(prefix)) {
String suffix = input.substring(i+1, len);
String segSuffix = SegmentString(suffix, dict);
if (segSuffix != null) {
return prefix + " " + segSuffix;
}
}
}
return null;
}

Many candidates for software engineering positions cannot come up with the above or an equivalent (e.g., a solution that uses an explicit stack) in half an hour. I’m sure that many of them are competent and productive. But I would not hire them to work on information retrieval or machine learning problems, especially at a company that delivers search functionality on a massive scale.

Analyzing the Running Time

But wait, there’s more! When a candidate does arrive at a solution like the above, I ask for an big O analysis of its worst-case running time as a function of n, the length of the input string. I’ve heard candidates respond with everything from O(n) to O(n!).

I typically offer the following hint:

Consider a pathological dictionary containing the words
"a", "aa", "aaa", ..., i.e., words composed solely of
the letter 'a'. What happens when the input string is a
sequence of n-1 'a's followed by a 'b'?

Hopefully the candidate can figure out that the recursive backtracking solution will explore every possible segmentation of this input string, which reduces the analysis to determine the number of possible segmentations. I leave it as an exercise to the reader (with this hint) to determine that this number is O(2n).

An Efficient Solution

If a candidate gets this far, I ask if it is possible to do better than O(2n). Most candidates realize this is a loaded question, and strong ones recognize the opportunity to apply dynamic programming or memoization. Here is a solution using memoization:

Map memoized;

String SegmentString(String input, Set dict) {
if (dict.contains(input) return input;
if (memoized.containsKey(input) {
return memoized.get(input);
}
int len = input.length();
for (int i = 1; i < len - 1; i++) {
String prefix = input.substring(0, i);
if (dict.contains(prefix)) {
String suffix = input.substring(i+1, len);
String segSuffix = SegmentString(suffix, dict);
if (segSuffix != null) {
memoized.put(input, prefix + " " + segSuffix);
return prefix + " " + segSuffix;
}
}
memoized.put(input, null);
return null;
}

Again the candidate should be able to perform the worst-case analysis. The key insight is that SegmentString is only called on suffixes of the original input string, and that there are only O(n) suffixes. I leave as an exercise to the reader to determine that the worst-case running time of the memoized solution above is O(n2), assuming that the substring operation only requires constant time (a discussion which itself makes for an interesting tangent).

Why I Love This Problem

There are lots of reasons I love this problem. I’ll enumerate a few:

  • It is a real problem that came up in the course of developing production software. I developed Endeca’s original implementation for rewriting search queries, and this problem came up in the context of spelling correction and thesaurus expansion.
  • It does not require any specialized knowledge — just strings, sets, maps, recursion, and a simple application of dynamic programming / memoization. Basics that are covered in a first- or second-year undergraduate course in computer science.
  • The code is non-trivial but compact enough to use under the tight conditions of a 45-minute interview, whether in person or over the phone using a tool like Collabedit.
  • The problem is challenging, but it isn’t a gotcha problem. Rather, it requires a methodical analysis of the problem and the application of basic computer science tools.
  • The candidate’s performance on the problem isn’t binary. The worst candidates don’t even manage to implement the fizzbuzz solution in 45 minutes. The best implement a memoized solution in 10 minutes, allowing you to make the problem even more interesting, e.g., asking how they would handle a dictionary too large to fit in main memory. Most candidates perform somewhere in the middle.

Happy Retirement

Unfortunately, all good things come to an end. I recently discovered that a candidate posted this problem on Glassdoor, a site that does not seem to mind when interview candidates violate NDAs under the veil of anonymity. The solution posted there hardly goes into the level of detail I’ve provided in this post, but I decided that a problem this good deserved to retire in style.

It’s hard to come up with good interview problems, and it’s also hard to keep secrets. The secret may be to keep fewer secrets. An ideal interview question is one for which advance knowledge has limited value. I’m working with my colleagues on such an approach. Naturally, I’ll share more if and when we deploy it.

In the mean time, I hope that everyone who experienced the word break problem appreciated it as a worthy test of their skills. No problem is perfect, nor can performance on a single interview question ever be a perfect predictor of how well a candidate will perform as an engineer. Still, this one was pretty good, and I know that a bunch of us will miss it.

Daniel Tunkelang August 8, 2011
Share this Article
Facebook Twitter Pinterest LinkedIn
Share

Follow us on Facebook

Latest News

data analytics in sports industry
Here’s How Data Analytics In Sports Is Changing The Game
Big Data
data analytics on nursing career
Advances in Data Analytics Are Rapidly Transforming Nursing
Analytics
data analytics reveals the benefits of MBA
Data Analytics Technology Proves Benefits of an MBA
Analytics
anti-spoofing tips
Anti-Spoofing is Crucial for Data-Driven Businesses
Security

Stay Connected

1.2k Followers Like
33.7k Followers Follow
222 Followers Pin

You Might also Like

database compliance guide
Data Management

Four Strategies For Effective Database Compliance

8 Min Read
data access for engineers
Big Data

How The Explosive Growth Of Data Access Affects Your Engineer’s Team Efficiency

11 Min Read
big data privacy concerns
Privacy

What Are the Most Serious Privacy Concerns Regarding Big Data?

8 Min Read
painful lessons from major data breaches
Security

7 Consequences of a Data Intrusion: Insights From Asiaciti Trust & MGM International

6 Min Read

SmartData Collective is one of the largest & trusted community covering technical content about Big Data, BI, Cloud, Analytics, Artificial Intelligence, IoT & more.

AI and chatbots
Chatbots and SEO: How Can Chatbots Improve Your SEO Ranking?
Artificial Intelligence Chatbots Exclusive
giveaway chatbots
How To Get An Award Winning Giveaway Bot
Big Data Chatbots Exclusive

Quick Link

  • About
  • Contact
  • Privacy
Follow US

© 2008-23 SmartData Collective. All Rights Reserved.

Removed from reading list

Undo
Go to mobile version
Welcome Back!

Sign in to your account

Lost your password?