Metric-Driven Agile for Big Data
Working in Bing Local Search brings together a number of interesting challenges.
Firstly, we are in a moderately sized organization, which means that our org chart has some rough similarities to our high level system architecture. This means that we have back-end teams who worry mostly about data - getting it, improving it and shipping it. These teams are not sitting in the end-users' laps and our customers, to some extent, are internal.
Secondly, we are dealing with 'big data'. I don't consider local as it is traditionally implemented to be a big data problem per se; however, when one starts to consider processing user behaviour and web scale data to improve the product it does turn in to a big data problem.
Agile (or eXtreme programming) brings certain key concepts. These include a limited time horizon for planning (allowing issues to be addressed in a short time frame and limiting the impact of taking a wrong turn) and the 'on-site customer.'
The product of a data team in the context of a product like local search is somewhat specialized within the broader scope of big data. Data is our product (we create a model of a specific part of the real world - those places where you can peform on-site transactions), and we leverage large scale data assets to make that data product better.
The agile framework uses the limited time horizon (the 'sprint' or 'iteration') to ensure that unknowns are reduced appropriately and that real work is done in a manner aligned with what the customer wants. The unknowns are often related to either the customer (who generally doesn't really know what they want), technologies (candidate solutions need to be tested for feasibility) and the team (how much work can they actually get done in a sprint). Having attended a variety of scrum / agile / eXtreme training events, I am now of the opinion that the key unknown of big data - the unknowns in the data itself - are generally not considered in the framework (quite possibly because this approach to engineering took off long before large scale data was a thing).
In a number of projects where we are being agile, we have modified the framework with a couple of new elements.
Metrics, not Customers: We develop key metrics that guide our decision making process, rather than relying on a customer. Developing metrics is actually challenging. Firstly, they need to be a proxy for some customer. As our downstream customers are also challenged by the big data fog (they aren't quite sure what they will find in the data they want us to produce for them), we have to work with them to come up with proxy metrics which will guide our work without incurring the cost of doing end to end experimentation at every step. In addition, metrics are expensive - rigorously executing and delivering measurements is a skill required of second generation big data scientists.
The Data Wallow: While I'm not yet happy with this name, the basic concept is that in addition to the standard meetings and behaviours of agile engineering, we have the teams spend scheduled time together walling in the data. The purpose of this is two fold: firstly, it is vital that a data team be intimate with the data they are working with and the data products they are producing - the wallow provides shared data accountability. Secondly, you simply don't know what you will find in the data and how it will impact your design and planning decisions. The wallow provides a team experience which will directly impact sprint / iteration planning.
(image: agile framework / shutterstock)