Build vs. Buy? Things to Consider Before Building an In-House Analytics System

Many analytics projects involving log files focus on operational event data. IT use cases around such projects typically focus on locating errors, warnings and critical event information within mountains of data. However, software applications and technology devices produce much more machine data than just log files, which can be incredibly complex.

Organizations need to search and mine information contained in machine data and have two choices to execute this challenge: 1) obtain a purpose-built analytics system or 2) build one in-house. While the latter option may sound appealing at first, here are some things to consider before building an in-house machine analytics system:

1. Different users have different requirements

A machine data analytics solution must satisfy the requirements of a wide range of internal consumers. A support engineer working a case needs to be able to see patterns of events and statistics over time grouped by specifics system components. An account manager, professional services engineer or sales rep has very different data analysis needs that involve being able to quickly spot reliability, performance and other issues involving an account, while a marketing or program manager needs spotting feature adoption and trends in product demand. This differentiation between jobs and teams highlights the complexity of an analytics system through machine data.

2. Every product has a complex and unique representation for machine data

For a set of data to be useful, it needs to communicate detailed information about the configuration, events, and statistics of each product’s unique architecture. Applications, appliances, software companies always log events, counters and information about internal component states, down to the level of vendor-specific abstractions. Much of this data will be specific and will not conform to the common information model for interoperability or end use.

3. A useful data archive will be “Big Data

To be of maximal benefit to all consumers, a good set of machine data need to contain “everything.” Trending and field analysis require detailed parsing for all systems continuously, not just when problems are detected. A successful product may have thousands or even millions of devices and systems reporting data back regularly. The volume of data received and retained in such a case is likely to be in the range of hundreds of terabytes or petabytes over the course of a year.

4. Data formatting and semantics will change quickly and without notice

Machine data analytics require quick adaptation, edge-case coverage and continuous business leverage. A machine data analytics solution cannot expect schematized for specially formatted data, it needs to adapt quickly to changes in format of information from machine logs while maintaining semantic continuity with existing tools.

In-house solution: The bottom line

While companies can choose to build such an analytical solution in house, it’s not worth the time and effort to do so. An in-house machine data analytics solution is a complex high-performance big data project associated BI tools that requires a variety of committed resources for an extended period of time. It’s inherently time-consuming and risky if not planned properly with appropriate resources needed to design and implement, but also to maintain and manage its life cycle on a continued basis.