Enterprises today want to tap into the wealth of information hidden in the data around them to improve competitiveness, efficiency and profitability. Huge volumes of data are created every day from a variety of sources including: sensors, smart devices, social media and billions of Internet and smart phone users worldwide. The challenge is storing, managing and deriving just-in-time insights from this data, while preserving and using existing information management investments. The Big Data challenge is pervasive across the majority of industries including: finance, government, telecommunications, retail, healthcare, energy and utilities.
Successfully gaining insights from Big Data is driving the evolution of business roles, corporate culture and IT. Successful implementations of new data solutions require business analysts, application developers, data architects and administrators to work together to understand and extract value from structured, unstructured and real-time streaming data. New roles are emerging, including the data scientist and new models are changing the way people work together and interact with their customers. As a result, IT vendors are being driven to simplify their Big Data solutions.
High Performance Environments and Big Data
Big Data is characterized by an explosion in the variety, volume and velocity of information created. Personal interactions including social media, our use of smart devices and our use of the Internet all create huge volumes of data. Behind the scenes, machine-to-machine interactions, sensors, recommendation engines and APIs drive a proliferation of information. From a volume standpoint, we are talking about terabytes, petabytes and beyond. Much of this data – in some studies as much as 80 percent – has little to no structure and much of it is generated at machine speed; at high velocity. The challenge is to remove the noise from this high velocity highly variable data and discover key insights while they are still relevant.
The evolution of the relational database has been driven by three major requirements: the need for management of larger data volumes, performance improvements and reduction in query response time. These requirements led to the large distributed systems developed by relational database vendors including IBM. Over the last two decades, these systems have become the backbone of transactional and warehousing applications. Increasing data volumes led to a new generation of massively parallel high performance databases. These databases ingest and process large amounts of structured data very efficiently. Unfortunately not all of today’s data can be represented as structured data. Examples of semi-structured or unstructured data include: text messages, video, voice, system logs, financial reports, email, legal documents, journals and weather data. These types of data led to the need for a new platform to perform real time, historical and predictive analytics over high volumes of variable data, structured and unstructured.
In certain applications, Big Data needs to be pre-processed and analyzed before inserting into relational databases. In these instances, Big Data augments and enhances existing applications and business processes. This hybrid data processing approach augments the traditional relational databases with the new Big Data platform. This approach enables enterprises to gain the most value from highly variable data with the same high performance and reliability delivered by the relational databases.
The Emergence of the Data Scientist
The data scientist has emerged as a key new role in enterprises that embrace Big Data. On any given day, there may be 2,000 job openings for that role. While traditional database administrators work with database queries, performance and storage efficiencies, data scientists establish data strategies and formulate the broader set of data to be analyzed and questions to be answered using the new data. The data scientist works across business lines to help the enterprise understand actionable trends and patterns in the data. On any given day the role: discovers, secures, cleans, explores; formats, structures, models and analyses data. The position is a hybrid, requiring the application of data mining, statistics and machine learning to answer pre-defined questions while also exploring new and existing data to discover new information and insights.
The data scientist generally works with the CIO or CTO, advising them how to derive maximum business value from Big Data and how to best integrate new information with their existing systems, infrastructure and processes. This direct communication between data scientist and executives helps establish and manage expectations in the current environment of accelerating executive expectations on the IT staff. The data scientist also advises administrators and application developers as new information is brought on-line and new information tools are acquire. People who are particularly successful as data scientists can handle the complexities of information management and expand the position to pursue their own research, learning and growing every day. Other roles emerging in this area include professionals that will help manage and navigate the new information. These positions include individuals with functional expertise in administration, performance monitoring and data management. Professionals who understand Hadoop as an underlying technology can assist with the intake and consumption of Big Data and new information.
A company that recognizes the need for a data scientist is also likely to possess another core organizational element for success with Big Data, a culture that recognizes the importance of asking new questions. This type of culture looks at not only the questions you know you need to ask, but also searching for new questions and new ways to look at and analyze the business. This business leadership consistently examines data generated from operations, makes strategic decisions and plans for growth based on that data, and then evaluates the results in a closed loop process. A company that connects IT to the broader business challenges can convert information into business insights and competitive advantage.
With the combined help of a data scientist and progressive culture, executives and IT managers can use information to look across business lines, departments and divisions to solve mission-critical issues. Consuming and integrating information from across the organization is essential to gaining a better view of customers, the market, internal operations and strategic opportunities. This level of integration and insight can only be developed when the IT department views the Big Data IT challenges as business challenges. Further, executives also need to prioritize and encourage collaboration and integration.
Industry Demands for IT Vendors
For IT vendors, these trends are driving the enhancement of current applications to embrace Big Data and the evolution of the Big Data platform itself, including: infrastructure, analytics and tools. Companies will achieve the most value from Big Data when IT providers simplify access and analysis tools for trusted data thus making the role of the database administrator and internal IT staff easier. To this end, interfaces need to be standardized and processes automated. Also, required are out-of-the-box applications and robust processing platforms and API’s that are consumable by application developers. With this broad arsenal of resources, companies and their IT departments can fully exploit the massive volumes of available data. The IT vendor that provides all these elements will win.
Big Data is an important opportunity for businesses today. It can provide a key competitive advantage to those that strive to better understand their operations and their customers. Integrating Big Data and applying context, patterns and intelligence will drive new business efficiencies and deliver an improved view of the customer. When embraced by an organization with a high performance data environment and staff empowered to work across functional boundaries, Big Data will quickly become an important tool to help deliver the insights that drive business improvements.