By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
SmartData Collective
  • Analytics
    AnalyticsShow More
    predictive analytics in dropshipping
    Predictive Analytics Helps New Dropshipping Businesses Thrive
    12 Min Read
    data-driven approach in healthcare
    The Importance of Data-Driven Approaches to Improving Healthcare in Rural Areas
    6 Min Read
    analytics for tax compliance
    Analytics Changes the Calculus of Business Tax Compliance
    8 Min Read
    big data analytics in gaming
    The Role of Big Data Analytics in Gaming
    10 Min Read
    analyst,women,looking,at,kpi,data,on,computer,screen
    Promising Benefits of Predictive Analytics in Asset Management
    11 Min Read
  • Big Data
  • BI
  • Exclusive
  • IT
  • Marketing
  • Software
Search
© 2008-23 SmartData Collective. All Rights Reserved.
Reading: Why a Mere 300 Exabytes Will Give Us a Headache [VIDEO]
Share
Notification Show More
Latest News
ai digital marketing tools
Top Five AI-Driven Digital Marketing Tools in 2023
Artificial Intelligence
ai-generated content
Is AI-Generated Content a Net Positive for Businesses?
Artificial Intelligence
predictive analytics in dropshipping
Predictive Analytics Helps New Dropshipping Businesses Thrive
Predictive Analytics
cloud data security in 2023
Top Tools for Your Cloud Data Security Stack in 2023
Cloud Computing
become a data scientist
Boosting Your Chances for Landing a Job as a Data Scientist
Jobs
Aa
SmartData Collective
Aa
Search
  • About
  • Help
  • Privacy
Follow US
© 2008-23 SmartData Collective. All Rights Reserved.
SmartData Collective > Data Management > Best Practices > Why a Mere 300 Exabytes Will Give Us a Headache [VIDEO]
Best PracticesBig DataData ManagementPolicy and GovernanceSoftwareUnstructured Data

Why a Mere 300 Exabytes Will Give Us a Headache [VIDEO]

Datafloq
Last updated: 2013/05/29 at 12:18 PM
Datafloq
9 Min Read
SHARE
Although 90% of the available data in the world was created in the last two years, it does mean that there is still a lot of ‘old data’. In 2010 and 2011 we created in total 3 Zettabytes of data. If we use a very simplified calculation, it would mean that the amount of ‘old data’ is still approximately 0.3 Zettabyte or 300 Exabytes.

Although 90% of the available data in the world was created in the last two years, it does mean that there is still a lot of ‘old data’. In 2010 and 2011 we created in total 3 Zettabytes of data. If we use a very simplified calculation, it would mean that the amount of ‘old data’ is still approximately 0.3 Zettabyte or 300 Exabytes. If we compare that to the 2.5 Exabyte of data that we currently create every day, it looks like it is nothing to worry about. Unfortunately, that is wrong. Those 300 Exabytes of data will give us headaches, sleepless nights and it will cost a lot of energy and money.
 
Why? Because a large percentage of those 300 Exabytes reside in legacy systems that are incompatible with modern technology. We cannot switch off those systems and we cannot simply import the data in modern Hadoop platforms. Especially banks and insurance companies have many legacy systems, some of them having been in place for decades. Due to the many mergers and acquisitions in the finance world, banks sometimes have dozens of separate legacy systems. As Karl Flinders writes in his article, one bank even had 40 different legacy systems. These aging cobbled-together legacy systems can often be found in payment and credit card systems, ATMs and branch or channel solutions. The fact that these legacy systems cause companies headaches is illustrated by the Deutsche Bank, whose big data plans are held back due to the legacy systems.

Not only banks have to deal with legacy systems. Also the car industry has to deal with them. At Ford Motor Company, they have data centres that are running on software that is 30 or 40 years old. But also the pharming industry, travel industry or the public sector have to deal with legacy systems. Replacing these legacy systems is almost impossible. Flinders refers to it as “changing the engines on a Boeing 747 while in flight”.

However, how hard it may seem, it is not impossible, as was shown by the Commonwealth Bank of Australia. In the past 5 years they have replaced the entire bank’s core system, moved most of the services into the cloud and developed many apps and innovations that brought the bank at the forefront of innovation.

Legacy systems consist of traditional relational database management systems often on old and slow machines that cannot handle too much data at once. Hence, most of these legacy systems process their data at night and it can take some time to query the data needed. Real-time processing and analysing of data in legacy systems is impossible. We have to look for solutions to continue to use that old data.

More Read

utlizing big data for business model

Utilizing Data to Discover Shortcomings Within Your Business Model

Small Businesses Use Big Data to Offset Risk During Economic Uncertainty
The Importance of Data-Driven Approaches to Improving Healthcare in Rural Areas
Analytics Changes the Calculus of Business Tax Compliance
How Big Data Is Transforming the Renewable Energy Sector

One of the solutions how to deal with legacy systems and big data is to replace the entire legacy system of a company. A part from the massive risks involved in such an operation, there are also a lot of costs involved so it is not very likely that many organisations will take up this strategy.

As such, it is important to find ways to have new innovative technologies that allow real-time analysis of various data sets to co-exist with the legacy systems. These systems from the terabyte era still contain valuable (historical) information. There are several ways to keep and use the historical data in the data warehouses:

  1. Macro-batching of the data into the new big data solutions on a periodic timescale, for example every night. This data can then be used together with the ‘new’ data.
  2. Sending periodic summaries of the data in the data warehouse in order to use the data in those warehouses while preventing continues querying of that data. Only when certain information is required, the data warehouse is queried for that data and the data is retrieved.

These solutions will enable analysing both unstructured and structured legacy data within a single integrated architectural framework. Such a platform allows the legacy data to remain within the existing data warehouses and at the same time enable near real-time analyses.

Using middleware to enhance systems and replace the hardware that supports them is however not ideal. Another problem for the legacy systems is that with the overall acceptance of big data, a larger percentage of the IT budget will go to these big data projects. Leaving less money for the legacy systems. While in turn, the employees being able to work with the legacy systems become scare and thus expensive.

If such a trend continues for a too long time, there is a danger that the legacy system will fail one day, placing the company into a lot of trouble. The later organisations start with replacing these legacy systems, or at least try to make them compatible with big data technologies, the more expensive and difficult it will be.

A less risky but still expensive solution could also be to develop a specific algorithm that can transfer millions of lines of old data into modern distributed file systems. Until all data works correctly in the new distributed file systems, they can both co-exist. A paper by among others Mariano Ceccato, explains how they developed an algorithm to interfere a structured data model from a legacy data system in an attempt to restructure that legacy system into an up-to-data and usable data model.

Real transformative insights can only come when all data is used, including the data from legacy systems with incompatible data formats. Therefore, eventually the data in legacy systems will need to be transferred to massively scalable storage system and thus replacing the critical search, calculation and reporting functions of those legacy systems.

In the end, the ambition for any organisation with legacy systems should be to truly retire these systems, as companies will not be able to support them forever.  If, in the mean time, they simultaneously integrate that legacy data in one platform to produce data aggregation, they can already reap the benefits from historical data to create truly valuable insights.

Finally, I came across below video from EMC, which gives a great explanation of how to deal with legacy systems:

Copyright Big Data Startups 2013. You may share using our article tools. Please don’t cut articles from BigData-Startups.com and redistribute by email or post to the web.

Datafloq May 29, 2013
Share this Article
Facebook Twitter Pinterest LinkedIn
Share

Follow us on Facebook

Latest News

ai digital marketing tools
Top Five AI-Driven Digital Marketing Tools in 2023
Artificial Intelligence
ai-generated content
Is AI-Generated Content a Net Positive for Businesses?
Artificial Intelligence
predictive analytics in dropshipping
Predictive Analytics Helps New Dropshipping Businesses Thrive
Predictive Analytics
cloud data security in 2023
Top Tools for Your Cloud Data Security Stack in 2023
Cloud Computing

Stay Connected

1.2k Followers Like
33.7k Followers Follow
222 Followers Pin

Sign Up for Our Newsletter

Subscribe to our newsletter to get our newest articles instantly!

[mc4wp_form id=”1616″]

You Might also Like

utlizing big data for business model
Big Data

Utilizing Data to Discover Shortcomings Within Your Business Model

6 Min Read
big data use in small businesses
Big Data

Small Businesses Use Big Data to Offset Risk During Economic Uncertainty

7 Min Read
data-driven approach in healthcare
Analytics

The Importance of Data-Driven Approaches to Improving Healthcare in Rural Areas

6 Min Read
analytics for tax compliance
Analytics

Analytics Changes the Calculus of Business Tax Compliance

8 Min Read

SmartData Collective is one of the largest & trusted community covering technical content about Big Data, BI, Cloud, Analytics, Artificial Intelligence, IoT & more.

giveaway chatbots
How To Get An Award Winning Giveaway Bot
Big Data Chatbots Exclusive
data-driven web design
5 Great Tips for Using Data Analytics for Website UX
Big Data

Quick Link

  • About
  • Contact
  • Privacy
Follow US

© 2008-23 SmartData Collective. All Rights Reserved.

Removed from reading list

Undo
Go to mobile version
Welcome Back!

Sign in to your account

Lost your password?