By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
SmartData Collective
  • Analytics
    AnalyticsShow More
    predictive analytics in dropshipping
    Predictive Analytics Helps New Dropshipping Businesses Thrive
    12 Min Read
    data-driven approach in healthcare
    The Importance of Data-Driven Approaches to Improving Healthcare in Rural Areas
    6 Min Read
    analytics for tax compliance
    Analytics Changes the Calculus of Business Tax Compliance
    8 Min Read
    big data analytics in gaming
    The Role of Big Data Analytics in Gaming
    10 Min Read
    analyst,women,looking,at,kpi,data,on,computer,screen
    Promising Benefits of Predictive Analytics in Asset Management
    11 Min Read
  • Big Data
  • BI
  • Exclusive
  • IT
  • Marketing
  • Software
Search
© 2008-23 SmartData Collective. All Rights Reserved.
Reading: Data Integration Roadmap to Support Big Data and Analytics
Share
Notification Show More
Latest News
ai in marketing with 3D rendering
Marketers Use AI to Take Advantage of 3D Rendering
Artificial Intelligence
How Big Data Is Transforming the Maritime Industry
How Big Data Is Transforming the Maritime Industry
Big Data
ai digital marketing tools
Top Five AI-Driven Digital Marketing Tools in 2023
Artificial Intelligence
ai-generated content
Is AI-Generated Content a Net Positive for Businesses?
Artificial Intelligence
predictive analytics in dropshipping
Predictive Analytics Helps New Dropshipping Businesses Thrive
Predictive Analytics
Aa
SmartData Collective
Aa
Search
  • About
  • Help
  • Privacy
Follow US
© 2008-23 SmartData Collective. All Rights Reserved.
SmartData Collective > Analytics > Data Integration Roadmap to Support Big Data and Analytics
AnalyticsBig DataData ManagementUnstructured Data

Data Integration Roadmap to Support Big Data and Analytics

Raju Bodapati
Last updated: 2012/12/11 at 7:07 PM
Raju Bodapati
9 Min Read
SHARE

Traditional extract, transform and load (ETL) has existed since the times when data warehousing evolved to help move data from legacy mainframe applications.  Therefore, data movement from files to relational or dimensional databases for the consumption by reporting engines has been the focus of ETL. Even in the data world today where most focus has been on data visualization or analytics or business intelligence, data professionals recognize the importance of effective ETL engines as the backbone.

Traditional extract, transform and load (ETL) has existed since the times when data warehousing evolved to help move data from legacy mainframe applications.  Therefore, data movement from files to relational or dimensional databases for the consumption by reporting engines has been the focus of ETL. Even in the data world today where most focus has been on data visualization or analytics or business intelligence, data professionals recognize the importance of effective ETL engines as the backbone. However, with changes such as widespread data access points, diverse data sources and unstructured data, expectations on data connections interfaces have moved towards data integration rather than traditional data movement.    

It is inspiring to read the new TDWI publication by David Loshin, “Satisfying New Requirements for Data Integration“, that briefly highlights changing demands on data integration as a checklist report.  There were seven demands listed in this report; a) increase performance and efficiency, b) integrate the cloud, c) protect information in the integration layer, d) embed master data services, e) process big data and enterprise data, f) satisfy real-time demands and g) develop data quality and data governance policies and practices.

While Loshin identified very well the changing demands on legacy ETL platforms in this publication, it is still a presentation of the future wishful state rather than the path organizations can take to build data integration framework that can sustain the evolving needs. The following is the five step roadmap with specific measures organizations can take as they move towards that future state.

More Read

How Big Data Is Transforming the Maritime Industry

How Big Data Is Transforming the Maritime Industry

Predictive Analytics Helps New Dropshipping Businesses Thrive
Utilizing Data to Discover Shortcomings Within Your Business Model
Small Businesses Use Big Data to Offset Risk During Economic Uncertainty
The Importance of Data-Driven Approaches to Improving Healthcare in Rural Areas

Step 1: get the foundation strong

Establishing a strong data quality and governance organization is perhaps the first foundation needed for data organizations aspiring to transition from mere data moving / storing entity to an information enablement engine. The data integration platform should enforce the policies and practices the organization establishes. ETL infrastructure gets the first look at the nature and volume of the data quality problems of source systems as they are integrated with rest of the organization. The traditional approach in ETL has been finding the workarounds to push the data through by making some tradeoffs. However, these chokepoints, AKA fault tolerance gates, should be re-examined to feed the data quality and integrity problems they reveal into the data governance organization. This does not mean that the organization cannot move to the next step unless they resolve all the data quality issues, but asks to establish visibility and have proper governance to process the data integrity issues of the organization.

Step 2: get serious about information security 

Traditionally, ETL engines land sensitive data and after use do not always discard it from the logs and temporary staging areas. Access, authorization and authentication are compromised when multiple people have ability to use service accounts. Also, when production data is refreshed into test or development environments, scrubbing the data to de-sensitize it is often ignored. Information security especially within the ETL world needs very thorough audits and controls to ensure security policies are enforced. Without this, enabling a wide spread data integration infrastructure can multiply these vulnerabilities and conceivably could be fatal to the organization.

Step 3: smarter master data and graceful validation services

Most Master data management (MDM) implementations continue to remain static and user managed. However, when used well, ETL infrastructure can implement an active and evolving master data management system.  Therefore, one of the first steps organizations are leveraging is to integrate the MDM tools and methods with the ETL engines. Also, ETL engines are increasingly integrating with geospatial validation software or data mapping / translation engines for enforcing data integrity. This is enabling the interfaces to be a bit more graceful and not become chokes when dealing with bad data. There are always strong arguments on what ETL should or should not do to data. However, ETL’s tradition role of moving the data without touching it is getting replaced with integrating data into the organizational information web. These steps can lay down the path for ETL engines as they form the organizational data integration architecture.

Step 4: upgrade the data integration infrastructure with the future in mind

When budgeting ETL infrastructure, most organizations use feedback mechanisms (what went wrong in the past) rather than the feed-forward mechanisms (what needs to go right in the future.) As a result, businesses often find themselves trying to find shortcuts to meet their changing demands with unstructured infrastructure. Traditional ETL environments always lag behind in order to catch up with the damage to data and process integrity caused by such short sighted temporary investments. Therefore, a major part of transforming an ETL organization to a data integration organization involves strategic investment decisions on the fundamental infrastructure needs of the future establishment. For example, when integration with cloud or real-time active data warehousing is on the horizon, the infrastructure investment decisions have to be taken now rather than waiting until the last hour. This calls for program management thinking and not infrastructure support mindset while budgeting.  

Step 5:  enable expanded data integration

Organizations that achieved progress in the previous steps can then think of how integration with cloud and big data analytics or mobile / self-service business intelligence needs be met by their data integration infrastructure. As Loshin explained in his article, mounds of structured data, unstructured data, big data, and advancements in cloud technology coupled with end user driven needs such as mobile BI, self-service BI, real-time reporting, advanced visualization techniques, are rapidly expanding the need for data integration competence well beyond what the traditional data movement ETL engines have to offer. At this stage, the data integration architecture has the necessary security framework, graceful validation to support the unexpected behaviors in data feeds, ability to integrate with and build organizational master data and the required strategic programs in place to support an organizational enablement demanded.

Summary

Traditional ETL infrastructure and processes need a clear roadmap, to consider expected future demands rather than reacting to issues / challenges faced in the past. Building the data integration infrastructure that can support future business needs should be managed as a program with step-by-step evolution. Data integration infrastructure should support new data sources from cloud, unstructured data or big data. Also, data integration infrastructure should be able to support real-time needs for data, mobile business intelligence, information access and performance demands, information security needs, and analytics. The steps described in this article can provide vision into a roadmap as the traditional ETL infrastructures transition to become the data integration services providers.

Raju Bodapati December 11, 2012
Share this Article
Facebook Twitter Pinterest LinkedIn
Share

Follow us on Facebook

Latest News

ai in marketing with 3D rendering
Marketers Use AI to Take Advantage of 3D Rendering
Artificial Intelligence
How Big Data Is Transforming the Maritime Industry
How Big Data Is Transforming the Maritime Industry
Big Data
ai digital marketing tools
Top Five AI-Driven Digital Marketing Tools in 2023
Artificial Intelligence
ai-generated content
Is AI-Generated Content a Net Positive for Businesses?
Artificial Intelligence

Stay Connected

1.2k Followers Like
33.7k Followers Follow
222 Followers Pin

Sign Up for Our Newsletter

Subscribe to our newsletter to get our newest articles instantly!

[mc4wp_form id=”1616″]

You Might also Like

How Big Data Is Transforming the Maritime Industry
Big Data

How Big Data Is Transforming the Maritime Industry

8 Min Read
predictive analytics in dropshipping
Predictive Analytics

Predictive Analytics Helps New Dropshipping Businesses Thrive

12 Min Read
utlizing big data for business model
Big Data

Utilizing Data to Discover Shortcomings Within Your Business Model

6 Min Read
big data use in small businesses
Big Data

Small Businesses Use Big Data to Offset Risk During Economic Uncertainty

7 Min Read

SmartData Collective is one of the largest & trusted community covering technical content about Big Data, BI, Cloud, Analytics, Artificial Intelligence, IoT & more.

ai is improving the safety of cars
From Bolts to Bots: How AI Is Fortifying the Automotive Industry
Artificial Intelligence
AI and chatbots
Chatbots and SEO: How Can Chatbots Improve Your SEO Ranking?
Artificial Intelligence Chatbots Exclusive

Quick Link

  • About
  • Contact
  • Privacy
Follow US

© 2008-23 SmartData Collective. All Rights Reserved.

Removed from reading list

Undo
Go to mobile version
Welcome Back!

Sign in to your account

Lost your password?