Cookies help us display personalized product recommendations and ensure you have great shopping experience.

By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
SmartData CollectiveSmartData Collective
  • Analytics
    AnalyticsShow More
    New Data Analytics Breakthroughs Give eCommerce Startups a Fighting Chance
    New Data Analytics Breakthroughs Give eCommerce Startups a Fighting Chance
    6 Min Read
    How Data Analytics Is Reshaping Patient Financing Decisions
    How Data Analytics Is Reshaping Patient Financing Decisions
    13 Min Read
    business using business intelligence
    How to Use a Competitive Intelligence Dashboard to Turn Market Data Into Smarter Marketing Decisions 
    9 Min Read
    unusual trading activity
    Signal Or Noise? A Decision Tree For Evaluating Unusual Trading Activity
    3 Min Read
    software developer using ai
    How Data Analytics Helps Developers Deliver Better Tech Services
    8 Min Read
  • Big Data
  • BI
  • Exclusive
  • IT
  • Marketing
  • Software
Search
© 2008-25 SmartData Collective. All Rights Reserved.
Reading: ETL Checkpoints
Share
Notification
Font ResizerAa
SmartData CollectiveSmartData Collective
Font ResizerAa
Search
  • About
  • Help
  • Privacy
Follow US
© 2008-23 SmartData Collective. All Rights Reserved.
SmartData Collective > Business Intelligence > ETL Checkpoints
Business Intelligence

ETL Checkpoints

MIKE20
MIKE20
4 Min Read
SHARE

Contents
  • The Case for Checkpoints
  • Enter the Checkpoint
  • Feedback

ETL tools can be extremely involved, especially with complex data sets. At one time or another, many data management professionals have built tools that have done the following:

More Read

The Energy Collective New “blogpod” on the…
The Surprising Benefits Of AI-Driven Video Conferencing In Education
5 Step Process For Insightful Data Driven Business Decision Making
Question: Why Are You In Social Channels?
Craving value: sparks for a new economic engine

ETL tools can be extremely involved, especially with complex data sets. At one time or another, many data management professionals have built tools that have done the following:

  • Taken data from multiple places.
  • Transformed into (often significantly) into formats that other systems can accept.
  • Loaded said data into new systems.

In this post, I discuss how to add some basic checkpoints into tools to prevent things from breaking bad.

The Case for Checkpoints

Often, consultants like me are brought into organizations in need of solving urgent data-related problems. Rather than gather requirements and figure everything out, the client usually wants to just start building. Rarely will people listen when consultants advocate the need to take a step back before beginning our development efforts in earnest. While this is a bit of a generalization, few non-technical folks understand:

  • the building blocks required to create effective ETL tools
  • the need to know what you need to do–before you actually have to do it
  • the amount of rework required should developers have an incomplete or inaccurate understanding of what the client needs done

Clients with a need to get something done immediately don’t want to wade through requirements; they want action–and now. The consultant who doth protests too much runs the risk of irritating his/her clients, not to mention being replaced. While you’ll never hear me argue against understanding as much as possible before creating an ETL tool, I have ways to placate demanding clients while concurrently minimizing rework.

Enter the Checkpoint

Checkpoints are simply invaluable tools for preventing things from getting particularly messy. Even simple SQL SELECT statements identifying potentially errant records can be enormously useful. For example, on my current assignment, I need to manipulate a large number of financial transactions from disparate systems. Ultimately, these transactions need to precisely balance against each other. Should one transaction be missing or inaccurate, things can go awry. I might need to review the thirty or so queries that transform the data, looking for an error on my end. This can be time-consuming and even futile.

Enter the checkpoint. Before the client or I even run the balancing routine, my ETL tool spits out a number of audits that identify major issues before anything else happens. These include:

  • Missing currencies
  • Missing customer accounts
  • Null values
  • Duplicate records

While the absence of results on these audits guarantees nothing, both the client and I know not to proceed if we’re not ready. Consider starting a round of golf only two realize on the third whole that you forgot extra balls, your pitching wedge, and drinking water. You’re probably not going to have a great round.

Sure, agile methods are valuable. However, one of the chief limitations of iterative development is that you may well be building something incorrectly or sub-optimally. While checkpoints offer no guarantee, at least they can stop the bleeding before wasting a great deal of time analyzing problems that don’t exist. Use them liberally; should the produce no errors, you can always ignore them, armed with increased confidence that you’re on the right track.

Feedback

What say you?

Read more at MIKE2.0: The Open Source Standard for Information Management

TAGGED:etl
Share This Article
Facebook Pinterest LinkedIn
Share

Follow us on Facebook

Latest News

New Data Analytics Breakthroughs Give eCommerce Startups a Fighting Chance
New Data Analytics Breakthroughs Give eCommerce Startups a Fighting Chance
Analytics Big Data Exclusive
data driven businesses
How Data-Driven Businesses Choose Storage That Reduces Risk and Drag
Big Data Exclusive
Operational Data Becomes Business Value in the Age of AIoT
Operational Data Becomes Business Value in the Age of AIoT
Big Data Exclusive Internet of Things
growth guide
Growing Smarter: The Role Of Strategic Partnerships From Startup To Scale
Infographic News

Stay Connected

1.2KFollowersLike
33.7KFollowersFollow
222FollowersPin

You Might also Like

etl for data-driven businesses
Big Data

Understanding ETL Tools as a Data-Centric Organization

8 Min Read

Estimating Extract, Transform, and Load (ETL) Projects

20 Min Read

5 Simple Tidbits to Include in Your Data Error Report

5 Min Read

Look, Ma. No ETL

4 Min Read

SmartData Collective is one of the largest & trusted community covering technical content about Big Data, BI, Cloud, Analytics, Artificial Intelligence, IoT & more.

AI chatbots
AI Chatbots Can Help Retailers Convert Live Broadcast Viewers into Sales!
Chatbots
ai is improving the safety of cars
From Bolts to Bots: How AI Is Fortifying the Automotive Industry
Artificial Intelligence

Quick Link

  • About
  • Contact
  • Privacy
Follow US
© 2008-25 SmartData Collective. All Rights Reserved.
Welcome Back!

Sign in to your account

Username or Email Address
Password

Lost your password?