By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
SmartData CollectiveSmartData Collective
  • Analytics
    AnalyticsShow More
    data Analytics instagram stories
    Data Analytics Helps Marketers Make the Most of Instagram Stories
    15 Min Read
    analyst,women,looking,at,kpi,data,on,computer,screen
    What to Know Before Recruiting an Analyst to Handle Company Data
    6 Min Read
    AI analytics
    AI-Based Analytics Are Changing the Future of Credit Cards
    6 Min Read
    data overload showing data analytics
    How Does Next-Gen SIEM Prevent Data Overload For Security Analysts?
    8 Min Read
    hire a marketing agency with a background in data analytics
    5 Reasons to Hire a Marketing Agency that Knows Data Analytics
    7 Min Read
  • Big Data
  • BI
  • Exclusive
  • IT
  • Marketing
  • Software
Search
© 2008-23 SmartData Collective. All Rights Reserved.
Reading: 5 Non-Quality Items to Consider in Data Profiling
Share
Notification Show More
Aa
SmartData CollectiveSmartData Collective
Aa
Search
  • About
  • Help
  • Privacy
Follow US
© 2008-23 SmartData Collective. All Rights Reserved.
SmartData Collective > Big Data > Data Mining > 5 Non-Quality Items to Consider in Data Profiling
Data Mining

5 Non-Quality Items to Consider in Data Profiling

DataQualityEdge
Last updated: 2009/03/13 at 2:24 AM
DataQualityEdge
7 Min Read
SHARE

Data Profiling is all about identifying and quantifying the accuracy of the data in a database. How complete is it? How accurate is it? These are must haves in any and all data profiling activities.

However, for any data profiling project to be complete we must look at the metadata. Not only do we need to know how many widgets were sold by analyzing the data, but we need to know when the load job runs, who the business owner is and a few other items that would make your data profiling project a masterpiece.

1. Table Details: Remember each table will have a purpose for existing, what is that purpose. While you’re checking the data within a table don’t forget the metadata on the table. Table physical and logical names and descriptions are essential elements to track. Performing a data profile on data within a large table is great, but if the table was last accessed 4 years ago. Your efforts may yield little ROI. So you will need to note this last statement and go back to the business and ask them – do they still need this data?

Some table details to consider:
Table Name
Table Description
Table Relationships
Table Usage Details (last user, frequency used)
Touch Points
Table Metadata (attribut…

More Read

data mining

Data Mining Technology Helps Online Brands Optimize Their Branding

Can Data Mining Aid with Off-Page SEO Strategies?
3 Data Mining Tips for Companies Trying to Understand their Customers
5 Data Mining Tips to Leverage the Benefits of Surveys
Perform Data Mining With Web Scrapers to Track Prices


Data Profiling is all about identifying and quantifying the accuracy of the data in a database. How complete is it? How accurate is it? These are must haves in any and all data profiling activities.

However, for any data profiling project to be complete we must look at the metadata. Not only do we need to know how many widgets were sold by analyzing the data, but we need to know when the load job runs, who the business owner is and a few other items that would make your data profiling project a masterpiece.

1. Table Details: Remember each table will have a purpose for existing, what is that purpose. While you’re checking the data within a table don’t forget the metadata on the table. Table physical and logical names and descriptions are essential elements to track. Performing a data profile on data within a large table is great, but if the table was last accessed 4 years ago. Your efforts may yield little ROI. So you will need to note this last statement and go back to the business and ask them – do they still need this data?

Some table details to consider:
Table Name
Table Description
Table Relationships
Table Usage Details (last user, frequency used)
Touch Points
Table Metadata (attribute definitions)

Known Table Issues

2. Load Details: Whether your data is being loaded through Datastage, mainframe jobs, or other data loader tools, you will need to record every job that touches a table, and how the data is distributed within that same table. For large tables that hold data from multiple business units this can be a monumental task.

Some load details to consider:
Job Name/Number
Job Frequency
Last Load
File Used
File Layouts
Known Failures
Known Corrective Actions

3. Report Details: Reports may not be on everyone’s list for data profiling projects, and not everyone uses reports if you perform ad-hoc analysis. However, they are an excellent way to determine the value of your data. You can analyze who is using the data; how often it is being used; the type of user; the type of decisions being made; how much data is being used and more. Data usage through reporting will be able to identify some ROI.

Some report details to consider:
Report Name
Report Purpose
Report Owner
Report–Table Touch Points
Report (last user, frequency used)

4. Owner Details: Some would argue that ownership has nothing to do with a data profile. I would say you are probably right. However, I would turn around and say, if you are a support analyst and you are asked, “Why the data is wrong or missing?”, and you don’t have the documentation or other support material. Knowing who the owner of the data is and having that in your data profile means you have just discovered speed dialing. You now have someone to discuss the issue with and someone that is responsible and accountable for the data. This information is virtually priceless to second-line support.

Some owner details to consider:
Data Owner
Process Owner
Contact Information

5. Lineage Details: Lineage details offer a more unique perspective about the data. In large organizations, who have legacy systems and silo environments this activity will become very cumbersome. In some situations the data may even pass through in individual PC for modifications, (a cold chill just went down my spine), before it reaches you the data analyst. It would be very common to see in such organizations that a single piece of data when first entered, runs through not 1, not 2, not 3, but 4 legacy systems and subsystems before coming to rest in an Enterprise Data Warehouse. Having this information in your pocket allows you to better communicate and understand different support teams, front-line users, business partners and more. Please note I use the term database below, but it can refer to any decision points in the data process flow/workflow.

Some Lineage Details:
Database/data warehouse name
Database Owner
Database Owner Contact Information
Database table names
Database attribute names (in many cases the attribute your looking at will not have been called the same in it’s lineage)
Attribute details (size, type, definitions)
First Entry Points

Once you have completed your accuracy and completeness verifications and gathered all of this intimate metadata that is often overlooked, then you can say your data profiling project is a 5-star effort. Remember, you will need to ensure that your gathered details contain accurate information.
Having all this information will not only help you the data quality analyst, but business analysts, and even support analysts. Your organization will be the better for it in the long-run.

DataQualityEdge March 13, 2009
Share This Article
Facebook Twitter Pinterest LinkedIn
Share

Follow us on Facebook

Latest News

smart home data
7 Mind-Blowing Ways Smart Homes Use Data to Save Your Money
Big Data
ai low code frameworks
AI Can Help Accelerate Development with Low-Code Frameworks
Artificial Intelligence
data Analytics instagram stories
Data Analytics Helps Marketers Make the Most of Instagram Stories
Analytics
data breaches
How Hospital Security Breaches Devastate Local Communities
Policy and Governance

Stay Connected

1.2k Followers Like
33.7k Followers Follow
222 Followers Pin

You Might also Like

data mining
Data Mining

Data Mining Technology Helps Online Brands Optimize Their Branding

7 Min Read
data mining helps with offsite SEO
Data Mining

Can Data Mining Aid with Off-Page SEO Strategies?

10 Min Read
using data mining to learn more about customers
Big Data

3 Data Mining Tips for Companies Trying to Understand their Customers

6 Min Read
surveys data
Data Mining

5 Data Mining Tips to Leverage the Benefits of Surveys

11 Min Read

SmartData Collective is one of the largest & trusted community covering technical content about Big Data, BI, Cloud, Analytics, Artificial Intelligence, IoT & more.

AI chatbots
AI Chatbots Can Help Retailers Convert Live Broadcast Viewers into Sales!
Chatbots
ai in ecommerce
Artificial Intelligence for eCommerce: A Closer Look
Artificial Intelligence

Quick Link

  • About
  • Contact
  • Privacy
Follow US
© 2008-23 SmartData Collective. All Rights Reserved.
Go to mobile version
Welcome Back!

Sign in to your account

Lost your password?