Cookies help us display personalized product recommendations and ensure you have great shopping experience.

By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
SmartData CollectiveSmartData Collective
  • Analytics
    AnalyticsShow More
    big data analytics in transporation
    Turning Data Into Decisions: How Analytics Improves Transportation Strategy
    3 Min Read
    sales and data analytics
    How Data Analytics Improves Lead Management and Sales Results
    9 Min Read
    data analytics and truck accident claims
    How Data Analytics Reduces Truck Accidents and Speeds Up Claims
    7 Min Read
    predictive analytics for interior designers
    Interior Designers Boost Profits with Predictive Analytics
    8 Min Read
    image fx (67)
    Improving LinkedIn Ad Strategies with Data Analytics
    9 Min Read
  • Big Data
  • BI
  • Exclusive
  • IT
  • Marketing
  • Software
Search
© 2008-25 SmartData Collective. All Rights Reserved.
Reading: 5 Non-Quality Items to Consider in Data Profiling
Share
Notification
Font ResizerAa
SmartData CollectiveSmartData Collective
Font ResizerAa
Search
  • About
  • Help
  • Privacy
Follow US
© 2008-23 SmartData Collective. All Rights Reserved.
SmartData Collective > Big Data > Data Mining > 5 Non-Quality Items to Consider in Data Profiling
Data Mining

5 Non-Quality Items to Consider in Data Profiling

DataQualityEdge
DataQualityEdge
7 Min Read
SHARE

Data Profiling is all about identifying and quantifying the accuracy of the data in a database. How complete is it? How accurate is it? These are must haves in any and all data profiling activities.

However, for any data profiling project to be complete we must look at the metadata. Not only do we need to know how many widgets were sold by analyzing the data, but we need to know when the load job runs, who the business owner is and a few other items that would make your data profiling project a masterpiece.

1. Table Details: Remember each table will have a purpose for existing, what is that purpose. While you’re checking the data within a table don’t forget the metadata on the table. Table physical and logical names and descriptions are essential elements to track. Performing a data profile on data within a large table is great, but if the table was last accessed 4 years ago. Your efforts may yield little ROI. So you will need to note this last statement and go back to the business and ask them – do they still need this data?

Some table details to consider:
Table Name
Table Description
Table Relationships
Table Usage Details (last user, frequency used)
Touch Points
Table Metadata (attribut…

More Read

Operational decision making as a corporate asset
Business (NOT) as Usual: 3 Big Business Intelligence Predictions for 2015
Do not underestimate the need for automation in decision making
Selecting Big Data Sources for Predictive Analytics
Tests that show machines closing in on human abilities – tech -…


Data Profiling is all about identifying and quantifying the accuracy of the data in a database. How complete is it? How accurate is it? These are must haves in any and all data profiling activities.

However, for any data profiling project to be complete we must look at the metadata. Not only do we need to know how many widgets were sold by analyzing the data, but we need to know when the load job runs, who the business owner is and a few other items that would make your data profiling project a masterpiece.

1. Table Details: Remember each table will have a purpose for existing, what is that purpose. While you’re checking the data within a table don’t forget the metadata on the table. Table physical and logical names and descriptions are essential elements to track. Performing a data profile on data within a large table is great, but if the table was last accessed 4 years ago. Your efforts may yield little ROI. So you will need to note this last statement and go back to the business and ask them – do they still need this data?

Some table details to consider:
Table Name
Table Description
Table Relationships
Table Usage Details (last user, frequency used)
Touch Points
Table Metadata (attribute definitions)

Known Table Issues

2. Load Details: Whether your data is being loaded through Datastage, mainframe jobs, or other data loader tools, you will need to record every job that touches a table, and how the data is distributed within that same table. For large tables that hold data from multiple business units this can be a monumental task.

Some load details to consider:
Job Name/Number
Job Frequency
Last Load
File Used
File Layouts
Known Failures
Known Corrective Actions

3. Report Details: Reports may not be on everyone’s list for data profiling projects, and not everyone uses reports if you perform ad-hoc analysis. However, they are an excellent way to determine the value of your data. You can analyze who is using the data; how often it is being used; the type of user; the type of decisions being made; how much data is being used and more. Data usage through reporting will be able to identify some ROI.

Some report details to consider:
Report Name
Report Purpose
Report Owner
Report–Table Touch Points
Report (last user, frequency used)

4. Owner Details: Some would argue that ownership has nothing to do with a data profile. I would say you are probably right. However, I would turn around and say, if you are a support analyst and you are asked, “Why the data is wrong or missing?”, and you don’t have the documentation or other support material. Knowing who the owner of the data is and having that in your data profile means you have just discovered speed dialing. You now have someone to discuss the issue with and someone that is responsible and accountable for the data. This information is virtually priceless to second-line support.

Some owner details to consider:
Data Owner
Process Owner
Contact Information

5. Lineage Details: Lineage details offer a more unique perspective about the data. In large organizations, who have legacy systems and silo environments this activity will become very cumbersome. In some situations the data may even pass through in individual PC for modifications, (a cold chill just went down my spine), before it reaches you the data analyst. It would be very common to see in such organizations that a single piece of data when first entered, runs through not 1, not 2, not 3, but 4 legacy systems and subsystems before coming to rest in an Enterprise Data Warehouse. Having this information in your pocket allows you to better communicate and understand different support teams, front-line users, business partners and more. Please note I use the term database below, but it can refer to any decision points in the data process flow/workflow.

Some Lineage Details:
Database/data warehouse name
Database Owner
Database Owner Contact Information
Database table names
Database attribute names (in many cases the attribute your looking at will not have been called the same in it’s lineage)
Attribute details (size, type, definitions)
First Entry Points

Once you have completed your accuracy and completeness verifications and gathered all of this intimate metadata that is often overlooked, then you can say your data profiling project is a 5-star effort. Remember, you will need to ensure that your gathered details contain accurate information.
Having all this information will not only help you the data quality analyst, but business analysts, and even support analysts. Your organization will be the better for it in the long-run.

Share This Article
Facebook Pinterest LinkedIn
Share

Follow us on Facebook

Latest News

AI role in medical industry
The Role Of AI In Transforming Medical Manufacturing
Artificial Intelligence Exclusive
b2b sales
Unseen Barriers: Identifying Bottlenecks In B2B Sales
Business Rules Exclusive Infographic
data intelligence in healthcare
How Data Is Powering Real-Time Intelligence in Health Systems
Big Data Exclusive
intersection of data
The Intersection of Data and Empathy in Modern Support Careers
Big Data Exclusive

Stay Connected

1.2kFollowersLike
33.7kFollowersFollow
222FollowersPin

You Might also Like

Implications of the Organization of Data

6 Min Read

Death and Taxes

4 Min Read

Fudging Data: The How, The Why and Catching it

5 Min Read

Why is Social Media About Media?

4 Min Read

SmartData Collective is one of the largest & trusted community covering technical content about Big Data, BI, Cloud, Analytics, Artificial Intelligence, IoT & more.

giveaway chatbots
How To Get An Award Winning Giveaway Bot
Big Data Chatbots Exclusive
AI and chatbots
Chatbots and SEO: How Can Chatbots Improve Your SEO Ranking?
Artificial Intelligence Chatbots Exclusive

Quick Link

  • About
  • Contact
  • Privacy
Follow US
© 2008-25 SmartData Collective. All Rights Reserved.
Go to mobile version
Welcome Back!

Sign in to your account

Username or Email Address
Password

Lost your password?