Cookies help us display personalized product recommendations and ensure you have great shopping experience.

By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
SmartData CollectiveSmartData Collective
  • Analytics
    AnalyticsShow More
    image fx (67)
    Improving LinkedIn Ad Strategies with Data Analytics
    9 Min Read
    big data and remote work
    Data Helps Speech-Language Pathologists Deliver Better Results
    6 Min Read
    data driven insights
    How Data-Driven Insights Are Addressing Gaps in Patient Communication and Equity
    8 Min Read
    pexels pavel danilyuk 8112119
    Data Analytics Is Revolutionizing Medical Credentialing
    8 Min Read
    data and seo
    Maximize SEO Success with Powerful Data Analytics Insights
    8 Min Read
  • Big Data
  • BI
  • Exclusive
  • IT
  • Marketing
  • Software
Search
© 2008-25 SmartData Collective. All Rights Reserved.
Reading: 5 Non-Quality Items to Consider in Data Profiling
Share
Notification
Font ResizerAa
SmartData CollectiveSmartData Collective
Font ResizerAa
Search
  • About
  • Help
  • Privacy
Follow US
© 2008-23 SmartData Collective. All Rights Reserved.
SmartData Collective > Big Data > Data Mining > 5 Non-Quality Items to Consider in Data Profiling
Data Mining

5 Non-Quality Items to Consider in Data Profiling

DataQualityEdge
DataQualityEdge
7 Min Read
SHARE

Data Profiling is all about identifying and quantifying the accuracy of the data in a database. How complete is it? How accurate is it? These are must haves in any and all data profiling activities.

However, for any data profiling project to be complete we must look at the metadata. Not only do we need to know how many widgets were sold by analyzing the data, but we need to know when the load job runs, who the business owner is and a few other items that would make your data profiling project a masterpiece.

1. Table Details: Remember each table will have a purpose for existing, what is that purpose. While you’re checking the data within a table don’t forget the metadata on the table. Table physical and logical names and descriptions are essential elements to track. Performing a data profile on data within a large table is great, but if the table was last accessed 4 years ago. Your efforts may yield little ROI. So you will need to note this last statement and go back to the business and ask them – do they still need this data?

Some table details to consider:
Table Name
Table Description
Table Relationships
Table Usage Details (last user, frequency used)
Touch Points
Table Metadata (attribut…

More Read

IBM’s Indian research division has developed a new…
Taking the question out of questionable claims
Mission: To convert the power in high altitude winds into clean…
Visual Complexity is a unified resource space for anyone…
A video introduction to R for Excel users


Data Profiling is all about identifying and quantifying the accuracy of the data in a database. How complete is it? How accurate is it? These are must haves in any and all data profiling activities.

However, for any data profiling project to be complete we must look at the metadata. Not only do we need to know how many widgets were sold by analyzing the data, but we need to know when the load job runs, who the business owner is and a few other items that would make your data profiling project a masterpiece.

1. Table Details: Remember each table will have a purpose for existing, what is that purpose. While you’re checking the data within a table don’t forget the metadata on the table. Table physical and logical names and descriptions are essential elements to track. Performing a data profile on data within a large table is great, but if the table was last accessed 4 years ago. Your efforts may yield little ROI. So you will need to note this last statement and go back to the business and ask them – do they still need this data?

Some table details to consider:
Table Name
Table Description
Table Relationships
Table Usage Details (last user, frequency used)
Touch Points
Table Metadata (attribute definitions)

Known Table Issues

2. Load Details: Whether your data is being loaded through Datastage, mainframe jobs, or other data loader tools, you will need to record every job that touches a table, and how the data is distributed within that same table. For large tables that hold data from multiple business units this can be a monumental task.

Some load details to consider:
Job Name/Number
Job Frequency
Last Load
File Used
File Layouts
Known Failures
Known Corrective Actions

3. Report Details: Reports may not be on everyone’s list for data profiling projects, and not everyone uses reports if you perform ad-hoc analysis. However, they are an excellent way to determine the value of your data. You can analyze who is using the data; how often it is being used; the type of user; the type of decisions being made; how much data is being used and more. Data usage through reporting will be able to identify some ROI.

Some report details to consider:
Report Name
Report Purpose
Report Owner
Report–Table Touch Points
Report (last user, frequency used)

4. Owner Details: Some would argue that ownership has nothing to do with a data profile. I would say you are probably right. However, I would turn around and say, if you are a support analyst and you are asked, “Why the data is wrong or missing?”, and you don’t have the documentation or other support material. Knowing who the owner of the data is and having that in your data profile means you have just discovered speed dialing. You now have someone to discuss the issue with and someone that is responsible and accountable for the data. This information is virtually priceless to second-line support.

Some owner details to consider:
Data Owner
Process Owner
Contact Information

5. Lineage Details: Lineage details offer a more unique perspective about the data. In large organizations, who have legacy systems and silo environments this activity will become very cumbersome. In some situations the data may even pass through in individual PC for modifications, (a cold chill just went down my spine), before it reaches you the data analyst. It would be very common to see in such organizations that a single piece of data when first entered, runs through not 1, not 2, not 3, but 4 legacy systems and subsystems before coming to rest in an Enterprise Data Warehouse. Having this information in your pocket allows you to better communicate and understand different support teams, front-line users, business partners and more. Please note I use the term database below, but it can refer to any decision points in the data process flow/workflow.

Some Lineage Details:
Database/data warehouse name
Database Owner
Database Owner Contact Information
Database table names
Database attribute names (in many cases the attribute your looking at will not have been called the same in it’s lineage)
Attribute details (size, type, definitions)
First Entry Points

Once you have completed your accuracy and completeness verifications and gathered all of this intimate metadata that is often overlooked, then you can say your data profiling project is a 5-star effort. Remember, you will need to ensure that your gathered details contain accurate information.
Having all this information will not only help you the data quality analyst, but business analysts, and even support analysts. Your organization will be the better for it in the long-run.

Share This Article
Facebook Pinterest LinkedIn
Share

Follow us on Facebook

Latest News

image fx (2)
Monitoring Data Without Turning into Big Brother
Big Data Exclusive
image fx (71)
The Power of AI for Personalization in Email
Artificial Intelligence Exclusive Marketing
image fx (67)
Improving LinkedIn Ad Strategies with Data Analytics
Analytics Big Data Exclusive Software
big data and remote work
Data Helps Speech-Language Pathologists Deliver Better Results
Analytics Big Data Exclusive

Stay Connected

1.2kFollowersLike
33.7kFollowersFollow
222FollowersPin

You Might also Like

Great illustration of the hierarchy of analytical BI techniques

1 Min Read

Business Rules to Programmers – Methink thou doest protest too much III

15 Min Read

Gathering Information on a Global Scale

4 Min Read

Safeguarding Patient Data in EHRs

5 Min Read

SmartData Collective is one of the largest & trusted community covering technical content about Big Data, BI, Cloud, Analytics, Artificial Intelligence, IoT & more.

ai chatbot
The Art of Conversation: Enhancing Chatbots with Advanced AI Prompts
Chatbots
AI chatbots
AI Chatbots Can Help Retailers Convert Live Broadcast Viewers into Sales!
Chatbots

Quick Link

  • About
  • Contact
  • Privacy
Follow US
© 2008-25 SmartData Collective. All Rights Reserved.
Go to mobile version
Welcome Back!

Sign in to your account

Username or Email Address
Password

Lost your password?