Cookies help us display personalized product recommendations and ensure you have great shopping experience.

By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
SmartData CollectiveSmartData Collective
  • Analytics
    AnalyticsShow More
    big data analytics in transporation
    Turning Data Into Decisions: How Analytics Improves Transportation Strategy
    3 Min Read
    sales and data analytics
    How Data Analytics Improves Lead Management and Sales Results
    9 Min Read
    data analytics and truck accident claims
    How Data Analytics Reduces Truck Accidents and Speeds Up Claims
    7 Min Read
    predictive analytics for interior designers
    Interior Designers Boost Profits with Predictive Analytics
    8 Min Read
    image fx (67)
    Improving LinkedIn Ad Strategies with Data Analytics
    9 Min Read
  • Big Data
  • BI
  • Exclusive
  • IT
  • Marketing
  • Software
Search
© 2008-25 SmartData Collective. All Rights Reserved.
Reading: The Data Audit Process: The Initial Step in Building Successful Predictive Analytics Solutions
Share
Notification
Font ResizerAa
SmartData CollectiveSmartData Collective
Font ResizerAa
Search
  • About
  • Help
  • Privacy
Follow US
© 2008-23 SmartData Collective. All Rights Reserved.
SmartData Collective > Analytics > Predictive Analytics > The Data Audit Process: The Initial Step in Building Successful Predictive Analytics Solutions
AnalyticsPredictive Analytics

The Data Audit Process: The Initial Step in Building Successful Predictive Analytics Solutions

RichardBoire
RichardBoire
5 Min Read
Image
SHARE

ImageBuilding predictive analytics solutions is very much in-vogue for most organizations today. Historically, practitioners needed to educate businesses  on the value of data mining and predictive analytics. Now, the concept and value of predictive analytics is widely accepted by most businesses.

ImageBuilding predictive analytics solutions is very much in-vogue for most organizations today. Historically, practitioners needed to educate businesses  on the value of data mining and predictive analytics. Now, the concept and value of predictive analytics is widely accepted by most businesses. Demonstrating the value of solutions thru various techniques and approaches represents the exciting component of predictive analytics. Many consultants and businesses will discuss their experience with predictive analytics and how it solved a particular business problem.

Yet, minimal attention is given to the “unglamorous” side of predictive analytics which is the “Data”. A business problem is often posed and one slide is devoted to the sources of data that were used to develop the solution. No attention is given to the rigor in examining these data sources in order to arrive at an optimum data environment that can be used to develop the predictive analytics solution. We refer to this rigor as the “Data Audit Process”. Yet, it is this process and the discipline of being a “data grunt” which provides the backbone in building predictive analytics solutions. But what does the data audit entail.

Upon commencement of any predictive analytics solution, the data requirements and data sources are defined. A data extract document is written by the practitioner and the data is then delivered to the practitioner. If twenty data files or tables are requested, a separate data audit is done on  each file/table which ultimately results in the creation of twenty data audit reports. But what does the data audit report contain?  Three different types of output are created which ultimately yield a level of detailed insight about the data and how it can be used in an analytical solution.

More Read

With over 30 shopping-related APIs and 300+ mashups tagged…
Last month, IBM CEO Samuel Palmisano advised President-elect…
CI & CNBC Using Social Media Analytics to Predict Business Outcomes
Cloud Front Group Capabilities Featured in Geospatial Intelligence Forum
How to Choose a New Inventory Software System

The first output is a report depicting a random sample of 100 records from the file. This output is to simply provide us with a picture of what the actual table or file looks like.  From this sample, the practitioner can begin to better understand the composition of certain fields  based on the values and outcomes being reported in that field.

The second output is a data diagnostics report. This output looks at each field within a given file. The report outputs the field format, number of missing values and , the number of unique values for each field in the file. Along with these diagnostics, the report also outputs the mean value and standard deviation for each numeric field in the file. This output begins to reveal the utility of a variable in any predictive analytics solution. For example, variables with more than  90% of its values reported  as missing will not be useful in any analytics exercise.  Variab les that only have 1 unique outcome will also not be useful in any analytics solution.

The third output is the frequency distribution reports which are output for each variable in the file. These reports provide a more detailed view of the field or variable by displaying how the outcomes or values distribute  within a given field. Besides  yielding  additional information regarding what information or variables will be useful in a future  predictive analytics exercise, frequency reports also provide insights on how to derive new variab les from the source variables.

Although these  actual outputs themselves are not revolutionary, this discipline of “data” investigation represents the initial process in any analytics exercise.  It is this initial process that provides the framework in creating the all-important analytical file which will be used to develop the predictive model. Without this framework, it is akin to trying to read without understanding the alphabet. In the next blog, I will discuss what we need to consider in creating a robust analytical file once this  framework is established.

Share This Article
Facebook Pinterest LinkedIn
Share

Follow us on Facebook

Latest News

AI role in medical industry
The Role Of AI In Transforming Medical Manufacturing
Artificial Intelligence Exclusive
b2b sales
Unseen Barriers: Identifying Bottlenecks In B2B Sales
Business Rules Exclusive Infographic
data intelligence in healthcare
How Data Is Powering Real-Time Intelligence in Health Systems
Big Data Exclusive
intersection of data
The Intersection of Data and Empathy in Modern Support Careers
Big Data Exclusive

Stay Connected

1.2kFollowersLike
33.7kFollowersFollow
222FollowersPin

You Might also Like

Data Mining Improved Company’s Revenue By 187%

5 Min Read

In this new era, as Saffo puts it, “…the central economic actor…

3 Min Read

Oracle Modernizes HR with Mobile and Wearable Computing

7 Min Read
data cleansing tips for business analytics
Big Data

How Data Cleansing Can Make or Break Your Business Analytics

9 Min Read

SmartData Collective is one of the largest & trusted community covering technical content about Big Data, BI, Cloud, Analytics, Artificial Intelligence, IoT & more.

data-driven web design
5 Great Tips for Using Data Analytics for Website UX
Big Data
ai in ecommerce
Artificial Intelligence for eCommerce: A Closer Look
Artificial Intelligence

Quick Link

  • About
  • Contact
  • Privacy
Follow US
© 2008-25 SmartData Collective. All Rights Reserved.
Go to mobile version
Welcome Back!

Sign in to your account

Username or Email Address
Password

Lost your password?