Cookies help us display personalized product recommendations and ensure you have great shopping experience.

By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
SmartData CollectiveSmartData Collective
  • Analytics
    AnalyticsShow More
    big data analytics in transporation
    Turning Data Into Decisions: How Analytics Improves Transportation Strategy
    3 Min Read
    sales and data analytics
    How Data Analytics Improves Lead Management and Sales Results
    9 Min Read
    data analytics and truck accident claims
    How Data Analytics Reduces Truck Accidents and Speeds Up Claims
    7 Min Read
    predictive analytics for interior designers
    Interior Designers Boost Profits with Predictive Analytics
    8 Min Read
    image fx (67)
    Improving LinkedIn Ad Strategies with Data Analytics
    9 Min Read
  • Big Data
  • BI
  • Exclusive
  • IT
  • Marketing
  • Software
Search
© 2008-25 SmartData Collective. All Rights Reserved.
Reading: Match Mitigation: When Algorithms Aren’t Enough
Share
Notification
Font ResizerAa
SmartData CollectiveSmartData Collective
Font ResizerAa
Search
  • About
  • Help
  • Privacy
Follow US
© 2008-23 SmartData Collective. All Rights Reserved.
SmartData Collective > Business Intelligence > Match Mitigation: When Algorithms Aren’t Enough
Business Intelligence

Match Mitigation: When Algorithms Aren’t Enough

SteveSarsfield
SteveSarsfield
4 Min Read
SHARE

I’d like to get a little technical on this post. I try to keep my posts business-friendly, but sometimes there’s importance in detail. If none of this post makes any sense to you, I wrote a sort of primer on how matching works in many data quality tools, which you can get here.

Matching Algorithms

I’d like to get a little technical on this post. I try to keep my posts business-friendly, but sometimes there’s importance in detail. If none of this post makes any sense to you, I wrote a sort of primer on how matching works in many data quality tools, which you can get here.

Matching Algorithms
When you use a data quality tool, you’re often using matching algorithms and rules to make decisions on whether records match or not.  You might be using deterministic algorithms like Jaro, SoundEx and Metaphones. You might also be using probabilistic matching algorithms.

More Read

So, How Do You Measure Sustainability Anyway?
Steve Jobs Leaves an Indelible Mark on Business Intelligence
Business Analytics vs Business Intelligence?
PayPal Winning the Mobile Payments and Mobile Data Race
How to Calculate R-squared for a Decision Tree Model

In many tools, you can set the rules to be tight where the software uses tougher criteria to determine a match, or loose where the software is not so particular. Tight and loose matches are important because you may have strict rules for putting records together, like customers of a bank, or not so strict rules, like when you’re putting together a customer list for marketing purposes.

What to do with Matches
Once data has been processed through the matcher, there are several possible outcomes. Between any two given records, the matcher may find:

  • No relationship
  • Match – the matcher found a definite match based on the criteria given
  • Suspect – the matcher thinks it found a match but is not confident. The results should be manually reviewed.

It’s that last category that the tough one.  Mitigating the suspect matches is the most time-consuming follow-up task after the matching is complete. Envision a million record database where you have 20,000 suspect matches.   That’s still going to take you some time to review.

Some of the newer (and cooler) tools offer strategies for dealing with suspect matches. The tools will present the suspect matches in a graphical user interface and allow users to pick which relationships are accurate and which are not. For example, Talend now offers a data stewardship console that lets you pick and choose records and attributes that will make up a best of breed record.

The goal, of course, is to not have suspect matches, so tuning the matches and limiting the suspect matches is the ultimate. The newest tools will make this easy. Some of the legacy tools make this hard.

Match mitigation is perhaps one of the most often overlooked processes of data quality. Don’t overlook it in your planning and processes.

Covering the world of data integration, data governance, and data quality from the perspective of an industry insider.
TAGGED:algorithms
Share This Article
Facebook Pinterest LinkedIn
Share

Follow us on Facebook

Latest News

AI role in medical industry
The Role Of AI In Transforming Medical Manufacturing
Artificial Intelligence Exclusive
b2b sales
Unseen Barriers: Identifying Bottlenecks In B2B Sales
Business Rules Exclusive Infographic
data intelligence in healthcare
How Data Is Powering Real-Time Intelligence in Health Systems
Big Data Exclusive
intersection of data
The Intersection of Data and Empathy in Modern Support Careers
Big Data Exclusive

Stay Connected

1.2kFollowersLike
33.7kFollowersFollow
222FollowersPin

You Might also Like

Software Dependence & Model Accuracy

4 Min Read

Rating agency data: Getting gamed

3 Min Read

How Algorithms Changed The World [INFOGRAPHIC]

0 Min Read
AI takeover life
Artificial IntelligenceInfographic

Are Fears of AI’s Takeover Exaggerated?

8 Min Read

SmartData Collective is one of the largest & trusted community covering technical content about Big Data, BI, Cloud, Analytics, Artificial Intelligence, IoT & more.

giveaway chatbots
How To Get An Award Winning Giveaway Bot
Big Data Chatbots Exclusive
ai chatbot
The Art of Conversation: Enhancing Chatbots with Advanced AI Prompts
Chatbots

Quick Link

  • About
  • Contact
  • Privacy
Follow US
© 2008-25 SmartData Collective. All Rights Reserved.
Go to mobile version
Welcome Back!

Sign in to your account

Username or Email Address
Password

Lost your password?