Cookies help us display personalized product recommendations and ensure you have great shopping experience.

By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
SmartData CollectiveSmartData Collective
  • Analytics
    AnalyticsShow More
    data driven insights
    How Data-Driven Insights Are Addressing Gaps in Patient Communication and Equity
    8 Min Read
    pexels pavel danilyuk 8112119
    Data Analytics Is Revolutionizing Medical Credentialing
    8 Min Read
    data and seo
    Maximize SEO Success with Powerful Data Analytics Insights
    8 Min Read
    data analytics for trademark registration
    Optimizing Trademark Registration with Data Analytics
    6 Min Read
    data analytics for finding zip codes
    Unlocking Zip Code Insights with Data Analytics
    6 Min Read
  • Big Data
  • BI
  • Exclusive
  • IT
  • Marketing
  • Software
Search
© 2008-25 SmartData Collective. All Rights Reserved.
Reading: What Is Natural Language Processing?
Share
Notification
Font ResizerAa
SmartData CollectiveSmartData Collective
Font ResizerAa
Search
  • About
  • Help
  • Privacy
Follow US
© 2008-23 SmartData Collective. All Rights Reserved.
SmartData Collective > Analytics > Text Analytics > What Is Natural Language Processing?
AnalyticsText Analytics

What Is Natural Language Processing?

jparnitzke
jparnitzke
9 Min Read
SHARE

Before proceeding with the Building Better Systems series I thought I should write a quick post over the weekend about the underlying Natural Language Processing (NLP) and text engineering technologies proposed in the solution.

Before proceeding with the Building Better Systems series I thought I should write a quick post over the weekend about the underlying Natural Language Processing (NLP) and text engineering technologies proposed in the solution. I have received a lot of questions about this when I posted “How to build better systems – the specification” and mentioned this key element. And a few assumptions that are not quite correct about just what this technology text miningis all about.  This is completely understandable. Unless you are in voice recognition, machine learning, or artificial intelligence, NLP is not a mainstream subject many of us are familiar with. I’m not going to pretend to know the subject as well as others who are dedicated to this field. There are others who know this subject inside out and can discuss the science behind it far better than I can. Nor do I intend to go into a detailed exposition describing just how powerful and useful this technology can be. I will share with you how I was able to take advantage of the work done to date and use a couple of tools with NLP components already embedded to automate many of the labor intensive tasks usually performed when evaluating system specifications. My focus is seeking a way to automate and uncover defects in our specifications as quickly as possible.  This can be accomplished by uncovering defects in our specifications related to:

  • Structural Completeness/Alignment
  • Text Ambiguity and Context
  • Section Quantity/Size
  • Volatility
  • Lexicon Discovery
  • Plain Language; word complexity density – complex words

NLP and text engineering can do this for us and much more. Rather than elaborate any further, I think it is better to just see it in action with a few good examples. Fortunately the University of Illinois Cognitive Computation Group as already done this for all of us and created a site for a little taste of what can be done. Their demonstration page using NLP has several clear demonstrations of the following key concepts:

Natural Language Analysis

More Read

data analytics helps with bitcoin investing
How a Danish Bitcoin Trader Discovered the Wonders of Analytics
Analytics and the Next Best Activity Strategy
First Look – Opera Solutions
Intro to Pervasive Business Intelligence (via…
Use a Data Strategy to Make Your Startup Profitable
  • Coreference Resolution
  • Part of Speech Tagging
  • Semantic Role Labeling
  • Shallow Parsing
  • Text Analysis

Entities and Information Extraction

  • Named Entity Recognition
  • Named Entity Recognizer (often using extended entity type sets)
  • Number Quantization
  • Temporal Extraction and Comparison

Similarity

  • Context Sensitive Verb Paraphrasing
  • LLM (Lexical Level Matching)
  • Named Entity Similarity
  • Relation Identification
  • Word Similarity

For example say we wanted to parse and process this simple sentence:

“Time Zones are used to manage the various campaigns that are executed to ensure customers are called within the allowed campaigning window which is 8 AM through 8 PM.”

Using the text analysis demonstration at the site we can copy and paste this simple phrase into the dialog box and submit for processing. The results are returned in the following diagram.

text analysis

Click to enlarge

The parser has identified things (entities), guarantees, purpose, verbs (use, manage, call, and execute) and time within this simple sentence.  I think you can see now how powerful this technology can be when used to collect, group, and evaluate text . While this is impressive enough think of the possibilities for using this technology to process page after page of business and system requirements?

There is also a terrific demonstration illustrating what the Cognitive Computation Group calls The Wikification system. Use their examples or plug in your own text for processing. Using the Wikification example you can insert plain text to be parsed and processed to identify entities with Wikipedia articles. The result set returned includes the live links to visit the corresponding Wikipedia page and the categories associated with each entity. Here is an example from an architecture specification describing (at a high level) Grid Service Agent behavior.

“The Grid Service Agent (GSA) acts as a process manager that can spawn and manage Service Grid processes (Operating System level processes) such as Grid Service Manager and Grid Service Container. Usually, a single GSA is run per machine. The GSA can spawn Grid Service Managers, Grid Service Containers, and other processes. Once a process is spawned, the GSA assigns a unique id for it and manages its life cycle. The GSA will restart the process if it exits abnormally (exit code different than 0), or if a specific console output has been encountered (for example, an Out Of Memory Error).”

The result returned is illustrated in the following diagram. Note the parser here has categorized and selected context specific links to public Wikipedia articles (and related links) to elaborate on the objects identified where such articles exist.

natural language processing

The more ambitious among us (with an internal Wiki) can understand how powerful this can be. Armed with the source code this is potentially a truly wonderful application processor to link dynamic content (think system specifications or requirements) in context back to an entire knowledge base.

If you really want to get your hands dirty and dive right in, there are two widely known frameworks for natural language processing.

  • GATE (General Architecture for Text Engineering)
  • UIMA (Unstructured Information Management Architecture)

GATE is a Java suite of tools originally developed at the University of Sheffield and now used worldwide by a wide community of scientists, companies for all sorts of natural language processing tasks. It is readily available and works well with Protégé (semantic editor) for those of you into ontology development.

UIMA (Unstructured Information Management Architecture) originally developed by IBM but now maintained by the Apache Software Foundation. .

If you need to input plain text and identify entities, such as persons, places, organizations; or relations, (such as works-for or located-at) this may be the way to do it.  Access to both frameworks is open, and there is a large community of developers supporting both initiatives.

For myself, I’m going to use a wonderful commercial product available in the cloud (and on-premise if needed) called The Visible Thread to illustrate many of the key concepts.  Designed for just this need, it also comes with the text engineering collaboration features already built-in and a pretty handy user interface for managing the collection (the fancy word for this is corpus; look it up and impress your friends) of documents I need to evaluate.  As an architect my primary mission is reduce unnecessary effort, so it just makes sense to use this. Of course, if you want to build this yourself there are several options available using GATE or UIMA to develop your own text processor to evaluate system specification document(s).  I hope you will take the time to try some of the samples at the University of Illinois Cognitive Computation Group to get a feel for what is going on under the covers as I discuss the methods and techniques around building better systems through high quality specifications in the coming weeks.  You don’t need to be a rocket scientist (something I can’t relate to either) to appreciate the underlying technology and concepts you can take advantage of to make this meet this important and urgent need a reality in your own practice.

TAGGED:enterprise architectureNatural Language ProcessingSystems Developmenttext engineering
Share This Article
Facebook Pinterest LinkedIn
Share

Follow us on Facebook

Latest News

crypto marketing
How a Crypto Marketing Agency Can Use AI to Create Powerful Native Advertising Strategies
Blockchain Exclusive Marketing
data driven insights
How Data-Driven Insights Are Addressing Gaps in Patient Communication and Equity
Analytics Big Data Exclusive
image fx (37)
Boosting SMS Marketing Efficiency with AI Automation
Exclusive
pexels pavel danilyuk 8112119
Data Analytics Is Revolutionizing Medical Credentialing
Analytics Big Data Exclusive

Stay Connected

1.2kFollowersLike
33.7kFollowersFollow
222FollowersPin

You Might also Like

2012 Outlook for Enterprise Architecture

6 Min Read

The Benefits of Business Intelligence

5 Min Read

Enterprise Architecture Still a Leap of Faith for Many Companies

3 Min Read
Natural Language Processing NLP AI
Artificial IntelligenceExclusiveNews

Natural Language Processing: An Essential Element of Artificial Intelligence

6 Min Read

SmartData Collective is one of the largest & trusted community covering technical content about Big Data, BI, Cloud, Analytics, Artificial Intelligence, IoT & more.

ai in ecommerce
Artificial Intelligence for eCommerce: A Closer Look
Artificial Intelligence
AI chatbots
AI Chatbots Can Help Retailers Convert Live Broadcast Viewers into Sales!
Chatbots

Quick Link

  • About
  • Contact
  • Privacy
Follow US
© 2008-25 SmartData Collective. All Rights Reserved.
Go to mobile version
Welcome Back!

Sign in to your account

Username or Email Address
Password

Lost your password?