The Semantic Web and Complementary Technologies

October 4, 2010
223 Views

In my previous post, I discussed the business case for the semantic web and how it is affecting customer service. In the third part of the series, I address the semantic web in the context of specific enterprise technologies.

IE: No, the Other One

Throw out the term IE and most people think of one of the following:

In my previous post, I discussed the business case for the semantic web and how it is affecting customer service. In the third part of the series, I address the semantic web in the context of specific enterprise technologies.

IE: No, the Other One

Throw out the term IE and most people think of one of the following:

  • Internet Explorer
  • The Latin term id est (i.e., or i.e.). This typically means that is. Example: Phil didn’t embarrass himself on the golf course today–i.e., he shot an 85. (Which I did, by the way, a few weeks ago. Polite applause…)

But there’s another type of IE and it’s critical from a semantic technology perspective: Information Extraction.

In their book Semantic Web Technologies: Trends and Research in Ontology-based Systems, John Davies, Rudi Studer, and Paul Warren define this type of (IE) as “a technology based on analyzing natural language in order to extract snippets of information.” IE allows users to easily find five types of information:

Contrast IE against what most enterprises rely upon today: basic information retrieval (IR). IR finds relevant text and returns a simple list. While this is useful, IR forces the user to determine the most relevant piece(s). IE, on the other hand, automatically does this analysis for the end user; it only returns the most germane results and, most important, in a superior format. This may be a spreadsheet that allows for sorting, filtering, and adding fields. In other words, there is greater context.

An Example

All of this may seem a bit abstract. Let’s make it more real. Consider searching for golfers with Google. No doubt that you know what basic Google search results look like. However, what if you performed that same search using a semantic technology? Consider the output below:

Pretty neat, eh? Note how each golfer’s picture, name, and date of birth appear by default. But what if you want to see more fields, such as earnings? No problem. Just type in earnings and you can see just how much money folks like Tiger Woods have made by being able to hit a white ball (when he’s not, er, doing other things).

Semantic Technologies

To make magic like this happen obviously requires, among other things, a great deal of technology behind the scenes. Beyond technology, however, accurate data, metadata, and tags are needed.

IE is more efficient and ultimately useful than IR. However, the technological requirements for IE far exceed those of IR. While IR can rely upon simple keywords or text, IE requires much more, often including technologies such as:

  • Natural Language Processing (NLP)
  • Artificial Intelligence
  • Machine Learning
  • Data Mining

To the layperson, all of this is irrelevant. They merely want a way to solve a problem, such as reducing the time required to find the most relevant emails. To this end, consider what one technolgoy company, Meshin, does. It takes a semantic approach to email, using NLP and other technologies to ultimately locate relevant emails quicker and better than traditional methods.

Information extraction trumps information retrieval. However, let’s remember that we’re in the early stages of Web 3.0 and the economy isn’t great. For these reasons, don’t expect simple IR to go away anytime soon. To be sure, accurate IR is certainly far better than the dismal search functionality of Web 1.0 in the 1990s. As semantic technologies develop, the social web matures, and the economy improves, expect search and the semantic web to do the same.

Don’t believe me? Check out what Eric Schmidt of Google said on a recent interview with Charlie Rose. Things are going to get interesting over the next five years.

Feedback

What say you?