Wordtree for Visual Text Exploration

May 26, 2009
54 Views

Analytics can be all about having the right tool for the job. When your data is text, traditional analysis tools (e.g. Excel, OLAP tools) are like peeling a mango with a chainsaw.

There are a number of visual exploration tools specifically designed for text data, including:

  • Word clouds like Wordle (fun but superficial);
  • Network diagrams like Visual Thesaurus (good for individual words, not text);
  • Trend graphs like Baby Name Voyager or Google Trends;
  • Granular presentations for interacting and exploring individual phrases, e.g. We Feel Fine and Twistori
  • “Word trees” that let you navigate through lines of text to understand the most frequent words, relationships between words, and common phase and sentence structures.

It is quite difficult to find a Word Tree in the wild. The brilliant team at IBM’s Many Eyes were the first to make Word Tree’s generally available. The same ManyEyes team have also created an alternative approach for visual text exploration with a tool called Phrase Net.

Phrase Net

Recently, we built a slightly different take on the Word Tree in Concentrate, our tool which allows users to explore huge search query lists to see how people use search keywords. For geeky entertainment

Analytics can be all about having the right tool for the job. When your data is text, traditional analysis tools (e.g. Excel, OLAP tools) are like peeling a mango with a chainsaw.

There are a number of visual exploration tools specifically designed for text data, including:

  • Word clouds like Wordle (fun but superficial);
  • Network diagrams like Visual Thesaurus (good for individual words, not text);
  • Trend graphs like Baby Name Voyager or Google Trends;
  • Granular presentations for interacting and exploring individual phrases, e.g. We Feel Fine and Twistori
  • “Word trees” that let you navigate through lines of text to understand the most frequent words, relationships between words, and common phase and sentence structures.

It is quite difficult to find a Word Tree in the wild. The brilliant team at IBM’s Many Eyes were the first to make Word Tree’s generally available. The same ManyEyes team have also created an alternative approach for visual text exploration with a tool called Phrase Net.

Phrase Net

Recently, we built a slightly different take on the Word Tree in Concentrate, our tool which allows users to explore huge search query lists to see how people use search keywords. For geeky entertainment, we created a special Concentrate demo account with the lyrics of songs from Rolling Stone’s 500 Greatest Songs of All Time. Click here to sign-in to the demo (Press submit and then choose WordTree at the top).

Here’s how our Word Tree works:

  • The box at the center is your starting point. When you open a Word Tree, it will contain the most common word in the text data. You can edit this box to “re-center” the wordtree (name that tune):

Wordtree image

  • Stretched out on either side are words and phrases that are tied to that center word. The size of the words represents their relative frequency.

Wordtree image

  • Rolling over the words/phrases will highlight the connections to your center word and on the other side. You’ll also see a pop-up box with examples of the phrases containing selected words.

Wordtree image

  • You can open or close branches by clicking on a word. Words with hidden branches are highlighted in orange. We also have an ability colorize the words based on a metric in your text data.

While these more advanced visualizations are a start, I suspect there is a lot of room for other tools and techniques to visually explore text data. I’d be curious to hear about other tools you’ve seen along these lines.

 Link to original post

You may be interested

How SAP Hana is Driving Big Data Startups
Big Data
298 shares3,066 views
Big Data
298 shares3,066 views

How SAP Hana is Driving Big Data Startups

Ryan Kh - July 20, 2017

The first version of SAP Hana was released in 2010, before Hadoop and other big data extraction tools were introduced.…

Data Erasing Software vs Physical Destruction: Sustainable Way of Data Deletion
Data Management
62 views
Data Management
62 views

Data Erasing Software vs Physical Destruction: Sustainable Way of Data Deletion

Manish Bhickta - July 20, 2017

Physical Data destruction techniques are efficient enough to destroy data, but they can never be considered eco-friendly. On the other…

10 Simple Rules for Creating a Good Data Management Plan
Data Management
69 shares672 views
Data Management
69 shares672 views

10 Simple Rules for Creating a Good Data Management Plan

GloriaKopp - July 20, 2017

Part of business planning is arranging how data will be used in the development of a project. This is why…