Twitter: Rubbish, Valuable, or Both?

We reach a certain age and the music gets too loud. We believe that the world is going to hell in a hand basket. Things were just better when we were young.

Contents

The Remarkable Data Behind 140 Characters Simon Says: Learn from Twitter

This is always the case, and today is no exception. The complaints can be deafening. Young people don’t read books anymore. Look at what young folks are wearing. We decry the state of society and the future.

We reach a certain age and the music gets too loud. We believe that the world is going to hell in a hand basket. Things were just better when we were young.

Of course, we’ve seen this movie before; every generation does this. Many people my age and older dismiss Twitter and the very idea tweeting. Somehow, reading books and newspapers were more sophisticated than texting, tweeting, blogging, and friending.

The Remarkable Data Behind 140 Characters

That may be true on some abstract cultural level. Curmudgeon Andrew Keen would wholeheartedly agree). On a data or technological level, though, nothing could be further from the truth. In fact, the data technology behind 140 characters are nothing short of remarkable. The BusinessWeek story The Hidden Technology That Makes Twitter Huge details the data and metadata behind each tweet.

From the piece:

All tweets share the same anatomy. To examine the guts of a tweet, you request an “API key” from Twitter, which is a fast, automated procedure. You then visit special Web addresses that, instead of nicely formatted Web pages for humans to read, return raw data for computers to read. That data is expressed in a computer language—a smushed-up nest of brackets and characters. It’s a simplified version of JavaScript called JSON, which stands for JavaScript Object Notation. API essentially means “speaks (and reads) JSON.” The language comes in a bundle of name/value fields, 31 of which make up a tweet. For example, if a tweet has been “favorited” 25 times, the corresponding name is “favorite_count” and “25” is the value.

Think about it: 31 data fields captured for each tweet. I’d bet my house that that number will only rise in the coming years. Types of data that we can’t even imagine will one day be tracked, analyzed, and used in unfathomable ways.

The notion of 140 characters may seem inherently limiting. What can you really know from such a small amount of text? The answer depends, but surely a great deal more can be gleaned from a tweet’s metadata. Think about things like a tweet’s location, user, device, date, time, URL, and the like. All of a sudden, tweets can become more informative, contextual, and maybe even predictive.

Simon Says: Learn from Twitter

The lesson here is two-fold. First, recognize that you can’t predict the future. As I’ve said myriad times in my career, its better to have it and not need it than need it and not have it. With data storage costs plummeting, why not track every type of data you can?

Second, ensure that your organization’s infrastructure can support new data types, emerging sources, and increased volumes. Yes, ETL is still important, but the future is all about APIs. Make sure that your organization is keeping up with the times. The costs of inaction are irrelevance and possibly extinction.