Austinites Really Love Music & Kevin Durant is Kind of a Big Deal, So Says Data Science

May 8, 2015
144 Views

Let’s talk a little bit about true audience segmentation, beyond naive demographics, splitting people into quartiles and deeper, by far, than simple queries and filters.

At Umbel, the data ecosystem populating our Digital Genome allows us to create an emergent, rich understanding of the subtle patterns spread throughout and across any audience. Let’s step through what I like to call the “data looking-glass” together. 

Let’s talk a little bit about true audience segmentation, beyond naive demographics, splitting people into quartiles and deeper, by far, than simple queries and filters.

At Umbel, the data ecosystem populating our Digital Genome allows us to create an emergent, rich understanding of the subtle patterns spread throughout and across any audience. Let’s step through what I like to call the “data looking-glass” together. 

To begin, what do we want out of audience segmentation anyway? Getting a sense of the average age and geography of your audience is great, but we can learn so much more. What if you could take any arbitrary slice through your data based on any whim and instantly zoom in on what specifically makes that cohort unique? 

Well, with the Digital Genome and a little bit of science, you can do exactly that.

Real Data Science, Right Now

Don’t believe it? Let’s start with a few examples to prove the power of what I’m talking about. 

I spent a few minutes exploring some data that’s near and dear to everyone here at Umbel. Being an Austin-native startup, I thought it would be interesting to look at the audience of some of our beloved local music festivals and events. What I love about this data is that we can see broad characteristics of music, sports and media fans as well as the distinct local signature. This isn’t simply adding up numbers of likes; we can find what makes any segment of any audience truly special.

So here’s how it all works: the collective audience of our local music festivals has millions of affinities to brands and bands and media and anything else you can imagine, both global and local. Each affinity, each demographic data point, each location on the globe is a dimension in this enormous data space. This isn’t an ocean of data: it’s a universe. If you ever gave yourself a headache trying to imagine the 4-dimensional space-time described in Stephen Hawking’s “A Brief History of Time,” then you can imagine the mind-bender that is multimillion-dimensional spaces. In a later post, I’ll show you how you can see and interact with these data universes. But for now, let’s stick to segments.

Intuitively, I have some feelings about the kinds of preferences that must exist in the crowds attending live music and events. Austin represents a diverse set of musical tastes. It shouldn’t be a surprise to anyone that Austinites love Willie Nelson and Johnny Cash. If you take a second, you can probably imagine who you think that audience is. But with the data and technologies at Umbel, I can show you explicitly who that audience is. If we segment the entirety of the events universe by Willie and The Man in Black, we can generate a list of the characteristic data that fundamentally defines that segment. 

This is really amazing. First, there are affinities here that make perfect sense. Wilco is exactly the sort of music outlaw country fans dig. And, there are clear signatures that this is an Austin audience: tacos are popular and The Highball, a trendy local bowling alley and karaoke bar, recently reopened and is trending – which should make Willie Nelson fans in Austin happy. But there are some surprises here, too. For instance, Nirvana and David Bowie rank high for this audience segment. Who knew?

More Than What Meets The Eye

I want to be clear here: what we’re looking at isn’t simply the most popular brands, filtered by an affinity to Willie and Johnny Cash. Even if you had the data, you’d never find these signatures with SQL. Instead, these are the representative affinities that define what is interesting and unique about this particular audience segment. Let’s take a look at some more segments and start to paint a picture of how deep this can go.

I said before that Austin music fans have broad tastes, and now I feel the need to back that up. So I tossed Radiohead and Arcade Fire into my program and within a few seconds, this list pops out: 

Again, some of these affinities are predictable, but I was happy to see Spoon, one of our original homegrown indie bands that hit the big time, show up on the list. 

Interestingly, two bands that straddle the alt-folk genre are in both lists: Edward Sharpe and Wilco.

But the data doesn’t only give us a picture of related musical preferences. I built two more segments around major publications to see how these Austin live music audience segments breakdown for print media. 

The New Yorker” segment showed us that their readers are also fans of:

JetBlue offers direct flights from Austin to New York City, so it’s immediately obvious why they’re a fundamental component of the New Yorker segment of the Austin music audience. Also, stylish reading glasses, TOMS shoes and our local art museum make an appearance in this segment.   

Compare that segment, though, to readers of Rolling Stone, and what you get is a fundamentally different segment:

All of these segments in this post are coming from the same data set, so these affinities are not merely a set of things that are broadly popular. What we’re seeing here is what makes these groups of people different, and also what makes them the same.

Comparison Analytics: The Spurs vs. The Mavs

To wrap this up, let’s try to do some bridge-building. While the University of Texas Longhorns are the clear favorite for local sports teams, Austin is the largest city in the U.S. without a professional team in one of the four major leagues. So, our loyalties here are divided. 

Knowing this, I wanted to explore two similar but rival fan segments and see what I could learn. To do this, I went to compare the Dallas Cowboys/Dallas Mavericks segment against the Dallas Cowboys/San Antonio Spurs segment still using the same underlying data. These are fans that share a love for the Cowboys, but are split across a mighty gulf when it comes to the NBA. Here are the results, and I’m certain anyone with familiarity with these teams can tell which segment is which: 

Segment A:

Segment B:

I want to detour for a second to note that if you sort any list of people by brand affiliation, Coke and Bud Light will float to near the top. Their brand ubiquity is impressive, but with these segments, Dr. Pepper and Dos Equis outperform. What does that say about these segments? That we really are capturing fundamental characteristics in completely novel, interactive ways.

If you’re paying attention, you’ve probably gleaned that Segment A is our Cowboys/Spurs cohort and Segment B is the all-Dallas fan club. The Mavs and the Spurs take great delight in knocking each other out of the Western Conference playoffs, and apparently they can’t even agree on fast food preferences. 

But, local pride still brings Austin NBA fans together. After all, Kevin Durant won the Naismith award playing for the University of Texas, so it’s no surprise in retrospect to see Austin NBA fans broadly showing their love for the alumni. 

And look at that, we just used data science to find the one thing Spurs and Mavs fans can agree on.