Data intimacy

February 25, 2009
118 Views

Long before Scott Davies made the self-service ETL tool he calls Lyza, he tried to find out how analysts really work. He remembers in particular the woman in a focus group who said, “I want to stay close to the data.”

He didn’t understand at first. The data was right in front of her, neatly summarized. But she meant all of the data, every little bit of it. She wanted to snap open a zillion-row-long window that she could scroll down to see the figures flip by. (Yes, you can; I saw it yesterday.) She wouldn’t try to read them, she’d only see their shapes. She could say, for example, “Hmm, I see that just two thirds are under 1000.” Davies calls that visualization with browse—as legitimate a use of “visualization” as any I’ve heard of.

He also thought about how people use Excel. In fact, it helps explain’s Excel’s popularity. They have the data, and the they have the formulas, and you can reveal either one. If a number shows up that doesn’t look right—say it’s six figures instead of five—you just look at the formula. You say, “Oh, that’s the annual figure. I forgot to divide by twelve.”

Something similar goes on at all levels of analysis: a rapid back and forth from question to answer,

Long before Scott Davies made the self-service ETL tool he calls Lyza, he tried to find out how analysts really work. He remembers in particular the woman in a focus group who said, “I want to stay close to the data.”

He didn’t understand at first. The data was right in front of her, neatly summarized. But she meant all of the data, every little bit of it. She wanted to snap open a zillion-row-long window that she could scroll down to see the figures flip by. (Yes, you can; I saw it yesterday.) She wouldn’t try to read them, she’d only see their shapes. She could say, for example, “Hmm, I see that just two thirds are under 1000.” Davies calls that visualization with browse—as legitimate a use of “visualization” as any I’ve heard of.

He also thought about how people use Excel. In fact, it helps explain’s Excel’s popularity. They have the data, and the they have the formulas, and you can reveal either one. If a number shows up that doesn’t look right—say it’s six figures instead of five—you just look at the formula. You say, “Oh, that’s the annual figure. I forgot to divide by twelve.”

Something similar goes on at all levels of analysis: a rapid back and forth from question to answer, back to a rephrased question, and back to an adjusted answer.

Forget the flow charts. Forget the “data train,” a metaphor I admit to having used. Analysis is more like what my labrador does when she knows there’s something good nearby. She sniffs in what looks like a random pattern until you realize she’s narrowing the range.

What drives analysts crazy about working with IT, he says, is that the data’s taken away. The conversation goes like this: the IT guy asks what the analyst wants; the analyst describes her best guess; the IT guy goes away and does it. But that may not be what the analyst really needed, and the analyst may not realize it until the first data’s tried and proves inadequate or suggests yet another path.

I can relate, because it’s like writing. I do a lot of scribbling and writing over, and I don’t have time to explain it. If I had to tell a typist what to write, I’d write much less.

Visualize the bumper stickers: “free the analysts” but also “free IT.”

Now, Larissa T. Moss has her doubts. Perhaps she’ll sit for a demo. I’d like to hear what she says.

Link to original post