National Public Radio is not the most obvious place for a deep dive on “The New World of Massive Data Mining,” so it may have surprised listeners when the April 2 edition of
National Public Radio is not the most obvious place for a deep dive on “The New World of Massive Data Mining,” so it may have surprised listeners when the April 2 edition of Diane Rehm’s popular NPR discussion program devoted an hour to just that topic. But the wide-ranging conversation covered some fascinating territory, and attracted thoughtful questions from the audience.
Guests included: John Villasenor, a senior fellow at the Brookings Institution and professor of electrical engineering at UCLA; Suzanne Iacono, senior science advisor for computer and information science and engineering at the National Science Foundation; and Michael Leiter, senior counselor at Palantir Technologies and former director of the National Counter Terrorism Center.
You can hear the conversation or read the whole transcript here. In the meantime, though, here are a few highlights:
- Leiter: The challenge of big data is not only the volume – “it’s also the speed with which it’s coming in, and the variety of forms of the data.” The most important requirements for managing and utilizing big data are first, integrating the data to discover meaningful correlations, and second, doing that in a “flexible, agile way” so human beings (not algorithms) can explore the data effectively.
- Iacono: “We’re seeing a huge transformation in science,” brought about by the shift from relatively small datasets to massive quantities of information. Big data creates “opportunities to address national challenges like clean energy and cyberlearning in completely new ways that we’ve never thought about before.”
- Villasenor: “One of the most remarkable statistics among many in the technology world” is the tremendous decline in storage costs over the last three decades. “It now costs less than seventeen cents to store everything one person says on the telephone in a year.”
- Leiter: “We have to make sure that the same technology that is used to leverage this data for very good purposes can also [be used] and is also used to protect privacy and civil liberties.” That could mean, for example, auditing the information that’s looked at, and putting controls on how it’s used.
- Villasenor: “Advertisers will talk the talk in terms of respecting consumer privacy when it suits their interest.” But as long as there’s a “fundamental underlying financial incentive for advertisers to know as much as they can about you,” they’re always going to push the boundaries in order to gather more information.
- Iacono: “There’s a whole new area called ‘Green IT,’” and a lot of computer scientists want to ensure that huge data centers and server farms are not using excessive amounts of electricity and other resources. “Right now we’re grappling with these issues.”
Next Steps: Download our complimentary “5-Minute Guide to Business Analytics“ and learn how analytics technologies can help you uncover the most relevant data when you need it.