Open Data: What’s It Hiding?
Let’s get this straight – I’m all for open data. Yet the fanfare about open data is taking up so much of our attention in the analytics community that many are unaware of looming threats to public data resources.
If you follow tech industry news, chances are you’ve heard some talk about open data, which refers to governments opening up access to information about government activity. Open data is public data, but not all public data is in that category. Public data also includes information that government agencies actively collect about people and businesses. Data collected in the census, for example, is public data.
Open data isn’t about exposing anything which was formerly secret; this information has long been available through the Freedom of Information Act and other means. What’s new is the way the information is exposed, using application programming interfaces (API) which facilitate the development of software applications which use the information. If you are a Joe Average, this means nothing to you directly. But to Jane Geekgal, a programmer and aspiring technology giant, open data is the raw material of new applications. If Jane creates a really useful application, Joe Average may benefit, either because he can use the application, or because Jane’s business will grow and bolster the general economy.
Public data has been available to all since the early days of the nation. We’re all aware of the census conducted every ten years, yet this is only one of many public data resources. The Census Bureau conducts research that provides a wide variety of information on businesses as well as people, including import and export statistics, tax collections, construction and many other activities. The Bureau of Labor Statistics provides information on compensation and employment, prices and productivity. These are the agencies best known as sources of public data, yet many others develop and distribute data which enriches our understanding of the people and businesses in our country. For example, the National Center for Education Statistics conducts a variety of surveys specifically focused on education.
Open data can be a terrific thing, but the hype surrounding it is diverting attention away from the threat to public data resources.
Today, the Census Bureau collects a lot of other information besides just how many people live in a dwelling. Through its American Community Survey, the United States Census Bureau collects information about the ages of the people, their race, income and many other factors that enable us to understand the nature of our people, information which is vital for businesses as well as government decision-making. It uses statistical methods to account for flaws in the data collection process and provide accurate information. Here’s what the Census Bureau has to say about the American Community Survey and how it is used:
What is the American Community Survey?
The American Community Survey (ACS) is an ongoing survey that provides data every year -- giving communities the current information they need to plan investments and services. Information from the survey generates data that help determine how more than $400 billion in federal and state funds are distributed each year.
To help communities, state governments, and federal programs, we ask about:
family and relationships
income and benefits
where you work and how you get there
where you live and how much you pay for some essentials
All this detail is combined into statistics that are used to help decide everything from school lunch programs to new hospitals.
So the American Community Survey provides governments with information needed to make informed decisions about serving the American public. The same data is also widely used by businesses and nonprofit organizations, for purposes such as selecting appropriate locations for stores and charitable service centers. Indeed the American Community Survey is vitally important to businesses, a fact reflected in this article by Michael Phillips in Bloomberg Businessweek: Killing the American Community Survey Blinds Business. Yet there’s an active political movement against it.
Sounds like a crazy conspiracy theory, you say? Have a look at this quote from one political organization: “We oppose the Census Bureau’s obtaining data beyond the number of people residing in a dwelling, and we oppose statistical sampling adjustments.” This position goes directly against the American Community Survey, and arguably against many other public data programs.
Who’s opposed to the census American Community Survey and why? Where did that quote come from? Here’s a hint. The same document also states, “We urge that the Voter Rights Act of 1965 codified and updated in 1973 be repealed and not reauthorized.” Still not sure? Here’s one more: ” Since education is not an enumerated power of the federal government, we believe the Department of Education (DOE) should be abolished.”
OK, I’ll tell you. These quotes do not come from any fringe organization. In fact, they are the words of a large and popular political organization whose views impact us all, the Republican Party of Texas. All of those quotes came directly from the Texas Republican platform.
What happens in Texas does not stay in Texas; its political influence is felt across our country. Indeed, just a few months ago, the US House of Representatives supported a bill to axe the American Community Survey. People who depend on data must sit up and take notice now.
You need to know about the importance of public data to your work and your life. Toward that end, we’ll be posting a series of articles about public data and its significance. We’ve planned these articles in the spirit of the upcoming conference “The Future of the Federal Statistical System in an Era of Open Government Data.” to be held on September 12 and 13 by the Association of Public Data Users. If you’d like to do more than read, please join us there.
Be on the lookout for these topics in the days to come:
September 6: Why Business Needs Public Data by Joan Naymark
September 11: Ending the American Community Survey: Privacy is Not the Issue by Virginia Carlson
September 13: Protecting Public Data by Meta Brown
Meta Brown is author of "Data Mining for Dummies" (forthcoming from John Wiley and Sons). She has introduced and expanded the use of analytics in offices and factories across the US and beyond. Got a question about promoting analytics? Or on using analytics? Just want to say hello? Email Meta at [email protected], tweet her @metabrown312 or visit http://www.metabrown.com