Cookies help us display personalized product recommendations and ensure you have great shopping experience.

By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
SmartData CollectiveSmartData Collective
  • Analytics
    AnalyticsShow More
    image fx (67)
    Improving LinkedIn Ad Strategies with Data Analytics
    9 Min Read
    big data and remote work
    Data Helps Speech-Language Pathologists Deliver Better Results
    6 Min Read
    data driven insights
    How Data-Driven Insights Are Addressing Gaps in Patient Communication and Equity
    8 Min Read
    pexels pavel danilyuk 8112119
    Data Analytics Is Revolutionizing Medical Credentialing
    8 Min Read
    data and seo
    Maximize SEO Success with Powerful Data Analytics Insights
    8 Min Read
  • Big Data
  • BI
  • Exclusive
  • IT
  • Marketing
  • Software
Search
© 2008-25 SmartData Collective. All Rights Reserved.
Reading: Data Infrastructures
Share
Notification
Font ResizerAa
SmartData CollectiveSmartData Collective
Font ResizerAa
Search
  • About
  • Help
  • Privacy
Follow US
© 2008-23 SmartData Collective. All Rights Reserved.
SmartData Collective > Big Data > Data Infrastructures
Big Data

Data Infrastructures

hstevens
hstevens
7 Min Read
Image
SHARE

ImageIt is obvious to say that big data rely on computers. It would be practically impossible to implement big data algorithms with pencil and paper – it would just take too much paper and too long to write all the data down.

ImageIt is obvious to say that big data rely on computers. It would be practically impossible to implement big data algorithms with pencil and paper – it would just take too much paper and too long to write all the data down. But the scale of big data is not the only thing that ties it to computers – big data are dependent on computer hardware and software in all kinds of ways. Data has to be manipulated into certain forms to get inside computers, to be transmitted over networks, and so on.

These shapings of data are often hidden from the direct view of data users. But they are important – the shapes that data can take, the ways in which it can be manipulated determine the kinds of things that data can show and tell us. I call these shapes ‘data infrastructures’ – that is, all the structures and forms that data must take inside the computer in order to be manipulated as big data. To give a sense of how significant such infrastructures are and what kinds of influence they have on our thinking, I will explore one ubiquitous example here in some detail.

In the twenty-first century, perhaps the most significant data infrastructure of all is the World Wide Web (WWW). What is the structure of the WWW? Well, it’s a web, of course! The WWW’s most important feature, and arguably the reason for its great success, is the hyperlink: text from any one WWW source can be “marked up” so that it forms a “link” to any other WWW source. This means that data can be cross-linked, suggesting ways of reading and writing that are multiple and non-linear. 

More Read

online data
Understanding the Different Types of Online Data for Your Data Strategy
Big Data and Day Trading: The Good, the Bad, the Ugly.
How the Financial Services Industry Should Use Big Data to Regain Trust
A Data Catalog Makes Quick Work of GDPR Compliance
Jeff Hawkins: Brain science is about to fundamentally change…

What was the context in which this system was designed and the purposes it was intended to serve? In the 1980s, Tim Berners-Lee, the WWW’s designer, was a computer programmer working at the Counseil Europeen pour la Recherche Nucleaire (CERN – a massive high energy physics lab straddling the border between France and Switzerland). Berners-Lee saw a failure of information management: the many computers at CERN stored information in different ways and in different formats. Although the computers were networked, there was little way to practically find out anything about what was stored on other machines. Berners-Lee saw work being duplicated and effort wasted due to the inability to share data effectively. The WWW was his solution: it was intended radically expand the circulation of all kinds of information within the closed community of CERN. 

In the late 1980s and early 1990s, the WWW was not the only solution to the problem of managing information in an online network. As various networks around the world were joined together into the Internet, different ideas emerged as to how to organize all this newly accessible information. For instance, in 1991, a team at the University of Minnesota released Gopher – a protocol for retrieving documents over the Internet. Unlike the WWW, Gopher consisted of a series on menus: if you wanted to find a page about, for example, mosquitoes, you might navigate to a menu of animals, then to a menu of insects, and then to the page you want. In the early 1990s, Gopher was a real alternative to the WWW – it imposed more hierarchy and organization on data and was therefore faster and more intuitive for finding many kinds of information.

I describe Gopher to show that in fact, however much we now take the WWW for granted, in fact it is one amongst several possible alternatives. It is one particular way of structuring information and the relationships between different pieces of information. It was designed for a particular purpose and that purpose rendered the structure of the WWW particularly decentralized, freeform, and non-hierarchical. This has some advantages, such as accommodating many different kinds of information. But it also has some disadvantages, such as a lack of organization or indexing of information (a problem we have had to solve using search engines).

In any case, the structure of the WWW is not neutral. It makes doing some tasks easier and others harder; it makes some paths or connections simple to follow, others not so simple. Structures like the WWW are so ubiquitous that they become invisible. This, however, does not diminish their importance.

When it comes to big data, we find data infrastructures everywhere: from the structure of hard drives to the organization of algorithms and databases. These physical and virtual structures place constraints on how data can be organized, processed, and accessed. Understanding the advantages and disadvantages of big data (in various forms) means understanding these structures – in particular it means knowing where they came from and what they were designed to do. Ultimately, what we get out of big data will constrained by the structures we put it into.

 

TAGGED:data infrastructures
Share This Article
Facebook Pinterest LinkedIn
Share

Follow us on Facebook

Latest News

image fx (2)
Monitoring Data Without Turning into Big Brother
Big Data Exclusive
image fx (71)
The Power of AI for Personalization in Email
Artificial Intelligence Exclusive Marketing
image fx (67)
Improving LinkedIn Ad Strategies with Data Analytics
Analytics Big Data Exclusive Software
big data and remote work
Data Helps Speech-Language Pathologists Deliver Better Results
Analytics Big Data Exclusive

Stay Connected

1.2kFollowersLike
33.7kFollowersFollow
222FollowersPin

SmartData Collective is one of the largest & trusted community covering technical content about Big Data, BI, Cloud, Analytics, Artificial Intelligence, IoT & more.

AI chatbots
AI Chatbots Can Help Retailers Convert Live Broadcast Viewers into Sales!
Chatbots
ai chatbot
The Art of Conversation: Enhancing Chatbots with Advanced AI Prompts
Chatbots

Quick Link

  • About
  • Contact
  • Privacy
Follow US
© 2008-25 SmartData Collective. All Rights Reserved.
Go to mobile version
Welcome Back!

Sign in to your account

Username or Email Address
Password

Lost your password?