By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
SmartData Collective
  • Analytics
    AnalyticsShow More
    data-driven image seo
    Data Analytics Helps Marketers Substantially Boost Image SEO
    8 Min Read
    construction analytics
    5 Benefits of Analytics to Manage Commercial Construction
    5 Min Read
    benefits of data analytics for financial industry
    Fascinating Changes Data Analytics Brings to Finance
    7 Min Read
    analyzing big data for its quality and value
    Use this Strategic Approach to Maximize Your Data’s Value
    6 Min Read
    data-driven seo for product pages
    6 Tips for Using Data Analytics for Product Page SEO
    11 Min Read
  • Big Data
  • BI
  • Exclusive
  • IT
  • Marketing
  • Software
Search
© 2008-23 SmartData Collective. All Rights Reserved.
Reading: Early Indications April 2010 The Web of Opinion: Metadata as conversation
Share
Notification Show More
Latest News
ai in software development
3 AI-Based Strategies to Develop Software in Uncertain Times
Software
ai in ppc advertising
5 Proven Tips for Utilizing AI with PPC Advertising in 2023
Artificial Intelligence
data-driven image seo
Data Analytics Helps Marketers Substantially Boost Image SEO
Analytics
ai in web design
5 Ways AI Technology Has Disrupted Website Development
Artificial Intelligence
cloud-centric companies using network relocation
Cloud-Centric Companies Discover Benefits & Pitfalls of Network Relocation
Cloud Computing
Aa
SmartData Collective
Aa
Search
  • About
  • Help
  • Privacy
Follow US
© 2008-23 SmartData Collective. All Rights Reserved.
SmartData Collective > Big Data > Data Mining > Early Indications April 2010 The Web of Opinion: Metadata as conversation
Data Mining

Early Indications April 2010 The Web of Opinion: Metadata as conversation

JohnJordan1
Last updated: 2010/04/29 at 7:28 PM
JohnJordan1
12 Min Read
SHARE
- Advertisement -

In the beginning, there was data, enumerating how many, what kind,
where. Data was kept in proprietary formats and physically located:
if the library was missing the Statistical Abstract for 1940, or some
other grad student had sequestered it, you had little chance to
determine corn production in Nebraska before World War II. Such
statistics were the exception: most data remained unpublished, in lab
notebooks and elsewhere.

Once data escaped from print into bits, it became potentially
ubiquitous, and once formats became less proprietary, more people
could gain access to more forms of data. The early history of the web
was built in part on a footing of public access to data: online
collections of maps, congressional votes, stock prices, phone numbers,
product catalogs, and other data proliferated.

- Advertisement -

Data has always required metadata: that table of corn production had a
title and probably a methodological footnote. Such metadata was
typically contributed by an expert in either the technical field or in
the practice of categorizing. Official taxonomies have continued the
tradition of creators and curators having cognitive authority in the
process of …


In the beginning, there was data, enumerating how many, what kind,
where. Data was kept in proprietary formats and physically located:
if the library was missing the Statistical Abstract for 1940, or some
other grad student had sequestered it, you had little chance to
determine corn production in Nebraska before World War II. Such
statistics were the exception: most data remained unpublished, in lab
notebooks and elsewhere.

More Read

big data technology has helped improve the state of both the deep web and dark web

What Role Does Big Data Have on the Deep Web?

5 Data Mining Tips to Leverage the Benefits of Surveys
Perform Data Mining With Web Scrapers to Track Prices
Data Mining Vital Statistics Yields Fascinating Societal Insights
Deciphering The Seldom Discussed Differences Between Data Mining and Data Science

Once data escaped from print into bits, it became potentially
ubiquitous, and once formats became less proprietary, more people
could gain access to more forms of data. The early history of the web
was built in part on a footing of public access to data: online
collections of maps, congressional votes, stock prices, phone numbers,
product catalogs, and other data proliferated.

Data has always required metadata: that table of corn production had a
title and probably a methodological footnote. Such metadata was
typically contributed by an expert in either the technical field or in
the practice of categorizing. Official taxonomies have continued the
tradition of creators and curators having cognitive authority in the
process of organizing. In addition, as Clay Shirky has pointed out in
“Ontology is Overrated,” the heritage of physicality led to the need
for one answer being correct so that an asset could be found: a book
about Russian and American agricultural policy during the 1930s had to
live among books on Russian history, agricultural history, or U.S.
history: it was arguably about any or all of those things, but someone
(most likely at the Library of Congress) assigned it a catalog number
that finalized the discussion: the book in question was officially and
forever “about” this more than it was about that.

In the past decade, the so-called read-write web has allowed anyone to
become both a content creator and a metadata creator. Sometimes these
activities coincide, as when someone tags their own YouTube video for
example. More often, creations are submitted to a commons, and the
commoners (rather than a cognitive authority) determine what the
contribution “is” and what it is “about.” Rather than editors or peer
reviewers judging an asset’s quality before publication, in more and
more settings the default process is publication then collaborative
filtering for definition, quality, and meaning.

- Advertisement -

Imagine a particular propane torch for sale on Amazon.com. So-called
social metadata has been nurtured and collected for years on the site.
If I appreciate the way the torch works for its intended use of
brazing copper pipe, I can submit a review with both a star rating and
prose. Amazon quickly allowed for more social metadata as you the
reader of my review can now rate my review, thus creating metadata
about metadata.

Here is where the discussion gets complicated and extremely
interesting. Suppose I say in my review that I use the Flamethrower
1000 for creme brulee even though the device is not rated (by whatever
safety or sanitation authority) for kitchen use. The comments about
my torch review can quickly become a foodie discussion thread: the
best creme brulee recipe, the best restaurants at which to order it,
regional variations in the naming or preparation of creme brulee, and
so forth. Amazon’s moderators might truncate the discussion to the
extent it’s not “about” the Flamethrower 1000 under review, but the
urge to digress has long been and will be demonstrated elsewhere.

Enter Facebook. The platform is in essence a gigantic metadata
generation and distribution system. (“I liked the concert.” “The
person who liked the concert did not know what she was talking about.”
“My friend was at the concert and said it was uneven.” and so on)
Strip Facebook of attribute data and there is little left: it’s
essentially a mass of descriptors (including “complicated”), created
by amateurs and never claimed as authoritative, linked by a
21st-century kinship network. Facebook’s announcement on April 21st
of the Open Graph institutionalizes this collection of conversations
as one vast, logged, searchable metadata repository. If I “like”
something, my social network can be alerted, and the website object of
my affection will know as well.

Back in November, Bruce Schneier laid out five categories of social
networking data
:

1. Service data. Service data is the data you need to give to a social
networking site in order to use it. It might include your legal name,
your age, and your credit card number.
2. Disclosed data. This is what you post on your own pages: blog
entries, photographs, messages, comments, and so on.
3. Entrusted data. This is what you post on other people’s pages. It’s
basically the same stuff as disclosed data, but the difference is that
you don’t have control over the data — someone else does.
4. Incidental data. Incidental data is data the other people post
about you. Again, it’s basically the same stuff as disclosed data, but
the difference is that 1) you don’t have control over it, and 2) you
didn’t create it in the first place.
5. Behavioral data. This is data that the site collects about your
habits by recording what you do and who you do it with.

- Advertisement -

What does that list look like today? A user’s trail of “like” clicks
makes this list or her Netflix reviews and star ratings, themselves
the subject of privacy concerns, seem like merely the tip of the
iceberg. As Dan Frankowski said in his Google Talk on data mining,
people have been defined by their preferences for millennia —
sometimes to the point of dying for them.

With anything so new and so massive in scale (50,000 sites adopted the
“like” software toolkit in the first week), the unexpected
consequences will take months and more likely years to accumulate.
What will it mean when every opinion we express on line, from the
passionate to the petty, gets logged in the Great Preference
Repository in the Sky, never to be erased and forever being able to be
correlated, associated, regressed, and otherwise algorithmically
parsed?

Several questions follow: who will have either direct or indirect
access to the metadata conversation? What are the opt-in, opt-out,
and monitoring/correction provisions? If I once mistakenly clicked a
Budweiser button but have since publicly declared myself a Molson man,
can I see my preference library as if it’s a credit score and remedy
any errors or misrepresentations? What will be the rewards for brand
monogamy versus the penalties for promiscuous “liking” of every
product with a prize or a coupon attached?

While this technology appears to build barriers to competitive entry
for Facebook, what happens if I establish a preference profile when
I’m 14, then decide I no longer like zoos, American Idol, or Gatorade?
Will people seek a fresh start at some point in an undefined network,
with no prehistory? What is the mechanism for “unliking” something,
and how far retrospectively will it apply?

Precisely because Facebook is networked, we’ve come a very long way
from from that Statistical Abstract on the library shelf. What
happens to my social metadata once it traverses my network? How much
or how little control do I have over what my network associates
(“friends” in Facebook-speak) do with my behavioral and opinion data
that comes their way? As both the Burger King “Whopper Sacrifice”
(defriend ten people, get a hamburger coupon) and a more recent
Ikea-spoofing scam have revealed, Facebook users will sell out their
friends for rewards large and small, whether real or fraudulent.

- Advertisement -

Finally, to the extent that Facebook is both free to use and expensive
to operate, the Open Graph model opens a fascinating array of revenue
streams. If beggars can’t be choosers, users of a free system have
limited say in how that system survives. At the same time, the global
reach of Facebook exposes it to a broad swath of regulators, not the
least formidable of whom come out of the European Union’s strict
privacy rights milieu. As both the uses and inevitable abuses of the
infinite metadata repository unfold, the reaction will be sure to be
newsworthy.

Link to original post

TAGGED: data mining, facebook, metadata
JohnJordan1 April 29, 2010
Share this Article
Facebook Twitter Pinterest LinkedIn
Share
- Advertisement -

Follow us on Facebook

Latest News

ai in software development
3 AI-Based Strategies to Develop Software in Uncertain Times
Software
ai in ppc advertising
5 Proven Tips for Utilizing AI with PPC Advertising in 2023
Artificial Intelligence
data-driven image seo
Data Analytics Helps Marketers Substantially Boost Image SEO
Analytics
ai in web design
5 Ways AI Technology Has Disrupted Website Development
Artificial Intelligence

Stay Connected

1.2k Followers Like
33.7k Followers Follow
222 Followers Pin

You Might also Like

big data technology has helped improve the state of both the deep web and dark web
Big Data

What Role Does Big Data Have on the Deep Web?

8 Min Read
surveys data
Data Mining

5 Data Mining Tips to Leverage the Benefits of Surveys

11 Min Read
data mining is game changer for small businesses
Data Mining

Perform Data Mining With Web Scrapers to Track Prices

7 Min Read
smart data for business cost reduction
Data Mining

Data Mining Vital Statistics Yields Fascinating Societal Insights

6 Min Read

SmartData Collective is one of the largest & trusted community covering technical content about Big Data, BI, Cloud, Analytics, Artificial Intelligence, IoT & more.

AI and chatbots
Chatbots and SEO: How Can Chatbots Improve Your SEO Ranking?
Artificial Intelligence Chatbots Exclusive
ai in ecommerce
Artificial Intelligence for eCommerce: A Closer Look
Artificial Intelligence

Quick Link

  • About
  • Contact
  • Privacy
Follow US

© 2008-23 SmartData Collective. All Rights Reserved.

Removed from reading list

Undo
Go to mobile version
Welcome Back!

Sign in to your account

Lost your password?