SmartData Collective
0
  • About
  • Help
  • Post Here
SmartData Collective
SmartData Collective
SmartData Collective
  • Post Here
  • Exclusive
    advanced business analytics benefits
    Yes, Advanced Business Analytics Can Cut Costs
    satellite data for imagery
    Big Data and the Use of Satellite Imagery
    writing content with machine learning
    New Case Studies Show Promising Potential for Writing Content with Machine Learning
    AI productivity tools
    AI Tools Can Help Facilitate Team Productivity While Working Remotely
    react js for data-driven businesses
    5 Reasons Data-Driven Startups Should Be Using React JS
  • Analytics
  • Big Data
  • BI
  • IT
  • Marketing
  • Software
0
Trending Now
business intelligence tools with gamification

Business Intelligence: Gamification as a Strategic Tool for Organizations

February 23, 2021
financial analytics

Gathering Data Can Be Vital for Your Personalized Financial Plan

February 12, 2021
writing content with machine learning

New Case Studies Show Promising Potential for Writing Content with Machine Learning

February 24, 2021
Big data mistakes to avoid

6 Big Data Mistakes You Must Avoid At All Costs

February 23, 2021
SmartData Collective
  • Post Here
  • Exclusive
    advanced business analytics benefits
    Yes, Advanced Business Analytics Can Cut Costs
    satellite data for imagery
    Big Data and the Use of Satellite Imagery
    writing content with machine learning
    New Case Studies Show Promising Potential for Writing Content with Machine Learning
    AI productivity tools
    AI Tools Can Help Facilitate Team Productivity While Working Remotely
    react js for data-driven businesses
    5 Reasons Data-Driven Startups Should Be Using React JS
  • Analytics
  • Big Data
  • BI
  • IT
  • Marketing
  • Software
Trending Now
business intelligence tools with gamification

Business Intelligence: Gamification as a Strategic Tool for Organizations

February 23, 2021
financial analytics

Gathering Data Can Be Vital for Your Personalized Financial Plan

February 12, 2021
writing content with machine learning

New Case Studies Show Promising Potential for Writing Content with Machine Learning

February 24, 2021
Big data mistakes to avoid

6 Big Data Mistakes You Must Avoid At All Costs

February 23, 2021
0
SmartData Collective > Big Data > Data Mining > Early Indications April 2010 The Web of Opinion: Metadata as conversation
Data Mining

Early Indications April 2010 The Web of Opinion: Metadata as conversation

JohnJordan1
Posted by JohnJordan1
0 Shares
READ NEXT
Big Data, Data Mining and Machine Learning: Deriving Value for Business

In the beginning, there was data, enumerating how many, what kind,
where. Data was kept in proprietary formats and physically located:
if the library was missing the Statistical Abstract for 1940, or some
other grad student had sequestered it, you had little chance to
determine corn production in Nebraska before World War II. Such
statistics were the exception: most data remained unpublished, in lab
notebooks and elsewhere.

Once data escaped from print into bits, it became potentially
ubiquitous, and once formats became less proprietary, more people
could gain access to more forms of data. The early history of the web
was built in part on a footing of public access to data: online
collections of maps, congressional votes, stock prices, phone numbers,
product catalogs, and other data proliferated.

Data has always required metadata: that table of corn production had a
title and probably a methodological footnote. Such metadata was
typically contributed by an expert in either the technical field or in
the practice of categorizing. Official taxonomies have continued the
tradition of creators and curators having cognitive authority in the
process of …


In the beginning, there was data, enumerating how many, what kind,
where. Data was kept in proprietary formats and physically located:
if the library was missing the Statistical Abstract for 1940, or some
other grad student had sequestered it, you had little chance to
determine corn production in Nebraska before World War II. Such
statistics were the exception: most data remained unpublished, in lab
notebooks and elsewhere.

Once data escaped from print into bits, it became potentially
ubiquitous, and once formats became less proprietary, more people
could gain access to more forms of data. The early history of the web
was built in part on a footing of public access to data: online
collections of maps, congressional votes, stock prices, phone numbers,
product catalogs, and other data proliferated.

Data has always required metadata: that table of corn production had a
title and probably a methodological footnote. Such metadata was
typically contributed by an expert in either the technical field or in
the practice of categorizing. Official taxonomies have continued the
tradition of creators and curators having cognitive authority in the
process of organizing. In addition, as Clay Shirky has pointed out in
“Ontology is Overrated,” the heritage of physicality led to the need
for one answer being correct so that an asset could be found: a book
about Russian and American agricultural policy during the 1930s had to
live among books on Russian history, agricultural history, or U.S.
history: it was arguably about any or all of those things, but someone
(most likely at the Library of Congress) assigned it a catalog number
that finalized the discussion: the book in question was officially and
forever “about” this more than it was about that.

In the past decade, the so-called read-write web has allowed anyone to
become both a content creator and a metadata creator. Sometimes these
activities coincide, as when someone tags their own YouTube video for
example. More often, creations are submitted to a commons, and the
commoners (rather than a cognitive authority) determine what the
contribution “is” and what it is “about.” Rather than editors or peer
reviewers judging an asset’s quality before publication, in more and
more settings the default process is publication then collaborative
filtering for definition, quality, and meaning.

Imagine a particular propane torch for sale on Amazon.com. So-called
social metadata has been nurtured and collected for years on the site.
If I appreciate the way the torch works for its intended use of
brazing copper pipe, I can submit a review with both a star rating and
prose. Amazon quickly allowed for more social metadata as you the
reader of my review can now rate my review, thus creating metadata
about metadata.

Here is where the discussion gets complicated and extremely
interesting. Suppose I say in my review that I use the Flamethrower
1000 for creme brulee even though the device is not rated (by whatever
safety or sanitation authority) for kitchen use. The comments about
my torch review can quickly become a foodie discussion thread: the
best creme brulee recipe, the best restaurants at which to order it,
regional variations in the naming or preparation of creme brulee, and
so forth. Amazon’s moderators might truncate the discussion to the
extent it’s not “about” the Flamethrower 1000 under review, but the
urge to digress has long been and will be demonstrated elsewhere.

Enter Facebook. The platform is in essence a gigantic metadata
generation and distribution system. (“I liked the concert.” “The
person who liked the concert did not know what she was talking about.”
“My friend was at the concert and said it was uneven.” and so on)
Strip Facebook of attribute data and there is little left: it’s
essentially a mass of descriptors (including “complicated”), created
by amateurs and never claimed as authoritative, linked by a
21st-century kinship network. Facebook’s announcement on April 21st
of the Open Graph institutionalizes this collection of conversations
as one vast, logged, searchable metadata repository. If I “like”
something, my social network can be alerted, and the website object of
my affection will know as well.

Back in November, Bruce Schneier laid out five categories of social
networking data
:

1. Service data. Service data is the data you need to give to a social
networking site in order to use it. It might include your legal name,
your age, and your credit card number.
2. Disclosed data. This is what you post on your own pages: blog
entries, photographs, messages, comments, and so on.
3. Entrusted data. This is what you post on other people’s pages. It’s
basically the same stuff as disclosed data, but the difference is that
you don’t have control over the data — someone else does.
4. Incidental data. Incidental data is data the other people post
about you. Again, it’s basically the same stuff as disclosed data, but
the difference is that 1) you don’t have control over it, and 2) you
didn’t create it in the first place.
5. Behavioral data. This is data that the site collects about your
habits by recording what you do and who you do it with.

What does that list look like today? A user’s trail of “like” clicks
makes this list or her Netflix reviews and star ratings, themselves
the subject of privacy concerns, seem like merely the tip of the
iceberg. As Dan Frankowski said in his Google Talk on data mining,
people have been defined by their preferences for millennia —
sometimes to the point of dying for them.

With anything so new and so massive in scale (50,000 sites adopted the
“like” software toolkit in the first week), the unexpected
consequences will take months and more likely years to accumulate.
What will it mean when every opinion we express on line, from the
passionate to the petty, gets logged in the Great Preference
Repository in the Sky, never to be erased and forever being able to be
correlated, associated, regressed, and otherwise algorithmically
parsed?

Several questions follow: who will have either direct or indirect
access to the metadata conversation? What are the opt-in, opt-out,
and monitoring/correction provisions? If I once mistakenly clicked a
Budweiser button but have since publicly declared myself a Molson man,
can I see my preference library as if it’s a credit score and remedy
any errors or misrepresentations? What will be the rewards for brand
monogamy versus the penalties for promiscuous “liking” of every
product with a prize or a coupon attached?

While this technology appears to build barriers to competitive entry
for Facebook, what happens if I establish a preference profile when
I’m 14, then decide I no longer like zoos, American Idol, or Gatorade?
Will people seek a fresh start at some point in an undefined network,
with no prehistory? What is the mechanism for “unliking” something,
and how far retrospectively will it apply?

Precisely because Facebook is networked, we’ve come a very long way
from from that Statistical Abstract on the library shelf. What
happens to my social metadata once it traverses my network? How much
or how little control do I have over what my network associates
(“friends” in Facebook-speak) do with my behavioral and opinion data
that comes their way? As both the Burger King “Whopper Sacrifice”
(defriend ten people, get a hamburger coupon) and a more recent
Ikea-spoofing scam have revealed, Facebook users will sell out their
friends for rewards large and small, whether real or fraudulent.

Finally, to the extent that Facebook is both free to use and expensive
to operate, the Open Graph model opens a fascinating array of revenue
streams. If beggars can’t be choosers, users of a free system have
limited say in how that system survives. At the same time, the global
reach of Facebook exposes it to a broad swath of regulators, not the
least formidable of whom come out of the European Union’s strict
privacy rights milieu. As both the uses and inevitable abuses of the
infinite metadata repository unfold, the reaction will be sure to be
newsworthy.

Link to original post

Tags: data mining facebook metadata
0 Shares
Share on Facebook Share on Twitter Share on Pinterest Share on Linkedin
JohnJordan1 April 29, 2010

Leave a Review

You must log in to post a comment.

Follow Socials

701 fans like
34.8k followers follow
63 followers pin

Trending Now

Big Data Architect skills
6 Essential Skills Every Big Data Architect Needs
How Data Science Is Revolutionising Our Social Visibility
encryption technology data protection sdc
7 Advantages of Using Encryption Technology for Data Protection
power BI solutions
How To Enhance Your Jira Experience With Power BI
finance and banking industries
How Big Data Impacts The Finance And Banking Industries
cloud storage computing
5 Things to Consider When Choosing the Right Cloud Storage

Follow us on Facebook

Follow us on Facebook

You Might Also Enjoy

smart data for business cost reduction
Data Mining

Data Mining Vital Statistics Yields Fascinating Societal Insights

February 15, 2021
data science and data mining differences
Data Science

Deciphering The Seldom Discussed Differences Between Data Mining and Data Science

November 18, 2020
revolutionize marketing in 2021
Analytics

4 Data Analytics Tools That Will Revolutionize Marketing In 2021

October 20, 2020
web data mining
Data Mining

Essential Proxy Selection Tips For Web Data Mining

October 2, 2020
Load More
SmartData Collective
  • About
  • Advertise

© 2008–2021 - All rights reserved

Our website uses cookies to improve your experience. Learn more about: cookie policy

Accept