Cookies help us display personalized product recommendations and ensure you have great shopping experience.

By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
SmartData CollectiveSmartData Collective
  • Analytics
    AnalyticsShow More
    software developer using ai
    How Data Analytics Helps Developers Deliver Better Tech Services
    8 Min Read
    ai for stock trading
    Can Data Analytics Help Investors Outperform Warren Buffett
    9 Min Read
    media monitoring
    Signals In The Noise: Using Media Monitoring To Manage Negative Publicity
    5 Min Read
    data analytics
    How Data Analytics Can Help You Construct A Financial Weather Map
    4 Min Read
    financial analytics
    Financial Analytics Shows The Hidden Cost Of Not Switching Systems
    4 Min Read
  • Big Data
  • BI
  • Exclusive
  • IT
  • Marketing
  • Software
Search
© 2008-25 SmartData Collective. All Rights Reserved.
Reading: The Data Time Investment
Share
Notification
Font ResizerAa
SmartData CollectiveSmartData Collective
Font ResizerAa
Search
  • About
  • Help
  • Privacy
Follow US
© 2008-23 SmartData Collective. All Rights Reserved.
SmartData Collective > Data Management > The Data Time Investment
Data Management

The Data Time Investment

Venky Ganti
Venky Ganti
7 Min Read
SHARE

In a prior blog post on challenges beyond the 3V’s of working with data, I discussed some issues which hindered the efficiency of data analysts besides drastically raising the bar on their motivation to begin working with new data.

Here, I will drill down those issues and my past experience around them.

How do I Find and Understand Data?

In a prior blog post on challenges beyond the 3V’s of working with data, I discussed some issues which hindered the efficiency of data analysts besides drastically raising the bar on their motivation to begin working with new data.

More Read

What IT Professionals Can Learn from Librarians
Solving Smith’s Dashboard Disdain: Reimagine BI communication with Collaborative BI
Why Business Analytics is important for business more than ever NOW !!
Big Data, Big Mistakes?
How the Consumerization of Data Leads to Additional Quality of Life Improvements

Here, I will drill down those issues and my past experience around them.

How do I Find and Understand Data?

Let’s consider the scenario when an engineer or a data analyst inside Google wants to find relevant data, say, a table in Dremel or an SSTable on GFS. She still has to remember the name of the table, and which among Google’s myriad data stores contain it. Further, unlike documents which are self-describing, it is not easy to “understand” what is inside a dataset and how to use it. The user needs to understand the data by talking to people who know about the data, or through some other alternative means. Contrast the effort spent by an engineer within Google for finding and understanding data, relative to that an external user spends using Google to find and understand information on the web.

Let me recall one of my own frustrating experiences around a similar scenario. I worked on the AdWords team at Google. I needed to find information about search queries that led to similar user behavior on Google’s products, specifically Search and Ads. I felt that there must be several datasets out there in the Search and Ads teams. I found two in the Ads teams because I knew someone who worked on those projects. But, it turned out after further investigation that I could not use either because of the differences in target applications. However, I had little luck in finding out similar information from the Search teams. I tried rebuilding my own, spent months, and didn’t succeed. Recently, after I left Google, an ex-colleague told me he chanced upon a pointer to the right data and successfully used it!

Of course, these problems around finding and understanding data are not peculiar to Google but exist at any organization which leverages data to enhance their decision-making and their products. In general, an engineer at Google has a better chance at overcoming these problems due to awesome internals tools (e.g., code search).

The focus of much of the technology related to data has been on enabling processing massive amounts of data, and visualizing results better. But, there is no focus on empowering users to find and understand data within these databases to prepare queries and programs more reliably and efficiently.

The primary reason in my opinion for the lack of focus on these issues, is that it is much more concrete to measure and show progress on query processing efficiency and visualization capabilities. On the other hand, it is hard today to articulate the benefits of helping data users find and understand data. By the way, wasn’t this true for Search over the web until Google came along and illustrated the economic and productivity gains across a wide spectrum of users? I believe that we are at the cusp of a similar revolution in data consumption.

Who do I Ask?

After an analyst finds a dataset, she needs to understand its usage by other analysts and applications. Often, it is very hard to find such knowledgeable users. There were many times when I found it quite hard, even at Google, to identify the people I need to talk to for such questions; when I did find them, I felt the pain of distracting engineers with run-of-the-mill questions which they must have answered many times over.

As an example, I was responsible for migrating an application reading data from one engine to a newer more robust engine. A big part of the migration involved rewriting queries to read from the new schema. I was among the last few to be doing this migration, and hence similar questions must have been answered. But, the wiki that I was pointed to didn’t have all the information I needed. So, I had to drag myself very reluctantly to a very busy principal engineer, the only one I knew directly, to get help. I would have appreciated, a lot, if I could quickly find someone else who went through a similar migration.

On the flip side, I would repeatedly answer the same set of questions over and over on data that I produced and maintained. I tried creating a wiki page, but was still asked lots of questions. As we all know, this approach comes with its own set of challenges — keeping the wiki updated and reliable over time. In retrospect, I wouldn’t be surprised if I or my colleagues may have missed a few updates.

How much Time?

So, how much time is actually spent by analysts on these activities of finding and understanding data? I haven’t tried measuring this yet. We just don’t have the methodology and the tools to do it. But, depending on who you ask and which data they need to use, the answer varies widely. New users to a particular dataset will spend upwards of 80% on these tasks, while experts much much less. However, experts spend time by answering other users’ questions over and over.

Share This Article
Facebook Pinterest LinkedIn
Share

Follow us on Facebook

Latest News

software developer using ai
How Data Analytics Helps Developers Deliver Better Tech Services
Analytics Big Data Exclusive
ai for stock trading
Can Data Analytics Help Investors Outperform Warren Buffett
Analytics Exclusive
data security issues with annotation outsourcing
Data Annotation Outsourcing and Risk Mitigation Strategies
Big Data Exclusive Security
NO-CODE
Breaking down SPARC Emulation Technology: Zero Code Re-write
Exclusive News Software

Stay Connected

1.2KFollowersLike
33.7KFollowersFollow
222FollowersPin

You Might also Like

Predictive Analytics on Big Data – What Does the Future Hold?

0 Min Read
AI leads to a new range of cybersecurity risks for social media users
Artificial Intelligence

AI Significantly Increases the Dangers of Social Media Hacking

11 Min Read

Tips for Developing a Super HR Analytics Team

5 Min Read
big data insights
AnalyticsBest PracticesBig DataBusiness IntelligenceCRMCulture/LeadershipData ManagementInside CompaniesITMarket ResearchMobilitySocial Data

Valuable Big Data Insights via Nike+ Gamification Platform

7 Min Read

SmartData Collective is one of the largest & trusted community covering technical content about Big Data, BI, Cloud, Analytics, Artificial Intelligence, IoT & more.

data-driven web design
5 Great Tips for Using Data Analytics for Website UX
Big Data
giveaway chatbots
How To Get An Award Winning Giveaway Bot
Big Data Chatbots Exclusive

Quick Link

  • About
  • Contact
  • Privacy
Follow US
© 2008-25 SmartData Collective. All Rights Reserved.
Welcome Back!

Sign in to your account

Username or Email Address
Password

Lost your password?