Cookies help us display personalized product recommendations and ensure you have great shopping experience.

By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
SmartData CollectiveSmartData Collective
  • Analytics
    AnalyticsShow More
    sales and data analytics
    How Data Analytics Improves Lead Management and Sales Results
    9 Min Read
    data analytics and truck accident claims
    How Data Analytics Reduces Truck Accidents and Speeds Up Claims
    7 Min Read
    predictive analytics for interior designers
    Interior Designers Boost Profits with Predictive Analytics
    8 Min Read
    image fx (67)
    Improving LinkedIn Ad Strategies with Data Analytics
    9 Min Read
    big data and remote work
    Data Helps Speech-Language Pathologists Deliver Better Results
    6 Min Read
  • Big Data
  • BI
  • Exclusive
  • IT
  • Marketing
  • Software
Search
© 2008-25 SmartData Collective. All Rights Reserved.
Reading: The Data Time Investment
Share
Notification
Font ResizerAa
SmartData CollectiveSmartData Collective
Font ResizerAa
Search
  • About
  • Help
  • Privacy
Follow US
© 2008-23 SmartData Collective. All Rights Reserved.
SmartData Collective > Data Management > The Data Time Investment
Data Management

The Data Time Investment

Venky Ganti
Venky Ganti
7 Min Read
SHARE

In a prior blog post on challenges beyond the 3V’s of working with data, I discussed some issues which hindered the efficiency of data analysts besides drastically raising the bar on their motivation to begin working with new data.

Here, I will drill down those issues and my past experience around them.

How do I Find and Understand Data?

In a prior blog post on challenges beyond the 3V’s of working with data, I discussed some issues which hindered the efficiency of data analysts besides drastically raising the bar on their motivation to begin working with new data.

More Read

Data Access for Citizen Integrators
Data Visualization Best Practices for Business Intelligence
Data Analysis Can Transform HR Department Processes
3 Data Sources to Boost Analytics and Business Intelligence
Security In Automated Document Processing: Ensuring Data Integrity And Confidentiality

Here, I will drill down those issues and my past experience around them.

How do I Find and Understand Data?

Let’s consider the scenario when an engineer or a data analyst inside Google wants to find relevant data, say, a table in Dremel or an SSTable on GFS. She still has to remember the name of the table, and which among Google’s myriad data stores contain it. Further, unlike documents which are self-describing, it is not easy to “understand” what is inside a dataset and how to use it. The user needs to understand the data by talking to people who know about the data, or through some other alternative means. Contrast the effort spent by an engineer within Google for finding and understanding data, relative to that an external user spends using Google to find and understand information on the web.

Let me recall one of my own frustrating experiences around a similar scenario. I worked on the AdWords team at Google. I needed to find information about search queries that led to similar user behavior on Google’s products, specifically Search and Ads. I felt that there must be several datasets out there in the Search and Ads teams. I found two in the Ads teams because I knew someone who worked on those projects. But, it turned out after further investigation that I could not use either because of the differences in target applications. However, I had little luck in finding out similar information from the Search teams. I tried rebuilding my own, spent months, and didn’t succeed. Recently, after I left Google, an ex-colleague told me he chanced upon a pointer to the right data and successfully used it!

Of course, these problems around finding and understanding data are not peculiar to Google but exist at any organization which leverages data to enhance their decision-making and their products. In general, an engineer at Google has a better chance at overcoming these problems due to awesome internals tools (e.g., code search).

The focus of much of the technology related to data has been on enabling processing massive amounts of data, and visualizing results better. But, there is no focus on empowering users to find and understand data within these databases to prepare queries and programs more reliably and efficiently.

The primary reason in my opinion for the lack of focus on these issues, is that it is much more concrete to measure and show progress on query processing efficiency and visualization capabilities. On the other hand, it is hard today to articulate the benefits of helping data users find and understand data. By the way, wasn’t this true for Search over the web until Google came along and illustrated the economic and productivity gains across a wide spectrum of users? I believe that we are at the cusp of a similar revolution in data consumption.

Who do I Ask?

After an analyst finds a dataset, she needs to understand its usage by other analysts and applications. Often, it is very hard to find such knowledgeable users. There were many times when I found it quite hard, even at Google, to identify the people I need to talk to for such questions; when I did find them, I felt the pain of distracting engineers with run-of-the-mill questions which they must have answered many times over.

As an example, I was responsible for migrating an application reading data from one engine to a newer more robust engine. A big part of the migration involved rewriting queries to read from the new schema. I was among the last few to be doing this migration, and hence similar questions must have been answered. But, the wiki that I was pointed to didn’t have all the information I needed. So, I had to drag myself very reluctantly to a very busy principal engineer, the only one I knew directly, to get help. I would have appreciated, a lot, if I could quickly find someone else who went through a similar migration.

On the flip side, I would repeatedly answer the same set of questions over and over on data that I produced and maintained. I tried creating a wiki page, but was still asked lots of questions. As we all know, this approach comes with its own set of challenges — keeping the wiki updated and reliable over time. In retrospect, I wouldn’t be surprised if I or my colleagues may have missed a few updates.

How much Time?

So, how much time is actually spent by analysts on these activities of finding and understanding data? I haven’t tried measuring this yet. We just don’t have the methodology and the tools to do it. But, depending on who you ask and which data they need to use, the answer varies widely. New users to a particular dataset will spend upwards of 80% on these tasks, while experts much much less. However, experts spend time by answering other users’ questions over and over.

Share This Article
Facebook Pinterest LinkedIn
Share

Follow us on Facebook

Latest News

sales and data analytics
How Data Analytics Improves Lead Management and Sales Results
Analytics Big Data Exclusive
ai in marketing
How AI and Smart Platforms Improve Email Marketing
Artificial Intelligence Exclusive Marketing
AI Document Verification for Legal Firms: Importance & Top Tools
AI Document Verification for Legal Firms: Importance & Top Tools
Artificial Intelligence Exclusive
AI supply chain
AI Tools Are Strengthening Global Supply Chains
Artificial Intelligence Exclusive

Stay Connected

1.2kFollowersLike
33.7kFollowersFollow
222FollowersPin

You Might also Like

Image
AnalyticsBig DataData ManagementData MiningData QualityData WarehousingExclusiveHadoopPredictive Analytics

The Driving Force Behind Big Data: Data Connectivity

8 Min Read
Image
Best PracticesBig DataData ManagementHadoop

The Data Lake: A More Balanced Perspective

7 Min Read
Image
AnalyticsCulture/LeadershipData MiningDecision ManagementNewsPredictive Analytics

Business Analytics and Hollywood: A Match Made in Heaven?

5 Min Read

10 Reasons Why Now Is the Time to Get into Big Data

7 Min Read

SmartData Collective is one of the largest & trusted community covering technical content about Big Data, BI, Cloud, Analytics, Artificial Intelligence, IoT & more.

giveaway chatbots
How To Get An Award Winning Giveaway Bot
Big Data Chatbots Exclusive
ai in ecommerce
Artificial Intelligence for eCommerce: A Closer Look
Artificial Intelligence

Quick Link

  • About
  • Contact
  • Privacy
Follow US
© 2008-25 SmartData Collective. All Rights Reserved.
Go to mobile version
Welcome Back!

Sign in to your account

Username or Email Address
Password

Lost your password?