By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
SmartData Collective
  • Analytics
    AnalyticsShow More
    data science anayst
    Growing Demand for Data Science & Data Analyst Roles
    6 Min Read
    predictive analytics in dropshipping
    Predictive Analytics Helps New Dropshipping Businesses Thrive
    12 Min Read
    data-driven approach in healthcare
    The Importance of Data-Driven Approaches to Improving Healthcare in Rural Areas
    6 Min Read
    analytics for tax compliance
    Analytics Changes the Calculus of Business Tax Compliance
    8 Min Read
    big data analytics in gaming
    The Role of Big Data Analytics in Gaming
    10 Min Read
  • Big Data
  • BI
  • Exclusive
  • IT
  • Marketing
  • Software
Search
© 2008-23 SmartData Collective. All Rights Reserved.
Reading: Online survey research – how accurate?
Share
Notification Show More
Latest News
SMEs Use AI-Driven Financial Software for Greater Efficiency
Artificial Intelligence
data security in big data age
6 Reasons to Boost Data Security Plan in the Age of Big Data
Big Data
data science anayst
Growing Demand for Data Science & Data Analyst Roles
Data Science
ai software development
Key Strategies to Develop AI Software Cost-Effectively
Artificial Intelligence
ai in omnichannel marketing
AI is Driving Huge Changes in Omnichannel Marketing
Artificial Intelligence
Aa
SmartData Collective
Aa
Search
  • About
  • Help
  • Privacy
Follow US
© 2008-23 SmartData Collective. All Rights Reserved.
SmartData Collective > Big Data > Data Mining > Online survey research – how accurate?
Data Mining

Online survey research – how accurate?

DavidBakken
Last updated: 2009/11/02 at 8:13 PM
DavidBakken
10 Min Read
SHARE

The debate over the accuracy – and quality – of survey research conducted online is flaring at the moment, at least partly in response to a paper by Yeager, Krosnick, Chang, Javitz. Levendusky, Simpson and Wang: “Comparing the accuracy of RDD telephone surveys and Internet surveys conducted with probability and non-probability samples.” Gary Langer, director of polling at ABC News, wrote about the paper in his blog “The Numbers” on September 1. In a nutshell, the paper compares survey results obtained via random-digit dialing (RDD) with those from an Internet panel where panelists were recruited originally by means of RDD and from a number of “opt-in” Internet panels where panelists were “sourced” in a variety of ways. The results produced by the probability sampling methods are, according to the authors, more accurate than those obtained from the non-probability Internet samples. You can find a response from Doug Rivers, CEO of YouGov/Polimetrix (and Professor of Political Science at Stanford) at “The Numbers,” as well as some other comments.

The analysis presented in the paper is based on surveys conducted in 2004/5. In recent years the coverage of the RDD sampling frame has .. …



The debate over the accuracy – and quality – of survey research conducted online is flaring at the moment, at least partly in response to a paper by Yeager, Krosnick, Chang, Javitz. Levendusky, Simpson and Wang: “Comparing the accuracy of RDD telephone surveys and Internet surveys conducted with probability and non-probability samples.” Gary Langer, director of polling at ABC News, wrote about the paper in his blog “The Numbers” on September 1. In a nutshell, the paper compares survey results obtained via random-digit dialing (RDD) with those from an Internet panel where panelists were recruited originally by means of RDD and from a number of “opt-in” Internet panels where panelists were “sourced” in a variety of ways. The results produced by the probability sampling methods are, according to the authors, more accurate than those obtained from the non-probability Internet samples. You can find a response from Doug Rivers, CEO of YouGov/Polimetrix (and Professor of Political Science at Stanford) at “The Numbers,” as well as some other comments.

The analysis presented in the paper is based on surveys conducted in 2004/5. In recent years the coverage of the RDD sampling frame has deteriorated as the number of cellphone-only users has increased (to 20% currently). In response to concerns of several major advertisers about the quality of online panel data, the Advertising Research Foundation (ARF) established an Online Research Quality Council and just this past year conducted new research comparing online panels with RDD telephone samples. Joel Rubinson, Chief Research Office of The ARF, has summarized some of the key findings in a blog post. According to Rubinson, this study reveals no clear pattern of greater accuracy for the RDD sample. There are, of course, differences in the two studies, both in purpose and method, but it seems that we can no longer assume that RDD samples represent the best benchmark against which to compare all other samples. Comparing the “accuracy” of different sampling methods is no easy task. There are multiple sources of “survey error” including measurement error and non-response in addition to pure sampling error. The benchmark measures may have errors as well.  For example, some of the accuracy measures reported by Yeager et. al. are based on comparison to rigorously conducted probability sample surveys with high (e.g., 80%) response rates. Non-survey criteria, such as the incidence of passport ownership, also provide measures of accuracy. Still, even non-survey measures may be approximations. Yeager et al estimated a population incidence by dividing the number of passports in existence by the size of the population, but they point out a discrepancy between the age range for the actual number of passports and the data from their survey. While this probably does not have a material effect on their conclusions, it does illustrate the difficulty of finding or developing accuracy criteria.

More Read

data mining

Data Mining Technology Helps Online Brands Optimize Their Branding

Can Data Mining Aid with Off-Page SEO Strategies?
3 Data Mining Tips for Companies Trying to Understand their Customers
5 Data Mining Tips to Leverage the Benefits of Surveys
Perform Data Mining With Web Scrapers to Track Prices

Another problem in making judgments about method accuracy lies in the relatively small sample of observations. For example, the Yeager et al study compares a single RDD sample with several non-probability samples from different online panel providers. While they identify and add in some additional RDD samples for part of the analysis, we are still looking at only a handful of samples. Similarly, the ARF Foundations of Quality study compares a limited number of samples (and only one sample from each online panel provider). Probability sampling is the gold standard because we have a theoretically specified sampling error. In practice, however, we almost never have true “probability” samples. In the case of RDD samples, each telephone number has some known probability of being sampled, but the probability of any individual being included in the final data is unknown, given contact failures, varying household size, refusal to participate when contacted, and so forth. It’s convenient to assume that differences in the probability of reaching a given individual are randomly distributed across the sampling frame, but that’s not always the case. Selection bias may be as problematic for telephone surveys as it is for opt-in online surveys.

One of the arguments for developing online panels in the first place was based on the belief that if the panel provided coverage of the population of interest–meaning that the sample encompassed the range of variability in the population if not the distribution, you could use post-stratification or “weighting” to approximate the population distribution. Both Yeager et al and my reading of the ARF study results posted by Rubinson suggest that post-stratification may not achieve the desired results.

I think it’s safe to say that online research with non-probability samples is here to stay. For one thing, the cost advantage can be considerable, especially when trying to reach a small, specialized target group. For a pharmaceutical company, for example, the ability to conduct surveys among a non-probability panel of individuals with a particular chronic illness at a fraction of the cost of RDD sampling may well outweigh the advantages of probability sampling. That being the case, is there any way to increase our confidence in the results we get from these non-probability samples?

Much of the effort to date in quality improvement for online interviewing has focused on respondent quality – verifying identity and blocking fraudulent respondents from participating in surveys. While this is important, I think that the online sample providers have an opportunity to develop a better understanding of the variability that occurs in online sampling. This would require consistent and ongoing analysis of all samples generated (including the final sample of respondents for any project). This probably will require some “standard” measures for demographics and maybe some key non-demographic variables for each panel member. Ideally, this will lead to better understanding of the differences between non-probability opt-in panels and probability samples. New sampling strategies may be effective. For one example, check out this white paper on representative sampling in Internet panels by Doug Rivers. And we should remind ourselves that random sampling error is only one way to construct “confidence” intervals. We can look at convergent sources of information and perhaps apply some Bayesian thinking to our judgment processes.

It’s possible that broader technological changes – perhaps a mass migration to “gmail” from Google – will lead to a more comprehensive sampling frame for online panels – so that something resembling a probability sample can be constructed using email recruitment rather than RDD sampling.

Copyright 2009 by David G. Bakken.  All rights reserved.

DavidBakken November 2, 2009
Share this Article
Facebook Twitter Pinterest LinkedIn
Share

Follow us on Facebook

Latest News

SMEs Use AI-Driven Financial Software for Greater Efficiency
Artificial Intelligence
data security in big data age
6 Reasons to Boost Data Security Plan in the Age of Big Data
Big Data
data science anayst
Growing Demand for Data Science & Data Analyst Roles
Data Science
ai software development
Key Strategies to Develop AI Software Cost-Effectively
Artificial Intelligence

Stay Connected

1.2k Followers Like
33.7k Followers Follow
222 Followers Pin

You Might also Like

data mining
Data Mining

Data Mining Technology Helps Online Brands Optimize Their Branding

7 Min Read
data mining helps with offsite SEO
Data Mining

Can Data Mining Aid with Off-Page SEO Strategies?

10 Min Read
using data mining to learn more about customers
Big Data

3 Data Mining Tips for Companies Trying to Understand their Customers

6 Min Read
surveys data
Data Mining

5 Data Mining Tips to Leverage the Benefits of Surveys

11 Min Read

SmartData Collective is one of the largest & trusted community covering technical content about Big Data, BI, Cloud, Analytics, Artificial Intelligence, IoT & more.

data-driven web design
5 Great Tips for Using Data Analytics for Website UX
Big Data
ai in ecommerce
Artificial Intelligence for eCommerce: A Closer Look
Artificial Intelligence

Quick Link

  • About
  • Contact
  • Privacy
Follow US

© 2008-23 SmartData Collective. All Rights Reserved.

Removed from reading list

Undo
Go to mobile version
Welcome Back!

Sign in to your account

Lost your password?