Data Collaboration: Crowdsourcing for Health Care

By Dave Handelsman (SAS)

Crowdsourcing is the ultimate in data collaboration. Crowdsourcing is an Internet-age phenomenon by which problems are distributed to “the crowd,” and those in the online community offer partial solutions. These partial solutions, when taken in aggregate, solve the overall problem. This approach continues to gain in popularity. Great examples in health care already exist with regard to protein folding (supported by the University of Washington) and cancer cell identification (supported by Cancer Research UK). In both of these examples, the crowd – typically untrained in the relevant specialty (in these instances, either proteomics or cellular biology) – are contributing to scientific advances in health care.

While the crowdsourcing of cures appears to be a new concept, it’s really quite similar to the approach that’s been used for hundreds of years, with arguably the very first instance being the identification of a treatment for scurvy. In that case, as in all modern day clinical trials, researchers identified new treatments based upon the valuable contributions of patients – through their health care data, their laboratory and tissue samples and their time. These contributors – or crowds – are not the experts on their illness. But without them, it would be impossible to successfully identify new treatments.

In 2013, there is growing recognition that the contributions of patients are far too valuable to be confined within the walls of individual corporations. The FDA is pooling submitted data from multiple companies to better identify trends across classes of drugs. Companies like GSK have announced data transparency projects through which they are enabling qualified external researchers and analytics experts to access historically proprietary clinical trial data. Public availability will not only enable broader audiences to explore and identify new trends regarding treatments, it will help rebuild the trust and confidence that consumers have regarding commercial biopharmaceutical corporations.

Project Data Sphere is taking a different approach by actively recruiting data collaboration with various biopharmaceutical companies conducting oncology research. By aggregating the clinical trials data, the expectation is that the larger pool of crowdsourced data will enable new discoveries and better insights than the compartmentalized approach historically followed. Once aggregated, as with GSK’s transparency project, access to the data and relevant analysis capabilities will available to qualified external researchers and analytics experts.

It’s a long way from scurvy to cancer – spanning oceans, science, time and technology – but the goals remain the same. Patients want to be cured and will contribute, literally, their blood, sweat and tears to the effort. Scientists, health care professionals and analytics experts are all working to help identify the cures and treatments.

Crowdsourced data. Crowdsourced analytics. Crowdsourced cures.