- Focus on current solutions: The ability to make a difference in government missions in the very near term was the most important evaluation factor.
- Focus on government teams: Industry supporting government also considered, but this is about government missions.
- Consideration of new approaches: New business processes, techniques, tools, models for enhancing analysis are key.
- Veterans Health Administration: New Big Data approaches and frameworks provide data and tools for 20,000 clinicians to track medical trends, better anticipate outcomes. The scope of the data set is over 80 billion data files. Focused on service to 25 million veterans. Judges selected Veterans Health Administration because of the impact and best practices in Big Data solutions.
- NASA: Multiple and extensive activities. One of many exemplars was the NASA Center for Climate Simulation (NCCS). Their work includes scalable Hadoop clusters for large scale climate simulations.
- Bureau of Engraving and Printing: This government agency is the largest producer of security documents in country. They have fielded an Oracle based Big Data solution enhanced quality and mission support, reduced waste. Judges characterized this as a good match of right business processes to and a modern technical approach.
- AMSAA: Army Materiel Systems Analysis Activity. Vehicle data analysis program instruments vehicles in theater to collect operational and environmental parameter historical data. Massive data pattern screening and analysis toolsets put in place. Result: rapid identification of issues before mission impact.
- National Cancer Institute: Extensive research and working prototypes of cutting edge systems based on Cloudera Distribution of Apache Hadoop (CDH) and the Oracle Big Data Appliance. Judges noted the significant potential impact of this solution as well as the strength of the technical approach.
The winner of the 2012 Government Big Data Solutions Award is the National Cancer Institute.
The National Cancer Institute (NCI) has spent years focusing thought and technical research on issues of concern to all of humanity. The are the part of the National Institutes of Health (NIH) responsible for coordinating the US National Cancer Program. They conduct and support research, training, health information dissemination and other related activities related to the causes, prevention, diagnosis and treatment of cancer; the supportive care of cancer patients and their families, and cancer survivorship.
Among the many activities of the NCI is the cancer Biomedical Information Grid (caBIG), a framework of open source, open access information enabling secure exchange of information for cancer research as well as many research tools. An example of the broad collaboration of work supported by NCI is The Cancer Genome Atlas (TCGA). caBIG forms the information infrastructure of TCGA. TCGA is an integrated database of molecular and clinical data. Goals of TCGA include accelerating our understanding of the genetics of cancer using innovative genome analysis technologies. It is one of these innovative technologies judged to be the most stellar example of Big Data solutions in the federal government in 2012.
NCI’s Frederick National Laboratory has been using Big Data solutions in pioneering ways to support researchers working on complex challenges around the relationship between genes and cancers. In a recent example, they have built infrastructure capable of cross-referencing the relationships between 17000 genes and five major cancer subtypes across 20 million biomedical publication abstracts. By cross referencing TCGA gene expression data from simulated 60 million patients and miRNA expression for a simulated 900 million patients. The result: understanding additional layers of the pathways these genes operate in and the drugs that target them. This will help researchers accelerate their work in areas of importance for all humanity. This solution, based on the Oracle Big Data Appliance with the Cloudera Distribution of Apache Hadoop (CDH), leverages capabilities available from the Big Data community today in pioneering ways that can serve a broad range of researchers. The promising approach of this solution is repeatable across many other Big Data challenges for bioinfomatics, making this approach worthy of its selection as the 2012 Government Big Data Solution Award.
On behalf of our judges and all those who may one day benefit from research into the prevention and treatment of cancer, we congratulate the National Cancer Institute for this pioneering work, and add our appreciation to the teams of industry technologists developing well engineered solutions that make this work possible (with a special thanks to Oracle and Cloudera).