By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
SmartData CollectiveSmartData CollectiveSmartData Collective
  • Analytics
    AnalyticsShow More
    data-driven white label SEO
    Does Data Mining Really Help with White Label SEO?
    7 Min Read
    marketing analytics for hardware vendors
    IT Hardware Startups Turn to Data Analytics for Market Research
    9 Min Read
    big data and digital signage
    The Power of Big Data and Analytics in Digital Signage
    5 Min Read
    data analytics investing
    Data Analytics Boosts ROI of Investment Trusts
    9 Min Read
    football data collection and analytics
    Unleashing Victory: How Data Collection Is Revolutionizing Football Performance Analysis!
    4 Min Read
  • Big Data
  • BI
  • Exclusive
  • IT
  • Marketing
  • Software
Search
© 2008-23 SmartData Collective. All Rights Reserved.
Reading: Big Data Success in Government
Share
Notification Show More
Aa
SmartData CollectiveSmartData Collective
Aa
Search
  • About
  • Help
  • Privacy
Follow US
© 2008-23 SmartData Collective. All Rights Reserved.
SmartData Collective > Data Management > Policy and Governance > Big Data Success in Government
AnalyticsPolicy and Governance

Big Data Success in Government

AlexOlesker
Last updated: 2012/01/20 at 3:00 PM
AlexOlesker
5 Min Read
SHARE

On January 19, Carahsoft hosted a webinar on Big Data success in government with Bob Gourley and Omer Trajman of Cloudera.  Bob began by explaining the current state of Big Data in the government.  There are 4 areas of significant activity in Big Data. Federal integrators are making large investments in research and development of solutions.  Large firms like Lockhead Martin as well as boutique organizations have made major contributions. The Department of Defense and the Intelligence Community have been major adopters of Big Data solutions to handle intelligence and information overload. Typically, they use Big Data technology to help analysts “connect the dots” and “find a needle in a haystack.” The national labs under the Department of Energy have been developing and implementing Big Data solutions for research as well, primarily in the field of bioinformatics, the application of computer science to biology. This ranges from organizing millions of short reads to sequence a genome to better tracking of patients and treatments. The last element in government use of Big Data are the Office of Management and Budget and the General Service Administration, which primarily ensure the sharing of lessons and solutions.

More Read

ICO and GDPR

Can ICO Data Awareness Campaigns Create More Trust In Crypto?

Are Government Agencies Complying with FedRAMP?
4 Benefits for the Public Sector when Governments Start Using Big Data
Bigger Data, Better Intelligence for Government
Big Data Fights Crime: The FBI’s Next Generation Identification

Gourley also recapped the Government Big Data Solutions Award presented at Hadoop World last year, highlighting the best uses of Big Data in the Federal Government. The winner was the GSA for USASearch, which uses Hadoop to host search services over more than 500 government sites effectively and economically. The other top nominees were GCE Federal, which provides cloud-based financial management solutions for federal agencies using Apache Hadoop and HBase, Pacific Northwest National Laboratory for the work of leading researcher Dr. Ronald Taylor in the application of Hadoop, Mapreduce, and HBase to bioinformatics, Wayne Wheeles’ Sherpa Surfing which uses Cloudera’s Distribution including Apache Hadoop in a cybersecurity solution for DoD networks, and the Bureau of Consular Affairs for the Consular Consolidated Database, which searches and analyzes travel documents from around the world for fraud and security threats.

Omer Trajman then gave some background on Apache Hadoop, the technology that powers many of these solutions. Narrowly defined, Hadoop consists of Hadoop Distributed File System, which allows for distributed storage and analysis on clusters of commodity hardware, and MapReduce, the processing layer that coordinates work.  But when people say Hadoop they typically mean the entire ecosystem of solutions. Hadoop is scalable, fault tolerant, and open source, and can process all types of data. Trajman explained some of the members of the Hadoop ecosystem such as Sqoop, developed by Cloudera and contributed to Apache, which brings SQL capabilities to Hadoop; Flume, which moves massive amounts of data into Hadoop as it is being processed; HBase, a Hadoop database; Pig, a data-flow oriented language for routing your data; and Hive, which delivers SQL-based data warehousing in Hadoop. All of these solutions are integrated in Cloudera’s Distribution including Apache Hadoop, available for free download. Trajman also explained Cloudera’s enterprise software to help enterprises manage their Hadoop deployments.

Listeners asked how they could learn about Hadoop on their own and were pointed to the Cloudera website, which has lots of free resources, documentation, and tutorials, as well as regular courses on Hadoop around the country. Another attendee asked how big your data has to be to warrant Hadoop and how big was too big, to which Trajman replied that if a job is too large for a single machine to handle effectively, Hadoop is a good option. As of yet, no job is has been found to be too large, since you can add as many machines as you need into a cluster, and Hadoop now supports federation, or clusters of clusters for truly massive jobs. When asked who uses Hadoop, he explained that we all do through services like Twitter, Facebook, LinkedIn, and Yahoo. Bob explained for attendees when it makes sense to migrate data to Hadoop and the type of problems are best taken on by Hadoop. If you have too much data to analyze on your current infrastructure you should consider moving it to Hadoop and, while not every problem is well suited for distributed computing, “if a problem is partitionable, it’s Hadoopable.”

TAGGED: government
AlexOlesker January 20, 2012
Share This Article
Facebook Twitter Pinterest LinkedIn
Share

Follow us on Facebook

Latest News

sobm for ai-driven cybersecurity
Software Bill of Materials is Crucial for AI-Driven Cybersecurity
Security
IT budgeting for data-driven companies
IT Budgeting Practices for Data-Driven Companies
IT
machine,translation
Translating Artificial Intelligence: Learning to Speak Global Languages
Artificial Intelligence
data science upskilling
Upskilling for Emerging Industries Affected by Data Science
Big Data

Stay Connected

1.2k Followers Like
33.7k Followers Follow
222 Followers Pin

You Might also Like

ICO and GDPR
Big DataData ManagementExclusivePolicy and GovernancePrivacyRisk ManagementSecurity

Can ICO Data Awareness Campaigns Create More Trust In Crypto?

8 Min Read

Are Government Agencies Complying with FedRAMP?

2 Min Read
Image
AnalyticsBig DataCulture/LeadershipData MiningData WarehousingPolicy and GovernanceTransparencyWorkforce Data

4 Benefits for the Public Sector when Governments Start Using Big Data

8 Min Read

Bigger Data, Better Intelligence for Government

3 Min Read

SmartData Collective is one of the largest & trusted community covering technical content about Big Data, BI, Cloud, Analytics, Artificial Intelligence, IoT & more.

ai in ecommerce
Artificial Intelligence for eCommerce: A Closer Look
Artificial Intelligence
data-driven web design
5 Great Tips for Using Data Analytics for Website UX
Big Data

Quick Link

  • About
  • Contact
  • Privacy
Follow US
© 2008-23 SmartData Collective. All Rights Reserved.
Go to mobile version
Welcome Back!

Sign in to your account

Lost your password?