Cookies help us display personalized product recommendations and ensure you have great shopping experience.

By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
SmartData CollectiveSmartData Collective
  • Analytics
    AnalyticsShow More
    data driven insights
    How Data-Driven Insights Are Addressing Gaps in Patient Communication and Equity
    8 Min Read
    pexels pavel danilyuk 8112119
    Data Analytics Is Revolutionizing Medical Credentialing
    8 Min Read
    data and seo
    Maximize SEO Success with Powerful Data Analytics Insights
    8 Min Read
    data analytics for trademark registration
    Optimizing Trademark Registration with Data Analytics
    6 Min Read
    data analytics for finding zip codes
    Unlocking Zip Code Insights with Data Analytics
    6 Min Read
  • Big Data
  • BI
  • Exclusive
  • IT
  • Marketing
  • Software
Search
© 2008-25 SmartData Collective. All Rights Reserved.
Reading: First Look: Datameer
Share
Notification
Font ResizerAa
SmartData CollectiveSmartData Collective
Font ResizerAa
Search
  • About
  • Help
  • Privacy
Follow US
© 2008-23 SmartData Collective. All Rights Reserved.
SmartData Collective > Software > Hadoop > First Look: Datameer
AnalyticsBig DataBusiness IntelligenceHadoopSoftware

First Look: Datameer

JamesTaylor
JamesTaylor
6 Min Read
Image
SHARE

ImageAs part of an ongoing expansion of our ecosystem mapping to include more Hadoop-based products I recently got an update from Datameer. Datameer was founded back in 2009 by Stefan Groschupf, who was one of the original contributors to Nutch, the open source project that spun off Hadoop.

ImageAs part of an ongoing expansion of our ecosystem mapping to include more Hadoop-based products I recently got an update from Datameer. Datameer was founded back in 2009 by Stefan Groschupf, who was one of the original contributors to Nutch, the open source project that spun off Hadoop. Prior to starting Datameer, he and the founding team were architecting and implementing custom distributed big data analytic systems. After several years of implementing the same kind of custom solutions over and over, they started Datameer to productize their experience. Datameer is a Hadoop-centric product, purpose built from the beginning as a Hadoop solution. VC funded they have about an 80 strong team, headquartered in San Mateo with a core engineering team in Germany.

Datameer’s product is an analytic application – a BI and analytics layer – that runs on Hadoop. Datameer aims to abstract the complexity of Hadoop through three key elements:

  • Wizard-based data integration for business users that ingests data from 55+ different sources (databases, some cloud sources, email etc) or an open API into Hadoop.
  • Point and click analytics based on an interactive spreadsheet.
    This has 240+ pre-built analytic functions (from nearest neighbor and text analytics to simple joins) and provides instant previews based on a Smart Sample built when the data was integrated. Once your analytic pipeline is defined functions can be executed against the full data set at once, on a schedule or on demand.
  • Drag and drop visualization based on HTML 5.
    The graphics start with a blank canvas for flexibility and allow annotation (including video and other more advanced annotations). All visualizations are updated automatically with new data. Part of the driver for this was a realization that business users need to present results, and that taking a screenshot of dashboards to annotate elsewhere meant the data displayed was instantly out of date.

There is also an app market for packaged analytic solutions which can be built by any user of the product using a one click “create app” option and then shared/sold. Datameer aims to play nice with all the various Hadoop distributions and has bi-directional connections to a wide range of databases, making it easy to fit into a company’s current environment.

More Read

Ease-of-Use Key to Consumerization of BI
Steve Jobs Leaves an Indelible Mark on Business Intelligence
The Hospitality Industry Benefits From the Emergence of Big Data
How Businesses Use Their Target Audience Data
How AI Is Helping Revolutionize Elder Care

Customers tend to be either larger companies, with lots of existing databases and BI tools where Datameer acts as a data hub that brings data sources together in Hadoop, or more emerging web 2.0 businesses that have all their data in Hadoop and don’t have a lot of historical BI infrastructure. Datameer claims plenty of large customers in both categories – banks, telcos, online gaming etc.

They recently announced 3.0, which builds on these core functions in a number of ways. In particular they have introduced what they call “Smart Analytics” – or self-service machine learning for Hadoop. This delivers a set of four key algorithms:

  • Clustering
    The k-means algorithm is used to group data into clusters that are alike.
  • Column dependencies
    This shows the degree of correlation between columns in a sheet
  • Decision trees
    Shows the different combinations of data attributes that result in a desired outcome.
  • Recommendations
    Develops a rating or preference for something not associated with a record based on which records are associated.

A simple click takes a set of data and runs these machine learning algorithms against the data. A graphical preview, using the Smart Sample is available. Running the algorithms creates new sheets containing the source data as well as the results of the machine learning algorithms. These can be stored back, used for additional analysis etc. These are based on available public algorithms and the implementation is designed both to execute against Hadoop and to allow very non-technical users to use machine learning models without a data scientist.

In addition, Datameer allows you to take a PMML model and then execute it against data stored in Hadoop. A PMML file generated from any data mining environment can be loaded up and is turned into a custom Datameer function. This function can then be used like any other in Datameer so the scores calculated from the model can be stored back into Hadoop and visualized/analyzed or pushed out to another environment (using Datameer as a data integration platform). With increased support among data mining tools for big data this is very timely, allowing you to extract data to your modeling environment (from Datameer perhaps), build the model(s) you want and then push them back into Datameer for ongoing use in your analysis.

You can get more information on Datameer and its support for PMML here.

Share This Article
Facebook Pinterest LinkedIn
Share

Follow us on Facebook

Latest News

crypto marketing
How a Crypto Marketing Agency Can Use AI to Create Powerful Native Advertising Strategies
Blockchain Exclusive Marketing
data driven insights
How Data-Driven Insights Are Addressing Gaps in Patient Communication and Equity
Analytics Big Data Exclusive
image fx (37)
Boosting SMS Marketing Efficiency with AI Automation
Exclusive
pexels pavel danilyuk 8112119
Data Analytics Is Revolutionizing Medical Credentialing
Analytics Big Data Exclusive

Stay Connected

1.2kFollowersLike
33.7kFollowersFollow
222FollowersPin

You Might also Like

Blasphemy? Quantitative Approaches Don’t Always Work Best

4 Min Read

The Future of Big Data: Good, Bad or Ugly?

6 Min Read

Cities Get Smarter with IBM’s Location-based Analytics

3 Min Read

Dataset too big for R ?

2 Min Read

SmartData Collective is one of the largest & trusted community covering technical content about Big Data, BI, Cloud, Analytics, Artificial Intelligence, IoT & more.

data-driven web design
5 Great Tips for Using Data Analytics for Website UX
Big Data
ai is improving the safety of cars
From Bolts to Bots: How AI Is Fortifying the Automotive Industry
Artificial Intelligence

Quick Link

  • About
  • Contact
  • Privacy
Follow US
© 2008-25 SmartData Collective. All Rights Reserved.
Go to mobile version
Welcome Back!

Sign in to your account

Username or Email Address
Password

Lost your password?