Cookies help us display personalized product recommendations and ensure you have great shopping experience.

By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
SmartData CollectiveSmartData Collective
  • Analytics
    AnalyticsShow More
    warehouse accidents
    Data Analytics and the Future of Warehouse Safety
    10 Min Read
    stock investing and data analytics
    How Data Analytics Supports Smarter Stock Trading Strategies
    4 Min Read
    predictive analytics risk management
    How Predictive Analytics Is Redefining Risk Management Across Industries
    7 Min Read
    data analytics and gold trading
    Data Analytics and the New Era of Gold Trading
    9 Min Read
    composable analytics
    How Composable Analytics Unlocks Modular Agility for Data Teams
    9 Min Read
  • Big Data
  • BI
  • Exclusive
  • IT
  • Marketing
  • Software
Search
© 2008-25 SmartData Collective. All Rights Reserved.
Reading: First Look: Dataiku’s Data Science Studio
Share
Notification
Font ResizerAa
SmartData CollectiveSmartData Collective
Font ResizerAa
Search
  • About
  • Help
  • Privacy
Follow US
© 2008-23 SmartData Collective. All Rights Reserved.
SmartData Collective > Analytics > First Look: Dataiku’s Data Science Studio
AnalyticsBig DataNew ProductsSoftware

First Look: Dataiku’s Data Science Studio

JamesTaylor
JamesTaylor
5 Min Read
dataiku
SHARE

dataikuDataiku, a company founded in 2013 and based in France, launched their product, Data Science Studio (DSS), in February 2014. DSS is a web-based analytic software platform designed for data scientists and analysts. The product is designed to improve the effectiveness and productivity of data teams especially when it comes to turning raw data into an analytic API.

dataikuDataiku, a company founded in 2013 and based in France, launched their product, Data Science Studio (DSS), in February 2014. DSS is a web-based analytic software platform designed for data scientists and analysts. The product is designed to improve the effectiveness and productivity of data teams especially when it comes to turning raw data into an analytic API. The tool is focused on creating and running analytic applications in a production environment., a few hundred people are using the product, which includes both a free version, the Community Edition, and a subscription based version, the Enterprise Edition. More than 15 companies are using the Enterprise Edition for production problems.

The product is web-based and designed to support the loading, preparation, analysis, and deployment of analytic models in a collaborative environment. When the user signs in to his/her DSS company account, the initial environment displays a set of projects as tiles. In each project, the users’ work with a set (or sets) of data is saved as a visual workflow. These workflows start from integration and go to cleansing, to analyzing, and to modeling (though there are some visualization and display options as well).

The data preparation elements of the tool allow a user to access data in a wide variety of formats – uploaded files, Hadoop, relational and nosql databases, fabrics such as cascading and web services, and even Excel or csv files. The tool helps the user visually discover the kind of data involved by profiling the data and providing a set of automated tools to detect and apply typical data cleansing and enriching activities (resolving geolocation IP addresses, parsing dates in text fields, merging datasets, etc). The user can apply these transformations interactively and the tool also recognizes obvious cleansing and enriching activities that the user can choose to apply automatically. The user can also choose to inject his / her own code for a very specific cleaning or enriching feature. All the integration and cleaning activities can be saved as a recipe that can be included in the workflow and reused. Additional steps like joins, custom formulas, etc. can also be defined and saved to the workflow.

More Read

Image
7 Ways Businesses are Leveraging Hadoop
ERP Integration Benefits Data-Savvy eCommerce for Distribution Industry
Catherine H vanZuylen’s Presentation Slides on Social Sentiment Analytics
Some Datasets Available on the Web
The Problem with the Relational Database

Once the data is ready, the workbench allows the user to develop a predictive model using a wide variety of machine learning algorithms. Thanks to the product’s connectors to analytical machine learning frameworks, the user can fine-tune the algorithms with a visual editor to build optimal models. Multiple results can be developed in parallel and compared to each other in order to see which one(s) yield the best results. A recipe – workflow – can be saved to create the predictions based on the selected approach. Additional workflows can be generated to periodically re-train the model. In this latter case the tool automatically identifies the date segmentation used in the model to see what new data should be considered.

These workflows can be extended with custom steps that use Python, R, SQL, Hive or Pig scripts. The custom node provides an editor for each with some code validation based on the language selected. The workflows also support versioning, commenting, and sharing of datasets – the option to share and reuse recipes/workflow fragments is on the roadmap.

There is a cloud version of the product as well as an on-premise version (which can also be downloaded and tried for free).

You can get more information on Dataiku here and they will be included in a future release of our Decision Management Systems Platform Technology Report.

Copyright © 2014 http://jtonedm.com James Taylor
Share This Article
Facebook Pinterest LinkedIn
Share

Follow us on Facebook

Latest News

macro intelligence and ai
How Permutable AI is Advancing Macro Intelligence for Complex Global Markets
Artificial Intelligence Exclusive
warehouse accidents
Data Analytics and the Future of Warehouse Safety
Analytics Commentary Exclusive
stock investing and data analytics
How Data Analytics Supports Smarter Stock Trading Strategies
Analytics Exclusive
qr codes for data-driven marketing
Role of QR Codes in Data-Driven Marketing
Big Data Exclusive

Stay Connected

1.2KFollowersLike
33.7KFollowersFollow
222FollowersPin

You Might also Like

AI in ecommerce
Big Data

4 Ways Data-Driven Automation Enhances Merchandise Distribution

8 Min Read
What is Data Pipeline A detailed explaination
Big Data

What is Data Pipeline? A Detailed Explanation

8 Min Read
create seamless web dashboards
Analytics

Utilizing Data Analytics To Create Seamless Web Dashboards

6 Min Read

SAP HANA Brings ROI of More Than 500% for the University of Kentucky

2 Min Read

SmartData Collective is one of the largest & trusted community covering technical content about Big Data, BI, Cloud, Analytics, Artificial Intelligence, IoT & more.

ai chatbot
The Art of Conversation: Enhancing Chatbots with Advanced AI Prompts
Chatbots
ai in ecommerce
Artificial Intelligence for eCommerce: A Closer Look
Artificial Intelligence

Quick Link

  • About
  • Contact
  • Privacy
Follow US
© 2008-25 SmartData Collective. All Rights Reserved.
Go to mobile version
Welcome Back!

Sign in to your account

Username or Email Address
Password

Lost your password?