By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
SmartData CollectiveSmartData CollectiveSmartData Collective
  • Analytics
    AnalyticsShow More
    customer experience analytics
    Using Data Analysis to Improve and Verify the Customer Experience and Bad Reviews
    6 Min Read
    data analytics and CRO
    Data Analytics is Crucial for Website CRO
    9 Min Read
    analytics in digital marketing
    The Importance of Analytics in Digital Marketing
    8 Min Read
    benefits of investing in employee data
    6 Ways to Use Data to Improve Employee Productivity
    8 Min Read
    Jira and zendesk usage
    Jira Service Management vs Zendesk: What Are the Differences?
    6 Min Read
  • Big Data
  • BI
  • Exclusive
  • IT
  • Marketing
  • Software
Search
© 2008-23 SmartData Collective. All Rights Reserved.
Reading: MongoDB, BI and non-Relational Databases
Share
Notification Show More
Font ResizerAa
SmartData CollectiveSmartData Collective
Font ResizerAa
Search
  • About
  • Help
  • Privacy
Follow US
© 2008-23 SmartData Collective. All Rights Reserved.
SmartData Collective > Software > SQL > MongoDB, BI and non-Relational Databases
Business IntelligenceSQL

MongoDB, BI and non-Relational Databases

Barry Devlin
Last updated: 2011/06/17 at 12:50 PM
Barry Devlin
6 Min Read
SHARE

A chat with Max Schireson, President of 10gen makers of MongoDB (from humongous database), yesterday provided some food for thought on the topic of our assumptions about the best database choices for different applications.  Such thinking is particularly relevant for BI at the moment, as the database choices expand rapidly.

A chat with Max Schireson, President of 10gen makers of MongoDB (from humongous database), yesterday provided some food for thought on the topic of our assumptions about the best database choices for different applications.  Such thinking is particularly relevant for BI at the moment, as the database choices expand rapidly.

But first, for traditionalist BI readers, a brief introduction to MongoDB, which is one of the growing horde of so-called NoSQL “databases”, some of which have very few of the characteristics of databases.  NoSQL stores come in a half a dozen generic classes, and MongoDB is in the class of “document stores” along with tools such as Apache’s CouchDB and Terrastore.  Documents?  If you’re confused, you are not alone.  In this context, we don’t mean textual documents used by humans, but rather use the word in a programming sense as a collection of data items stored together for convenience and ease of processing, generally without a predefined schema.  From a database point of view, such a document is rather like the pre-First Normal Form set of data fields that users present to you as what they need in their application.  Think, for example, of an order consisting of an order header and multiple line items.  In the relational paradigm, you’ll make two (or more) tables and join them via foreign keys. In a document paradigm, you’ll keep all the data related to that order in one document.

Two characteristics–the lack of a predefined schema and the absence of joins–are very attractive in certain situations, and these turn out to be key design points for MongoDB. The lack of a schema makes it very easy to add new data fields to an existing database without having to reload the old data; so if you are in an emerging industry or application space, especially where data volumes are large, this is very attractive.  The absence of joins also plays well for large data volumes; if you have to shard your data over multiple servers, joins can be very expensive.  So, MongoDB, like most NoSQL tools, play strongly in the Web space with companies needing fact processing of large volumes of data with emergent processing needs.  Sample customers include Shutterfly, foursquare, Intuit, IGN Entertainment, Craigslist and Disney.  In many cases, the use of the database would be classed as operational by most BI experts.  However, there are some customers using it for real-time analytics, and that leads us to the question of using non-relational databases for BI.

When considering implementing Operational BI solutions, many implementers first think of copying the operational data to an operational data store (ODS), data warehouse or data mart and analysing it there.  They are immediately faced with the problem of how to update the informational environment fast enough to satisfy the timeliness requirement of the users.  As that approaches real-time, traditional ETL tools begin to struggle.  Furthermore, in the case of the data warehouse, the question arises of the level of consistency among these real-time updates and between the updates and the existing content.  The way MongoDB is used points immediately to an alternative, viable approach–go directly against the operational data.

As always, there are pros and cons.  Avoiding storing and maintaining a second copy of large volumes of data is always a good thing.  And if the analysis doesn’t require joining with data from another source, using the original source data can be advantageous.  There are always questions about performance impacts on the operational source, and sometimes security implications as well.  However, the main question is around the types of query possible against a NoSQL store in general or a document-oriented database in this case.  It is generally accepted that normalizing data in a relational database leads to a more query-neutral structure, allowing a wider variety of queries to be handled.  On the other hand, as we saw with the emergence of dimensional schemas and now columnar databases, query performance against normalized databases often leaves much to be desired.  In the case of Operational BI, however, most experience indicates that the queries are usually relatively simple, and closely related to the primary access paths used operationally for the data concerned.  The experience with MongoDB bears this out, at least in the initial analyses users have required.

I’d be interested to hear details of readers’ experience with analytics use of this and other non-Relational approaches.

Barry Devlin June 17, 2011 June 17, 2011
Share This Article
Facebook Twitter Pinterest LinkedIn
Share

Follow us on Facebook

Latest News

ai can help with nurse burnout
Breakthroughs in AI Are Helping to Prevent Nurse Burnout
Artificial Intelligence Exclusive
AI in marketing
AI Can’t Replace Creativity When Crafting Digital Content
Artificial Intelligence
ai in furniture design
Top 5 AI-Driven Furniture Engineering Design Applications
Artificial Intelligence
data protection regulation
Benefits of Data Management Regulations for Consumers & Businesses
Data Management

Stay Connected

1.2k Followers Like
33.7k Followers Follow
222 Followers Pin

You Might also Like

ai can help with nurse burnout
Artificial IntelligenceExclusive

Breakthroughs in AI Are Helping to Prevent Nurse Burnout

9 Min Read
AI in marketing
Artificial Intelligence

AI Can’t Replace Creativity When Crafting Digital Content

6 Min Read
ai in furniture design
Artificial Intelligence

Top 5 AI-Driven Furniture Engineering Design Applications

11 Min Read
ai software failure
Artificial Intelligence

Why Do AI Startups Have High Failure Rates?

13 Min Read

SmartData Collective is one of the largest & trusted community covering technical content about Big Data, BI, Cloud, Analytics, Artificial Intelligence, IoT & more.

ai chatbot
The Art of Conversation: Enhancing Chatbots with Advanced AI Prompts
Chatbots
AI chatbots
AI Chatbots Can Help Retailers Convert Live Broadcast Viewers into Sales!
Chatbots

Quick Link

  • About
  • Contact
  • Privacy
Follow US
© 2008-23 SmartData Collective. All Rights Reserved.
Go to mobile version
Welcome Back!

Sign in to your account

Lost your password?