Cookies help us display personalized product recommendations and ensure you have great shopping experience.

By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
SmartData CollectiveSmartData Collective
  • Analytics
    AnalyticsShow More
    image fx (67)
    Improving LinkedIn Ad Strategies with Data Analytics
    9 Min Read
    big data and remote work
    Data Helps Speech-Language Pathologists Deliver Better Results
    6 Min Read
    data driven insights
    How Data-Driven Insights Are Addressing Gaps in Patient Communication and Equity
    8 Min Read
    pexels pavel danilyuk 8112119
    Data Analytics Is Revolutionizing Medical Credentialing
    8 Min Read
    data and seo
    Maximize SEO Success with Powerful Data Analytics Insights
    8 Min Read
  • Big Data
  • BI
  • Exclusive
  • IT
  • Marketing
  • Software
Search
© 2008-25 SmartData Collective. All Rights Reserved.
Reading: The problem with a full box of big data tools
Share
Notification
Font ResizerAa
SmartData CollectiveSmartData Collective
Font ResizerAa
Search
  • About
  • Help
  • Privacy
Follow US
© 2008-23 SmartData Collective. All Rights Reserved.
SmartData Collective > Big Data > Data Quality > The problem with a full box of big data tools
Data Quality

The problem with a full box of big data tools

TonyBain
TonyBain
8 Min Read
SHARE

“NoSQL”, for lack of better name, is a generic term that describes any data management system that does not use SQL as a query interface.  Generally this means any data management system that is non-relational, but the term also has also been stretched as far to include the boundaries of what constitutes a data management system at all.

“NoSQL”, for lack of better name, is a generic term that describes any data management system that does not use SQL as a query interface.  Generally this means any data management system that is non-relational, but the term also has also been stretched as far to include the boundaries of what constitutes a data management system at all.

Early on (a couple of years back in NoSQL time) when the term was coined I think the positioning was much more aggressive, but more recently this has been softened so now NoSQL is commonly quoted as meaning of “Not only SQL” or “next generation databases” (whatever that means).  The common message you get now is something along the lines of NoSQL systems are more “specialized”, each being designed to solve a smaller number of problems than the generic RDBMS sets out to.  NoSQL is another tool in your toolbox.  A better option in certain cases where the RDBMS doesn’t fit well.  A different hammer for a different type of nail.  All makes sense in theory, but in reality this brings its own set of troubles.

There are now dozens of NoSQL systems available for a developer to choose.  From MongoDB, Cassandra, Voldemort, Hbase, CouchDB, Riak, Neo4J, HamsterDB and so on.  And there are several different orientations of NoSQL system including document, key/value and graph.  It seems the same energy we saw open-source hackers 10 years ago putting into MySQL has now been transferred into a myriad of NoSQL systems.  Again the argument, more choice, better for everyone.

The problem, and I am putting it out there as a problem so we can think of ways to fix it, is that while that is fine in reality, in practice many choices also creates difficulties.  Real world development projects have certain skills bases they draw on, with experience and ability to “make things work” based on years of hard slog cobbling things together.  And there are very few surprises left when deploying an application on a mainstream RDBMS (of course they will, like any software, will still have issue from time to time).

One of the key reasons the RDBMS has been so dominate is the fact that you could use it pretty much for any requirement.  And using it for any requirement meant that your developers had lots of experience building applications and your DBAs had lots of experience running it.  But also you knew that you could almost always make any requirement work “good enough” by buying extra hardware and/or indexing the heck out of it etc.  Regardless of whether it was technically the best fit or not, when all things were considered the RDBMS was a stable constant given short project timeframes and limited development budgets.  It was exactly its generic nature, its ability to do most things good enough, that has led to the RDBMS to become the default option for any new development project (with the various flavors of MySQL, Oracle, DB2 ,SQL Server being less relevant).

As humans, we all have limited brain capacities and most of us can only be experts in a small number of things.  And our expertise typically come from our history, making mistakes learning what works and what doesn’t through the hard yards of experience.   So given a buffet choice of specialized NoSQL systems how on earth do we choose the most appropriate tool for the job, while at the same time dealing with the lack of expertise we will invariably have?  Also what will be the impact to development projects in choosing the wrong tool for the job?  The RDBMS is very very forgiving to poor design, poor implementation and the subsequent addition of unforeseen application requirements (you want to run OLAP now we have built you a busy OLTP database – sure but do it overnight).  Will a specialist NoSQL system have the same tolerance for our incompetence?

So now I return back to the point that is really the keystone of the NoSQL motivation, “there are requirements which a RDBMS doesn’t work at all well for”.  I agree with this, but I have yet to see any quantification of what this actually means.  Is it 5% or 10% of current development projects?  And should the question really be “what percentage of development projects is the RDBMS unusable for”?  Technical purity, and even reducing license costs, needs to be balanced against one of the largest costs, re-skilling development and production teams to understand this new data platform. 

There are some clear cases, the Googles, Twitters, Facebooks etc where scale alone is clearly outside the boundaries of what is possible on today’s RDBMS platforms.  But in terms of today’s development projects, what percentage would these scalability requirements quantify?  1%?  Less?  Sure, we are going through somewhat of a data explosion and by all counts the volume of data we collect and manage in our databases is growing at an alarming rate.  So the demand for scale will continue, but let’s also not forget that the big RDBMS vendors are very market driven, and as the market changes their products will also continue to change with it.  It is very unlikely they will be asleep at the wheel and lose their dominate share of the ~$30b market without a fight.

Contrary to how it may appear, I am actually supportive of a number of NoSQL initiatives and I am even hands on with a few.  But I do have concerns about how we quantify the market, how we ensure that people are making the right decisions in choosing a NoSQL platform.  And also how do bridge the gap with skill sets and experience for developers who will have years upon years of RDBMS experience but, by nature, only have exposure to NoSQL systems periodically based on certain application requirements.

TAGGED:tools
Share This Article
Facebook Pinterest LinkedIn
Share

Follow us on Facebook

Latest News

image fx (2)
Monitoring Data Without Turning into Big Brother
Big Data Exclusive
image fx (71)
The Power of AI for Personalization in Email
Artificial Intelligence Exclusive Marketing
image fx (67)
Improving LinkedIn Ad Strategies with Data Analytics
Analytics Big Data Exclusive Software
big data and remote work
Data Helps Speech-Language Pathologists Deliver Better Results
Analytics Big Data Exclusive

Stay Connected

1.2kFollowersLike
33.7kFollowersFollow
222FollowersPin

You Might also Like

Tips for Starting Your Dashboard Layout

7 Min Read

First Look – Incanto

7 Min Read
dashboard tool
AnalyticsBig DataBusiness IntelligenceData ManagementData VisualizationModelingSoftware

First Look: Decisions

6 Min Read

6 Innovative Dashboards

3 Min Read

SmartData Collective is one of the largest & trusted community covering technical content about Big Data, BI, Cloud, Analytics, Artificial Intelligence, IoT & more.

giveaway chatbots
How To Get An Award Winning Giveaway Bot
Big Data Chatbots Exclusive
ai in ecommerce
Artificial Intelligence for eCommerce: A Closer Look
Artificial Intelligence

Quick Link

  • About
  • Contact
  • Privacy
Follow US
© 2008-25 SmartData Collective. All Rights Reserved.
Go to mobile version
Welcome Back!

Sign in to your account

Username or Email Address
Password

Lost your password?