NoSQL, NewSQL... NonplussedSQL
I was the analyst on The Briefing Room this week with NuoDB's CEO, Barry Morris. The product itself is extremely interesting, both in its concept and technology; and will formally launch in the next month or so after a long series of betas. More about that a little later...
But first, I need to vent! For some time now, I've been taking an interest in NoSQL because of its positioning in the Big Data space. I've always had a real problem with the term - whether it means Not SQL or Not Only SQL - because defining anything by what it's not is logical nonsense. Even the definitive nosql-database.org falls into the trap, listing 122+ examples from "Possibly the oldest NoSQL DB: Adabas" to the wonderfully-named StupidDB. A simple glance at the list of categories used to classify the entries shows the issue: NoSQL is catch-all for a potentially endless list of products and tools. Just because they don't use SQL as an access language is insufficient as a definition.
But, my irritation now extends to "NewSQL", a term I went a-Googling when I saw that NuoDB is sometimes put in this category. This picture from Matthew Aslett of 451 Research's presentation was interesting if somewhat disappointing: another gathering of tools with a mixed and overlapping set of characteristics, most of which relate to their storage and underlying processing approaches, rather than anything new about SQL, which is, of course, at heart a programming language. So why invent the term NewSQL when the aim is to keep the same syntax? The term totally misses the real innovation that's going on.
This innovation at a physical storage level has been happening for a number of years now. Columnar storage on disk, from companies such as Vertica and ParAccel, was the first innovative concept to challenge traditional RDMS approaches in the mid-2000s. Not forgetting Sybase IQ from the mid-1990s, which was, of course, column-oriented, but didn't catch the market as the analytic database vendors did later. With cheaper memory and 64-bit addressing, the move is underway towards using main memory as the physical storage medium and disk as a fallback. SAP HANA champions this approach at the high end, while various BI tools, such as QlikView and MicroStrategy hold the lower end. And don't forget that the world's most unloved (by IT, at least) BI tool, Excel, has always been in-memory!
The other aspect of innovation relates to parallel processing. Massively parallel processing (MPP) relational databases have been around for many years in the scientific arena and in commercial data warehousing from Teradata (1980s) and IBM DB2 Parallel Edition (1990s). These powerful, if proprietary, platforms are usually forgotten (or ignored) when NoSQL vendors lament the inability of traditional RDBMSs to scale-out to multiple processors, blithely citing comparisons of their products to MySQL, probably more popular for its price than its technical prowess. Relational databases do indeed run across multiple processors, and must evolve to do so more easily and efficiently as increases in processing power are now coming mainly from increasing the number of cores in processors. Which finally brings me back to NuoDB.
NuoDB takes a highly innovative, object-oriented, transaction/messaging-system approach to the underlying database processing, eliminating the concept of a single control process responsible for all aspects of database integrity and organization. Invented by Jim Starkey, an Ã©minence grise of the database industry, the approach is described as elastically scalable - cashing in on the cloud and big data. It also touts emergent behavior, a concept central to the theory of complex systems. Together with an in-memory model for data storage, NuoDB appears very well positioned to take advantage of the two key technological advances of recent years mentioned already:- extensive memory and multi-core processors. And all of this behind a traditional SQL interface to maximize use of existing, widespread skills in the database industry. What more could you ask?
However, it seems there's an added twist. Apparently, SQL is just a personality the database presents; and is the focus of the initial release. Morris also claims that NuoDB is able to behave as a document, object or graph database, personalities slated for later releases in 2013 and beyond. Whether this emerges remains to be seen. Interestingly, however, when saving to disk, NuoDB stores data in key-value format.
I'll be big data, NoSQL and NewSQL in speaking engagements in Europe in November: the IRM DW&BI Conference in London (5-7 Nov) and Big Data Deutschland in Frankfurt (20-21 Nov). I look forward to meeting you there!
Dr. Barry Devlin is a founder of the data warehousing industry and among the foremost authorities worldwide on business intelligence (BI) and beyond. He is a widely respected consultant, lecturer and author of “Data Warehouse—from Architecture to Implementation”. Barry has 30 years of experience in the IT industry, previously with IBM, as an architect, consultant, manager and software ...