Cookies help us display personalized product recommendations and ensure you have great shopping experience.

By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
SmartData CollectiveSmartData Collective
  • Analytics
    AnalyticsShow More
    image fx (67)
    Improving LinkedIn Ad Strategies with Data Analytics
    9 Min Read
    big data and remote work
    Data Helps Speech-Language Pathologists Deliver Better Results
    6 Min Read
    data driven insights
    How Data-Driven Insights Are Addressing Gaps in Patient Communication and Equity
    8 Min Read
    pexels pavel danilyuk 8112119
    Data Analytics Is Revolutionizing Medical Credentialing
    8 Min Read
    data and seo
    Maximize SEO Success with Powerful Data Analytics Insights
    8 Min Read
  • Big Data
  • BI
  • Exclusive
  • IT
  • Marketing
  • Software
Search
© 2008-25 SmartData Collective. All Rights Reserved.
Reading: The Technology of VoltDB
Share
Notification
Font ResizerAa
SmartData CollectiveSmartData Collective
Font ResizerAa
Search
  • About
  • Help
  • Privacy
Follow US
© 2008-23 SmartData Collective. All Rights Reserved.
SmartData Collective > Software > SQL > The Technology of VoltDB
SQL

The Technology of VoltDB

BobGourley
BobGourley
12 Min Read
SHARE

VoltDB is a company fielding a technology designed by DBMS pioneer Mike Stonebraker.

VoltDB is a company fielding a technology designed by DBMS pioneer Mike Stonebraker. It is designed to address challenges of performance limitations in existing systems, and also provides significant potential cost savings, giving it the virtuous position of having more functionality at a lower cost.

In conversations with Stonebraker I learned a bit more about the VoltDB approach and would like to share a bit of context here.

First, for background, consider that most transaction-focused databases were designed decades ago. The need for new approaches has led to the movement many call “Big Data” and also gave rise to a group of efforts called the “No SQL” approach. Mike underscored that the benefits of No SQL for some problems is clear, but for transactions SQL is still key. What is needed, he says, is a NewSQL: something designed fresh to take advantage of new compute capabilities. The result is VoltDB, a re-engineered approach to SQL.

More Read

Interactive Analytics and OLAP – Part III
Predictive Analytics World New York City Conference Announces Speaker Line-Up
SQL Visualization in the Spreadsheet
Teradata Aster Standardizes Access to Hadoop with SQL-H
8 File Types, Databases and Platforms You May Not Know Can Be Rescued

VoltDB considers themselves at “The NewSQL database for high velocity applications.” It is an in-memory database, which means it primarily relies on main memory for data storage. Main memory is the memory directly accessible by the CPU, it is not secondary storage like hard disks or offline storage like tapes. Being based in main memory provides many speed benefits, including the fact that main memory is faster. Main memory databases are also faster since internal optimization algorithms are simpler.

VoltDB’s design provides the benefits of ACID (atomicity, consistency, insolation, durability) for very high transactions and changes. The design also provides the benefits of a “shared nothing” architecture, which lets each database node operate in an independent and self-sufficient way. There need be no single point of contention across the system. Shared nothing architectures are known for their scalability. They scale by simply adding nodes.

VoltDB is a relational database that provides SQL access from within pre-compiled Java stored procedures interspersed within SQL. Since stored procedures can be the unit of transaction time and compute power is saved on their execution. Data does not have to make the round trip between SQL statements. Stored procedures can be executed serially and to completion in a single thread without locking or latching. Since data is in memory and local to the partition a stored procedure can execute in microseconds.

The results of this new design are much faster execution, and more of the computing power being focused on results vice admin overhead. In benchmark testing of old systems, up to 90% of the computer effort is spent in things like managing the buffer pool, concurrency control, record locks, crash and trash recovery, managing multiple access threads etc. This overhead is eliminated with the newly engineered approach of NewSQL.

Here is a bit more from the VoltDB website:

VoltDB is a blazingly fast relational database system. It is specifically designed for modern software applications that are pushed beyond their limits by high velocity data sources. This new generation of systems – real-time feeds, machine-generated data, micro-transactions, high performance content serving – requires database throughput that can reach millions of operations per second. What’s more, the applications that use this data must be able to scale on demand, provide flawless fault tolerance and give real-time visibility into the data that drives business value.

The volume and velocity of data are exploding, fueled by social applications, sensor automation, mobile networking, and other data-intensive forces. Moore’s Law signals massive data tier scale-outs as networks and servers become faster and less expensive. Enabling that scale-out is a new generation of relational DBMSs, led by VoltDB, designed to exploit networked and virtualized computing environments. VoltDB provides the throughput, scale and accuracy needed to handle high velocity applications.

I asked Mike about use cases for this new approach. In his view, any organization which has racks of old style SQL should consider the cost savings and performance benefits of this approach, and I think he is right. When you get the benefits of needing less hardware, spending less on power, spending less on cooling data centers, but getting higher performance on transaction databases, that is very virtuous.

I also asked Mike about how they fit into an architecture where non-SQL-type analytics must be done rapidly over big data. He immediately pointed out how they fit with Hadoop and the Cloudera approach. Organizations can selectively stream high velocity data from a VoltDB cluster into Hadoop’s Distributed File System leveraging Cloudera’s Distribution Including Hadoop (CDH), which has SQL-to-Hadoop integration technology (Apache Sqoop) built in.

The following is from a recent VoltDB press release on that topic:

VoltDB Announces Enterprise-grade Hadoop Integration

Billerica, Mass., June 22, 2011 – VoltDB, a leading provider of high-velocity data management systems, today announced the release of VoltDB Integration for Hadoop. The new product functionality, available in VoltDB Enterprise Edition, allows organizations to selectively stream high velocity data from a VoltDB cluster into Hadoop’s native HDFS file system by leveraging Cloudera’s Distribution Including Apache Hadoop (CDH), which has SQL-to-Hadoop integration technology, Apache Sqoop, built in.

“The term ‘big data’ is being applied to a diverse set of data storage and processing problems related to the growing volume, variety and velocity of data and the desire of organizations to store and process data sets in their totality,” said Matt Aslett, senior analyst, enterprise software, The 451 Group. “Choosing the right tool for the job is crucial: high velocity data requires an engine that offers fast throughput and real-time visibility; high volume data requires a platform that can expose insights in massive data sets. Integration between VoltDB and CDH will help organizations to combine two special purpose engines to solve increasingly complex data management problems.”

Volume, Velocity and Variety

The volume, velocity and variety of data are exploding, fueled by social applications, sensor automation, mobile networking, and other data intensive forces. Organizations are increasingly turning to specialized, task-specific data management solutions. Leading examples include VoltDB, which is designed to process high velocity data in real time, and Cloudera’s Distribution Including Apache Hadoop (CDH), which provides organizations with a reliable and elastic infrastructure for data processing and deep analytics. VoltDB’s Integration for Hadoop allows customers to rapidly move high velocity data from VoltDB to CDH for long term storage and analysis.

“Customers across a wide variety of industries, from retail and web services to government and telecommunications, are using Cloudera’s Distribution Including Apache Hadoop to identify new value from a wide variety of data sources and then process that data into new product features for their end users,” said Ed Albanese, Head of Business Development for Cloudera. “It’s exciting that companies using CDH are now able to collect data from VoltDB – a next-generation, real-time database, process that data into high value insights and then deliver the results back to VoltDB for real-time consumption. This integration introduces new opportunities for processing and delivering information derived from a previously untapped class of data.”

Commercial-grade Integration

VoltDB Integration for Hadoop is designed specifically to handle the widest variety of customer deployment scenarios including end-user applications, site-based OEM installations and Cloud-based deployments. It combines VoltDB’s enterprise-grade export environment with Apache Sqoop, a Cloudera-sponsored solution for integrating relational databases with Hadoop infrastructures, and delivers the following capabilities:

  • Simple, fast set-up. Establishing integration between VoltDB and a Hadoop installation is fast and easy. A user identifies which VoltDB data will be exported to Hadoop, configures the VoltDB export client with the location of Hadoop, the location of a VoltDB cluster, Sqoop options such as output formatting, and other installation-specific instructions (e.g., frequency of import). The VoltDB export client automatically manages periodic Sqoop jobs based on this configuration. The entire set-up process can be completed in about 15 minutes.
  • Loosely-coupled, push-pull operation. VoltDB automatically pushes copies of export data, in real-time, to the VoltDB export client, which in turn automatically queues that data. The Sqoop receiver then pulls data from the VoltDB export client and imports that data into HDFS on whatever frequency and in whatever amounts the user has defined. VoltDB’s export client manages its data buffer in a way that eliminates possible “impedance mismatches” (i.e., VoltDB exporting data faster than Sqoop imports that data).
  • Automatic overflow management. VoltDB’s export client also automatically writes overflow data to disk to optimize memory utilization. This feature protects against large-scale overflows that could occur if the Sqoop receiver terminates, and allows export data to be retained across sessions if the VoltDB database is stopped.

“Big Data applications come with a complex combination of operational and analytical challenges,” said VoltDB CEO Scott Jarr. “In response, many organizations are evolving rapidly toward specialized database engines that must function in a co-ordinated way. Recognizing this need, VoltDB and Cloudera are working co-operatively to deliver high-powered product integrations that are easy to use, fast to deploy, and reliable to operate in production.”

With all the progress being made in the direction of NewSQL you can expect a bit of drama in the community. That is something I love about our field, folks are not afraid to voice opinions, and some of the giants have been rumbling about on this topic. For more reading on the drama see:

http://gigaom.com/cloud/facebook-trapped-in-mysql-fate-worse-than-death and

http://www.theregister.co.uk/2011/07/13/mike_stonebraker_versus_facebook/

But perhaps more importantly, if you are designing changes to your organization’s data approaches, consider how VoltDB will fit in your approach. And stick on your path to leverage Cloudera as well, of course.

 

Share This Article
Facebook Pinterest LinkedIn
Share

Follow us on Facebook

Latest News

image fx (2)
Monitoring Data Without Turning into Big Brother
Big Data Exclusive
image fx (71)
The Power of AI for Personalization in Email
Artificial Intelligence Exclusive Marketing
image fx (67)
Improving LinkedIn Ad Strategies with Data Analytics
Analytics Big Data Exclusive Software
big data and remote work
Data Helps Speech-Language Pathologists Deliver Better Results
Analytics Big Data Exclusive

Stay Connected

1.2kFollowersLike
33.7kFollowersFollow
222FollowersPin

You Might also Like

use drop tables for your sql server
SQL

Big Data Strategies Hinge on Using Drop Tables in SQL Servers

15 Min Read
Image
Big DataHadoopITMapReduceOpen SourceSoftwareSQL

How To Maximize Performance and Scalability Within Your Hadoop Architecture

7 Min Read

Pros and Cons of Using MySQL for Analytical Reporting

12 Min Read

Information Availability: Exploiting the Full Value of Information to Drive Business

5 Min Read

SmartData Collective is one of the largest & trusted community covering technical content about Big Data, BI, Cloud, Analytics, Artificial Intelligence, IoT & more.

ai chatbot
The Art of Conversation: Enhancing Chatbots with Advanced AI Prompts
Chatbots
ai is improving the safety of cars
From Bolts to Bots: How AI Is Fortifying the Automotive Industry
Artificial Intelligence

Quick Link

  • About
  • Contact
  • Privacy
Follow US
© 2008-25 SmartData Collective. All Rights Reserved.
Go to mobile version
Welcome Back!

Sign in to your account

Username or Email Address
Password

Lost your password?