First Look – Pervasive DataRush

June 4, 2009
54 Views

Pervasive is a global software company with 200+ employees that has been profitable for the last 8 years and best known for btrieve (now Pervasive PSQL) and their data integration products. The company is busy expanding into new markets and Pervasive DataRush is one of their new products. They see a new generation of data intensive applications that are Smart, Green, Scalable and Efficient.

  • Smart means applications that handle lots of data, use personalized data, focus on predictive analytics and enable time critical decisions. Today they see problems with software choking on large data volumes, applications being hampered by unclean/incomplete data, limits to accuracy and simulation, and days of processing time for complex problems.
  • Green means applications with smaller carbon footprints that get higher utilization rates (e.g. of multi-core servers) so companies can limit the number of data centers. Currently data processing software tends to need lots of blades and cannot utilize existing hardware as well as it might.
  • Scalable means


Copyright © 2009 James Taylor. Visit the original article at First Look – Pervasive DataRush.

Pervasive is a global software company with 200+ employees that has been profitable for the last 8 years and best known for btrieve (now Pervasive PSQL) and their data integration products. The company is busy expanding into new markets and Pervasive DataRush is one of their new products. They see a new generation of data intensive applications that are Smart, Green, Scalable and Efficient.

  • Smart means applications that handle lots of data, use personalized data, focus on predictive analytics and enable time critical decisions. Today they see problems with software choking on large data volumes, applications being hampered by unclean/incomplete data, limits to accuracy and simulation, and days of processing time for complex problems.
  • Green means applications with smaller carbon footprints that get higher utilization rates (e.g. of multi-core servers) so companies can limit the number of data centers. Currently data processing software tends to need lots of blades and cannot utilize existing hardware as well as it might.
  • Scalable means leveraging multi-core, scaling dynamically to match the hardware and running on any platform. Current software, they say, loses scalability after about 4 cores and requires rewrites or adjustments to use the full capacity of a given piece of hardware. Such a rewrite may also need to be written and maintained for every Operating System.
  • Efficient means finding ways to limit software and hardware costs and free up time on your existing hardware. Current software often limits the number of jobs, requires additional databases for analytic work and requires more blades or servers to scale – hardware is being thrown at the problem.

Obviously they feel that Pervasive DataRush will address these issues. They feel that DataRush can change the way a company does business. When a company can process and understand data on the fly or very rapidly rather than overnight this allows different approaches and creates new opportunities. Taking Netflix as an example they ran more than 100M ratings from 480,000 users (the public data set for the Netflix Prize challenge) in 16.31 minutes with a reasonably competitive result. In contrast most teams spend days running the algorithms. And this was run on a standard 8 core box so not a massive piece of hardware. Clearly if you can run something several times an hour instead of over the weekend you would use it differently and you can easily see how this could create new opportunities for analytic applications.

DataRush is a data processing engine and software platform with a family of embeddable software solutions for data-intensive applications. They have some patent-pending parallel processing technology so that low-level parallel processing issues are handled by the framework – application developers simply focus on their problem. Developers write Java in Eclipse or NetBeans to access APIs in the JRush framework and then scalability, exploiting additional cores and so on is automatic – handled by the JRush framework without recording. DataRush has a large and growing library of operators. It handles locking, memory, threading etc. enabling much easier development and more design-time productivity than attempting to use the standard Java libraries]. They use JMX for monitoring and the Pervasive DataRush Engine runs on a standard JVM/JMX set up. The engine supports an execution API and an operator library. On top of this they have some applications like transforms, join/sort/aggregate, predictive analytics, matching, data profiler.

They have a bog you can check out at http://cs.pervasive.com/blogs/datarush/.


Link to original post

You may be interested

How SAP Hana is Driving Big Data Startups
Big Data
298 shares2,909 views
Big Data
298 shares2,909 views

How SAP Hana is Driving Big Data Startups

Ryan Kh - July 20, 2017

The first version of SAP Hana was released in 2010, before Hadoop and other big data extraction tools were introduced.…

Data Erasing Software vs Physical Destruction: Sustainable Way of Data Deletion
Data Management
42 views
Data Management
42 views

Data Erasing Software vs Physical Destruction: Sustainable Way of Data Deletion

Manish Bhickta - July 20, 2017

Physical Data destruction techniques are efficient enough to destroy data, but they can never be considered eco-friendly. On the other…

10 Simple Rules for Creating a Good Data Management Plan
Data Management
69 shares623 views
Data Management
69 shares623 views

10 Simple Rules for Creating a Good Data Management Plan

GloriaKopp - July 20, 2017

Part of business planning is arranging how data will be used in the development of a project. This is why…