Cookies help us display personalized product recommendations and ensure you have great shopping experience.

By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
SmartData CollectiveSmartData Collective
  • Analytics
    AnalyticsShow More
    data analytics
    How Data Analytics Can Help You Construct A Financial Weather Map
    4 Min Read
    financial analytics
    Financial Analytics Shows The Hidden Cost Of Not Switching Systems
    4 Min Read
    warehouse accidents
    Data Analytics and the Future of Warehouse Safety
    10 Min Read
    stock investing and data analytics
    How Data Analytics Supports Smarter Stock Trading Strategies
    4 Min Read
    predictive analytics risk management
    How Predictive Analytics Is Redefining Risk Management Across Industries
    7 Min Read
  • Big Data
  • BI
  • Exclusive
  • IT
  • Marketing
  • Software
Search
© 2008-25 SmartData Collective. All Rights Reserved.
Reading: Two Approaches to Scalable Database Design
Share
Notification
Font ResizerAa
SmartData CollectiveSmartData Collective
Font ResizerAa
Search
  • About
  • Help
  • Privacy
Follow US
© 2008-23 SmartData Collective. All Rights Reserved.
SmartData Collective > Big Data > Two Approaches to Scalable Database Design
Big Data

Two Approaches to Scalable Database Design

Eran Levy
Eran Levy
7 Min Read
SHARE

Any form of application used for data analysis is stringently dependant on its ability to retrieve queries fast. However, when working with larger or more complex datasets, as well as an increasing amount of concurrent users, the performance depends largely on the underlying analytical database – whether this is built into the application as part of a single-stack tool, or implemented via a separate data warehouse layer.

Contents
  • What Makes a Scalable Database?
    • The Traditional Approach: Denormalization
    • Elasticube: Optimizing Performance via In-Chip Optimization

Any form of application used for data analysis is stringently dependant on its ability to retrieve queries fast. However, when working with larger or more complex datasets, as well as an increasing amount of concurrent users, the performance depends largely on the underlying analytical database – whether this is built into the application as part of a single-stack tool, or implemented via a separate data warehouse layer.

What Makes a Scalable Database?

Database scalability is a concept in database design that emphasizes on the capability of a database to handle growth in the amount of data and users. In the modern applications sphere two types of workloads have emerged – namely analytical and transactional workloads. Planning for workload growth must take into account operating system, database design and hardware design decisions.

More Read

Battling Data Silos: 3 Tips to Finance and Operations Integration
Here Are Bank of America’s Revelations of the Future of Big Data
Big Data in the Music Industry: Richard Bowman
Using Web 2.0 for Analytics 2.0
4 Ways Big Data Can Drive Your Sales to the Next Level

The Traditional Approach: Denormalization

At the hardware level vertical and horizontal scalability exist. Vertical scaling involves increasing the capacity of a single machine. The capacity of a single machine can be increased by adding more CPU cores, RAM and storage. Horizontal scaling involves adding more machines to cope with growing workloads. This article will only highlight database design decisions required for a scalable application.

At the database level configuration, schema design, indexing and query design affect the capability of a database to scale. Each relational database system provides a set of parameters that can be tweaked to improve performance. Therefore it is advisable to check parameters that are used in the specific database system. Schema design offers the biggest opportunity for a system developer to make decisions that will affect the scalability and performance of a database. Transaction workloads require a normalized design while analytical workloads require a denormalized design. Each design will be briefly explained.

In a transaction workload the main objective is to process a large number of short UPDATE, INSERT and DELETE queries. Such databases are normalized to the third normal form (3NF). In this form business entities are broken into small tables. The business entities to be used in informing how to break up the tables come from a thorough analysis of the business. Normalization reduces redundant storage of data and enables tables to be joined to others. Indexing in a transaction workload does improve query response but slows update, insert and delete queries.

Therefore indexing can speed up or slow down a transaction database depending on how it is implemented. An iterative indexing strategy of implementing an index and testing its effect is appropriate. Although partitioning was developed for data warehouses it can still be beneficial in a transaction system especially where databases are big. A 3NF assures data consistency and accuracy but performance may be reduced due to the multiple joins involved. Trying to optimize query performance in a transaction system will affect the core functions of recording business data. Therefore it is advisable to migrate data analysis to a specialized system that uses an analytical workload design.

In an analytical workload the objective is to process few complex queries that arise in data analysis. This design is optimized for fast query performance. The data is denormalized meaning the business entities that were broken into different tables in the transaction system are joined together into one table. In this design the two major table types are dimension and fact tables. Dimension tables contain descriptive attributes about the business. Fact tables contain measurable attributes that show the performance of a business. For example in a sales process the dollar amounts are the facts while region, customer type, and product category are the dimensions. In this design indexing and partitioning are very beneficial in improving the response times.

When working with either workload query optimization is very important. One optimization technique is avoiding use of SELECT * instead you should list all columns. Loops and correlated sub-queries slow down performance so they need to be used judiciously.

Elasticube: Optimizing Performance via In-Chip Optimization

The approach outlined above, while scalable, often means sacrificing a great deal of the granularity of the original data on account of the need to denormalize and index said data in advance. This also limits the ability of the typically less-technical, front-end users of analytical applications to perform ad-hoc analysis – as they can only access certain views or queries according to a predefined schema.

The Elasticube present an alternative form of analytical repository designed for fast query retrieval and scale. It’s based on the notion that by employing In-Chip™ technology to maximize the capabilities of modern hardware, business intelligence tools can achieve optimal performance when processing analytical queries, even without sacrificing the granularity of the data or flexibility of the front-end analysis – since faster query retrieval eliminates the need for summary tables, indexes or denormalization.

 

Share This Article
Facebook Pinterest LinkedIn
Share

Follow us on Facebook

Latest News

Edge Computing in IoT
Unique Capabilities of Edge Computing in IoT
Exclusive Internet of Things
Turning Geographic Data Into Competitive Advantage
The Rise of Location Intelligence: Turning Geographic Data Into Competitive Advantage
Big Data Exclusive
AI Recruitment Software Solution
The Best AI Recruitment Software Solution: Transforming Hiring with Smarter Tech
Artificial Intelligence Exclusive
real estate data
How Big Data Is Changes How We Buy and Sell Real Estate
Big Data Exclusive

Stay Connected

1.2KFollowersLike
33.7KFollowersFollow
222FollowersPin

You Might also Like

big data approaches to marketing
Big DataExclusive

Can Big Data Approaches To Marketing Slash Business Closure Rates?

8 Min Read

How the Cloud is Changing the Data Center’s Bad Reputation for Energy Inefficiency

10 Min Read
Image
AnalyticsBig Data

The Value Architect and Data Scientist: Are They Really New Roles?

9 Min Read

Talk Analytics with Executives: 4 Things You Must Understand

8 Min Read

SmartData Collective is one of the largest & trusted community covering technical content about Big Data, BI, Cloud, Analytics, Artificial Intelligence, IoT & more.

data-driven web design
5 Great Tips for Using Data Analytics for Website UX
Big Data
ai chatbot
The Art of Conversation: Enhancing Chatbots with Advanced AI Prompts
Chatbots

Quick Link

  • About
  • Contact
  • Privacy
Follow US
© 2008-25 SmartData Collective. All Rights Reserved.
Welcome Back!

Sign in to your account

Username or Email Address
Password

Lost your password?