Cookies help us display personalized product recommendations and ensure you have great shopping experience.

By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
SmartData CollectiveSmartData Collective
  • Analytics
    AnalyticsShow More
    data driven insights
    How Data-Driven Insights Are Addressing Gaps in Patient Communication and Equity
    8 Min Read
    pexels pavel danilyuk 8112119
    Data Analytics Is Revolutionizing Medical Credentialing
    8 Min Read
    data and seo
    Maximize SEO Success with Powerful Data Analytics Insights
    8 Min Read
    data analytics for trademark registration
    Optimizing Trademark Registration with Data Analytics
    6 Min Read
    data analytics for finding zip codes
    Unlocking Zip Code Insights with Data Analytics
    6 Min Read
  • Big Data
  • BI
  • Exclusive
  • IT
  • Marketing
  • Software
Search
© 2008-25 SmartData Collective. All Rights Reserved.
Reading: Two Approaches to Scalable Database Design
Share
Notification
Font ResizerAa
SmartData CollectiveSmartData Collective
Font ResizerAa
Search
  • About
  • Help
  • Privacy
Follow US
© 2008-23 SmartData Collective. All Rights Reserved.
SmartData Collective > Big Data > Two Approaches to Scalable Database Design
Big Data

Two Approaches to Scalable Database Design

Eran Levy
Eran Levy
7 Min Read
SHARE

Any form of application used for data analysis is stringently dependant on its ability to retrieve queries fast. However, when working with larger or more complex datasets, as well as an increasing amount of concurrent users, the performance depends largely on the underlying analytical database – whether this is built into the application as part of a single-stack tool, or implemented via a separate data warehouse layer.

Contents
What Makes a Scalable Database?The Traditional Approach: DenormalizationElasticube: Optimizing Performance via In-Chip Optimization

Any form of application used for data analysis is stringently dependant on its ability to retrieve queries fast. However, when working with larger or more complex datasets, as well as an increasing amount of concurrent users, the performance depends largely on the underlying analytical database – whether this is built into the application as part of a single-stack tool, or implemented via a separate data warehouse layer.

What Makes a Scalable Database?

Database scalability is a concept in database design that emphasizes on the capability of a database to handle growth in the amount of data and users. In the modern applications sphere two types of workloads have emerged – namely analytical and transactional workloads. Planning for workload growth must take into account operating system, database design and hardware design decisions.

More Read

Visual Complexity is a unified resource space for anyone…
Quality vs Quantity in Customer Service Analytics
7 Ways Big Data Is Essential For Life Insurance Settlements
Big Data vs. Small Data: An Analysis Of Its Latest News
Big Data ROI? Not Likely in Year 1

The Traditional Approach: Denormalization

At the hardware level vertical and horizontal scalability exist. Vertical scaling involves increasing the capacity of a single machine. The capacity of a single machine can be increased by adding more CPU cores, RAM and storage. Horizontal scaling involves adding more machines to cope with growing workloads. This article will only highlight database design decisions required for a scalable application.

At the database level configuration, schema design, indexing and query design affect the capability of a database to scale. Each relational database system provides a set of parameters that can be tweaked to improve performance. Therefore it is advisable to check parameters that are used in the specific database system. Schema design offers the biggest opportunity for a system developer to make decisions that will affect the scalability and performance of a database. Transaction workloads require a normalized design while analytical workloads require a denormalized design. Each design will be briefly explained.

In a transaction workload the main objective is to process a large number of short UPDATE, INSERT and DELETE queries. Such databases are normalized to the third normal form (3NF). In this form business entities are broken into small tables. The business entities to be used in informing how to break up the tables come from a thorough analysis of the business. Normalization reduces redundant storage of data and enables tables to be joined to others. Indexing in a transaction workload does improve query response but slows update, insert and delete queries.

Therefore indexing can speed up or slow down a transaction database depending on how it is implemented. An iterative indexing strategy of implementing an index and testing its effect is appropriate. Although partitioning was developed for data warehouses it can still be beneficial in a transaction system especially where databases are big. A 3NF assures data consistency and accuracy but performance may be reduced due to the multiple joins involved. Trying to optimize query performance in a transaction system will affect the core functions of recording business data. Therefore it is advisable to migrate data analysis to a specialized system that uses an analytical workload design.

In an analytical workload the objective is to process few complex queries that arise in data analysis. This design is optimized for fast query performance. The data is denormalized meaning the business entities that were broken into different tables in the transaction system are joined together into one table. In this design the two major table types are dimension and fact tables. Dimension tables contain descriptive attributes about the business. Fact tables contain measurable attributes that show the performance of a business. For example in a sales process the dollar amounts are the facts while region, customer type, and product category are the dimensions. In this design indexing and partitioning are very beneficial in improving the response times.

When working with either workload query optimization is very important. One optimization technique is avoiding use of SELECT * instead you should list all columns. Loops and correlated sub-queries slow down performance so they need to be used judiciously.

Elasticube: Optimizing Performance via In-Chip Optimization

The approach outlined above, while scalable, often means sacrificing a great deal of the granularity of the original data on account of the need to denormalize and index said data in advance. This also limits the ability of the typically less-technical, front-end users of analytical applications to perform ad-hoc analysis – as they can only access certain views or queries according to a predefined schema.

The Elasticube present an alternative form of analytical repository designed for fast query retrieval and scale. It’s based on the notion that by employing In-Chip™ technology to maximize the capabilities of modern hardware, business intelligence tools can achieve optimal performance when processing analytical queries, even without sacrificing the granularity of the data or flexibility of the front-end analysis – since faster query retrieval eliminates the need for summary tables, indexes or denormalization.

 

Share This Article
Facebook Pinterest LinkedIn
Share

Follow us on Facebook

Latest News

agenic ai
How Businesses Are Using AI to Make Smarter, Faster Decisions
Artificial Intelligence Exclusive
accountant using ai
AI Improves Integrity in Corporate Accounting
Exclusive
ai and law enforcement
Forensic AI Technology is Doing Wonders for Law Enforcement
Artificial Intelligence Exclusive
langgraph and genai
LangGraph Orchestrator Agents: Streamlining AI Workflow Automation
Artificial Intelligence Exclusive

Stay Connected

1.2kFollowersLike
33.7kFollowersFollow
222FollowersPin

You Might also Like

Scenario Testing, Stress Testing and Decision Management

3 Min Read
Image
Big DataData Quality

True Data-Driven Decision Making Requires Smart, Honest Data Collection Strategies

7 Min Read

Big Data Without Integration Is Broken

7 Min Read

Top 10 analytics mistakes

7 Min Read

SmartData Collective is one of the largest & trusted community covering technical content about Big Data, BI, Cloud, Analytics, Artificial Intelligence, IoT & more.

AI and chatbots
Chatbots and SEO: How Can Chatbots Improve Your SEO Ranking?
Artificial Intelligence Chatbots Exclusive
giveaway chatbots
How To Get An Award Winning Giveaway Bot
Big Data Chatbots Exclusive

Quick Link

  • About
  • Contact
  • Privacy
Follow US
© 2008-25 SmartData Collective. All Rights Reserved.
Go to mobile version
Welcome Back!

Sign in to your account

Username or Email Address
Password

Lost your password?