Cookies help us display personalized product recommendations and ensure you have great shopping experience.

By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
SmartData CollectiveSmartData Collective
  • Analytics
    AnalyticsShow More
    sales and data analytics
    How Data Analytics Improves Lead Management and Sales Results
    9 Min Read
    data analytics and truck accident claims
    How Data Analytics Reduces Truck Accidents and Speeds Up Claims
    7 Min Read
    predictive analytics for interior designers
    Interior Designers Boost Profits with Predictive Analytics
    8 Min Read
    image fx (67)
    Improving LinkedIn Ad Strategies with Data Analytics
    9 Min Read
    big data and remote work
    Data Helps Speech-Language Pathologists Deliver Better Results
    6 Min Read
  • Big Data
  • BI
  • Exclusive
  • IT
  • Marketing
  • Software
Search
© 2008-25 SmartData Collective. All Rights Reserved.
Reading: So You Think You’re Ready for a Data Warehouse Appliance, Part 2
Share
Notification
Font ResizerAa
SmartData CollectiveSmartData Collective
Font ResizerAa
Search
  • About
  • Help
  • Privacy
Follow US
© 2008-23 SmartData Collective. All Rights Reserved.
SmartData Collective > Big Data > Data Warehousing > So You Think You’re Ready for a Data Warehouse Appliance, Part 2
Data Warehousing

So You Think You’re Ready for a Data Warehouse Appliance, Part 2

EvanLevy
EvanLevy
7 Min Read
SHARE
Forklift by Bien Stephenson via Flickr (Creative Commons license)

As I wrote in last week’s blog post, a data warehouse appliance simplifies platform and system resource administration. It doesn’t simplify the traditional time-intensive efforts of managing and integrating disparate data and addressing performance and tuning of various applications that contend for the same resources.

Many data warehouse appliance vendors offer sophisticated parallel processing environments, query optimization, and specialized storage structures to improve query processing (e.g., columnar-based engines). It’s naïve to think that taking data from an SMP (Symmetric Multi-Processing) relational database and moving it into a parallel processing environment will effectively scale without any adjustments or changes. Moving onto an appliance can…

More Read

Spotlight on Innovation
DM Radio: EDW Mistakes to Avoid
Measuring the benefits of Business Intelligence
Glasshouse by Green Phosphor is a gateway which can take…
“Businesses Are Still Crazy for BI After All These Years” – CIO.com
Forklift by Bien Stephenson via Flickr (Creative Commons license)

As I wrote in last week’s blog post, a data warehouse appliance simplifies platform and system resource administration. It doesn’t simplify the traditional time-intensive efforts of managing and integrating disparate data and addressing performance and tuning of various applications that contend for the same resources.

Many data warehouse appliance vendors offer sophisticated parallel processing environments, query optimization, and specialized storage structures to improve query processing (e.g., columnar-based engines). It’s naïve to think that taking data from an SMP (Symmetric Multi-Processing) relational database and moving it into a parallel processing environment will effectively scale without any adjustments or changes. Moving onto an appliance can be likened to moving into a new house.  When you move into a new, larger house, you quickly learn that it’s not as simple as dumping all of your stuff into the new house.  The different dimensions of the new rooms cause you realize that some of your old furniture or rugs simple don’t fit.  You inevitably have to make adjustments if you want to truly enjoy your new home.  The same goes with a data warehouse appliance; it likely has numerous features to support growth and scalability; you have to make adjustments to leverage their benefits.

Companies that expect to simply dump their data from a few legacy data marts over to a new appliance should expect to confront some adjustments or their likely to experience some unpleasant surprises. Here are some that we’ve already seen.

Everyone agrees that the biggest cost issue behind building a data warehouse is ETL design and development. Hoping to migrate existing ETL jobs into a new hardware and processing environment without expecting rework is short-sighted.  While you can probably force fit your existing job streams, you’ll inevitably misuse the new system, waste system resources, and dramatically reduce the lifespan of the appliance. Each appliance has its own way of handling the intensive resource requirements of data loading – in much the same way that each incumbent database product addresses these same situations. If you’ve justified an appliance through the benefits of consolidating multiple data marts (that contain duplicate data), it only makes sense to consolidate and integrate the ETL processes to prevent processing duplication and waste.

To assume that because you’ve built your ETL architecture leveraging the latest and greatest ETL software technology that you won’t have to review the underlying ETL architecture is also misguided.  While there’s no question that migrating tool-based ETL jobs to a new platform can be much easier than lower-level code, the issue at hand isn’t the source and destination– it’s the underlying table structures.  Not every table will change in definition on a new platform, but the largest (and most used) table content is the most likely candidate for review and redesign.  Each appliance handles data distribution and database design differently. Consequently, since the underlying table structures are likely to require adjustment, plan on a redesign of the actual ETL process too.

I’m also surprised by the casual attitude regarding technical training.  After all, it’s just a SQL database, right? But application developers and data warehouse development staff need to understand the differences of the appliance product (after all, it’s a different database version or product).  While most of this knowledge can be gained through reading the manuals – when was the last time the DBAs or database developers actually had a full-set of manuals—much less the time required to read them?  The investment in training isn’t significant—usually just a few days of classes. If you’re going to provide your developers with a product that claims to bigger, better, and faster than its competitors, doesn’t it make sense to prepare them adequately to use it?

There’s also an assumption that—since most data warehouse appliance vendors are software-only—that there are no hardware implications. On the contrary, you should expect to change your existing hardware. The way memory and storage are configured on a data warehouse appliance can differ from a general-purpose server, but it’s still rare that the hardware costs are factored into the development plan. And believing that older servers can be re-purposed has turned out to be a myth.  If you ‘re attempting to support more storage, more processing, and more users, how can using older equipment (with the related higher maintenance costs) make financial sense?

You could certainly fork-lift your data, leave all the ETL jobs alone, and not change any processing.  Then again, you could save a fortune on a new data warehouse appliance and simply do nothing. After all, no one argues with the savings associated with doing nothing—except, of course, the users that need the data to run your business.

photo by Bien Stephenson via Flickr (Creative Commons License)

Link to original post

Share This Article
Facebook Pinterest LinkedIn
Share

Follow us on Facebook

Latest News

sales and data analytics
How Data Analytics Improves Lead Management and Sales Results
Analytics Big Data Exclusive
ai in marketing
How AI and Smart Platforms Improve Email Marketing
Artificial Intelligence Exclusive Marketing
AI Document Verification for Legal Firms: Importance & Top Tools
AI Document Verification for Legal Firms: Importance & Top Tools
Artificial Intelligence Exclusive
AI supply chain
AI Tools Are Strengthening Global Supply Chains
Artificial Intelligence Exclusive

Stay Connected

1.2kFollowersLike
33.7kFollowersFollow
222FollowersPin

You Might also Like

Windows Server on Amazon EC2

2 Min Read

Nortel to Develop Virtual Collaboration Tool called web.alive…

0 Min Read

The data and information puzzle

5 Min Read
Image
AnalyticsBig DataData ManagementData WarehousingHadoopText Analytics

The Benefits of Semantic-Based Data Modeling in the Smart Data Lake Era

5 Min Read

SmartData Collective is one of the largest & trusted community covering technical content about Big Data, BI, Cloud, Analytics, Artificial Intelligence, IoT & more.

ai chatbot
The Art of Conversation: Enhancing Chatbots with Advanced AI Prompts
Chatbots
ai in ecommerce
Artificial Intelligence for eCommerce: A Closer Look
Artificial Intelligence

Quick Link

  • About
  • Contact
  • Privacy
Follow US
© 2008-25 SmartData Collective. All Rights Reserved.
Go to mobile version
Welcome Back!

Sign in to your account

Username or Email Address
Password

Lost your password?