Cookies help us display personalized product recommendations and ensure you have great shopping experience.

By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
SmartData CollectiveSmartData Collective
  • Analytics
    AnalyticsShow More
    media monitoring
    Signals In The Noise: Using Media Monitoring To Manage Negative Publicity
    5 Min Read
    data analytics
    How Data Analytics Can Help You Construct A Financial Weather Map
    4 Min Read
    financial analytics
    Financial Analytics Shows The Hidden Cost Of Not Switching Systems
    4 Min Read
    warehouse accidents
    Data Analytics and the Future of Warehouse Safety
    10 Min Read
    stock investing and data analytics
    How Data Analytics Supports Smarter Stock Trading Strategies
    4 Min Read
  • Big Data
  • BI
  • Exclusive
  • IT
  • Marketing
  • Software
Search
© 2008-25 SmartData Collective. All Rights Reserved.
Reading: Schema on Read vs Schema on Write and Why Shakespeare Hates Me
Share
Notification
Font ResizerAa
SmartData CollectiveSmartData Collective
Font ResizerAa
Search
  • About
  • Help
  • Privacy
Follow US
© 2008-23 SmartData Collective. All Rights Reserved.
SmartData Collective > Uncategorized > Schema on Read vs Schema on Write and Why Shakespeare Hates Me
Uncategorized

Schema on Read vs Schema on Write and Why Shakespeare Hates Me

Paige Roberts
Paige Roberts
5 Min Read
SHARE

A couple of months ago, I found myself without a full time gig for the first time in decades, and I did a little freelance blogging. Being an overachiever, I wrote such a long post for Adaptive Systems Inc. that I broke it into two parts. The first part got published before I dove head first into documenting and unit testing a big Hadoop implementation. The second part got published last week.

A couple of months ago, I found myself without a full time gig for the first time in decades, and I did a little freelance blogging. Being an overachiever, I wrote such a long post for Adaptive Systems Inc. that I broke it into two parts. The first part got published before I dove head first into documenting and unit testing a big Hadoop implementation. The second part got published last week.

It was interesting reading my opinions on the nature and comparative strengths of the various strategies and technologies from a few months ago. It had been long enough that I didn’t remember what I’d written. I got a kick out of comparing my perspective, now that I have some recent hands-on experience digging through Hive code, comparing query speed with ORC vs without, or with MapReduce vs Tez.

In part 1, I made Shakespeare roll in his grave by misquoting Hamlet repeatedly, while talking about the merits of schema on read versus schema on write strategies in Hadoop data lake projects. This time, I butchered Romeo and Juliet when looking at SQL in Hadoop technologies that use these two strategies, and recommended how to decide which ones to use for your next Hadoop data lake project. Main point: it depends on which balcony you want to climb. And either way, SQL is not off the menu just because you’re using Hadoop.

More Read

Ten examples of SOA at work, circa 2008
Workforce Management for Human Capital Management
Is “The New Small” the future of big business?
Social Media Strategy Q & A
Missing! Results from the field of change management

Hadoop Data Lake Balcony

Yup, still feel pretty much the same on the subject. Nothing new there. Glad to see the post get released into the wild at last.

Life has taken a radical shift, though. This is the first time in ages that I’ve had a job where social media activity wasn’t part of the job description. My blog has been neglected. My Twitter followers probably think I fell off the face of the earth. But no. Still here. Just heads down a lot of the time, digging through code and quizzing folks about why this strategy was picked, or how this problem was solved.

I go through phases in my life where I learn and practice, and other phases where I assimilate and share. Of course, all of my life has been a mishmash of both, but the pendulum swings more toward one end of the spectrum or the other sometimes. Right now, I’m swimming in information and splashing around like a kid in summer.

The Content Pool by Alan J Porter

My friend Alan J. Porter wrote a great book about content management called The Content Pool, and another friend, Doug Potter, did this awesome cover illustration. I feel like that guy on the cover. (Buy that book, btw, if you do anything content management related. It’s a must have.)

I still owe everybody a Storm post, and it is coming, but I’ve also been learning a lot about Apache Nifi, ironically recently re-named DataFlow. Expect a post to come on that. I wrote something up a couple weeks ago about reasons Hadoop implementations fail. That’s bound to show up somewhere soon.

Stay tuned … Same bat time, same bat channel.

On another note, I’m interested in comparing ETL workflow orchestration tools, especially open source ones, but also good commercial ones if they’re not priced out of the usual Hadoop market. I’m looking for things that are Oozie-like, but better than Oozie. (Oozie is NOT one of my favorite Hadoop ecosystem bits.)

Suggestions?

Share This Article
Facebook Pinterest LinkedIn
Share

Follow us on Facebook

Latest News

data security issues with annotation outsourcing
Data Annotation Outsourcing and Risk Mitigation Strategies
Big Data Exclusive Security
NO-CODE
Breaking down SPARC Emulation Technology: Zero Code Re-write
Exclusive News Software
online business using analytics
Why Some Businesses Seem to Win Online Without Ever Feeling Like They Are Trying
Exclusive News
edi compliance with AI
AI Is Transforming EDI Compliance Services
Exclusive News

Stay Connected

1.2KFollowersLike
33.7KFollowersFollow
222FollowersPin

You Might also Like

Potato Chips, French Fries, and Metadata

4 Min Read

Protecting Yourself From Server 2003 Breaches

3 Min Read

Big Data Legends: Jake Porway

6 Min Read

Don’t do list for 2010

4 Min Read

SmartData Collective is one of the largest & trusted community covering technical content about Big Data, BI, Cloud, Analytics, Artificial Intelligence, IoT & more.

ai in ecommerce
Artificial Intelligence for eCommerce: A Closer Look
Artificial Intelligence
giveaway chatbots
How To Get An Award Winning Giveaway Bot
Big Data Chatbots Exclusive

Quick Link

  • About
  • Contact
  • Privacy
Follow US
© 2008-25 SmartData Collective. All Rights Reserved.
Welcome Back!

Sign in to your account

Username or Email Address
Password

Lost your password?