Cookies help us display personalized product recommendations and ensure you have great shopping experience.

By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
SmartData CollectiveSmartData Collective
  • Analytics
    AnalyticsShow More
    image fx (67)
    Improving LinkedIn Ad Strategies with Data Analytics
    9 Min Read
    big data and remote work
    Data Helps Speech-Language Pathologists Deliver Better Results
    6 Min Read
    data driven insights
    How Data-Driven Insights Are Addressing Gaps in Patient Communication and Equity
    8 Min Read
    pexels pavel danilyuk 8112119
    Data Analytics Is Revolutionizing Medical Credentialing
    8 Min Read
    data and seo
    Maximize SEO Success with Powerful Data Analytics Insights
    8 Min Read
  • Big Data
  • BI
  • Exclusive
  • IT
  • Marketing
  • Software
Search
© 2008-25 SmartData Collective. All Rights Reserved.
Reading: Schema on Read vs Schema on Write and Why Shakespeare Hates Me
Share
Notification
Font ResizerAa
SmartData CollectiveSmartData Collective
Font ResizerAa
Search
  • About
  • Help
  • Privacy
Follow US
© 2008-23 SmartData Collective. All Rights Reserved.
SmartData Collective > Uncategorized > Schema on Read vs Schema on Write and Why Shakespeare Hates Me
Uncategorized

Schema on Read vs Schema on Write and Why Shakespeare Hates Me

Paige Roberts
Paige Roberts
5 Min Read
SHARE

A couple of months ago, I found myself without a full time gig for the first time in decades, and I did a little freelance blogging. Being an overachiever, I wrote such a long post for Adaptive Systems Inc. that I broke it into two parts. The first part got published before I dove head first into documenting and unit testing a big Hadoop implementation. The second part got published last week.

A couple of months ago, I found myself without a full time gig for the first time in decades, and I did a little freelance blogging. Being an overachiever, I wrote such a long post for Adaptive Systems Inc. that I broke it into two parts. The first part got published before I dove head first into documenting and unit testing a big Hadoop implementation. The second part got published last week.

It was interesting reading my opinions on the nature and comparative strengths of the various strategies and technologies from a few months ago. It had been long enough that I didn’t remember what I’d written. I got a kick out of comparing my perspective, now that I have some recent hands-on experience digging through Hive code, comparing query speed with ORC vs without, or with MapReduce vs Tez.

In part 1, I made Shakespeare roll in his grave by misquoting Hamlet repeatedly, while talking about the merits of schema on read versus schema on write strategies in Hadoop data lake projects. This time, I butchered Romeo and Juliet when looking at SQL in Hadoop technologies that use these two strategies, and recommended how to decide which ones to use for your next Hadoop data lake project. Main point: it depends on which balcony you want to climb. And either way, SQL is not off the menu just because you’re using Hadoop.

More Read

Image
5 Challenges Facing the Internet of Things
Overcoming Objections to a Data Governance Program
Do You Have a Love-Hate Relationship With Your Data?
Are people more willing to pay for digital goods on mobile devices?
On Holiday in Snowy Sheffield

Hadoop Data Lake Balcony

Yup, still feel pretty much the same on the subject. Nothing new there. Glad to see the post get released into the wild at last.

Life has taken a radical shift, though. This is the first time in ages that I’ve had a job where social media activity wasn’t part of the job description. My blog has been neglected. My Twitter followers probably think I fell off the face of the earth. But no. Still here. Just heads down a lot of the time, digging through code and quizzing folks about why this strategy was picked, or how this problem was solved.

I go through phases in my life where I learn and practice, and other phases where I assimilate and share. Of course, all of my life has been a mishmash of both, but the pendulum swings more toward one end of the spectrum or the other sometimes. Right now, I’m swimming in information and splashing around like a kid in summer.

The Content Pool by Alan J Porter

My friend Alan J. Porter wrote a great book about content management called The Content Pool, and another friend, Doug Potter, did this awesome cover illustration. I feel like that guy on the cover. (Buy that book, btw, if you do anything content management related. It’s a must have.)

I still owe everybody a Storm post, and it is coming, but I’ve also been learning a lot about Apache Nifi, ironically recently re-named DataFlow. Expect a post to come on that. I wrote something up a couple weeks ago about reasons Hadoop implementations fail. That’s bound to show up somewhere soon.

Stay tuned … Same bat time, same bat channel.

On another note, I’m interested in comparing ETL workflow orchestration tools, especially open source ones, but also good commercial ones if they’re not priced out of the usual Hadoop market. I’m looking for things that are Oozie-like, but better than Oozie. (Oozie is NOT one of my favorite Hadoop ecosystem bits.)

Suggestions?

Share This Article
Facebook Pinterest LinkedIn
Share

Follow us on Facebook

Latest News

image fx (2)
Monitoring Data Without Turning into Big Brother
Big Data Exclusive
image fx (71)
The Power of AI for Personalization in Email
Artificial Intelligence Exclusive Marketing
image fx (67)
Improving LinkedIn Ad Strategies with Data Analytics
Analytics Big Data Exclusive Software
big data and remote work
Data Helps Speech-Language Pathologists Deliver Better Results
Analytics Big Data Exclusive

Stay Connected

1.2kFollowersLike
33.7kFollowersFollow
222FollowersPin

You Might also Like

Matt Cutts Keeps Google Honest

2 Min Read

An Assessment on the Cyber Threat

8 Min Read

Relational Database Design Tips to Boost Performance

6 Min Read

Non-linearity of technology adoption

6 Min Read

SmartData Collective is one of the largest & trusted community covering technical content about Big Data, BI, Cloud, Analytics, Artificial Intelligence, IoT & more.

giveaway chatbots
How To Get An Award Winning Giveaway Bot
Big Data Chatbots Exclusive
AI chatbots
AI Chatbots Can Help Retailers Convert Live Broadcast Viewers into Sales!
Chatbots

Quick Link

  • About
  • Contact
  • Privacy
Follow US
© 2008-25 SmartData Collective. All Rights Reserved.
Go to mobile version
Welcome Back!

Sign in to your account

Username or Email Address
Password

Lost your password?