Trystan here, Software Engineer and doer of all things technical at AtScale. Which SQL-on-Hadoop engine performs best? We get this question all the time!
We looked around and found that no one had done a complete and impartial benchmark test of real-life workloads across multiple SQL-on-Hadoop engines (Impala, Spark, Hive...etc).
So, we decided to put our enterprise experience to work and deliver the world's first BI-on-Hadoop performance benchmark.
What did we find out? Well, turns out that the right question to ask is: "Which engine performs best for Which query type?". We looked across three of the most common types of BI queries and found that each engine had a particular niche. Bottom line: One Engine does NOT fit all.
Read on to find out the details of our environment and configuration, the types of queries we tested... (or download the full whitepaper here)