Data Lake Intelligence with AtScale
In my recent Data Lake 2.0 article I described how the worlds of big data and cloud are coming together to reshape the concept of the data lake. The data lake is an important element of any modern data architecture, and the data lake footprint will continue to expand. However, the data lake investment is only one part of delivering a modern data architecture. At Yahoo!, in addition to building a Hadoop-based data lake, we also needed to solve the problem of connecting traditional business intelligence workloads to this Hadoop data. Although the term “Data Lake” didn’t exist back then, we were solving the problem of: “How can you deliver an interactive BI experience on top of a scale-out Data Lake” - it turns out we were pioneers in delivering Data Lake Intelligence.
Our experiences and learnings from those initial efforts led to the architecture that sits at the core of the AtScale Intelligence Platform. Because AtScale has been built from the ground up to deliver business-friendly insights from the vast amounts of information in data lakes, AtScale has experienced tremendous success and adoption in enterprises ranging from financial services, to retail to digital media. With the release of AtScale 6.5, we’ve continued to build on and expand AtScale’s ability to uniquely deliver on the promise of Data Lake Intelligence. If this sounds like something you might be interested in knowing more about… keep reading!
Read More
Topics:
Business Intelligence,
bi-on-hadoop,
Big Data,
Cloud,
BI,
Analytics,
BI on Big Data,
Data Strategy,
data driven
A version of this article originally appeared on the Cloudera VISION blog.
One of my favorite parts of my role is that I get to spend time with customers and prospects, learning what’s important to them as they move to a modern data architecture. Lately, a consistent set of six themes has emerged during these discussions. The themes span industries, use cases and geographies, and I’ve come to think of them as the key principles underlying an enterprise data architecture.
Whether you’re responsible for data, systems, analysis, strategy or results, you can use these principles to help you navigate the fast-paced modern world of data and decisions. Think of them as the foundation for data architecture that will allow your business to run at an optimized level today, and into the future.
Read More
Topics:
Hadoop,
Business Intelligence,
Big Data,
Hadoop Summit,
Chief Data Officer
The Forecast Calls for Cloudy Weather
You don’t have to be a clairvoyant to know that there is an ever-increasing trend in cloud adoption among start-ups and enterprises alike. In the world of Big Data, the past few years have shown a significant increase in cloud adoption. While Amazon initially led the way with cloud data products - with Amazon Elastic MapReduce (EMR) for Hadoop and Redshift for data warehousing - the past 12 months have seen new entrants on the scene.
Read More
Topics:
Hadoop,
bi-on-hadoop,
Analytics,
BI on Big Data,
6.0,
azure,
HDInsight,
Microsoft
Is it October already?
It’s hard to believe that October is here. It feels like only a few days ago that we released AtScale 5.0 and AtScale 5.5. Both releases contained a number of great new features that I was excited to share with the Big Data and Business Intelligence communities. Although I may be a little biased, I really do believe that the release of AtScale 6.0 marks one of the biggest releases in the history of the company.
Read More
Topics:
Hadoop,
Tableau,
bi-on-hadoop,
Analytics,
BI on Big Data,
Google BigQuery,
BigQuery,
6.0
The rapidly exploding demand for business intelligence on big data is nothing new - this trend is clearly indicated in the latest Big Data Maturity surveys (2015 and 2016). As shown in the graphic below, 75% of respondents are planning on deploying BI workloads on their big data platforms (with 73% of respondents already with some BI use cases deployed).
Read More
Topics:
Hadoop,
hive,
bi-on-hadoop,
Analytics,
BI on Big Data,
druid
In the world of Business Intelligence and Big Data there continue to be a number of exciting innovations as new and improved options for processing large data sets appear on the market. You may be familiar with AtScale’s BI-on-Hadoop Benchmarks - where we focus on evaluating the top SQL-on-Hadoop engines and their fitness to support traditional BI-style queries. As we continue to work with customers who are navigating their journey to BI on Big Data, we are increasingly getting questions about the emerging cloud-based data processing engines.
In this blog post, we will take a deeper look at BigQuery from Google, and how it stacks up in the BI-on-Big Data ecosystem.
Read More
Topics:
Business Intelligence,
Big Data,
olap,
BI,
Google BigQuery
I’ve asked it before and I’ll ask it again. Wouldn’t it be great if you could easily analyze ALL your data from a Excel single file? We all know this isn’t feasible; especially when dealing with big data and complex business analytics needs.
In working at the intersection of Big Data and traditional Business Intelligence, the AtScale team has encountered a number of complex business analytics use cases that are difficult, if not near-impossible, to solve using typical table-based data models and SQL. Today, I’m going to share why and how complex analysis, like for multi-level metrics, is no longer as ‘difficult’ nor ‘near-impossible’ as it once was.
Read More
Topics:
Business Intelligence,
Big Data,
olap,
BI
Wouldn’t it be great if you could load all of your data from a single file into an Excel pivot table for easy analysis?
Unfortunately, this approach isn’t usually viable when dealing with complex business analytics and big data. Take for example a typical use case found inthe world of healthcare insurance. A large insurance provider has 10s of millions of members, and processes 100s of millions of claims a year. As flexible as Excel is, we all know it won’t handle this volume or velocity of data.
As a result, more and more enterprises store large data sets in big data platforms like Hadoop. And while Hadoop provides a low-cost and performant approach to store and process this information, there is still the challenge of supporting the many types of analytics required on claims and member data sets. But why? Why and how, with all of the advances in technology, can a simple calculation cause so much complexity?
Read More
Topics:
Business Intelligence,
Big Data,
olap,
BI
Just this week, AtScale published the Q4 Edition of our BI-on-Hadoop Benchmark, and we found 1.5X to 4X performance improvements across SQL engines Hive, Spark, Impala and Presto for Business Intelligence and Analytic workloads on Hadoop.
Bottom line, the benchmark results are great news for any company looking to analyze their big data in Hadoop because you can now do so faster, on more data, for more users than ever before.
While this blog provides a high level summary of our findings, you can access the full Q4 2016 Edition of the BI-on-Hadoop Benchmarks here, and also listen to our webinar replay discussing this in more details here.
Read More
Topics:
Hadoop,
Business Intelligence,
spark,
hive,
bi-on-hadoop,
Big Data,
impala,
presto
The growing popularity of big data analytics coupled with the adoption of technologies like Spark and Hadoop have allowed enterprises to collect an ever increasing amount of data - in terms of breadth and volume. At the same time, the need for traditional business analysis of these data sets using widely adopted tools like Microsoft Excel, Tableau, and Qlik still remains. Historically data is provided to these visualization front ends using OLAP interfaces and data structures. OLAP makes the data easy for business users to consume, and offers interactive performance for the types of queries that the business intelligence (BI) tools generate.
However, as data volumes explode, reaching hundreds of terabytes or even petabytes of data, traditional OLAP servers have a hard time scaling. To surmount this modern data challenge, many leading enterprises are now in search of the next generation of business intelligence capabilities, falling into the category of scale-out BI. In this blog I'll share how you can leverage the familiar interface and performance of an OLAP server while scaling out to the largest of data sets.
And if you don't have time to read the whole thing, don't miss the 10-minute 'cliff-note' video of scale-out BI on Hadoop near the end.
Read More
Topics:
Hadoop,
Business Intelligence,
spark,
hive,
bi-on-hadoop,
Big Data