AtScale Blog

Supercharge Your Percentile Calculations for Big Data (Part III)

Posted by Daren Drummond on Feb 27, 2018

Additional contribution by: Santanu ChatterjeeTrystan LeftwichBryan Naden.

In the previous post we demonstrated how to model percentile estimates and use them in Tableau without moving large amounts of data.  You may ask, "how accurate are the results and how much load is placed on the cluster?".  In this post we discuss the accuracy and scaling properties of the AtScale percentile estimation algorithm.


To learn how to be a data driven orgazation, watch this webinar now!


Read More

Topics: Hadoop, bi-on-hadoop, Analytics, BI on Big Data, percentiles

Supercharge Your Percentile Calculations for Big Data (Part II)

Posted by Daren Drummond on Feb 26, 2018

Additional contribution by: Santanu ChatterjeeTrystan LeftwichBryan Naden.

In the previous post, we discussed typical use cases for percentiles and the advantages of percentile estimates.  In this post, we illustrate how to model percentile estimates with AtScale and use them from Tableau.


To learn how to be a data driven orgazation, check out this webinar!


Read More

Topics: Hadoop, bi-on-hadoop, Analytics, BI on Big Data, percentiles

Supercharge Your Percentile Calculations for Big Data (Part I)

Posted by Daren Drummond on Feb 23, 2018

Additional contribution by: Santanu Chatterjee, Trystan Leftwich, Bryan Naden.

 

A new and powerful method of computing percentile estimates on Big Data is now available to you! By combining the well known t-Digest algorithm with AtScale’s semantic layer and smart aggregation features AtScale addresses gaps in both the Business Intelligence and Big Data landscapes. Most BI tools have features to compute and display various percentiles (i.e. medians, interquartile ranges, etc), but they move data for processing which dramatically limits the size of the analysis.  The Hadoop-based SQL engines (Hive, Impala, Spark) can compute approximate percentiles on large datasets, however these expensive calculations are not aggregated and reused to answer similar queries.  AtScale offers robust percentile estimates that work with AtScale’s semantic layer and aggregate tables to provide fast, accurate, and reusable percentile estimates.  

In this three-part blog series we discuss the benefits of percentile estimates and how to compute them in a Big Data environment.  Subscribe today to learn the best practices of percentile estimation on Big Data and more.  Let's dive right in!


To learn how to be a data driven orgazation, check out this webinar!


Read More

Topics: Hadoop, bi-on-hadoop, Analytics, BI on Big Data, percentiles

The 6 Principles of Modern Data Architecture

Posted by Joshua Klahr on Jan 19, 2018

A version of this article originally appeared on the Cloudera VISION blog.

One of my favorite parts of my role is that I get to spend time with customers and prospects, learning what’s important to them as they move to a modern data architecture. Lately, a consistent set of six themes has emerged during these discussions. The themes span industries, use cases and geographies, and I’ve come to think of them as the key principles underlying an enterprise data architecture.

Whether you’re responsible for data, systems, analysis, strategy or results, you can use these principles to help you navigate the fast-paced modern world of data and decisions. Think of them as the foundation for data architecture that will allow your business to run at an optimized level today, and into the future.

 

Read More

Topics: Hadoop, Business Intelligence, Big Data, Hadoop Summit, Chief Data Officer

Revolutionizing The Cloud with AtScale and Amazon RedShift

Posted by Lucio Daza on Nov 29, 2017

We Told You It Would Be Cloudy

You have probably already heard all about it, read all about it and know all about it, but if you are anything like me, (or Einstein) you have already figured out that the more we learn about BI on Big Data the more we realize how little we know. Or, do we?


Read More

Topics: Hadoop, bi-on-hadoop, Analytics, BI on Big Data, 6.0, Amazon, AWS, RedShift

AtScale and HDInsight on Microsoft Azure

Posted by Joshua Klahr on Nov 2, 2017

The Forecast Calls for Cloudy Weather

You don’t have to be a clairvoyant to know that there is an ever-increasing trend in cloud adoption among start-ups and enterprises alike.  In the world of Big Data, the past few years have shown a significant increase in cloud adoption.  While Amazon initially led the way with cloud data products - with Amazon Elastic MapReduce (EMR) for Hadoop and Redshift for data warehousing - the past 12 months have seen new entrants on the scene.

Read More

Topics: Hadoop, bi-on-hadoop, Analytics, BI on Big Data, 6.0, azure, HDInsight, Microsoft

TECH TALK: AtScale 6.0 brings Universal Semantic Layer Benefits to Google Cloud

Posted by Joshua Klahr on Oct 10, 2017

Is it October already?

It’s hard to believe that October is here. It feels like only a few days ago that we released AtScale 5.0 and AtScale 5.5.  Both releases contained a number of great new features that I was excited to share with the Big Data and Business Intelligence communities. Although I may be a little biased, I really do believe that the release of AtScale 6.0 marks one of the biggest releases in the history of the company. 

Read More

Topics: Hadoop, Tableau, bi-on-hadoop, Analytics, BI on Big Data, Google BigQuery, BigQuery, 6.0

The True Cost of doing Big Data...the old fashioned way...

Posted by Bruno Aziza on Aug 16, 2017

Despite the challenges associated with data warehousing, enterprise IT leaders have accepted it as a necessary evil of deriving value from information within Hadoop and other Big Data ecosystems. How much does it cost to create data warehouses or datamarts that extract data out of Hadoop? Is there a better way to do BI on Big Data?!

Read More

Topics: Hadoop, bi-on-hadoop, Analytics, BI on Big Data

TECH TALK: AtScale, Hive, Druid: A Match Made In Heaven

Posted by Joshua Klahr on May 11, 2017

The rapidly exploding demand for business intelligence on big data is nothing new - this trend is clearly indicated in the latest Big Data Maturity surveys (2015 and 2016).  As shown in the graphic below, 75% of respondents are planning on deploying BI workloads on their big data platforms (with 73% of respondents already with some BI use cases deployed).

Read More

Topics: Hadoop, hive, bi-on-hadoop, Analytics, BI on Big Data, druid

What's the best BI tool for Hadoop?

Posted by Bruno Aziza on Apr 20, 2017

Every once in awhile, the ultimate question comes up: "What is the best analysis tool for BI on Hadoop?!"  AtScale is not in the business of favoring one tool versus the other.  We are in the business of making all of them work.  There are indeed many reasons why business users and IT departments choose particular analysis tools.   Here are a few things to consider. 

Read More

Topics: Hadoop, bi-on-hadoop, Analytics, BI on Big Data

Learn about BI & Hadoop

The AtScale Blog is the one-stop shop for cutting edge news and insights about BI on Hadoop and all things AtScale.

Subscribe to Email Updates

Recent Posts