AtScale Blog

Supercharge Your Percentile Calculations for Big Data (Part III)

Posted by Daren Drummond on Feb 27, 2018

Additional contribution by: Santanu ChatterjeeTrystan LeftwichBryan Naden.

In the previous post we demonstrated how to model percentile estimates and use them in Tableau without moving large amounts of data.  You may ask, "how accurate are the results and how much load is placed on the cluster?".  In this post we discuss the accuracy and scaling properties of the AtScale percentile estimation algorithm.


To learn how to be a data driven orgazation, watch this webinar now!


Read More

Topics: Hadoop, bi-on-hadoop, Analytics, BI on Big Data, percentiles

Supercharge Your Percentile Calculations for Big Data (Part II)

Posted by Daren Drummond on Feb 26, 2018

Additional contribution by: Santanu ChatterjeeTrystan LeftwichBryan Naden.

In the previous post, we discussed typical use cases for percentiles and the advantages of percentile estimates.  In this post, we illustrate how to model percentile estimates with AtScale and use them from Tableau.


To learn how to be a data driven orgazation, check out this webinar!


Read More

Topics: Hadoop, bi-on-hadoop, Analytics, BI on Big Data, percentiles

Supercharge Your Percentile Calculations for Big Data (Part I)

Posted by Daren Drummond on Feb 23, 2018

Additional contribution by: Santanu Chatterjee, Trystan Leftwich, Bryan Naden.

 

A new and powerful method of computing percentile estimates on Big Data is now available to you! By combining the well known t-Digest algorithm with AtScale’s semantic layer and smart aggregation features AtScale addresses gaps in both the Business Intelligence and Big Data landscapes. Most BI tools have features to compute and display various percentiles (i.e. medians, interquartile ranges, etc), but they move data for processing which dramatically limits the size of the analysis.  The Hadoop-based SQL engines (Hive, Impala, Spark) can compute approximate percentiles on large datasets, however these expensive calculations are not aggregated and reused to answer similar queries.  AtScale offers robust percentile estimates that work with AtScale’s semantic layer and aggregate tables to provide fast, accurate, and reusable percentile estimates.  

In this three-part blog series we discuss the benefits of percentile estimates and how to compute them in a Big Data environment.  Subscribe today to learn the best practices of percentile estimation on Big Data and more.  Let's dive right in!


To learn how to be a data driven orgazation, check out this webinar!


Read More

Topics: Hadoop, bi-on-hadoop, Analytics, BI on Big Data, percentiles

Learn about BI & Hadoop

The AtScale Blog is the one-stop shop for cutting edge news and insights about BI on Hadoop and all things AtScale.

Subscribe to Email Updates

Recent Posts