It is no surprise that as businesses discover more ways to turn data into value, the appetite for storing large amounts of data is also increasing. Low storage costs enabled by cloud computing solutions such as Google BigQuery, Amazon Web Services, and Microsoft Azure allow enterprises to house vast quantities of data with minimal risk. However, analysts note that up to 75% of data remains unused, meaning most companies are not truly using data to run their businesses.
The primary reason for this missed opportunity is excessively long “time to insight.”
The cause for most companies’ lengthy times to insight are their inability to rapidly obtain answers to key business questions from querying data warehouses. Companies suffering from a slow “time to insight” struggle to effectively leverage the large amounts of data they have at their disposal. All too often, by the time a company receives the answer to their business question, that question’s value to the business has become obsolete.
Challenges leading to reduced time to insight:
Today’s average enterprise uses more than two BI tools to ingest its data. For enterprises using business intelligence tools with a cloud data warehouse, highly available concurrency access is a must-have to deliver self-service BI to end users. Gartner defines self-service BI as a scenario in which line-of-business professionals are “enabled and encouraged to perform queries and generate reports on their own, with nominal IT support.” Efficient data driven organizations will allow users to generate as many queries as needed against their data lake, allowing many departments and teams to gain rapid insights from the data they own. Enabling rapid generation of queries by large amounts of users is easier said than done, primarily due to data warehouses’ limitations around query concurrency. Currently, Google BigQuery, Amazon Redshift, and Microsoft Azure enforce a query concurrency limit of 50 queries or less. Enterprises with large amounts of teams and users can quickly hit this concurrency limit, leading to a time to insight that can take weeks or even months, which in turn prevents the enterprise from receiving actionable data in time to answer a pressing business question.
Having a strong centrally-defined data model is integral to streamlining the number of queries that BI users are required to make, which in turn speeds up time to insight. The ability to enable the end-user to navigate data without having to constantly define new queries delivers consistent results in a timely manner. A centrally-defined BI model is something that can be difficult for enterprises to achieve, especially when working across multiple business intelligence tools and data warehouses. Many data warehouses do not have an interface to create a business-centered data model. This can result in multiple models being created between various BI tools and data warehouses. These models need to be updated individually as new data comes in or as business priorities change, which greatly hinders time to insight.
How AtScale solves these challenges to drastically cut down time to insight:
To ensure successful use of BI tools on a data warehouse, enterprises should design a business-centered model that defines facts, dimensions, and clear relationships between their data points, so end users can consume information and gain insights through their BI tool of choice. AtScale allows business users to design, develop and publish this type of model. The AtScale virtual cube designer is based on concepts that BI developers already understand. Models produced through AtScale contain the metadata that BI applications need to browse, and can also query data directly from the data lake. It makes the data lake look like any other multi-dimensional data mart or relational data warehouse, without the need for ETL, processing or moving the data out of the GCP environment. AtScale’s universal semantic layer includes no data movement, minimizing the latency between IT and BI when preparing data for consumption, and reducing preparation time from weeks to minutes. AtScale’s ability to reduce the time needed for data prep drastically cuts down on an enterprise’s overall time to insight.
Empowering users with interactive modeling and data consumption frees up time to invest in the things that matter: complex analysis and deeper insights from data. However, the addition of many users who query data can still translate into performance issues for the underlying data warehouse due to concurrency limitations. AtScale prevents this problem by utilizing aggregate tables. These tables contain measures from one or more fact tables and include aggregated values for these measures. The aggregation of the data is at the level of one or more dimensional attributes, or, if no dimensional attributes are included, the aggregated data is a total of the values for the included measures. AtScale aggregates reduce the number of rows that the query has to scan in order to obtain the results for the report or dashboard. By doing this, the length of time needed to produce the results will be dramatically reduced. The data warehouse will handle queries that are hitting smaller data sets, thus allowing for faster execution and more queries per the amount of time, resulting in reduced time to insight.
In short, AtScale’s ability to create a business-centric data model and aggregate tables solve two major pain points that result in slow time to insight. Enterprises in the retail, automotive, and financial services industries have employed AtScale to bring their time to insight from weeks to hours.