Amazon Redshift is a fast, fully managed, petabyte-scale data warehouse that makes it simple and cost-effective to analyze an organization’s data using existing business intelligence tools. Redshift is fast. Very fast! … except when it’s not. Right out of the box, Redshift can search through vast amounts of data and provide complex analytic results in shockingly short query times. Then why do we spend significant time with clients improving slow queries as their data volumes grow?

The 3 Major Reasons for Slow Redshift Queries

There are three primary reasons for poorly performing queries in Amazon Redshift:

  1. Ineffective zone maps
  2. Excessive network traffic
  3. Too many query steps

Addressing these three issues can resolve the majority of slow Redshift queries. These issues generally surface as datasets begin to grow large and cluster size increases to match. This blog post will focus on the first issue, ineffective zone maps. Excessive network traffic and numerous query steps will be addressed in subsequent blog posts.

To the first order, performance on Redshift is proportional to how much data is moved, how far it is moved and how often it is moved. If a query is running 10x slower than expected, it generally means too much data is being moved, too far and too often.

What is an Ineffective Zone Map?

The best way to look through large amounts of data is to NOT look through large amounts of data! Zone Maps allow Redshift to do this.

Zone maps (sometimes also called “block metadata”) are the information that describes the contents of each 1M block of each column of each table on your Redshift cluster. It is this metadata that enables Redshift to quickly pare down the needed data for a query. If zone maps aren’t being effectively used, significantly more data has to be read into the cluster from disk.

We often find in customer engagements that shortly after a Redshift data warehouse solution goes into production, the data being managed dwarfs the dataset used for testing and some query times can start to extend significantly.

What is “Block Rejection” and How to Analyze It?

We call the ability for Redshift to exclude blocks of data from being read into the cluster from disk as “block rejection”. Redshift uses the zone map metadata to determine whether or not data must be read to resolve a particular query – that is, it can reject many blocks outright, without having to read their data at all! Good block rejection rates allow Redshift to maintain high query performance on very large tables. Reading terabytes of data across a disk interface takes significant time and most queries do not require all the data in these tables.

Luckily, it is quite apparent when your queries are being slowed down by insufficient block rejection. The Redshift console will alert you in many cases when large amounts of a table are being read from disk but not used by the query. In the cases where there are not alerts, it is still simple to see if block rejection is occurring at all and to the expected degree:

AWS Redshift console -> Clusters -> {your cluster name} -> Queries -> {your query}

Under “Query Execution Details” you will see the Explain plan. Look for the tab called “Actual.” Find the step that is labeled “Sequential Scan on {table}.” This will bring up a sub-window. Find the step labeled “Range restricted Scan on {table}.” This will tell you how many rows of data were “scanned” from disk (brought into the compute node from disk) and how many were needed by this step in the query plan. The Redshift console may place a red triangle “alert” next to the table scan if its heuristics indicate that there may be a problem with block rejection. This alert should be viewed as a hint, not a truth. Review your large table scans for high amounts of row scanning when the resulting rows are not needed.

Example red triangle “alert” next to the table scan letting us know there may problem with block rejection and rows scanned vs. returned
Example red triangle “alert” next to the table scan letting us know there may problem with block rejection and rows scanned vs. returned

If you find that your long query is reading many more rows from disk than you think are needed and there is poor or no block rejection, there are many tools that can help resolve this issue. These include:

If you and your team find that you need help with zone maps and block rejection, 47Lining’s team of experts offer Redshift “Blackbelt” Optimization services. Contact us today to discuss your needs.

Next up, excessive network traffic…

— The Redshift Whisperer