> ## Documentation Index
> Fetch the complete documentation index at: https://private-7c7dfe99-fix-nav-issues.mintlify.site/llms.txt
> Use this file to discover all available pages before exploring further.

> Article describing what lazy materialization is in ClickHouse

# Lazy materialization

This article describes how lazy materialization works and how it fits into ClickHouse’s broader I/O optimization stack.
It presents a real-world example demonstrating how lazy materialization increases query performance.

<Tip>
  **Available from version 25.4**

  Lazy materialization was introduced in version 25.4 of ClickHouse, and is turned on by default.
</Tip>

<h2 id="overview">
  Overview
</h2>

Over the years, ClickHouse has introduced a series of layered optimizations to aggressively reduce I/O.
These techniques form the foundation of its speed and efficiency:

| Optimization                                                                                                                                                                                                                                                   | Description                                                                                                                                                                                                                                                                           |
| -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| [**Columnar storage**](/get-started/about/intro#row-oriented-vs-column-oriented-storage)                                                                                                                                                                       | Allows skipping entire columns that aren't needed for a query and also enables high compression by grouping similar values together, minimizing I/O during data loading.                                                                                                              |
| [**Sparse primary indexes**](/guides/clickhouse/data-modelling/sparse-primary-indexes) \| [**secondary data-skipping indexes**](/concepts/features/performance/skip-indexes/skipping-indexes) \| [**projections**](/concepts/features/projections/projections) | Prune irrelevant data by identifying which [granules](/concepts/core-concepts/glossary#granule) (row blocks) might match filters on *indexed columns*. These techniques operate at the granule level and can be used individually or in combination.                                  |
| [**PREWHERE**](/reference/statements/select/prewhere)                                                                                                                                                                                                          | Checks matches also for filters on *non-indexed* columns to skip data early that would otherwise be loaded and discarded. It can work independently or refine the granules selected by indexes, complementing granule pruning by skipping rows that don't match *all* column filters. |
| [**Query condition cache**](/concepts/features/performance/caches/query-condition-cache)                                                                                                                                                                       | Speeds up repeated queries by remembering which granules matched all filters last time. ClickHouse can then skip reading and filtering granules that didn't match, even if the query shape changes.                                                                                   |

While the aforementioned I/O optimizations can significantly reduce data read, they still assume that all columns for rows passing the `WHERE` clause must be loaded before running operations like sorting, aggregation, or `LIMIT`. But what if some columns aren’t needed until later, or some data, despite passing the `WHERE` clause, is never needed at all?
That’s where lazy materialization comes in. It is an orthogonal enhancement that completes the I/O optimization stack:

* Indexing, together with `PREWHERE`, ensures that only rows matching column filters in the `WHERE` clause are processed.
* Lazy materialization builds on this by deferring column reads until they’re actually required by the query execution plan.
  Even after filtering, only the columns needed for the next operation - such as sorting - are loaded immediately.
  Others are postponed and, due to `LIMIT`, are often read only partially, just enough to produce the final result.
  This makes lazy materialization especially powerful for Top N queries, where the final result may only require a handful of rows from certain, often large, columns.

<h2 id="worked-example">
  A worked example
</h2>

We highly recommend the blogpost ["ClickHouse gets lazier (and faster): Introducing lazy materialization"](https://clickhouse.com/blog/clickhouse-gets-lazier-and-faster-introducing-lazy-materialization#speed-without-filters-lazy-materialization-in-isolation)
for a deep dive on lazy materialization. The example below is taken from the aforementioned blogpost and reproduced here to demonstrate how a ClickHouse query can go from 219 seconds to just 139 milliseconds (a 1576× speedup) with lazy materialization.

To benefit from indexing and `PREWHERE`, a query needs filters, on primary key columns for indexing, and on any columns for `PREWHERE`.
Lazy materialization layers cleanly on top, but unlike the other optimizations mentioned previously, it can also speed up queries with no column filters at all.

Consider the following example query which finds Amazon reviews with the highest number of helpful votes, regardless of date, product, rating, or verification status, and returns the top 3 along with their title, headline, and full text.

First running the query (with cold filesystem caches) with lazy materialization disabled (using [`query_plan_optimize_lazy_materialization`](/reference/settings/session-settings#query_plan_optimize_lazy_materialization)):

```sql title="Query" theme={null}
SELECT
    helpful_votes,
    product_title,
    review_headline,
    review_body
FROM amazon.amazon_reviews
ORDER BY helpful_votes DESC
LIMIT 3
FORMAT Vertical
SETTINGS
    query_plan_optimize_lazy_materialization = false;
```

```response title="Response" highlight={22-23} theme={null}
Row 1:
──────
helpful_votes:   47524
product_title:   Kindle: Amazon's Original Wireless Reading Device (1st generation)
review_headline: Why and how the Kindle changes everything
review_body:     This is less a \"pros and cons\" review than a hopefully use...

Row 2:
──────
helpful_votes:   41393
product_title:   BIC Cristal For Her Ball Pen, 1.0mm, Black, 16ct (MSLP16-Blk)
review_headline: FINALLY!
review_body:     Someone has answered my gentle prayers and FINALLY designed ...

Row 3:
──────
helpful_votes:   41278
product_title:   The Mountain Kids 100% Cotton Three Wolf Moon T-Shirt
review_headline: Dual Function Design
review_body:     This item has wolves on it which makes it intrinsically swee...

0 rows in set. Elapsed: 219.071 sec. Processed 150.96 million rows, 71.38 GB (689.08 thousand rows/s., 325.81 MB/s.)
Peak memory usage: 1.11 GiB.
```

Next the query is rerun (again with a cold filesystem cache), but this time with lazy materialization enabled:

```sql title="Query" highlight={11} theme={null}
SELECT
    helpful_votes,
    product_title,
    review_headline,
    review_body
FROM amazon.amazon_reviews
ORDER BY helpful_votes DESC
LIMIT 3
FORMAT Vertical
SETTINGS
query_plan_optimize_lazy_materialization = true;
```

<Tip>
  Ordinarily you don't need to explicitly set `query_plan_optimize_lazy_materialization = true` to gain the benefit of lazy materialization.
  It is enabled by default.
</Tip>

```response title="Response" highlight={22-23} theme={null}
Row 1:
──────
helpful_votes:   47524
product_title:   Kindle: Amazon's Original Wireless Reading Device (1st generation)
review_headline: Why and how the Kindle changes everything
review_body:     This is less a \"pros and cons\" review than a hopefully use...

Row 2:
──────
helpful_votes:   41393
product_title:   BIC Cristal For Her Ball Pen, 1.0mm, Black, 16ct (MSLP16-Blk)
review_headline: FINALLY!
review_body:     Someone has answered my gentle prayers and FINALLY designed ...

Row 3:
──────
helpful_votes:   41278
product_title:   The Mountain Kids 100% Cotton Three Wolf Moon T-Shirt
review_headline: Dual Function Design
review_body:     This item has wolves on it which makes it intrinsically swee...

0 rows in set. Elapsed: 0.139 sec. Processed 150.96 million rows, 1.81 GB (1.09 billion rows/s., 13.06 GB/s.)
Peak memory usage: 3.80 MiB.
```

Consider the difference in performance with lazy materialization turned off and turned on:

| Metric       | Lazy materialization off | Lazy materialization on | Improvement    |
| ------------ | ------------------------ | ----------------------- | -------------- |
| Elapsed time | 219.071 sec              | 0.139 sec               | \~1576× faster |
| Data read    | 71.38 GB                 | 1.81 GB                 | \~40× less     |
| Peak memory  | 1.11 GiB                 | 3.80 MiB                | \~300× less    |

<h2 id="confirm-lazy-materialization">
  How to confirm lazy materialization in the query execution plan
</h2>

You can observe usage of lazy materialization for the previous query by inspecting the query’s logical execution plan using the [`EXPLAIN`](/reference/statements/explain) clause:

```sql theme={null}
EXPLAIN actions = 1
SELECT
    helpful_votes,
    product_title,
    review_headline,
    review_body
FROM amazon.amazon_reviews
ORDER BY helpful_votes DESC
LIMIT 3
SETTINGS
    query_plan_optimize_lazy_materialization = true;
```

```response highlight={2} theme={null}
...
Lazily read columns: review_headline, review_body, product_title
  Limit
    Sorting
      ReadFromMergeTree
```

You can read the operator plan from bottom to top and observe that ClickHouse defers reading the three large String columns until after sorting and limiting.
