Introduction
This guide focuses on the most common and effective performance optimizations for ClickStack, sufficient to optimize the majority of real-world observability workloads, typically up to tens of terabytes of data per day. The optimizations are presented in a deliberate order, starting with the simplest and highest impact techniques and progressing toward more advanced and specialized tuning. Early optimizations should be applied first and will often deliver substantial gains on their own. As data volumes grow and workloads become more demanding, the latter techniques become increasingly worthwhile to explore.ClickHouse concepts
Before applying any of the optimizations described in this guide, it’s important to be familiar with a few core ClickHouse concepts. In ClickStack, each data source maps directly to one or more ClickHouse tables. When using OpenTelemetry, ClickStack creates and manages a set of default tables that store logs, traces, and metrics data. If you’re using custom schemas or managing your own tables, you may already be familiar with these concepts. However, if you’re simply sending data via the OpenTelemetry Collector, these tables are created automatically, and are where all optimizations described below will be applied.| Data type | Table |
|---|---|
| Logs | otel_logs |
| Traces | otel_traces |
| Metrics (gauges) | otel_metrics_gauge |
| Metrics (sums) | otel_metrics_sum |
| Metrics (histogram) | otel_metrics_histogram |
| Metrics (Exponential histograms) | otel_metrics_exponentialhistogram |
| Metrics (summary) | otel_metrics_summary |
| Sessions | hyperdx_sessions |
default database is used - this can be modified in the OpenTelemetry collector.
At a minimum, you should understand the following ClickHouse fundamentals:
| Concept | Description |
|---|---|
| Tables | How data sources in ClickStack correspond to underlying ClickHouse tables. Tables in ClickHouse mainly use the MergeTree engine. |
| Parts | How data is written in immutable parts and merged over time. |
| Partitions | Partitions group the data parts of a table into organised logical units. These units are easier to manage, query, and optimize. |
| Merges | The internal process that merges parts together to ensure that there are a lower number of parts to query. Essential for maintaining query performance. |
| Granules | The smallest unit of data that ClickHouse reads and prunes during query execution. |
| Primary (ordering) keys | How the ORDER BY key determines on-disk data layout, compression, and query pruning. |
- Creating tables in ClickHouse - A simple introduction to tables.
- Parts
- Partitions
- Merges
- Primary keys/indexes
- How ClickHouse stores data: parts and granules - More advanced guide on how data is structured and queried in ClickHouse, covering granules and primary keys in detail.
- MergeTree- Advanced MergeTree reference guide useful for commands and for internal specifics.
Optimization 1. Materialize frequently queried attributes
The first and simplest optimization for ClickStack users is to identify commonly queried attributes inLogAttributes, ScopeAttributes, and ResourceAttributes, and promote them to top-level columns using materialized columns.
This optimization alone is often sufficient to scale ClickStack deployments to tens of terabytes per day and should be applied before considering more advanced tuning techniques.
Why materialize attributes
ClickStack stores metadata such as Kubernetes labels, service metadata, and custom attributes inMap(String, String) columns. While this provides flexibility, querying map subkeys has an important performance implication.
When querying a single key from a Map column, ClickHouse must read the entire map column from disk. If the map contains many keys, this results in unnecessary IO and slower queries compared to reading a dedicated column.
Materializing frequently accessed attributes avoids this overhead by extracting the value at insert time and storing it as a first-class column.
Materialized columns:
- Are computed automatically during inserts
- Can’t be explicitly set in INSERT statements
- Support any ClickHouse expression
- Allow type conversion from String to more efficient numeric or date types
- Enable skip indexes and primary key usage
- Reduce disk reads by avoiding full map access
ClickStack automatically detects materialized columns extracted from maps and transparently uses them during query execution, even when users continue to query the original attribute path.
Example
Consider the default ClickStack schema for traces, where Kubernetes metadata is stored inResourceAttributes:
ResourceAttributes.k8s.pod.name:"checkout-675775c4cc-f2p9c":
This results in a SQL predicate similar to:
ResourceAttributes column for each matching row - potentially very large if the Map contains many keys.
If this attribute is queried frequently, it should be materialized as a top-level column.
To extract the pod name at insert time, add a materialized column:
PodName.
Users can now query for pod names efficiently, using Lucene syntax e.g. PodName:"checkout-675775c4cc-f2p9c"
For newly inserted data, this avoids map access entirely and significantly reduces I/O.
However, even if users continue querying the original attribute path e.g. ResourceAttributes.k8s.pod.name:"checkout-675775c4cc-f2p9c", ClickStack will automatically rewrite the query internally to use the materialized PodName column i.e. using the predicate:
By default, materialized columns are excluded from
SELECT * queries. This preserves the invariant that query results can always be reinserted into the table.Materializing historical data
Materialized columns only apply automatically to data inserted after the column is created. For existing data, queries against the materialized column will transparently fall back to reading from the original map. If historical performance is critical, the column can be backfilled using a mutation e.g.system.mutations table e.g.
is_done = 1 for the corresponding mutation.
Optimization 2. Adding skip indexes
After materializing frequently queried attributes, the next optimization is to add data skipping indexes to further reduce the amount of data ClickHouse needs to read during query execution. Skip indexes allow ClickHouse to avoid scanning entire blocks of data when it can determine that no matching values exist. Unlike traditional secondary indexes, skip indexes operate at the granule level and are most effective when query filters exclude large portions of the dataset. When used correctly, they can significantly accelerate filtering on high-cardinality attributes without changing query semantics. Consider the default traces schema for ClickStack, which includes skip indexes:- High-cardinality string filtering, such as TraceId, session identifiers, attribute keys, or values
- Map subkey filtering accelerated by text indexes on the
*AttributeItemscolumns - Numeric range filtering, such as span duration
text(tokenizer = 'array') indexes throughout instead of Bloom filters, and adds a text(tokenizer = 'splitByNonAlpha') index on lower(Body) for full-text search. See “Tables and schemas used by ClickStack” for the full DDL.
Bloom filters
Bloom filter indexes are the most commonly used skip index type in ClickStack. They’re well-suited for string columns with high cardinality, typically at least tens of thousands of distinct values. A false-positive rate of 0.01 with granularity 1 is a good default starting point and balances storage overhead with effective pruning. Continuing the example from Optimization 1, suppose the Kubernetes pod name has been materialized from ResourceAttributes:PodName:"checkout-675775c4cc-f2p9c".
Bloom filters are most effective when the distribution of values is such that a given value appears in a relatively small number of parts. This often occurs naturally in observability workloads where metadata like pod names, trace IDs, or session identifiers is correlated with time, and therefore clustered by the table’s ordering key.
As with all skip indexes, Bloom filters should be added selectively and validated against real query patterns to ensure they provide measurable benefit - see “Evaluating skip index effectiveness.”
Text indexes
Text indexes offer an alternative to Bloom filters. A Bloom filter is a probabilistic structure that can definitively rule out granules, but it has a false-positive rate, so the granules it doesn’t exclude must still be loaded and evaluated against theWHERE condition. Text indexes are inverted indexes that map tokens to exact offsets within a part. Because they evaluate offsets rather than granules and produce no false positives, they can usually answer the WHERE condition without loading the underlying column. This is an optimization known as direct read. Since loading data is often the largest contributor to query time, direct read can meaningfully reduce query latency.
Additionally, text indexes are themselves queryable, powering autocomplete and other introspection in ClickStack.
Two tokenizers cover most ClickStack patterns:
| Tokenizer | Used for | Typical column |
|---|---|---|
array | Indexing Array(String) elements as whole tokens | mapKeys(...), *AttributeItems |
splitByNonAlpha | Word-level full-text search on prose strings | Body, lower(Body), SpanName |
Array tokenizer for Map and array columns
The default logs schema indexesmapKeys and the materialized item arrays with
the array tokenizer:
splitByNonAlpha for log bodies
Full-text search on the Body column benefits from a splitByNonAlpha text
index. ClickStack defines this index on lower(Body) so case-insensitive
Lucene searches can use it:
text(tokenizer = 'splitByNonAlpha') index on
lower(Body), it rewrites implicit-column Lucene queries like error or
"connection refused" into hasAllTokens(lower(Body), lower(...)), which the
index can satisfy without reading the full Body column. For most
observability log workloads this is the single largest filtering speedup
available.
Text indexes vs
tokenbf_v1The older tokenbf_v1 index type (still used in the default traces schema for
lower(SpanName)) is functionally similar but deprecated in ClickHouse 26.2
and above. New text-search indexes should use text(tokenizer = ...).Text indexes in the default logs schema
The defaultotel_logs schema synced from upstream ships every text index discussed above: text(tokenizer = 'array') on TraceId, on each mapKeys(...) and *AttributeItems array, and text(tokenizer = 'splitByNonAlpha') on lower(Body) for full-text search. For the canonical DDL, see “Tables and schemas used by ClickStack”; the same schema is reproduced below.
Min-max indexes
Minmax indexes store the minimum and maximum value per granule and are extremely lightweight. They’re particularly effective for numeric columns and range queries. While they may not accelerate every query, they’re low-cost and almost always worth adding for numeric fields. Minmax indexes work best when numeric values are either naturally ordered or confined to narrow ranges within each part. Suppose a Kafka offset is frequently queried fromSpanAttributes:
Materialize skip index
After a skip index has been added, it only applies to newly ingested data. Historical data won’t benefit from the index until it’s explicitly materialized. If you have already added a skip index, for example:Materializing skip indexesMaterializing a skip index is typically lightweight and safe to run, especially for minmax indexes. For Bloom filter indexes on large datasets, users may prefer to materialize on a per-partition basis to better control resource usage e.g.
is_done = 1 for the corresponding mutation.
Once complete, confirm that the index data has been created:
0.01 to 0.05 produces a smaller index that evaluates faster, at the cost of less aggressive pruning. While fewer granules may be skipped, overall query latency can improve due to faster index evaluation.
Tuning Bloom filter parameters is therefore a workload-dependent optimization and should be validated using real query patterns and production-like data volumes.
For further details on skip indexes, see the guide “Understanding ClickHouse data skipping indexes.”
Evaluating skip index effectiveness
The most reliable way to evaluate skip index pruning is to useEXPLAIN indexes = 1, which shows how many parts and granules are eliminated at each stage of query planning. In most cases, you want to see a large reduction in granules at the Skip stage, ideally after the primary key has already reduced the search space. Skip indexes are evaluated after partition pruning and primary key pruning, so their impact is best measured relative to the remaining parts and granules.
EXPLAIN confirms whether pruning occurs, but it doesn’t guarantee a net speedup. Skip indexes have a cost to evaluate, especially if the index is large. Always benchmark queries before and after adding and materializing an index to confirm real performance improvements.
For example, consider the default Bloom filter skip index for TraceId included with the default Traces schema:
EXPLAIN indexes = 1 to see how effective it’s for a selective query:
FORMAT Null to avoid result serialization overhead, and disable the query condition cache to keep runs repeatable:
use_query_condition_cache ensures results aren’t affected by cached filtering decisions, and setting use_skip_indexes = 0 provides a clean baseline for comparison. If the pruning is effective and index evaluation cost is low, the indexed query should be materially faster, as in the example above.
When to add skip indexes
Skip indexes should be added selectively, based on the types of filters users run most frequently and the shape of the data in parts and granules. The goal is to prune enough granules to offset the cost of evaluating the index itself, which is why benchmarking on production-like data is essential. For numeric columns that are used in filters, a minmax skip index is almost always a good choice. It’s lightweight, cheap to evaluate, and can be effective for range predicates - especially when values are loosely ordered, or confined to narrow ranges inside parts. Even when minmax doesn’t help a specific query pattern, its overhead is typically low enough that it’s still reasonable to keep. For string columns, prefer text indexes where supported; fall back to Bloom filters otherwise. Text indexes accelerate the same equality andIN filters that Bloom filters do, and additionally enable token-based predicates (hasToken, hasAllTokens, has) used by full-text search and the Map direct read optimization. On older clusters that don’t yet support text indexes, Bloom filters remain a solid choice.
Bloom filters are most effective for high-cardinality string columns where each value has relatively low frequency, meaning most parts and granules don’t contain the searched value. As a rule of thumb, Bloom filters are most promising when the column has at least 10,000 distinct values, and often perform best with 100,000+ distinct values. They’re also more effective when matching values are clustered into a small number of sequential parts, which typically happens when the column is correlated with the ordering key. Again, your mileage here may vary - nothing replaces real world-testing.
Optimization 3. Map direct read
When you filter on a Map subkey such asLogAttributes['k8s.pod.name'] = 'checkout', ClickHouse must read the entire LogAttributes Map column from
disk and unpack every row to evaluate the predicate. Materializing frequently queried attributes
solves this for keys you know about ahead of time, but it doesn’t scale to
arbitrary attributes that users filter on ad hoc.
Even if a schema has indexes on mapKeys and mapValues, those indexes can tell you whether a row has a given key, and whether it has a given value, but not whether the key and value belong to the same entry. In other words, mapKeys answers mapContainsKey(ResourceAttributes, 'foo') and mapValues answers mapContainsValue(ResourceAttributes, 'bar'), but neither answers ResourceAttributes['foo'] = 'bar'.
By concatenating the keys and values into a single Array(String) column, the
Map direct read optimization enables ResourceAttributes['foo'] = 'bar' to be
answered without loading the underlying map. Maps are often large, and increase
in size with an increase in volume. Combined with an application-level query
rewrite, equality filters on any Map subkey become a single has(...) call
backed by that index, with no Map deserialization at query time. Additionally,
the only storage cost paid is for the text index, as the underlying column is
an ALIAS column and is not stored.
This optimization is automatic. ClickStack ships the necessary columns and
indexes in the default logs and trace tables, and rewrites Map subscript
filters at runtime when the connected ClickHouse server supports the underlying
primitive. If your schema does not contain these columns, or you have
additional Map columns you want to accelerate beyond the defaults, read on to
enable them.
Schema
For each Map column you want to accelerate, ClickStack defines anArray(String) ALIAS column that joins each key and value with =:
text(tokenizer = 'array') skip index on
the ALIAS column stores one token per key=value pair, which ClickHouse uses
to prune granules without touching the source Map:
| Table | ALIAS columns | Text indexes |
|---|---|---|
otel_logs | ResourceAttributeItems, ScopeAttributeItems, LogAttributeItems | idx_res_attr_items, idx_scope_attr_items, idx_log_attr_items |
otel_traces | ResourceAttributeItems, SpanAttributeItems | idx_res_attr_items, idx_span_attr_items |
Query rewrite
When a user filters on a Map subkey through the ClickStack UI or the SDK, ClickStack rewrites:LogAttributeItems, prunes entire
rows that don’t contain the key=value token, and never deserializes the
source LogAttributes Map for non-matching rows. For high-cardinality
observability workloads this typically delivers an order-of-magnitude reduction
in I/O over Map subscript access.
The rewrite happens automatically — saved queries, dashboards, and alerts that
reference LogAttributes['key'] see the speedup without any change.
ClickHouse version requirements
The query rewrite requires a ClickHouse version that supports direct token-level pruning on text-indexed array columns. ClickStack detects the connected server version (SELECT version(), cached per connection) and only
emits the rewritten form when the server is at or above the threshold. Older
servers fall back to the original Map subscript form automatically.
| ClickHouse branch | Minimum version |
|---|---|
| 26.2 | 26.2.19.43 |
| 26.3 | 26.3.12.3 |
| 26.4 | 26.4.3.37 |
| 26.5+ | All versions |
Why ALIAS, not MATERIALIZEDThe items array is a window onto data that already lives in the Map column.
Storing it twice — once in the Map, once in the array — would double write I/O
without unlocking new query patterns. The text index on the
ALIAS column is
built at insert time from the same source data, so the optimization adds only
the index footprint to disk.Optimization 4. Modifying the primary key
The primary key is one of the most important components of ClickHouse performance tuning for most workloads. To tune it effectively, you must understand how it works and how it interacts with your query patterns. Ultimately, the primary key should align with how users access the data, particularly which columns are most commonly filtered on. While the primary key also influences compression and storage layout, its primary purpose is query performance. In ClickStack, the out-of-the-box primary keys are already optimized for the most common observability access patterns and for strong compression. The default keys for logs, traces, and metrics tables are designed to perform well for typical workflows. Filtering on columns that appear earlier in the primary key is more efficient than filtering on columns that appear later. While the default configuration is sufficient for most users, there are cases where modifying the primary key can improve performance for specific workloads.A note on terminologyThroughout this document, the term “ordering key” is used interchangeably with “primary key.” Strictly speaking, these differ in ClickHouse, but for ClickStack, they typically refer to the same columns specified in the table
ORDER BY clause. For details, see the ClickHouse documentation on choosing a primary key that differs from the sorting key.- Logs (
otel_logs) -(toStartOfFiveMinutes(Timestamp), ServiceName, Timestamp) - Traces (
otel_traces) -(ServiceName, SpanName, toDateTime(Timestamp))
Choosing a primary key
First, identify whether your access patterns differ substantially from the defaults for a specific table. For example, if you most commonly filter logs by Kubernetes node before service name, and this represents a dominant workflow, it may justify changing the primary key.Modifying the default primary keyThe default primary keys are sufficient in most cases. Changes should be made cautiously and only with a clear understanding of query patterns. Modifying a primary key can degrade performance for other workflows, so testing is essential.
- Select columns that align with your common filters and access patterns. If you typically start Observability investigations by filtering by a specific column e.g. pod name, this column will be used frequently in
WHEREclauses. Prioritize including these in your key over those that are used less frequently. - Prefer columns that help exclude a large percentage of the total rows when filtered, thus reducing the amount of data that needs to be read. Service names and status codes are often good candidates - in the latter case, only if you filter by values which exclude most rows e.g. filtering by 200 codes will, in most systems, match most rows, in comparison to 500 errors, which will correspond to a small subset.
- Prefer columns that are likely to be highly correlated with other columns in the table. This will help ensure these values are also stored contiguously, improving compression.
GROUP BY(aggregations for charts) andORDER BY(sorting) operations for columns in the ordering key can be made more memory efficient.
Changing the primary key
If you’re confident of your access patterns prior to data ingestion, simply drop and re-create the table for the relevant data type. The example below shows a simple way to create a new logs table with the existing schema, but with a new primary key that includes the columnSeverityText before the ServiceName.
Create new table
Ordering key vs primary keyNote in the above example, you’re required to specify a
PRIMARY KEY and ORDER BY.
In ClickStack, these are almost always the same.
The ORDER BY controls the physical data layout, while the PRIMARY KEY defines the sparse index.
In rare, very large workloads, these may differ, but most users should keep them aligned.Exchange and drop table
TheEXCHANGE statement is used to swap the names of the tables atomically. The temporary table (now the old default table), can be dropped.Backfilling existing data into a new table is rarely worthwhile at scale. The compute and IO cost is usually high, and doesn’t justify the performance benefits. Instead, allow older data to expire via TTL while newer data benefits from the improved key.
SeverityText as the first column in the primary key is used below. In this case, a table is created for new data, retaining the old table for historical analysis.
Create new table
Create the new table with the desired primary key. Note the_23_01_2025 suffix - adapt this to be the current date. e.g.Create a Merge table
The Merge engine (not to be confused with MergeTree) doesn’t store data itself, but allows reading from any number of other tables simultaneously.currentDatabase() assumes the command is run in the correct database. Otherwise, specify the database name explicitly.otel_logs.Update the ClickStack UI to read from the merge table
Configure the ClickStack UI to useotel_logs_merge as the table for the logs data source.At this point, writes continue to otel_logs with the original primary key, while reads use the merge table. There is no visible change for users or impact on ingestion.Exchange the tables
AnEXCHANGE statement is now used to swap the names of the otel_logs and otel_logs_23_01_2025 tables atomically.otel_logs table with the updated primary key. Existing data remains in otel_logs_23_01_2025 and is still accessible via the merge table. The suffix indicates the date the change was applied and represents the latest timestamp contained in that table.This process allows primary key changes with no ingest interruption and no user-visible impact.SeverityNumber should be part of the primary key one week later, rather than the SeverityText. The following process can be adapted as many times as primary key changes are required.
Create new table
Create the new table with the desired primary key. In the example below30_01_2025 is used as our suffix to denote the date of the table. e.g.Exchange the tables
AnEXCHANGE statement is now used to swap the names of the otel_logs and otel_logs_30_01_2025 tables atomically.otel_logs table with the updated primary key. The old data remains in otel_logs_30_01_2025, accessible via the merge table.Redundant tablesIf TTL policies are in place, which is recommended, tables with older primary keys that are no longer receiving writes will gradually empty as data expires. They should be monitored and periodically cleaned up once they contain no data. At present, this cleanup process is manual.
Row lookup acceleration with block columns
The default ClickStack logs schema enables two MergeTree settings that don’t affect query performance directly, but materially speed up row-detail lookups in the ClickStack UI:(_block_number, _block_offset) pair that uniquely identifies it inside a
part. When you click a log row in the ClickStack UI to open the detail panel, ClickStack
issues a follow-up query to fetch that single row. Without block columns, the
row’s WHERE clause must include enough columns — typically the primary key
plus Body and SeverityText — to disambiguate the row. With block columns,
the primary key plus _block_number plus _block_offset is sufficient. Large
columns like Body are never read for the lookup, effectively speeding up the query.
ClickStack detects the setting from the table’s CREATE statement and emits
the leaner WHERE clause automatically when both columns are enabled. No
application config change is required.
To opt into the optimization on an existing logs or traces table:
ALTER. Existing parts continue
to use the old per-row lookup until they’re rewritten by a merge.
Optimization 5. Exploiting materialized views
ClickStack can exploit Incremental Materialized Views to accelerate visualizations that rely on aggregation-heavy queries, such as computing average request duration per minute over time. This feature can dramatically improve query performance and is typically most beneficial for larger deployments, around 10 TB per day and above, while enabling scaling into the petabytes-per-day range. Incremental Materialized Views are in Beta and should be used with care. For details on using this feature in ClickStack, see our dedicated guide “ClickStack - Materialized Views.”Optimization 6. Exploiting Projections
Projections represent a final, advanced optimization that can be considered once materialized columns, skip indexes, primary keys, and materialized views have been evaluated. While projections and materialized views may appear similar, in ClickStack, they serve different purposes, and are best used in different scenarios.ORDER BY key, allowing ClickHouse to prune data more effectively for access patterns that don’t align with the original ordering.
Materialized views can achieve a similar effect by explicitly writing rows into a separate target table with a different ordering key. The key difference is that projections are maintained automatically and transparently by ClickHouse, while materialized views are explicit tables that must be registered and selected intentionally by ClickStack.
When a query targets the base table, ClickHouse evaluates the base layout and any available projections, samples their primary indexes, and selects the layout that can produce the correct result while reading the fewest granules. This decision is made automatically by the query analyzer.
In ClickStack, projections are therefore best suited for pure data reordering, where:
- Access patterns are fundamentally different from the default primary key
- It’s impractical to cover all workflows with a single ordering key
- You want ClickHouse to transparently choose the optimal physical layout
Example projections
Suppose your traces table is optimized for the default ClickStack access pattern:Use wildcardsIn the example projection above, a wildcard (
SELECT *) is used. While selecting a subset of columns can reduce write overhead, it also limits when the projection can be used, since only queries that can be fully satisfied by those columns are eligible. In ClickStack, this often restricts projection usage to very narrow cases. For this reason, it is generally recommended to use a wildcard to maximize applicability.Materializing a projection can take a long time and consume significant resources. Because observability data typically expires via TTL, this should only be done when absolutely necessary. In most cases, it’s sufficient to let the projection apply only to newly ingested data, allowing it to optimize the most frequently queried time ranges, such as the last 24 hours.
SELECT *) and the query filters strongly align with the projection’s ORDER BY.
Queries that filter on TraceId (especially equality) and include a time range would benefit from the above projection. For example:
TraceId, or that primarily filter on other dimensions that aren’t leading in the projection’s ordering key, typically won’t benefit (and may read via the base layout instead).
Projections can also store aggregations (similar to materialized views). In ClickStack, projection-based aggregations aren’t generally recommended because selection depends on the ClickHouse analyzer, and usage can be harder to control and reason about. Instead, prefer explicit materialized views that ClickStack can register and select intentionally at the application layer.
Costs and guidance
- Insert overhead: A
SELECT *projection with a different ordering key is effectively writing the data twice, which increases write I/O and can require additional CPU and disk throughput to sustain ingestion. - Use sparingly: Projections are best reserved for genuinely diverse access patterns where a second physical ordering unlocks meaningful pruning for a large share of queries, for example, two teams querying the same dataset in fundamentally different ways.
- Validate with benchmarks: As with all tuning, compare real query latency and resource usage before and after adding and materializing a projection.
Lightweight projections with _part_offset
Lightweight projections are Beta for ClickStack
_part_offset-based lightweight projections aren’t recommended for ClickStack workloads. While they reduce storage and write I/O, they can introduce more random access at query time, and their production behavior at the observability scale is still being evaluated. This recommendation may change as the feature matures and we gain more operational data._part_offset pointer into the base table, rather than duplicating full rows. This can greatly reduce storage overhead, and recent improvements enable granule-level pruning, making them behave more like true secondary indexes. See:
Alternatives
If you need multiple ordering keys, projections aren’t the only option. Depending on operational constraints and how you want ClickStack to route queries, consider:- Configuring your OpenTelemetry collector to write to two tables with different
ORDER BYkeys, and create separate ClickStack sources for each table. - Create a materialized view as a copy pipeline i.e. attach a materialized view to the main table that selects raw rows into a secondary table with a different ordering key (a denormalization or routing pattern). Create a source for this target table. Examples can be found here.