High Cardinality: What It Is and Why It Breaks Systems

High Cardinality

High cardinality describes data with a very large number of unique values, like user IDs, session IDs, or trace IDs. It is a frequent source of performance and cost problems in time-series and observability platforms.

The high-cardinality problem

Many monitoring systems were designed for low-cardinality metrics, like CPU usage per host. They build an internal index for each unique combination of labels. That works fine until you add a high-cardinality label.

The moment you tag metrics with something like customer ID or request ID, the number of unique series can jump into the millions. The index balloons, memory use spikes, queries slow down, and in usage-based pricing models your bill climbs sharply. Teams often respond by dropping the very labels that would have made debugging possible.

The fix is a storage approach that treats unique values as ordinary data rather than as index entries. That is where columnar analytical stores have a structural advantage.

How Arc handles High Cardinality

Arc treats a high-cardinality field as just another column, not as a new indexed series. That means you can keep the labels you actually need for debugging, like user and trace IDs, without watching memory and cost explode.

Arc is a high-performance columnar database. Open Parquet on storage you own, single Go binary, production-ready in 30 seconds.

Get Arc->See it live->

Analytical Database

Streaming

AI Memory

By industry

Explore

Read

Migrate from…

Forum

Source & Issues

Real-time chat

High Cardinality

The high-cardinality problem

How Arc handles High Cardinality

Related terms