Cardinality in Databases: What It Means

Cardinality

Cardinality is the number of distinct values in a column or dataset. A column of countries has low cardinality. A column of unique user IDs or request traces has high cardinality.

Why cardinality matters

Cardinality matters because many time-series and monitoring systems build an index entry for every unique combination of labels or tags. As the number of unique combinations grows, that index grows with it, and memory use and query time can explode. This is the infamous "cardinality problem" in observability.

It shows up fast in modern systems. Add a label like user ID, container ID, or trace ID, and your cardinality can jump from thousands to millions of unique series. Systems that were fine yesterday start struggling, and your monitoring bill climbs with it.

The way out is a storage model that does not pay a steep penalty for unique values. Columnar systems handle high cardinality far more gracefully than tag-indexed time-series databases.

How Arc handles Cardinality

Arc does not hit the cardinality cliff that tag-indexed time-series databases do. Because it stores data columnar in Parquet rather than building a series index per unique label combination, high-cardinality data like user IDs and trace IDs is just more column data, not an exploding index.

Arc is a high-performance columnar database. Open Parquet on storage you own, single Go binary, production-ready in 30 seconds.

Get Arc->See it live->

Analytical Database

Streaming

AI Memory

By industry

Explore

Read

Migrate from…

Forum

Source & Issues

Real-time chat

Cardinality

Why cardinality matters

How Arc handles Cardinality

Related terms