Vectorized Execution

Vectorized execution is a query processing technique where the engine operates on batches of column values at once, instead of one row at a time. It makes far better use of modern CPUs and is a key reason columnar analytical databases are fast.

Why processing in batches is faster

A traditional query engine processes data row by row, calling the same logic over and over for each row. That creates a lot of per-row overhead and uses the CPU poorly. For analytics over billions of rows, that overhead dominates.

A vectorized engine instead processes a batch of values from one column in a tight loop. This plays to how modern CPUs actually work: it keeps data in cache, reduces function-call overhead, and lets the processor use SIMD instructions that operate on many values at once. The result is often an order of magnitude faster on analytical queries.

Vectorized execution pairs naturally with columnar storage, since the data is already laid out one column at a time.

How Arc handles Vectorized Execution

Arc uses DuckDB as its query engine, which is vectorized. Combined with columnar Parquet storage, that is what lets Arc scan and aggregate billions of rows quickly on a single instance.

Arc is a high-performance columnar database. Open Parquet on storage you own, single Go binary, production-ready in 30 seconds.