Schema on Read

Schema on read is an approach where data is stored in its raw form and structure is applied when the data is read or queried, rather than enforced when it is written. It contrasts with schema on write, where structure is fixed at ingestion time.

Schema on read versus schema on write

Schema on write, the traditional database approach, requires you to define the structure up front. Every row must fit the schema before it is stored. This guarantees consistency but is rigid: changing the schema or handling varied data is painful.

Schema on read stores the data as-is and interprets the structure at query time. This is flexible. It handles evolving and heterogeneous data well, and it suits data lakes where many kinds of data land together. The tradeoff is that some validation and performance work shifts to query time.

Most modern lakehouse architectures lean on schema-on-read flexibility while still keeping enough structure for fast queries.

Arc is a high-performance columnar database. Open Parquet on storage you own, single Go binary, production-ready in 30 seconds.