Stream Processing
Stream processing is a way of handling data continuously, record by record or in small windows, as it arrives. It contrasts with batch processing, which collects data and processes it in large scheduled chunks.
Streaming versus batch
Batch processing waits. It gathers a day's or an hour's worth of data and processes it all at once. That is simple and efficient, but it adds latency: insights lag behind reality by however long the batch window is.
Stream processing acts on data as it flows. Systems like Kafka, Flink, and others let you transform, aggregate, and react to events in near real time. This is what enables live dashboards, immediate alerting, and pipelines that feed fresh data into analytical stores.
Stream processing and analytical databases are complementary. The stream moves and shapes the data, and the database stores and lets you query it.
How Arc handles Stream Processing
Arc is the analytical store at the end of a streaming pipeline. Streaming tools move and transform events, and Arc ingests them with a fast write path and makes them queryable in about 100 milliseconds. Note that Arc is a database, not a message queue.
Arc is a high-performance columnar database. Open Parquet on storage you own, single Go binary, production-ready in 30 seconds.