Arc Cloud is live. Start free — no credit card required.
ClickBench Verified

Arc vs Elasticsearch

Elasticsearch is a search engine. Arc is a columnar analytical database. When you run analytical workloads on Elasticsearch, you pay the price of a Lucene inverted index doing work it was never designed for.

45x
faster log ingestion
6–21x
faster analytical queries
1ms
p50 vs 38ms Elasticsearch

ClickBench Results

99.9M rows, 43 analytical queries. Arc runs true cold runs: service restart and OS cache flush before every query. Verify on benchmark.clickhouse.com →

Combined Score (lower is better)

SystemMachineScore
Arcc8g.metal-48xl×1.29
Arcc6a.4xlarge×2.00
Elasticsearchc8g.metal-48xl×11.97
Elasticsearchc6a.4xlarge×12.43

Cold Run (lower is better)

SystemMachineScore
Arcc8g.metal-48xl×1.32
Arcc6a.4xlarge×1.54
Elasticsearchc8g.metal-48xl×4.46
Elasticsearchc6a.4xlarge×5.82

Hot Run (lower is better)

SystemMachineScore
Arcc8g.metal-48xl×1.03
Arcc6a.4xlarge×2.84
Elasticsearchc8g.metal-48xl×27.67
Elasticsearchc6a.4xlarge×28.41

Log Ingestion Benchmark

Sustained 60-second log ingestion load. Same log schema, same machine.

SystemThroughputp50 latency
Arc4.58M logs/sec1ms
Elasticsearch101K logs/sec38ms

Arc achieves ~45x higher throughput with 38x lower p50 latency.

Why Arc Is Different: Under the Hood

Elasticsearch was designed for full-text search. Arc was designed for analytical queries on structured data. These are different problems with different optimal data structures.

Storage Format

Columnar Parquet vs. Lucene inverted index

Arc stores data as Apache Parquet files in time-partitioned paths (db/measurement/YYYY/MM/DD/HH/). Parquet is columnar: aggregating one field across 100M rows reads only that column from disk. Elasticsearch stores data in Lucene segments, an inverted index structure optimized for finding documents by term. For GROUP BY aggregations over numeric fields, Lucene must traverse posting lists that were never designed for that access pattern.

Query Engine

Vectorized SIMD aggregation vs. bucket trees

Arc embeds DuckDB, a vectorized OLAP engine that processes 2,048 rows per SIMD operation and executes aggregations directly on columnar Arrow arrays. Arc also rewrites SQL before execution: regex calls become string functions, time bucketing becomes epoch arithmetic. Elasticsearch implements GROUP BY as nested bucket trees over Lucene data, which works well for term faceting but is not competitive for analytical aggregations over high-cardinality numeric columns.

Ingestion Protocol

4.58M logs/sec vs. 101K logs/sec

Arc accepts MessagePack binary columnar batches (18M+ records/s), InfluxDB Line Protocol for Telegraf compatibility, and bulk CSV/Parquet import with efficient batches starting at ~1,000 rows. Elasticsearch uses the Bulk API, which requires JSON-encoded documents with per-document metadata lines, index analysis, and Lucene segment writes. The per-document indexing overhead (tokenization, inverted index updates, field data structures) limits Elasticsearch to ~101K logs/sec on the same hardware where Arc reaches 4.58M/sec.

Deployment Model

Single Go binary vs. JVM + ZooKeeper

Arc ships as a single Go binary with no external dependencies and no JVM. Optional clustering uses embedded Raft consensus. A 3-node Arc cluster is 3 processes. Elasticsearch requires JVM on each node, the Elasticsearch server process, and Kibana for visualization. A production HA cluster also needs dedicated master-eligible nodes separate from data nodes, with heap tuning, GC pauses, shard rebalancing, and split-brain prevention adding to the operational surface area.

Feature Comparison

FeatureArcElasticsearch
Standard SQL analytics
Portable Parquet storage
Open source
Edge / single-binary deployment
Columnar storage for analytics
InfluxDB Line Protocol ingestion
Retention policies

Frequently Asked Questions

Why is Elasticsearch so much slower on analytical queries?

Elasticsearch is built on Apache Lucene, an inverted index optimized for full-text search, not columnar aggregations. When you run GROUP BY or range aggregations over billions of rows, Elasticsearch must traverse the inverted index structure, which is fundamentally inefficient for that access pattern. Arc uses a vectorized columnar engine (DuckDB) that processes analytical queries order-of-magnitudes faster.

Can Arc replace Elasticsearch for log analytics?

For structured log analytics (aggregations, filtering, dashboards, and alerting on log fields): yes. Arc is purpose-built for that workload. If you need full-text search or fuzzy matching across unstructured text bodies, Elasticsearch remains the right tool. The two workloads are different.

How do I migrate log ingestion from Elasticsearch to Arc?

Most logging pipelines (Fluent Bit, Logstash, Vector, OpenTelemetry Collector) support HTTP output. Point them at Arc's HTTP ingestion endpoint. Arc accepts JSON arrays or MessagePack. Migration is typically a configuration change, not a re-engineering effort.

What about full-text search?

Arc does not have an inverted index for full-text search. DuckDB supports LIKE, regex, and ILIKE patterns on string columns, which covers most structured log filtering. For unstructured document search, Elasticsearch is still the right choice.

Pricing

Start free with open source. Scale with enterprise features when you need them.

Open Source

Freeforever
AGPL-3.0 licensed
  • 18M records/sec ingestion
  • Full SQL query engine (DuckDB)
  • Parquet storage (S3, GCS, local)
  • Docker and Kubernetes ready
  • Community support (Discord)
New

Arc Cloud

from$50/month

Managed hosting. No infrastructure. Free 30-day trial.

  • Deploy in 30 seconds
  • Dedicated physical servers
  • Daily backups to S3
  • Arc Enterprise included
  • No credit card required
Coming Q2 2026

Enterprise

$5,000/year

Starting price for up to 8 cores. Clustering, RBAC, and dedicated support.

  • Everything in Open Source
  • Horizontal clustering and HA
  • Role-based access control (RBAC)
  • Tiered storage and auto-aggregation
  • Dedicated support and SLAs
View all plans ->

Enterprise Features

Clustering

Horizontal scaling with automatic data distribution. Query routing and load balancing across nodes.

Security

Fine-grained RBAC with database and table-level permissions. LDAP/SAML integration available.

Data Management

Automated retention policies, continuous queries for aggregation, and tiered storage for cost optimization.

Ready to handle billion-record workloads?

Deploy Arc in minutes. Own your data in Parquet. Use for analytics, observability, AI, IoT, or data warehousing.

Get Started ->