Arc 25.12.1: We Rewrote Everything in Go. Here's Why.

Almost two months ago, we released the bits of Arc on GitHub. A month after that we released the first stable version, 25.11.1. Today we're announcing Arc 25.12.1—and this one is different.

But wait, wasn't Arc already good enough? Let me explain.

From Prototype to Production

When we started working on Arc—back when the name didn't even exist—our goal was to validate a hypothesis: is there hunger for a new time-series database that handles not just the typical IoT and observability scenarios, but also analytics workloads? One that truly separates storage from compute and fits into the modern data stack?

We wanted to demonstrate that with the right combination of technologies (DuckDB + Parquet + Arrow), we could build something that actually handles real-world workloads.

And we did it. The prototype worked. We open-sourced it and got incredible feedback from the community.

The validation exceeded our expectations. Over 100 deployments. 16 billion rows ingested across all instances—people were testing Arc hard. Conversations with investors who saw the potential. Ex-coworkers from my InfluxData days who understood the pain points we were solving. The signal was clear: yes, there's a real need for this.

But that prototype was in Python. That was intentional—we chose Python to move fast, get an MVP out the door, and start talking to users as quickly as possible. Speed of development over speed of execution. Ship first, optimize later.

And it worked. We validated the market. But once we had that validation, we hit a wall.

The Python Problem

Here's what we were dealing with:

372 MB memory leak per 500 queries with 42 workers
pymalloc arena caching that doesn't release memory back to the OS
Setting PYTHONMALLOC=malloc made it worse, not better
The multi-worker architecture was fundamentally leaky

We tried everything. Different memory allocators. Different worker configurations. Different garbage collection strategies. Nothing worked.

The Python runtime wasn't built for this workload. So we made a decision.

We Rewrote Arc in Go

Yes, a full rewrite. 18,280 lines of Python → Go.

Why Go?

Better memory management - GC actually returns memory to the OS
Native concurrency - goroutines instead of multi-worker processes
Single binary deployment - no Python dependencies, no virtual environments
Excellent DuckDB bindings - go-duckdb just works
Production stability - same language that powers ClickHouse, Prometheus, InfluxDB

We estimated 4-6 weeks. It took about that.

The Numbers

Let me show you what we got.

Python Arc (baseline):

4.21M records/sec
42 workers
Memory: leaking

Go Arc (25.12.1):

9.47M records/sec sustained
16 parallel writers
Memory: stable, no leak

That's 125% faster with 38% of the workers.

Here's the 60-second sustained load test:

[   5.0s] RPS:  9,678,800 | Total:   48,394,000 | Errors: 0
[  10.0s] RPS:  9,680,800 | Total:   96,798,000 | Errors: 0
[  15.0s] RPS:  9,709,000 | Total:  145,343,000 | Errors: 0
[  20.0s] RPS:  9,717,600 | Total:  193,931,000 | Errors: 0
[  25.0s] RPS:  9,673,400 | Total:  242,298,000 | Errors: 0
[  30.0s] RPS:  9,560,600 | Total:  290,101,000 | Errors: 0
...
[  60.0s] RPS:  9,085,200 | Total:  567,220,000 | Errors: 0

THROUGHPUT: 9,447,110 records/sec (sustained)
p50:  8.40ms | p95: 25.83ms | p99: 42.29ms

No degradation. No memory leak. Just stable performance.

What's New in 25.12.1

The rewrite wasn't just about performance. We added features we couldn't build properly in Python.

Partition Pruning

Queries now only scan the data they need. Before, a query for the last hour would read all files. Now it reads only the relevant partition.

Before: 16 files scanned for any query After: 1 file scanned for 1-hour query

That's 16x fewer files and ~10x faster queries on filtered data.

Arrow Streaming

Large query results are now streamed directly to the client. No more loading 10M rows into memory before sending the response.

Rows	Format	Server Time	Throughput
1M	Arrow	432ms	2.31M rows/s
1M	JSON	536ms	1.87M rows/s
10M	Arrow	3,471ms	2.88M rows/s
10M	JSON	4,479ms	2.23M rows/s

Arrow is consistently faster and 2x more compact than JSON.

Write-Ahead Log (WAL)

Optional durability layer for crash recovery. Only ~6% overhead when enabled.

ARC_WAL_ENABLED=true
ARC_WAL_DIR=./data/wal
ARC_WAL_SYNC_MODE=fdatasync

S3/MinIO Storage

Full S3-compatible object storage support. Tested at 8.41M records/sec to MinIO—only 11% slower than local filesystem.

Data Management APIs

Delete operations with WHERE clause support
Retention policies with automatic enforcement
Continuous queries for downsampling and aggregation

Security Hardening

We ran govulncheck and gosec on the codebase. Fixed SQL injection vectors, path traversal vulnerabilities, and added proper security headers. Zero known vulnerabilities.

Migration from Python Arc

If you're running Python Arc, here's what you need to know:

Parquet files are compatible - Go Arc reads Python Arc's data files. No migration needed.
API endpoints are identical - Same routes, same payload formats. Drop-in replacement.
Configuration changed - We moved from arc.conf (ConfigParser) to arc.toml. Environment variables still work with ARC_ prefix.

The only breaking change: we removed some experimental endpoints that nobody was using.

Getting Started

Docker

docker run -d -p 8000:8000 \
  -e STORAGE_BACKEND=local \
  -v arc-data:/app/data \
  ghcr.io/basekick-labs/arc:25.12.1

Kubernetes

helm install arc https://github.com/Basekick-Labs/arc/releases/download/v25.12.1/arc-25.12.1.tgz

Native Packages (Debian, RPM)

For the first time, Arc ships as native packages for Linux. No Python. No dependencies. Just install and run.

Debian/Ubuntu (x86_64, aarch64):

# Download and install
wget https://github.com/basekick-labs/arc/releases/download/v25.12.1/arc_25.12.1_amd64.deb
sudo dpkg -i arc_25.12.1_amd64.deb
 
# Start and enable
sudo systemctl enable arc
sudo systemctl start arc
 
# Check status
curl http://localhost:8000/health

RHEL/Fedora/Rocky (x86_64, aarch64):

# Download and install
wget https://github.com/basekick-labs/arc/releases/download/v25.12.1/arc-25.12.1-1.x86_64.rpm
sudo rpm -i arc-25.12.1-1.x86_64.rpm
 
# Start and enable
sudo systemctl enable arc
sudo systemctl start arc

Both packages include systemd service files, so Arc starts automatically on boot.

What We Learned

Rewriting 18,000 lines of code isn't something you do lightly. Here's what we took away:

Python was the right choice for prototyping. We moved fast, validated the architecture, and shipped something useful. Without Python's speed of development, we wouldn't have gotten the feedback that told us we were on the right track.

Go is the right choice for production. For a database that needs to handle millions of writes per second with stable memory, Go's runtime is simply better suited.

The architecture held up. DuckDB as the query engine, Parquet for storage, Arrow for data interchange—these choices were sound. The rewrite validated them.

What's Next

Arc 25.12.1 is production-ready for single-node deployments. Here's what we're working on:

Coming soon:

Official SDKs (Python, Go, JavaScript)
MQTT ingestion for IoT edge devices
Arc Cloud managed hosting (beta in Q1 2026)

We're also building a Control Center for managing Arc instances—think metrics, health monitoring, and token management in one place.

More details coming as we ship. For now, 25.12.1 is ready for production.

Try It

Two months ago, Arc was an experiment. Today, it's a production database that's faster than anything else we've tested for time-series workloads.

If you're dealing with IoT telemetry, observability data, or any time-series workload—give Arc a shot. The performance speaks for itself.

Resources:

Thank you for being part of this journey.

Analytical Database

Streaming

AI Memory