Arc 25.12.1: We Rewrote Everything in Go. Here's Why.

Almost two months ago, we released the bits of Arc on GitHub. A month after that we released the first stable version, 25.11.1. Today we're announcing Arc 25.12.1—and this one is different.
But wait, wasn't Arc already good enough? Let me explain.
From Prototype to Production
When we started working on Arc—back when the name didn't even exist—our goal was to validate a hypothesis: is there hunger for a new time-series database that handles not just the typical IoT and observability scenarios, but also analytics workloads? One that truly separates storage from compute and fits into the modern data stack?
We wanted to demonstrate that with the right combination of technologies (DuckDB + Parquet + Arrow), we could build something that actually handles real-world workloads.
And we did it. The prototype worked. We open-sourced it and got incredible feedback from the community.
The validation exceeded our expectations. Over 100 deployments. 16 billion rows ingested across all instances—people were testing Arc hard. Conversations with investors who saw the potential. Ex-coworkers from my InfluxData days who understood the pain points we were solving. The signal was clear: yes, there's a real need for this.
But that prototype was in Python. That was intentional—we chose Python to move fast, get an MVP out the door, and start talking to users as quickly as possible. Speed of development over speed of execution. Ship first, optimize later.
And it worked. We validated the market. But once we had that validation, we hit a wall.
The Python Problem
Here's what we were dealing with:
- 372 MB memory leak per 500 queries with 42 workers
- pymalloc arena caching that doesn't release memory back to the OS
- Setting
PYTHONMALLOC=mallocmade it worse, not better - The multi-worker architecture was fundamentally leaky
We tried everything. Different memory allocators. Different worker configurations. Different garbage collection strategies. Nothing worked.
The Python runtime wasn't built for this workload. So we made a decision.
We Rewrote Arc in Go
Yes, a full rewrite. 18,280 lines of Python → Go.
Why Go?
- Better memory management - GC actually returns memory to the OS
- Native concurrency - goroutines instead of multi-worker processes
- Single binary deployment - no Python dependencies, no virtual environments
- Excellent DuckDB bindings -
go-duckdbjust works - Production stability - same language that powers ClickHouse, Prometheus, InfluxDB
We estimated 4-6 weeks. It took about that.
The Numbers
Let me show you what we got.
Python Arc (baseline):
- 4.21M records/sec
- 42 workers
- Memory: leaking
Go Arc (25.12.1):
- 9.47M records/sec sustained
- 16 parallel writers
- Memory: stable, no leak
That's 125% faster with 38% of the workers.
Here's the 60-second sustained load test:
[ 5.0s] RPS: 9,678,800 | Total: 48,394,000 | Errors: 0
[ 10.0s] RPS: 9,680,800 | Total: 96,798,000 | Errors: 0
[ 15.0s] RPS: 9,709,000 | Total: 145,343,000 | Errors: 0
[ 20.0s] RPS: 9,717,600 | Total: 193,931,000 | Errors: 0
[ 25.0s] RPS: 9,673,400 | Total: 242,298,000 | Errors: 0
[ 30.0s] RPS: 9,560,600 | Total: 290,101,000 | Errors: 0
...
[ 60.0s] RPS: 9,085,200 | Total: 567,220,000 | Errors: 0
THROUGHPUT: 9,447,110 records/sec (sustained)
p50: 8.40ms | p95: 25.83ms | p99: 42.29ms
No degradation. No memory leak. Just stable performance.
What's New in 25.12.1
The rewrite wasn't just about performance. We added features we couldn't build properly in Python.
Partition Pruning
Queries now only scan the data they need. Before, a query for the last hour would read all files. Now it reads only the relevant partition.
Before: 16 files scanned for any query After: 1 file scanned for 1-hour query
That's 16x fewer files and ~10x faster queries on filtered data.
Arrow Streaming
Large query results are now streamed directly to the client. No more loading 10M rows into memory before sending the response.
| Rows | Format | Server Time | Throughput |
|---|---|---|---|
| 1M | Arrow | 432ms | 2.31M rows/s |
| 1M | JSON | 536ms | 1.87M rows/s |
| 10M | Arrow | 3,471ms | 2.88M rows/s |
| 10M | JSON | 4,479ms | 2.23M rows/s |
Arrow is consistently faster and 2x more compact than JSON.
Write-Ahead Log (WAL)
Optional durability layer for crash recovery. Only ~6% overhead when enabled.
ARC_WAL_ENABLED=true
ARC_WAL_DIR=./data/wal
ARC_WAL_SYNC_MODE=fdatasyncS3/MinIO Storage
Full S3-compatible object storage support. Tested at 8.41M records/sec to MinIO—only 11% slower than local filesystem.
Data Management APIs
- Delete operations with WHERE clause support
- Retention policies with automatic enforcement
- Continuous queries for downsampling and aggregation
Security Hardening
We ran govulncheck and gosec on the codebase. Fixed SQL injection vectors, path traversal vulnerabilities, and added proper security headers. Zero known vulnerabilities.
Migration from Python Arc
If you're running Python Arc, here's what you need to know:
- Parquet files are compatible - Go Arc reads Python Arc's data files. No migration needed.
- API endpoints are identical - Same routes, same payload formats. Drop-in replacement.
- Configuration changed - We moved from
arc.conf(ConfigParser) toarc.toml. Environment variables still work withARC_prefix.
The only breaking change: we removed some experimental endpoints that nobody was using.
Getting Started
Docker
docker run -d -p 8000:8000 \
-e STORAGE_BACKEND=local \
-v arc-data:/app/data \
ghcr.io/basekick-labs/arc:25.12.1Kubernetes
helm install arc https://github.com/Basekick-Labs/arc/releases/download/v25.12.1/arc-25.12.1.tgzNative Packages (Debian, RPM)
For the first time, Arc ships as native packages for Linux. No Python. No dependencies. Just install and run.
Debian/Ubuntu (x86_64, aarch64):
# Download and install
wget https://github.com/basekick-labs/arc/releases/download/v25.12.1/arc_25.12.1_amd64.deb
sudo dpkg -i arc_25.12.1_amd64.deb
# Start and enable
sudo systemctl enable arc
sudo systemctl start arc
# Check status
curl http://localhost:8000/healthRHEL/Fedora/Rocky (x86_64, aarch64):
# Download and install
wget https://github.com/basekick-labs/arc/releases/download/v25.12.1/arc-25.12.1-1.x86_64.rpm
sudo rpm -i arc-25.12.1-1.x86_64.rpm
# Start and enable
sudo systemctl enable arc
sudo systemctl start arcBoth packages include systemd service files, so Arc starts automatically on boot.
What We Learned
Rewriting 18,000 lines of code isn't something you do lightly. Here's what we took away:
Python was the right choice for prototyping. We moved fast, validated the architecture, and shipped something useful. Without Python's speed of development, we wouldn't have gotten the feedback that told us we were on the right track.
Go is the right choice for production. For a database that needs to handle millions of writes per second with stable memory, Go's runtime is simply better suited.
The architecture held up. DuckDB as the query engine, Parquet for storage, Arrow for data interchange—these choices were sound. The rewrite validated them.
What's Next
Arc 25.12.1 is production-ready for single-node deployments. Here's what we're working on:
Coming soon:
- Official SDKs (Python, Go, JavaScript)
- MQTT ingestion for IoT edge devices
- Arc Cloud managed hosting (beta in Q1 2026)
We're also building a Control Center for managing Arc instances—think metrics, health monitoring, and token management in one place.
More details coming as we ship. For now, 25.12.1 is ready for production.
Try It
Two months ago, Arc was an experiment. Today, it's a production database that's faster than anything else we've tested for time-series workloads.
If you're dealing with IoT telemetry, observability data, or any time-series workload—give Arc a shot. The performance speaks for itself.
Resources:
Thank you for being part of this journey.