Arc 26.05.1: Enterprise GA. Helm, Replication, Failover, Hardening.

26.05.1 is the Enterprise GA release.
For the last few months, every release has been quietly converging on this one: the cluster manifest, the role separation, the failover paths, the WAL replication, the peer file replication. None of it shipped as a single big bang, each piece landed when it was ready, behind feature flags or as opt-in code paths. 26.05.1 is the release where it all comes together, with a production-ready Helm chart wrapped around it.
It's also the release where we ran the ingestion path and the query path through two separate four-agent staff/principal-engineer reviews and fixed everything they found. That part nobody asked for, but it's the part that matters most when you're running this in production.
Let me walk through what's in here.
Production-Ready Helm Chart
helm/arc-enterprise/ is now a real, opinionated chart, not a starter template.
It ships role-separated StatefulSets for writer, reader, and compactor, with per-role scheduling (writer.nodeSelector, writer.tolerations, writer.affinity, and the same for reader and compactor), per-role resource sizing, and per-role PVCs sized for each deployment pattern.
A few details that matter:
- Correct HA bootstrap: only pod ordinal
-0bootstraps Raft. The chart refuseswriter.replicas=2(Raft split-brain hazard), use 1 for dev or 3+ for HA. - Automatic failover wiring: a single
cluster.failover.enabledknob activates both writer and compactor failover in the binary. - Durable ingest by default: writers ship with WAL enabled (
writer.wal.enabled=true), sync mode configurable. - Fail-fast install validation: missing license key, shared secret, TLS secret (when TLS is enabled), MinIO credentials, or external-S3 credentials all produce a clear error at
helm installtime instead of aCreateContainerConfigErrorpod stuck later. - Secure defaults: no default MinIO credentials (chart refuses to install without explicit values), reader
Servicedefaults toClusterIP, MinIO runs as non-root with the console disabled, bundled-MinIO pinned to a tagged release (not:latest),seccompProfile: RuntimeDefaultandallowPrivilegeEscalation: falseon Arc pods. - Initial admin token plumbing:
auth.bootstrapToken.valueorauth.bootstrapToken.existingSecretpre-sets the admin token, removing the first-boot log-scraping step from deploy automation. - Cluster TLS: when
cluster.tls.enabled=true, the chart wires both the server cert/key and the CA certificate (for mutual TLS peer verification) from the referenced Kubernetes Secret.
Quick-start preset files land in the chart root: values-shared-storage.yaml (S3/MinIO/Azure) and values-local-storage.yaml (per-node SSDs with peer replication).
A new Deployment Patterns page in the docs compares the two topologies side-by-side with sizing guidance, operational trade-offs, and the security posture for each.
Peer File Replication (Enterprise)
This is the feature that unlocks bare-metal, VM, and edge deployments, environments where each node has its own local SSDs and shared object storage isn't available.
Arc Enterprise clusters now replicate Parquet files between nodes automatically. When a writer flushes a file, all other nodes pull the bytes from the origin peer (or any healthy peer that already has a copy) and write them to their own local storage.
Every file is SHA-256 hashed at flush time and the hash is committed into a Raft-backed cluster manifest. Receivers verify the hash after download, checksum mismatches trigger automatic retry. When a node starts (or restarts), it walks the manifest and catches up on any files it missed, pulling from whichever healthy peer has them. The original writer doesn't need to be alive: that's the property that makes this safe.
Non-leader nodes forward manifest commands to the Raft leader automatically, so writes don't silently fall on the floor if the writer isn't the Raft leader.
The transfer path is resumable. Interrupted pulls resume from the last committed byte instead of restarting from zero, especially valuable for large compacted Parquet outputs on slow or flaky links. On retry the puller checks how many bytes are already on disk, hashes the prefix to continue the SHA-256 chain, and requests only the remaining tail. Full-file hash verification is still enforced. S3 and Azure Blob backends fall back to a full re-fetch on resume (append isn't supported by those APIs); local-SSD nodes get complete resume support.
Operator visibility:
GET /api/v1/cluster/filesreturns the full cluster manifest (supports?database=filter)- Replication status and catch-up progress are surfaced via
/api/v1/cluster/status - Zero overhead for OSS / standalone deployments
Dedicated Compactor Role with Automatic Failover (Enterprise)
In clustered deployments, compaction now runs on exactly one node to prevent duplicate outputs on shared storage and eliminate redundant CPU/IO work across nodes.
Set ARC_CLUSTER_ROLE=compactor on one node. Writers and readers automatically gate their compaction schedulers. OSS deployments are unaffected, compaction runs unconditionally.
Compacted outputs are registered in the Raft manifest and replicated to all peers automatically.
Automatic failover: when cluster.failover_enabled=true, the Raft leader monitors the active compactor and reassigns it to another healthy node after ~30s of unresponsiveness. No restart required. Prefers compactor-role nodes, falls back to writers. 60s cooldown prevents rapid cycling.
Health warnings surface when no compactor is elected or when multiple compactors are configured (a misconfiguration that re-introduces duplicate outputs).
Cluster TLS + Shared-Secret Auth (Enterprise)
Arc Enterprise clustering now supports encrypted inter-node communication and authenticated cluster joins.
Shared secret authentication (cluster.shared_secret): when configured, join requests include an HMAC-SHA256 signature over a random nonce, node ID, cluster name, and timestamp. The leader validates the signature and rejects unauthorized joins. Timestamps are checked within a 5-minute tolerance to prevent replay attacks.
TLS encryption (cluster.tls_enabled + cert/key files): all inter-node TCP connections (coordinator, WAL replication, shard replication, and Raft consensus) are wrapped in TLS. Raft transport uses a custom TLSStreamLayer implementing raft.StreamLayer. Optional CA certificate (cluster.tls_ca_file) enables mutual TLS for peer certificate verification.
Both features are opt-in and backward compatible. When disabled, behavior is unchanged.
cluster:
shared_secret: "my-cluster-secret"
tls_enabled: true
tls_cert_file: "/etc/arc/cluster-cert.pem"
tls_key_file: "/etc/arc/cluster-key.pem"Kubernetes-Ready Node Identity (Enterprise)
Cluster node identity is now stable across pod reschedules. When cluster.node_id isn't explicitly set, Arc uses the OS hostname as the node ID, in Kubernetes StatefulSets, this is the pod name (e.g., arc-writer-0), which survives reschedules. A restarted pod rejoins the cluster with the same identity instead of registering as a new node and leaving a dead Raft voter behind.
Cluster nodes also broadcast a LeaveNotify message to all peers during graceful shutdown. Peers immediately remove the departing node from Raft and the registry, rather than waiting for the heartbeat timeout to detect the departure. Rolling updates and scale-down operations are now clean and predictable.
Reader Query Freshness via WAL Replication (Enterprise)
Reader nodes now apply replicated WAL entries to their local ArrowBuffer, enabling near-real-time query freshness.
Previously, readers received WAL entries from the writer but only persisted them to their local WAL, the data was invisible to queries until flushed to Parquet. Now, replicated entries are decoded (both columnar and row formats) and written directly to the reader's in-memory buffer, making unflushed writer data queryable on readers.
This is the foundation for zero-latency reads across clustered deployments, ingested data is queryable on readers within milliseconds of arriving at the writer.
Manifest-vs-Storage Reconciliation (Enterprise)
A periodic reconciler now detects and repairs drift between the Raft-replicated cluster file manifest and physical storage. Two kinds of drift are addressed:
- Orphan manifest entries: manifest references a file path that no longer exists in storage. Caused by retention/compaction/delete succeeding storage-side then losing Raft quorum before the manifest update commits.
- Orphan storage files: file exists in storage but no manifest entry references it. Caused by a crash between
storage.Writeand the file-registrar Raft propose, or by files predating the Phase 1 manifest.
The reconciler ships off by default AND report-only on first run. Once enabled (reconciliation.enabled=true), the cron runs daily at 04:17 producing dry-run audit reports. After reviewing the reports operators flip manifest_only_dry_run=false to allow real deletes. Pre-manifest cleanup (files outside the standard Arc layout) requires a separate explicit opt-in via delete_pre_manifest_orphans=true.
Steady-state behavior matches Druid's coordinator kill task and Pinot's retention manager: opt-in, conservative grace window, blast cap. Per-run cap, 24-hour grace window, manifest-size ceiling, per-prefix list timeout, and a root-walk fan-out cap all default to conservative values.
Every run produces an audit trail. The most recent 10 runs are queryable via GET /api/v1/reconciliation/status.
Ingestion Hardening: ~17% Better p99
A four-agent post-implementation review of all four ingest paths (MessagePack columnar, MessagePack row, Line Protocol, TLE) surfaced and fixed five critical issues. Sustained-load benchmarks after the fixes:
| Metric | Before | After |
|---|---|---|
| p99 latency | 3.68ms | 3.13ms (~17% better) |
| Throughput (MsgPack columnar) | ~18.6M rec/s | ~19M rec/s |
| Errors over 60s | 0% | 0% |
What the review caught:
- Multi-hour flush atomicity: when a write spanned multiple hour buckets, the manifest could be updated for completed hours even if a later hour's write failed. The fix uses a collect-then-register pattern: all hour buckets write first, then all registrations proceed.
- Graceful-shutdown panic eliminated:
ArrowBuffer.Close()previously closed the internal flush queue after cancelling the buffer context, creating a narrow race where a writer goroutine past the shard mutex but not yet at the channel send would panic with "send on closed channel". Fixed with aclosingatomic flag set before the cancel; workers exit on context cancellation. - Schema-evolution corruption under concurrent writes eliminated: when two writers raced through a schema change against the same
(database, measurement), one writer could append records into a buffer the other had just re-keyed with a third schema. The schema-change path now uses a bounded retry loop withctx.Err()checks per iteration. Hitting the cap returns a typedingest.ErrSchemaChurnExceededsentinel (HTTP 503) rather than silently committing a wide schema-mixed buffer. - WAL backpressure no longer masquerades as durability: when the WAL's async entry channel was full,
AppendRawpreviously returnednilwhile incrementing a hidden counter. Downstream code logged "data preserved in WAL for recovery", which was untrue for the dropped entries. WAL writes now return a typedwal.ErrWALDroppedsentinel; ingest callers increment a separatetotal_wal_droppedcounter and emit a sampled (max 1/sec) Warn instead of an unsampled per-record Error. - Cluster-replication receivers tolerate WAL backpressure: replication and shard receivers previously treated
LocalWAL.AppendRawerrors as fatal, silently diverging the follower from the primary. They now treatwal.ErrWALDroppedas non-fatal and rely on the primary's WAL plus peer Parquet replication for durability.
Performance improvements that fell out of the same review:
- Cached column signatures: schema change detection previously recomputed a sorted, joined column name string on every write. The signature is now computed once and cached, so hot-path schema checks are a field read with zero allocation. The signature also encodes each column's Go type, so an
int64→float64change on the same column name is now detected as schema evolution. - Pre-built Parquet writer properties:
parquet.WriterPropertiesandpqarrow.ArrowWriterPropertieswere reconstructed on every flush. Now built once inNewArrowWriterand reused. - Sort permutation reuse:
sortTypedColumnBatchByKeyspreviously sorted twice (once for data, once for validity bitmaps). Now computes the permutation once and applies it directly. mergeBatchesvalue types:colInfostructs were heap-allocated as pointers. Changed to value semantics, removing per-column heap allocation during batch merges.
Query Path Hardening
A second four-agent review covered the read execution path. ClickBench-hits regression budget held: ~3% on a 99.9M-row aggregate, well within the 5% no-regression target.
Defense in depth on the read-only query API. The query API is intended for read-only SELECT workloads, but in some edge cases the validation layer was looser than the intent. We tightened the surface in three places:
- Expanded SQL denylist with comment-strip and literal-mask normalization: the guardrail now also gates
ATTACH/DETACH/COPY/EXPORT/IMPORT/PRAGMA/SET/RESET/LOAD/INSTALL/CALL. The validation pipeline strips SQL comments before the regex check (soDROP /* */ TABLE xcan't slip past via interleaved comments) and masks string literals first (soSELECT 'DROP TABLE x'isn't falsely rejected). Test matrix covers every blocked keyword plus comment-injection, quoted-identifier false-positive, andSET x TO 1/CALL(proc)bypass shapes. x-arc-databaseheader validation + universalread_parquetpath quoting: the header value lands inside theread_parquet('<base>/<database>/<measurement>/.../*.parquet')storage path Arc generates internally. It's now validated at every entry point against[a-zA-Z_][a-zA-Z0-9_-]*and everyread_parquet('PATH', ...)interpolation site routes through a single source-of-truth quoting helper that doubles single quotes per the DuckDB literal-escape rule.- Direct
read_parquet()calls in user SQL rejected: Arc's transformation layer is the only legitimate source ofread_parquetin a user query. A user submittingSELECT * FROM read_parquet('...')directly produced zero extracted table references and bypassed the RBAC pair-check. User SQL containingread_parquet(is now rejected at validation time.
Correctness, partial failures now surfaced.
- Streaming-response error semantics:
streamTypedJSONandstreamArrowJSONpreviously returned only the row count; in some edge cases (Scan failures mid-stream,ctx.Err()after the response envelope was flushed) the loop silently continued and the caller marked the queryComplete. Both functions now return errors and checkctx.Err()at every row/batch boundary. Operators get distinct signals for "stream truncated after headers committed" vs. clean completion. - Parallel-partition partial failure surfaced as request error: when the parallel executor fans out a query across N partitions and one or more partition queries error, the merged iterator previously returned the surviving partitions' rows as
success: true. The handler now fails the whole request with HTTP 500. Companion fix: the goroutine fan-out semaphore is now acquired in the launch loop instead of inside each spawned goroutine, for a query with 10K partition paths, in-flight goroutines are bounded byMaxConcurrentPartitions(default 4) instead of spawning 10K. - Arrow IPC streaming memory pinned by deferred Release: the
executeQueryArrowIPC stream loop useddefer batch.Release()insidefor reader.Next(). Defers accumulate on the closure's stack, so for a 10M-row result with 10K-row batches, 1,000 deferred Release calls held all casted Arrow records alive until the entire stream completed. Memory grew proportional to result size, defeating the streaming-memory contract. Fixed: each casted batch is released explicitly afteripcWriter.Writereturns.
Security
Beyond the hardening passes above:
- Write endpoints now require write-tier auth: five ingest endpoints (
/api/v1/write/msgpack,/write,/api/v2/write,/api/v1/write/line-protocol,/api/v1/write/tle) and four bulk-import endpoints lacked explicit write-tier auth. A token issued with read-only permissions could write data when RBAC was disabled (the OSS default). All write endpoints now useauth.RequireWrite. Import endpoints, which can rewrite history, and the global flush endpoint useauth.RequireAdmin. - Gzip and zstd decompression-bomb fixes: Fiber's transparent gunzip on
Content-Encoding: gziphad no decompressed-size cap. A 1 MB gzip payload that decompressed to multiple gigabytes would OOM the process. Both Line Protocol and TLE handlers now read the raw request body, detect gzip/zstd by magic bytes, and decompress through a pooled limited reader with a hard 100 MB cap, same as the MessagePack handler. The zstd path uses streaming decode rather thanDecodeAllso a 28 KB → 256 MB zstd bomb is rejected with bounded allocation. - Cluster-safe DELETE: readers reject delete requests with 503 before any storage scan; database/measurement inputs validated against
..,/,\. Manifest-before-storage ordering preserved for full-file deletes. - Cluster-safe retention and continuous queries: retention and CQs now gate on
IsPrimaryWriter()checked at every tick. Failover and demotion take effect without restart. Manifest updates batched into single Raft entries (~200ms → ~5ms apply latency for typical 20-output manifests). - Directory permissions 0700: auth DB, CQ definitions, retention policies, Raft state, telemetry, import output. Existing deployments retain prior permissions; operators may
chmod 700manually.
Deprecations
?p=token query parameter authentication is now deprecated. Tokens passed in URLs leak through reverse proxies, load balancers, and access logs, that's a credential leak risk. The ?p= method continues to work but Arc now logs a one-time warning on first use. Migrate clients to Authorization: Bearer <token>.
This is the InfluxDB 1.x compat path. If you're using the InfluxDB Go/Python clients with bearer-token configuration, you're already on the safe path.
Bug Fixes
A handful of meaningful correctness fixes:
- WAL filename rotation collision: second-precision filenames could collide on rapid rotation and the second rotation would reopen the existing file via
O_APPEND, corrupting the WAL structure. Filenames now use nanosecond precision (arc-YYYYMMDD_HHMMSS.000000000.wal). - Query registry reports 0 row count for Arrow-path queries: the Arrow path streams asynchronously via
SetBodyStreamWriter, soComplete(queryID, 0)was always called with the placeholder count. Error and timeout branches also returned without notifying the registry, leaving queries stuck in "running". Fixed withonComplete/onFail/onTimeoutcallbacks. - Low-volume measurements starved of age-based flushes under load:
periodicFlushunconditionally reset its self-adjusting timer every time a new buffer was created. Under high write load, the timer was continuously reset, preventing low-volume buffers from being flushed withinmax_buffer_age_ms. Now the timer is only reset when the new computed deadline is earlier than the already-scheduled deadline. - Memory not released after delete or retention: DuckDB's parquet metadata cache and data block cache were populated during
read_parquet()queries executed by delete and retention but never cleared. Both handlers now callClearHTTPCache()after completion.debug.FreeOSMemory()is debounced via atomic CAS to fire at most once every 30 seconds, preventing GC storms when concurrent operations complete in rapid succession. - Writer-only schedulers skipped all ticks without failover enabled:
IsPrimaryWriter()returnedtrueonly whenWriterState == WriterStatePrimary, which is set exclusively via the failover manager. With failover disabled (the default), retention and CQs silently no-op'd on every tick.Coordinator.IsPrimaryWriter()now falls back to a role check (Role == RoleWriter) when no failover manager is configured. - CQ not scheduled after API creation: a CQ created via
POST /api/v1/continuous_querieswasn't picked up by the scheduler until the node was restarted.handleCreatenow callsscheduler.StartJobDirectafter a successful insert. - RBAC goroutine leak:
RBACManager's background cache cleanup goroutine ran in an infinite loop with no shutdown mechanism. AddedClose()method following the same pattern used byAuthManager. RBAC permission and token caches are also bounded with a configurableMaxCacheSize(default 10,000 entries each). - Row-format MessagePack flush hardening: issue #401 reported that row-format MessagePack writes could be silently dropped at flush time with
no time data in batch. Couldn't reproduce end-to-end on current builds, but the path now has explicit regression coverage and a dedicated Prometheus counter (arc_buffer_flush_failures_total) for visibility.
Dependencies
- DuckDB v2.5.5 (DuckDB 1.4.4) → v2.10501.0 (DuckDB 1.5.1): all platform-specific binary packages updated in lockstep.
aws-sdk-go-v2core 1.40 → 1.41.5,service/s31.92 → 1.99,smithy-go1.23 → 1.24. DNS timeout errors are now retried automatically (improves S3 reliability on flaky networks); fix for config load failure when a non-existent AWS profile is configured.
How to Update
docker pull ghcr.io/basekick-labs/arc:26.05.1For binary installations, download 26.05.1 from the https://github.com/Basekick-Labs/arc/releases.
For Kubernetes:
helm install arc https://github.com/basekick-labs/arc/releases/download/v26.05.1/arc-26.05.1.tgzThe full release notes (long-form, with config tables and per-fix detail) are in https://github.com/Basekick-Labs/arc/blob/release/26.05.1/RELEASE_NOTES_2026.05.1.md on the release branch.
Get started:
- Arc documentation
- Arc Enterprise deployment patterns
- https://github.com/Basekick-Labs/arc
- Join the Discord
Questions? Discord or https://github.com/Basekick-Labs/arc/issues.
Ready to handle billion-record workloads?
Deploy Arc in minutes. Own your data in Parquet. Use for analytics, observability, AI, IoT, or data warehousing.