Arc 26.06.2: FIPS Build, Signed Supply Chain, Sandbox Fix

#Arc#release#v26.06.2#FIPS#supply chain#SBOM#SLSA#defense#security#bug fixes#performance#Basekick Labs
Cover image for Arc 26.06.2: FIPS Build, Signed Supply Chain, Sandbox Fix

26.06.2 is the first patch release under Arc's new quarterly cadence. It lands a few weeks after 26.06.1, which is exactly the point we made when we slowed the major cycle down: the quarterly window is for features, but patches ship whenever they need to. This one did.

It's also a patch that carries the things defense and regulated buyers ask about first. There's a FIPS 140-3 build now, and every release ships with a signed, attested supply chain: SBOMs, vulnerability scans, cosign signatures, and SLSA provenance.

Despite all of that, 26.06.2 is a drop-in upgrade. No config change, no schema migration, no breaking API changes. Your existing arc.toml, license keys, and tokens keep working. If you're on 26.06.1, update.

A FIPS 140-3 build for regulated environments

Arc now ships an optional arc-fips build for US defense, aerospace, and other environments that require validated cryptography. It is the same source at the same version as the standard build, not a separate product or version line. It is compiled with a fips build tag against the CMVP-certified Go Cryptographic Module v1.0.0 (GOFIPS140=v1.0.0), runs the module in FIPS-only mode (GODEBUG=fips140=only is baked into the binary), and fails closed at startup if it is not actually running in FIPS mode. It carries the same Arrow/DuckDB performance build as the standard binary.

What runs through the FIPS module:

  • TLS for the API, cluster, and MQTT paths is restricted to FIPS-approved cipher suites and curves.
  • API-token hashing moves from bcrypt (not FIPS-approved) to PBKDF2-HMAC-SHA256.
  • Cluster replication key derivation uses the standard library's crypto/hkdf.

The FIPS binary links no non-approved crypto, verified in CI by an import-graph check. DuckDB and SQLite perform no cryptography and sit outside the module boundary.

The variant ships as arc-fips-linux-amd64 and arc-fips-linux-arm64 (each with a cosign .bundle), container images tagged :26.06.2-fips on GHCR and Docker Hub, and arc-fips .deb / .rpm packages. Both variants report the same version; the FIPS build is identified by its artifact name and "fips_mode":true in its startup log.

One honest caveat, stated plainly. Arc's FIPS build is compiled against a CMVP-certified cryptographic module. That means the module is validated. Arc itself is not a CMVP-listed module, and this is not a claim that "Arc is FIPS 140-3 validated." If you're putting this in a compliance submission, confirm the live certificate number on the NIST CMVP list first.

Operator note: moving a deployment to the FIPS build requires rotating (recreating) your existing API tokens. Tokens created by a non-FIPS build are stored as bcrypt hashes, which the FIPS build refuses to verify; new tokens are PBKDF2 automatically. Arc Enterprise customers get FIPS by running the arc-fips build with their license key — there is no separate enterprise FIPS binary.

If you're running Arc on satellite telemetry or at the tactical edge, this is the build you've been asking for.

A signed, attested supply chain

Every Arc release now ships machine-readable supply-chain evidence as first-class artifacts, targeting EO 14028 and defense supply-chain review requirements. SBOMs are generated with https://github.com/anchore/syft, container images are scanned with https://github.com/aquasecurity/trivy, the Go module graph is scanned with govulncheck (a finding blocks the build), binaries and images are signed with https://github.com/sigstore/cosign keyless OIDC signing anchored in the Rekor transparency log, and the binaries carry SLSA Level 3 provenance.

To be precise about what this is: the deliverable is signed scan evidence, not "zero findings." Transitive OS CVEs from the Debian base layer, outside Arc's control, are reported but do not gate the release. Air-gapped and Zarf-based deployments can verify everything offline.

What you can pull from each GitHub release:

  • arc-26.06.2-sbom-container.spdx.json — SPDX SBOM from the container image (DuckDB native libraries, SQLite, and every Debian package in the runtime layer)
  • arc-26.06.2-sbom-source.cyclonedx.json — CycloneDX SBOM from the Go source tree (every module with its exact version and SPDX license)
  • arc-26.06.2-trivy-report.json — full Trivy container scan; SARIF results are also uploaded to the GitHub Security tab
  • arc-linux-amd64.bundle, arc-linux-arm64.bundle — cosign signature bundles (signature + Rekor transparency-log entry)
  • arc-26.06.2.intoto.jsonl — SLSA Level 3 provenance attestation
  • FIPS counterparts: arc-26.06.2-fips-sbom-container.spdx.json, arc-26.06.2-fips-trivy-report.json, the :26.06.2-fips images, and the arc-fips packages

Verifying a container image is one command:

cosign verify ghcr.io/basekick-labs/arc:26.06.2 \
  --certificate-identity-regexp "^https://github.com/Basekick-Labs/arc/" \
  --certificate-oidc-issuer https://token.actions.githubusercontent.com

Security: the sandbox CVE is now fully closed, update now

This release completes the fix for CVE-2026-47735. The 26.06.1 sandbox fix blocked the literal read_parquet() and arc_partition_agg() spellings, but DuckDB exposes the same filesystem reads through a wider family of path-taking table functions: parquet_scan (an alias), glob, read_blob, read_csv / read_csv_auto, read_json*, parquet_metadata, read_text, delta_scan, iceberg_scan, and more. On an RBAC multi-tenant deployment, any authenticated principal — including a least-privileged scoped token — could read another database's Parquet data, file listings, and column statistics by naming one of these functions directly. The query validator now rejects this entire family by name, in any query position (comma cross-joins, subqueries, IN-lists, lateral joins, quoted-identifier spellings), across every query endpoint. Pure non-I/O table functions (generate_series, range, unnest) and all scalar/aggregate functions are unaffected.

S3 credentials are no longer readable through the query API. Arc previously configured DuckDB's S3 access with SET GLOBAL s3_secret_access_key=…, and DuckDB exposes settings to plain SQL — so any authenticated principal could read the secret back with SELECT current_setting('s3_secret_access_key'). S3 credentials now live in DuckDB's secrets manager via CREATE SECRET, which is not reachable through current_setting(). Primary and cold-tier storage get separate, scope-bound secrets, and S3/Azure access now supports the AWS credential chain (IRSA, IAM roles, environment credentials). This affects only deployments that configure S3-compatible or Azure storage.

A handful of other hardening fixes round out the release:

  • GET /api/v1/logs now requires an admin token. It had been registered among the public base routes (health, readiness, metrics), which placed it ahead of the auth middleware and left it reachable without a token. Reported by @sondt99.
  • Query endpoints consistently require read permission. The query routes previously accepted any valid token, so a write-only token could run SELECTs. They now require read (or admin), matching the write endpoints.
  • POST / DELETE /api/v1/databases now require admin tokens, matching every other mutating endpoint. Read-only routes are unchanged.
  • The auth database file is locked to 0600. The SQLite database that stores token hashes and RBAC roles (plus its -wal / -shm siblings) was created with the process umask, world-readable on most systems. It's now owner-only.
  • Clustering fails closed without a shared secret (Enterprise). A node with cluster.enabled = true but no cluster.shared_secret now refuses to start instead of silently skipping inter-node authentication. Reported by @sondt99 through coordinated disclosure. There's a rolling-upgrade caveat: the HMAC payload for join/leave/heartbeat now carries a per-message-type label, so cross-version messages fail validation during a mixed-version window. Upgrade all cluster nodes in one window; the transient "node unhealthy" / "authentication failed" log lines clear once every node is on 26.06.2.

Full per-fix detail, including the RBAC normalization and multi-statement-query rejection work, is in the long-form notes linked at the bottom.

Bug fixes operators will feel

Compaction no longer wedges a partition when files disagree on the time column type. A regression from 26.05.1 (https://github.com/Basekick-Labs/arc/issues/411) made the ingest buffer's schema signature type-aware, which meant a writer that sent time as a string for some batches and integer epochs for others routed those batches into separate Parquet files in the same partition. Compaction couldn't reconcile the conflicting types (TIMESTAMP WITH TIME ZONE != VARCHAR), so the partition failed to compact on every cycle while raw files piled up. The fix forces time to an integer-microsecond timestamp at ingest (a string or null time is now rejected with a clear error), and the compaction query normalizes time so partitions already written by an affected build self-heal on the next cycle. A follow-up replaced the original CTE-based normalization with a temp-table materialization — the CTE held up on a narrow test fixture but not at the column width of real production partitions, where the dedup path failed to bind even when no VARCHAR file was present.

Primary S3 reads now authenticate via the AWS credential chain. On an EKS deployment using IRSA (no static keys), Arc previously created no DuckDB S3 secret for the primary backend, so s3:// query reads failed with an opaque permission error even though writes succeeded and the pod's IAM role was valid. The query path now provisions a credential-chain secret whenever the primary backend is S3-compatible. Azure gets the matching fix for connection-string-only configurations. Deployments with static keys are unchanged.

Pre-1970 timestamps now partition into the correct hour (https://github.com/Basekick-Labs/arc/issues/312). Hour bucketing used integer division, which truncates toward zero rather than flooring, so a negative timestamp landed one hour too late — a row at 1969-12-31 23:30 filed into 1970/01/01/00/. Bucketing now floors toward negative infinity. Non-negative timestamps are unaffected.

CSV and Parquet bulk imports work again against the shipped DuckDB version. The import path used to introspect uploaded files with DuckDB queries, and the DESCRIBE-as-subquery form was rejected by the version Arc ships, so CSV imports failed with a 422 before any data landed. Both formats now parse in-process through the same streaming pipeline as Line Protocol. Empty or malformed files now fail fast with a clear 400 (empty file, blank or duplicate column names, a colliding time_column rename, or a NaN/Inf float time value), and Parquet DECIMAL columns import as DOUBLE.

A few more worth a line each:

  • The WAL reader now uses io.ReadFull for fixed-size header reads, so a short read can't cascade into corrupting every subsequent entry.
  • The WAL writer now tracks write failures (arc_wal_failed_writes_total) and rotates to a fresh file on error instead of hammering a bad handle.
  • S3/MinIO and Azure backends now record storage read/write/error metrics (https://github.com/Basekick-Labs/arc/issues/349) — these counters previously sat at zero on cloud deployments.
  • Two tiering policy-cache fixes (https://github.com/Basekick-Labs/arc/issues/345, https://github.com/Basekick-Labs/arc/issues/499): default-policy lookups no longer hit SQLite on every call, and a lookup racing a deletion can no longer re-cache the deleted policy. Enterprise-only.

Performance

Three changes on the hot paths:

  • Compaction cleanup uses batch-delete APIs on S3 and Azure. It previously deleted compacted source files one API call at a time; on a large cycle (hundreds of files), that's hundreds of sequential calls. It now uses DeleteObjects / BlobBatch, cutting S3 DELETE calls by up to 1000×. Local storage is unaffected.
  • Single-hour flush no longer builds and discards a per-hour index. Every flush allocated an index bucket over every buffered row, then threw it away when the data fell in one hour — the common case for live ingest. On a 5M-row buffer that was ~201 MB allocated and discarded per flush. A cheap inline min/max scan now makes the decision, and the buckets are only built on the multi-hour backfill path.
  • The flush time-sort is now a radix sort. A closure-based sort.Slice cost ~6.9% of ingest CPU on the effectively-unordered merged buffer. A sign-aware LSD radix sort (which also keeps pre-1970 timestamps ordering correctly) replaces it — about 5× faster on that shape.
Single-hour flush (5M rows)BeforeAfter
Time16 ms1.25 ms
Allocated201 MB0 B

Together, sustained MessagePack-columnar ingest improved from about 20.0M to 20.9M rec/s over a 60-second run, with a flatter throughput curve and lower tail latency.

Deployment

Helm: IRSA support for external S3 (EKS). The arc-enterprise chart can now authenticate to external S3 through the pod's IAM role instead of static keys. Set storage.shared.credentials.useIRSA=true and the chart omits the access/secret-key env vars so Arc falls back to the AWS credential chain; attach the role via serviceAccount.create=true and the eks.amazonaws.com/role-arn annotation. Install fails fast if useIRSA=true without the annotation. Point image.tag at 26.06.2 to get the primary-S3 IRSA fix on the query path — on 26.06.1 only the write path used the chain.

Operator action items

  • Update. It's a drop-in upgrade; no config or schema change.
  • If you issue single-permission tokens, make sure tokens used for querying carry read, and any automation that provisions databases via POST /api/v1/databases uses an admin token.
  • Enterprise clustering without a shared secret: set ARC_CLUSTER_SHARED_SECRET on every node before upgrading, and upgrade all nodes in one window.
  • Moving to the FIPS build: rotate (recreate) your existing API tokens so they re-store as PBKDF2.
  • CSV/Parquet import callers: malformed files now return 400 instead of partially succeeding, and Parquet DECIMAL columns import as DOUBLE — review any automation that ignored import error responses.

How to update

# Docker Hub
docker pull basekicklabs/arc:26.06.2
 
# or GitHub Container Registry
docker pull ghcr.io/basekick-labs/arc:26.06.2

For binary installations, download 26.06.2 from the https://github.com/Basekick-Labs/arc/releases. For Kubernetes:

helm install arc https://github.com/basekick-labs/arc/releases/download/v26.06.2/arc-26.06.2.tgz

The full long-form release notes — with per-fix detail, the FIPS crypto boundary, the verify commands for every signed artifact, and the threat-model notes for each security fix — are in https://github.com/Basekick-Labs/arc/blob/release/26.06.2/RELEASE_NOTES_2026.06.2.md on the release branch.


Get started:

Questions? Discord or https://github.com/Basekick-Labs/arc/issues.

Ready to handle billion-record workloads?

Deploy Arc in minutes. Own your data in open files on your storage. Use for analytics, observability, AI, IoT, or data warehousing.

Get Started ->