Arc 26.02.2: Bulk Import, Query Governance, and the WAL Fix That Saved Your Data

Ten days since 26.02.1 and we're already shipping again. This release brings the feature I've been asked about the most since launch — bulk import — along with a wave of enterprise capabilities and a bug fix that some of you have been waiting for.

Let's get into it.

Bulk Import API (CSV & Parquet)

You have a 500MB CSV sitting on disk. Or a pile of Parquet files from a data pipeline. Until now, getting that data into Arc meant writing a script to stream it through the write API line by line.

Not anymore.

Arc now has dedicated import endpoints. Upload your file via multipart form data, Arc analyzes the time column to determine hourly partitions, converts each partition to Parquet with Snappy compression, and writes directly to your configured storage backend. The entire process bypasses the normal Arrow ingestion buffer, so bulk loads don't compete with streaming ingestion for resources.

# Import a CSV file
curl -X POST "http://localhost:8000/api/v1/import/csv?measurement=temperature&time_column=timestamp&time_format=epoch_s" \
  -H "Authorization: Bearer $ARC_TOKEN" \
  -H "x-arc-database: mydb" \
  -F "file=@sensors.csv"
 
# Import a Parquet file
curl -X POST "http://localhost:8000/api/v1/import/parquet?measurement=device_metrics&time_column=ts" \
  -H "Authorization: Bearer $ARC_TOKEN" \
  -H "x-arc-database: mydb" \
  -F "file=@telemetry.parquet"
 
# Check import stats
curl http://localhost:8000/api/v1/import/stats \
  -H "Authorization: Bearer $ARC_TOKEN"

CSV imports support custom delimiters, row skipping, and time format configuration (epoch seconds, milliseconds, microseconds, nanoseconds, or auto-detect). Both formats get automatic hourly partitioning, atomic writes per partition, and work across all storage backends — local, S3, and Azure Blob Storage.

No additional configuration needed. If your Arc instance is running, bulk import is ready to go.

Enterprise Features

This release is heavy on enterprise. Six new capabilities, all unlocked with a license key on the same binary.

Query Governance

Per-token rate limiting and query quotas. Sliding window algorithm, configurable per minute and per hour. Hourly and daily query limits.

When a client exceeds their limit, they get a 429 with a Retry-After header telling them exactly when to try again. No guessing, no ambiguity.

This is the kind of thing you need when you have multiple teams querying the same Arc cluster and one team's runaway dashboard is eating all the resources.

Audit Logging

Every authentication attempt, RBAC change, data operation, and administrative action is now logged to a structured SQLite audit trail. Events are categorized (auth.failed, token.created, data.write, rbac.role.updated, etc.) and queryable through the REST API.

You can filter by event type, actor, database, and time range. If compliance matters to your organization, this is the foundation for it.

Automatic Writer Failover

Arc Enterprise clusters now support automatic single-writer failover. If the active writer goes down, a standby node is promoted through Raft consensus in under 30 seconds. Health monitoring, candidate selection, and coordinated promotion — all handled automatically.

Cluster-Wide Core Limits

License core limits are now enforced across the entire cluster, not per node. The cluster leader validates that adding a new node won't exceed your licensed core count. Simple, fair, and predictable.

Long-Running Query Management

You can now track active queries in real-time, cancel runaway operations mid-flight, and review query history. Each query gets a unique ID with execution metadata. When that one analyst runs a SELECT * on 12 billion rows and you need to stop it, now you can.

Tiered Storage Migration Gate

The cold-tier migration system got smarter. It now only migrates daily-compacted files to cold storage (S3 GLACIER, Azure Archive), skipping raw and hourly-compacted files. Fewer objects in cold storage means lower costs and simpler management.

Bug Fixes

WAL Duplication (#199)

This one was painful. The periodic WAL maintenance process was replaying entries that had already been persisted to Parquet, causing 2x data duplication. If you noticed your data counts doubling after restarts, this was why.

The fix properly distinguishes between normal operation (purge old WAL files) and recovery scenarios (replay only when flush failures actually occurred). Your data is now written once, as it should be.

S3 Flush Hangs (#197)

Storage write operations had no timeout. When S3 became unresponsive, workers would block indefinitely. All writes now have a configurable timeout with a 30-second default. If S3 goes away, you get an error instead of a frozen process.

Empty Measurement Queries (#198)

Querying a measurement with no data returned HTTP 500 instead of an empty result set. The handler now properly distinguishes between "no files exist for this measurement" and "something actually broke."

Backfilled Data Compaction (#187)

Daily compaction was rejecting partitions from historical backfills because they failed a creation-time check. Partitions older than 7 days now bypass that validation, so backfilling months of historical data works correctly.

Timeout Mismatch (#185)

The HTTP WriteTimeout (30s) was shorter than the query timeout (300s), which meant long-running queries would get killed by the HTTP layer before the query engine timed them out. Arc now auto-synchronizes these values at startup.

Improvements

WHERE clause optimization. Queries combining LIKE with empty string checks now reorder predicates to filter the cheaper operation first. 12.6% latency improvement on affected query patterns.

Per-database compaction. You can now trigger compaction for a specific database instead of running it across all databases simultaneously. Useful when one database is massive and the others are small.

Upgrade Notes

Bulk import is available immediately with no configuration changes. The default payload size limit is 1GB — adjust if you need to import larger files.

Enterprise features require a license key. All tiers include a 14-day free trial. Contact enterprise@basekick.net to get started.

Get It

docker run -d \
  -p 8000:8000 \
  -e STORAGE_BACKEND=local \
  -v arc-data:/app/data \
  ghcr.io/basekick-labs/arc:latest

GitHub: github.combasekick-labs/archttps://github.com/basekick-labs/arc
Documentation: https://docs.basekick.net/arc
Discord: https://discord.gg/nxnWfUxsdm
Python SDK: https://pypi.org/project/arc-tsdb-client/

Analytical Database

Streaming

AI Memory

By industry

Explore

Read

Migrate from…

Forum

Source & Issues

Real-time chat