Getting Started with the Python SDK for Arc

Photo by Brecht Corbeel on Unsplash
We've been getting requests for a Python SDK since day one. "Love Arc, but I don't want to craft HTTP requests manually." Fair enough.
With the release of Arc 26.01.1, the official Python SDK is now available on PyPI as arc-tsdb-client. It gives you high-performance MessagePack ingestion, query responses directly into pandas or polars DataFrames, buffered writes with automatic batching, and the full management API for retention policies, tokens, and continuous queries.
Let's walk through it.
Installation
The SDK is available via pip or uv. Install the base package or add optional dependencies for DataFrame support:
# Base installation
pip install arc-tsdb-client
# With pandas support
pip install arc-tsdb-client[pandas]
# With polars support
pip install arc-tsdb-client[polars]
# Everything (pandas, polars, PyArrow)
pip install arc-tsdb-client[all]If you're using uv:
uv add arc-tsdb-client[all]Quick Start
First, create a client and connect to your Arc instance:
from arc_client import ArcClient
client = ArcClient(
host="localhost",
port=8000,
token="your-arc-token",
database="default"
)
# Verify the connection
info = client.auth.verify()
print(f"Connected to Arc. Token: {info.token_info.name}")That's it. You're connected.
Writing Data
Arc supports multiple ingestion formats. The SDK makes all of them straightforward.
MessagePack Columnar (Recommended)
This is the fastest way to write data—18M+ records per second with automatic gzip compression. Use write_columnar() for bulk ingestion:
client.write.write_columnar(
measurement="temperature",
columns={
"time": [1705660800000000, 1705660801000000, 1705660802000000],
"device": ["sensor_01", "sensor_02", "sensor_01"],
"location": ["warehouse_a", "warehouse_a", "warehouse_b"],
"value": [22.5, 23.1, 21.8]
},
database="iot"
)The columns dictionary maps column names to lists of values. All lists must have the same length. Timestamps are in microseconds since epoch.
InfluxDB Line Protocol
If you're migrating from InfluxDB or want compatibility with existing tools, use Line Protocol:
# Single line
client.write.write_line_protocol(
"temperature,device=sensor_01,location=warehouse_a value=22.5 1705660800000000000",
database="iot"
)
# Multiple lines
lines = """
temperature,device=sensor_01,location=warehouse_a value=22.5 1705660800000000000
temperature,device=sensor_02,location=warehouse_a value=23.1 1705660801000000000
temperature,device=sensor_01,location=warehouse_b value=21.8 1705660802000000000
"""
client.write.write_line_protocol(lines, database="iot")Note: Line Protocol timestamps are in nanoseconds, while MessagePack uses microseconds.
Writing DataFrames
If you already have data in pandas or polars, write it directly:
import pandas as pd
df = pd.DataFrame({
"time": pd.to_datetime(["2026-01-19 10:00:00", "2026-01-19 10:00:01"]),
"device": ["sensor_01", "sensor_02"],
"value": [22.5, 23.1]
})
client.write.write_dataframe(
df,
measurement="temperature",
time_column="time",
tag_columns=["device"],
database="iot"
)The SDK handles the conversion to MessagePack format automatically.
Buffered Writes
For high-throughput scenarios where you're writing records one at a time (like processing a stream), use buffered writes. The buffer batches records and flushes automatically:
with client.write.buffered(batch_size=10000, flush_interval=5.0) as buffer:
for record in incoming_records:
buffer.write(
measurement="events",
tags={"source": record.source, "type": record.event_type},
fields={"value": record.value, "count": record.count},
timestamp=record.timestamp
)
# Buffer automatically flushes on exitThis is useful when you're processing events in a loop and don't want to make an HTTP request for every single record.
Querying Data
Now let's read data back. The SDK supports multiple response formats depending on your use case.
JSON Response
The simplest option—returns a dictionary with columns and data:
result = client.query.query(
"SELECT * FROM temperature WHERE time > NOW() - INTERVAL '1 hour' LIMIT 10",
database="iot"
)
print(result["columns"]) # ['time', 'device', 'location', 'value']
print(result["data"]) # List of rowsPyArrow Table
For zero-copy data interchange with other Arrow-based tools:
table = client.query.query_arrow(
"SELECT * FROM temperature WHERE device = 'sensor_01' ORDER BY time DESC LIMIT 1000",
database="iot"
)
print(table.num_rows)
print(table.column_names)Arrow IPC delivers around 5.2M rows/sec—use this when performance matters.
pandas DataFrame
Query directly into a DataFrame for analysis:
df = client.query.query_pandas(
"""
SELECT
time_bucket(INTERVAL '1 hour', time) as hour,
device,
AVG(value) as avg_temp,
MAX(value) as max_temp
FROM temperature
WHERE time > NOW() - INTERVAL '24 hours'
GROUP BY hour, device
ORDER BY hour DESC
""",
database="iot"
)
print(df.head())polars DataFrame
Same thing, but with polars:
pl_df = client.query.query_polars(
"SELECT * FROM temperature WHERE location = 'warehouse_a' LIMIT 1000",
database="iot"
)
print(pl_df.head())List Measurements
See what tables exist in a database:
measurements = client.query.list_measurements(database="iot")
for m in measurements:
print(f"{m.measurement}: {m.file_count} files, {m.total_size_mb:.1f} MB")Complete Example
Here's a full script that writes sensor data and queries it back:
from arc_client import ArcClient
import time
# Connect
client = ArcClient(
host="localhost",
port=8000,
token="your-arc-token",
database="sensors"
)
# Write some data
print("Writing sensor data...")
client.write.write_columnar(
measurement="temperature",
columns={
"time": [
int(time.time() * 1_000_000) - 3_000_000,
int(time.time() * 1_000_000) - 2_000_000,
int(time.time() * 1_000_000) - 1_000_000,
int(time.time() * 1_000_000),
],
"device": ["sensor_01", "sensor_02", "sensor_01", "sensor_02"],
"location": ["floor_1", "floor_1", "floor_2", "floor_2"],
"celsius": [21.5, 22.0, 20.8, 21.2],
}
)
print("Data written.")
# Query it back
print("\nQuerying recent data...")
df = client.query.query_pandas(
"""
SELECT
time,
device,
location,
celsius
FROM temperature
ORDER BY time DESC
LIMIT 10
"""
)
print(df)
# Aggregate by location
print("\nAverage temperature by location:")
df_agg = client.query.query_pandas(
"""
SELECT
location,
AVG(celsius) as avg_temp,
COUNT(*) as readings
FROM temperature
GROUP BY location
"""
)
print(df_agg)Save this as arc_example.py and run it:
python arc_example.pyOutput:
Writing sensor data...
Data written.
Querying recent data...
time device location celsius
0 2026-01-19 16:32:53.165162+00:00 sensor_02 floor_2 21.2
1 2026-01-19 16:32:52.165162+00:00 sensor_01 floor_2 20.8
2 2026-01-19 16:32:51.165162+00:00 sensor_02 floor_1 22.0
3 2026-01-19 16:32:50.165161+00:00 sensor_01 floor_1 21.5
Average temperature by location:
location avg_temp readings
0 floor_1 21.75 2
1 floor_2 21.00 2
Async Support
All operations have async variants via AsyncArcClient. If you're building async applications:
from arc_client import AsyncArcClient
import asyncio
async def main():
async with AsyncArcClient(host="localhost", token="your-token", database="sensors") as client:
# Query the data we wrote earlier
df = await client.query.query_pandas(
"SELECT * FROM temperature ORDER BY time DESC LIMIT 10"
)
print(df)
asyncio.run(main())What's Next
This is version 1.0 of the SDK. It covers the core workflows—writing data, querying, and basic management operations. But we're just getting started.
We'd love to hear from you:
- What's working well? Let us know what you're building.
- What's missing? Features you need that aren't there yet.
- What's broken? Bugs, edge cases, confusing APIs.
The SDK is open source. If you want to contribute, check out the repo:
https://github.com/Basekick-Labs/arc-client-python
Open an issue, submit a PR, or just star the repo if you find it useful.
Resources
- PyPI: pypi.org/project/arc-tsdb-client
- GitHub: https://github.com/Basekick-Labs/arc-client-python
- Arc Documentation: docs.basekick.net/arc
- Discord: discord.gg/nxnWfUxsdm
Questions? Drop by the Discord or reach out on Twitter.
Ready to handle billion-record workloads?
Deploy Arc in minutes. Own your data in Parquet.