Aerospace Data Has a Storage Problem, Not a Speed Problem

#Arc#aerospace#satellite telemetry#Voyager#Parquet#self-hosted#data ownership#storage#Aerospace Week
Cover image for Aerospace Data Has a Storage Problem, Not a Speed Problem

Aerospace Week, Day 1 of 5. All this week (May 25–29, 2026) we're publishing a daily post on the real data problems aerospace teams face, and how Arc helps. Today: storage, ownership, and longevity. Tomorrow: The Ground Station Stack Has Too Many Moving Parts.

In November 2023, NASA's Voyager 1 started sending home gibberish.

The spacecraft was 15 billion miles away, running code written in the 1970s, and its telemetry unit had jammed into a repeating loop of unintelligible binary. No science data. No engineering data. Just noise from the most distant human-made object in existence.

The fix took months. Engineers had to remotely relocate 46-year-old code around a failed memory chip, sending the patch on a 22.5-hour one-way trip into interstellar space and waiting another 22.5 hours to find out if it worked. By June 2024, Voyager 1 was transmitting science data from all four instruments again.

Think about what made that repair possible. The engineers who saved Voyager 1 weren't the engineers who built it. The data they needed to diagnose the spacecraft had been collected and structured decades earlier, by people who were retired or gone. They could read it anyway. The telemetry outlived the team.

That's the real aerospace data problem. And almost nobody sells you on it.

Everyone sells speed. Speed is table stakes.

Every telemetry vendor leads with the same number: writes per second. Millions of them. Real-time dashboards. Ingest as fast as the sensors fire.

Fine. Arc does ~20 million records per second on a single binary. The competition posts big ingest numbers too. If you're choosing a telemetry database in 2026 on write throughput alone, you're solving a problem that was solved years ago.

The problem nobody puts on the landing page is what happens to that data after you write it, over the 10, 20, 30 years a serious aerospace program actually runs.

Flight programs outlive databases

A commercial aircraft program runs 20+ years. A defense program can run longer. A satellite stays in orbit for its entire operational life, and the ground archive lives longer still. Voyager has been transmitting for 48 years.

That telemetry never gets thrown away. It can't. You need it for certification. For failure analysis. For the anomaly that surfaces three years into the program and sends you back through two years of historical data to find a root cause. You need it readable by engineers who weren't in the room when it was recorded.

So the question that actually matters isn't "how fast can I write it." It's "how much does it cost to keep, who owns it, and can someone still read it in twenty years."

That's where most telemetry stacks quietly fall apart.

The three costs nobody quotes you up front

Storage cost at retention scale. Ingest is a one-time event. Retention is forever. When you're holding years of high-rate telemetry, the storage bill is the bill. Vendors that keep your data locked in their engine, on their managed cloud, charge you to hold it, and charge you again to get it back. The number that matters isn't writes per second. It's dollars per terabyte per month, times the life of the program.

Lock-in through format. If your telemetry lives in a proprietary storage format, you don't own your data. You own a dependency on the one vendor that can read it. When the contract renews, when pricing changes, when an agency audit wants the raw data in an open format, "it's in our proprietary store" is not an answer you want to give. Voyager's engineering data was readable 48 years on because the structure was understood, not trapped behind a dead vendor's software.

The egress tax. The cloud-hosted telemetry model has a trap built in: cheap to put data in, expensive to take it out. By the time you've stored years of flight telemetry, moving it costs real money. That's not an accident. That's the business model.

What Arc does differently

Arc writes your telemetry to open Apache Parquet on storage you own. Your S3 bucket. Your Azure account. Your on-prem MinIO. Your sovereign cloud.

The data is yours, in an open format, on infrastructure you control. Read it from Arc. Read it from DuckDB, Spark, Snowflake, pandas, or anything else that speaks Parquet. There is no proprietary layer between you and your own telemetry.

If Arc disappeared tomorrow, you'd still have every byte, in a format the entire data ecosystem can read. No egress tax. No format hostage situation. No "let us export that for you." Walk away anytime. That's the point.

And yes, it's fast: ~20M records/sec ingest, 9.2M rows/sec queried, sub-second dashboards on billion-row datasets after compaction. Speed is in the box. It just isn't the thing that should keep you up at night when you're signing a contract for data you'll hold for two decades.

Try it in 30 seconds

Arc is a single Go binary. No cluster, no dependencies, no three weekends of YAML. Spin it up, point your telemetry at it, query with standard SQL.

# Run Arc (Docker)
docker run -d -p 8000:8000 \
  -e STORAGE_BACKEND=local \
  -v arc-data:/data \
  ghcr.io/basekick-labs/arc:latest
 
# Write telemetry (InfluxDB Line Protocol; point Telegraf or your existing pipeline here)
curl -X POST http://localhost:8000/api/v1/write/line-protocol \
  -H "Authorization: Bearer $ARC_TOKEN" \
  -H "Content-Type: text/plain" \
  -H "x-arc-database: default" \
  --data-binary "telemetry,craft=voyager1,subsystem=power bus_voltage=22.4"
 
# Query it back with SQL
curl -X POST http://localhost:8000/api/v1/query \
  -H "Authorization: Bearer $ARC_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"sql":"SELECT * FROM telemetry WHERE craft = '\''voyager1'\'' ORDER BY time DESC LIMIT 10"}'

Point it at object storage you own and that same telemetry lands as open Parquet you control, not a proprietary file only one vendor can read.

The question to actually ask

When you evaluate a telemetry database for an aerospace program, don't stop at the ingest benchmark. Ask the questions that matter over the life of the program:

  • What does it cost to retain this data for 10, 15, 20 years?
  • What format is my data stored in, and can I read it without this vendor?
  • If I want to leave, what does it cost to take my data with me?
  • Can I deploy on-prem or in a sovereign environment, or am I forced onto someone else's cloud?

Speed gets you through the demo. Ownership gets you through the program.

Voyager's still talking, 48 years on. Build like your data has to last that long, because in aerospace, it does.


Arc is a high-performance columnar database. Open Parquet on storage you own, single Go binary, production-ready in 30 seconds.

Tomorrow, Aerospace Week Day 2: The Ground Station Stack Has Too Many Moving Parts.

Ready to handle billion-record workloads?

Deploy Arc in minutes. Own your data in open files on your storage. Use for analytics, observability, AI, IoT, or data warehousing.

Get Started ->