← Ferrosa Suite home · Database home

Migration Guide

Move from Apache Cassandra to Ferrosa with the same CQL protocol, the same drivers, and the same LWT semantics. Most core application paths migrate without code changes; a small set of features — GROUP BY, PER PARTITION LIMIT, Java UDFs, and server-push schema events — require changes described below.

Developer Preview: Ferrosa is in active development. Run representative dual-read and failure-recovery tests before migrating production traffic.

Migration Overview

Ferrosa is designed for Cassandra-compatible protocol access where supported, with migrations validated incrementally against your schema, drivers, and workload. The migration strategy is incremental:

Start a Ferrosa node alongside your existing Cassandra cluster
Point a test workload at Ferrosa to validate compatibility
Import existing SSTables directly (Ferrosa reads Cassandra BTI format)
For migration evaluations, move keyspaces one at a time with dual-read verification
Decommission Cassandra nodes as Ferrosa proves stable

No driver changes required: Ferrosa speaks the CQL native protocol (negotiated at v4) with SASL authentication, LZ4/Snappy compression, and prepared statement support. Your Python, Java, Go, Node.js, C#, and Rust drivers connect without modification. Driver compatibility is verified by automated smoke tests on every CI build — see tests/drivers/ and .github/workflows/driver-tests.yml.

What's Compatible

Feature	Status	Notes
CQL protocol v4	Compatible	All 16 opcodes; negotiation capped at v4 (v5 STARTUP falls back to v4)
SASL authentication	Compatible	PasswordAuthenticator flow
Frame compression	Compatible	LZ4, Snappy
Prepared statements	Compatible	W-TinyLFU cache
Unlogged batches	Compatible	Unlogged and counter batches
Logged batches	Compatible	Atomic via commit-log group write (single-node); 3-phase batchlog protocol (cluster). No batchlog table in single-node mode — equivalent crash-recovery via commit log.
system_schema.*	Compatible	All standard tables
system.local	Compatible	Including tokens column
Murmur3Partitioner	Compatible	Same token distribution
BTI SSTables	Compatible	Read Cassandra 5.x SSTables directly
Consistency levels	Compatible	ONE, TWO, THREE, QUORUM, ALL, LOCAL_ONE, LOCAL_QUORUM, EACH_QUORUM
cqlsh	Compatible	Tested with Cassandra cqlsh
Hinted handoff	Compatible	Stores hints for down nodes, replays on recovery
Node lifecycle	Compatible	Join and decommission via `ferrosa-ctl`
Token rebalancing	Compatible	Operator-triggered via `ferrosa-ctl rebalance`
Secondary indexes	Compatible	8 index types including vector (HNSW, IVFFlat)
UDTs	Supported	CREATE/ALTER/DROP TYPE implemented
Materialized views	Not yet	Planned
LWT	Compatible	IF NOT EXISTS, IF conditions, batch CAS — Accord protocol (strict serializable)
Transactions	Supported	BEGIN TRANSACTION / COMMIT / ROLLBACK — multi-partition atomic operations
Gossip protocol	Replaced	Ferrosa uses Raft for metadata (not wire-compatible)
Internode protocol	Replaced	Custom binary protocol (not wire-compatible)

Important: Ferrosa targets CQL client compatibility for common driver paths but not cluster-compatible (a Ferrosa node cannot join an existing Cassandra ring). Migration is done by importing data, not by mixed-version rolling upgrades.

Step-by-Step Migration

Audit your schema and queries

Export your Cassandra schema and check for unsupported features. Ferrosa supports all standard CQL types, partition/clustering keys, table options, UDTs, WASM UDFs, and lightweight transactions. The following require code changes:

Materialized views — not supported; rewrite against base tables.
GROUP BY / PER PARTITION LIMIT — not parsed; restructure queries.
Java or JavaScript UDFs — rejected; recompile to WebAssembly.
Schema-change EVENT push — REGISTER accepted but EVENT frames never sent; configure drivers to poll for schema changes if needed.

# Export schema from Cassandra
cqlsh -e "DESCRIBE SCHEMA" > schema.cql

# Check for unsupported features
grep -iE "MATERIALIZED VIEW|GROUP BY|PER PARTITION LIMIT|LANGUAGE (java|javascript)" schema.cql

Start a Ferrosa node

Run Ferrosa alongside your existing cluster. It doesn't need to join the Cassandra ring.

# Build and start
cargo build --release
FERROSA_CQL_BIND=0.0.0.0:9042 \
FERROSA_AUTH_DISABLED=true \
./target/release/ferrosa

Apply your schema

Run your exported schema against Ferrosa. Remove any unsupported objects first.

# Apply schema to Ferrosa
cqlsh ferrosa-host 9042 -f schema.cql

Import data

Two options: SSTable import (fastest for large datasets) or CQL COPY/INSERT (simpler for smaller datasets).

# Option A: CQL COPY (simple, works for moderate datasets)
# Export from Cassandra
cqlsh cassandra-host -e "COPY social.users TO '/tmp/users.csv'"

# Import to Ferrosa
cqlsh ferrosa-host -e "COPY social.users FROM '/tmp/users.csv'"

# Option B: SSTable import (see SSTable Import section below)

Validate with dual reads

Run your application against both Cassandra and Ferrosa, comparing results. See the dual-read verification section below.

Cut over

Once validation passes, update your driver contact points from Cassandra to Ferrosa.

# Your application code — just change the host
# Before:
# cluster = Cluster(['cassandra-node-1.prod'])

# After:
cluster = Cluster(['ferrosa-node-1.prod'])

SSTable Import

Ferrosa's ferrosa-sstable crate reads Cassandra BTI (Big Trie-Indexed) SSTables natively — the default format in Cassandra 5.x. This means you can import existing SSTable files directly without an intermediate conversion step.

What Ferrosa reads

A BTI SSTable consists of 7 component files:

*-Data.db — row data
*-Partitions.db — partition index (on-disk trie)
*-Rows.db — row-level index
*-Filter.db — Bloom filter
*-CompressionInfo.db — compression metadata
*-Statistics.db — SSTable statistics
*-TOC.txt — table of contents

Copy these files from your Cassandra data directory to Ferrosa's data directory, organized by keyspace and table. Ferrosa will pick them up on next read.

Note: SSTable import is currently a manual file-copy process. A dedicated ferrosa-import CLI tool with validation and progress reporting is planned for a future release.

Compression support

Ferrosa supports LZ4 and Zstd compressed SSTables. No decompression step needed — compressed SSTables are read directly.

Big format (pre-5.x): If you're running Cassandra 4.x or earlier with the Big SSTable format, you'll need to upgrade to Cassandra 5.x first (which converts to BTI), or use nodetool upgradesstables to force conversion. Big format read support is planned for a future Ferrosa release.

Dual-Read Verification

Before considering any production cutover, run dual reads against both databases and compare results:

from cassandra.cluster import Cluster

# Connect to both
cass = Cluster(['cassandra-host']).connect('social')
ferro = Cluster(['ferrosa-host']).connect('social')

# Compare results
query = "SELECT * FROM users WHERE user_id = ?"
stmt_c = cass.prepare(query)
stmt_f = ferro.prepare(query)

for uid in sample_user_ids:
    row_c = cass.execute(stmt_c, [uid]).one()
    row_f = ferro.execute(stmt_f, [uid]).one()
    assert row_c == row_f, f"Mismatch for {uid}"

Run this against a representative sample of your real queries. Cover:

Point reads by partition key
Range scans on clustering columns
Collection types (list, set, map)
Prepared statements with bind variables
Batch operations
Lightweight transactions (INSERT IF NOT EXISTS, UPDATE IF)

S3 Storage Setup

For production-style evaluations, configure S3 or an S3-compatible service as the durable storage backend. Local disk acts as a hot cache.

# AWS S3 with IAM instance profile (no explicit keys needed)
FERROSA_S3_ENDPOINT=https://s3.amazonaws.com \
FERROSA_S3_BUCKET=my-ferrosa-data \
FERROSA_S3_REGION=us-east-1 \
FERROSA_DATA_DIR=/var/lib/ferrosa \
./target/release/ferrosa

# MinIO for local development
FERROSA_S3_ENDPOINT=http://localhost:9000 \
FERROSA_S3_BUCKET=ferrosa \
FERROSA_S3_ACCESS_KEY_ID=minioadmin \
FERROSA_S3_SECRET_ACCESS_KEY=minioadmin \
FERROSA_S3_ALLOW_HTTP=true \
./target/release/ferrosa

S3-compatible providers

Ferrosa works with any S3-compatible object store. Set FERROSA_S3_ENDPOINT for non-AWS providers:

Provider	Endpoint
AWS S3	(default — no endpoint needed)
MinIO	`http://minio:9000`
Cloudflare R2	`https://<account>.r2.cloudflarestorage.com`
DigitalOcean Spaces	`https://<region>.digitaloceanspaces.com`
Backblaze B2	`https://s3.<region>.backblazeb2.com`

Storage cost comparison

With S3-backed storage, you trade some local disk and snapshot costs for object storage, request costs, lifecycle policy, and restore behavior. The table below is illustrative only; benchmark and price your own workload before making migration decisions:

Scale	EBS (gp3, 3 replicas)	S3 Standard	Savings
1 TB	varies by provider	model from S3 + requests	workload-dependent
10 TB	depends on RF + snapshots	depends on lifecycle policy	benchmark required
100 TB	depends on retention	depends on cache misses	benchmark required
1 PB	depends on topology	depends on restore SLA	benchmark required

S3 request costs (GET/PUT), restore latency, and cache misses matter for high-throughput workloads. The local NVMe cache is intended to absorb hot reads, and write-behind uploads batch SSTables to amortize PUT costs, but the preview docs should be treated as a model to validate rather than a guaranteed cost outcome.

Key Differences from Cassandra

Area	Cassandra	Ferrosa
Cluster membership	Gossip protocol	Raft (openraft) for metadata consensus
Internode protocol	Cassandra messaging	Custom binary protocol with 3 priority lanes, PSK auth
Storage durability	Local disk (EBS/SSD)	S3 (local NVMe as cache)
Node recovery	Stream from replicas (hours)	Read from S3 (seconds)
GC pauses	JVM stop-the-world	None (Rust, no GC)
Observability	JMX + nodetool	CQL virtual tables + Prometheus + TUI + Web console
Real-time pub/sub	CDC (requires Kafka/Debezium)	Experimental SUBSCRIBE syntax with EVERY/DELTA modes
Graph queries	Not supported	Cypher via HTTP/JSON on the same tables
Transactions	LWT via Paxos (Accord in 5.x)	LWT via Accord (strict serializable, 1-RTT fast path)
SSTable format	Big + BTI	Reads BTI, writes BTI (native format planned)
Commit log	Standard	CAS-allocated segments, 3 sync modes, built-in CDC

LWT & Transaction Migration

Lightweight transactions

Ferrosa fully supports Cassandra's lightweight transaction (LWT) syntax. Your existing LWT queries work without modification:

INSERT ... IF NOT EXISTS — conditional inserts
UPDATE ... IF condition — compare-and-set updates
DELETE ... IF EXISTS / DELETE ... IF condition — conditional deletes
Batch CAS — BEGIN BATCH with IF conditions across statements
SERIAL and LOCAL_SERIAL consistency levels

Under the hood, Ferrosa uses the Accord consensus protocol instead of Paxos. This is transparent to your application — the CQL syntax and semantics are identical. Accord provides strict serializability with better performance characteristics: 1-RTT fast path via leaseholder, compared to Paxos's minimum 2-RTT.

Temporal compatibility: Ferrosa implements the LWT patterns used by Temporal's Cassandra persistence path. If you run Temporal on Cassandra with LWT-based persistence, evaluate Ferrosa against the same INSERT IF NOT EXISTS, conditional UPDATE IF, and batch CAS semantics, with expected configuration changes limited to the Cassandra contact points.

Beyond LWT: multi-statement transactions

Ferrosa also supports explicit multi-statement transactions for operations that go beyond what LWT can express:

-- Atomic multi-partition operation
BEGIN TRANSACTION
  UPDATE accounts SET balance = balance - 100
    WHERE id = 'acct-1' IF balance >= 100;
  UPDATE accounts SET balance = balance + 100
    WHERE id = 'acct-2';
COMMIT TRANSACTION;

This is a Ferrosa extension — not available in Apache Cassandra. Use it for new functionality after migration, or to replace application-level two-phase patterns.

Rollback Plan

If you need to roll back to Cassandra:

Keep Cassandra running during the migration period — don't decommission until you're confident
Your application code is unchanged — rolling back is just changing the contact point back to Cassandra
Export data from Ferrosa using cqlsh COPY if you need to sync writes that went only to Ferrosa
No schema changes needed — your schema is the same on both systems

Low risk: Because Ferrosa uses the same CQL protocol and the same drivers, rollback is a configuration change, not a code change. Keep both systems running during validation and cut over only when you're confident.