Developer preview

Let's get Ferrosa Memory running.

Ferrosa Memory is a persistent memory layer for your AI agents. You give it your code and docs once, and your agent can recall exactly what it needs later — instead of re-reading files and re-discovering context every session. In a few minutes you'll have it running locally and connected to your agent.

Three steps to memory

That's the whole journey. The dense reference for every option is below — but most people only need these three:

1 · Install & start

Run one setup script. You'll have Ferrosa Database and the Ferrosa Memory MCP server up locally.

2 · Connect your agent

Point your MCP client at the local memory server. Your agent gains memory tools instantly.

3 · Ingest & retrieve

Feed in knowledge, then ask for it back. The agent fetches the relevant pieces semantically.

Why it's worth a few minutes: ingest your whole codebase and docs once, and your agent retrieves the relevant pieces semantically instead of grepping, listing files, and re-reading whole files every session. That means fewer tokens and less latency — no re-grepping, no re-reading a large file just to find one function, no re-deriving context the agent already had. Memoization skips redundant LLM sub-calls, and session-restore lets the agent start with context instead of rebuilding it from scratch. It's still a developer preview, so run it locally and kick the tires.

Step 1 · Fast setup

Good news: you don't need to clone anything by hand. The hosted setup scripts do the heavy lifting for you.

# Ferrosa Database only
curl -fsSL https://ferrosadb.com/install.sh | bash

# Ferrosa Memory plus LLM onboarding
curl -fsSL https://ferrosadb.com/setup-memory.sh | bash

setup-memory.sh downloads ONBOARDING.md, optionally clones or updates the public repos, offers to pull the Nomic embedding model, and hands the user to their selected LLM harness with onboard me using ONBOARDING.md.

What you will run

Minimal native mode

One Ferrosa Database process using local filesystem storage plus one Ferrosa Memory MCP/workbench process.

Full Compose mode

Three local Ferrosa nodes plus optional MinIO/S3-compatible storage for cluster/operator testing.

Nomic embeddings

Optional nomic-embed-text-v2-moe via Ollama. If skipped, semantic/vector search is degraded.

Agent harnesses

Use ONBOARDING.md to configure Hermes, Claude, Codex, skills, hooks, hints, and prompts.

ServiceLocal endpoint
MCP/workbenchhttp://127.0.0.1:18765/
MCP JSON-RPChttp://127.0.0.1:18765/mcp
Vizhttp://127.0.0.1:18766/viz
CQL127.0.0.1:19042-19044
Graph HTTPhttp://127.0.0.1:17474-17476
Bolt127.0.0.1:17687-17689
MinIO, full stack onlyhttp://127.0.0.1:19000, console 19001

Manual source setup

1. Check prerequisites

rustc --version
cargo --version
git --version
curl --version
python3 --version
ollama --version              # optional semantic search layer
docker compose version        # full Docker stack only
podman compose version        # full Podman stack only

2. Optional Nomic embedding layer

ollama pull nomic-embed-text-v2-moe
ollama list | grep nomic-embed-text-v2-moe
If embeddings are skipped, lexical/phonetic search still works, but semantic/vector search quality is degraded.

Retrieval defaults

Ferrosa Memory keeps live agent turns compact while evaluation runs use wider candidate generation.

[retrieval]
default_limit = 10

[embeddings]
provider = "ollama"
model = "nomic-embed-text-v2-moe"
dimensions = 768

[eval]
retrieval_k = 25

default_limit = 10 is used when an agent omits limit or k. Lower it with the config MCP tool if memory results are consuming too much context. Eval runs should widen candidate generation in the harness instead of raising the live default.

For evaluation, the harness widens candidate generation rather than raising the live default — for example a wider candidate_limit, a full fusion profile, LLM query decomposition, and embedding variants:

candidate_limit=50
fusion_profile=all
query_decomposition=llm
query_variant_limit=5
query_embed_variants=true
chunk_expansion=none
rerank=false

Tune these in the eval harness for your own corpus; keep the live defaults compact so memory results don't crowd out an agent's working context.

Optional live judge/reranker

[judge]
enabled = false
provider = "ollama"
base_url = "http://127.0.0.1:11434"
model = "qwen2.5-coder:7b"
timeout_seconds = 60

Keep live judge disabled unless a local or remote model endpoint is available. Judge failures and no-decisions are recorded as abstentions, while agent/user feedback can use compact +1 and -1 item scores to tune future rankings by workspace, query shape, and retrieval channel.

3. Clone the public repositories, if contributing from source

mkdir -p ~/src/ferrosa-suite
cd ~/src/ferrosa-suite

git clone https://github.com/ferrosadb/ferrosa.git ferrosa
git clone https://github.com/ferrosadb/ferrosa-memory.git ferrosa-memory

4. Native minimal mode

cd ~/src/ferrosa-suite/ferrosa
cargo build --release

cd ~/src/ferrosa-suite/ferrosa-memory
cargo build --release

Use repository config examples or setup-memory.sh/ONBOARDING.md to choose ports, local data directories, credentials, embedding settings, skills, hooks, and harness prompts.

5. Full Compose development stack

cd ~/src/ferrosa-suite/ferrosa
docker build -t ferrosa-memory-node:latest .
# or: podman build -t ferrosa-memory-node:latest .

cd ~/src/ferrosa-suite/ferrosa-memory
docker compose up -d
# or: podman compose up -d

5. Verify health

curl -fsS http://127.0.0.1:18765/healthz/live && echo
curl -fsS http://127.0.0.1:18765/healthz/ready && echo
curl -fsS http://127.0.0.1:18766/viz | head -c 64 && echo

Expected output includes ok, ready, and an HTML doctype from the viz UI.

Step 2 · Connect an MCP client

Now hand your agent the keys. Configure an MCP HTTP server pointing at:

http://127.0.0.1:18765/mcp
Your agent won't be overwhelmed. The server progressively discloses its ~80 tools across two tiers — a small, focused default set shows up first, and the fuller toolbox is revealed only as it's needed. That keeps your agent's tool context small: fewer tool schemas loaded means fewer tokens spent and less chance of the agent reaching for the wrong tool.

Generic MCP server shape:

{
  "name": "ferrosa-memory",
  "transport": "http",
  "url": "http://127.0.0.1:18765/mcp",
  "headers": {
    "Authorization": "Basic <base64 username:password>"
  }
}
Use unique credentials and TLS/auth boundaries for shared deployments. Default local credentials are only appropriate for single-user development on loopback.

Smoke-test MCP with curl

curl -sS -u ferrosa_user:ferrosa_user \
  -H 'content-type: application/json' \
  -d '{"jsonrpc":"2.0","id":1,"method":"tools/list","params":{}}' \
  http://127.0.0.1:18765/mcp

curl -sS -u ferrosa_user:ferrosa_user \
  -H 'content-type: application/json' \
  -d '{"jsonrpc":"2.0","id":2,"method":"tools/call","params":{"name":"get_stats","arguments":{}}}' \
  http://127.0.0.1:18765/mcp

Step 3 · Ingest and retrieve

This is the payoff. Store a piece of knowledge once, then ask for it back in plain language — your agent gets the relevant memory without re-reading anything. Try these from your agent or the smoke-test tools above.

Ingest a memory entity

{
  "tool": "smart_ingest",
  "arguments": {
    "entity_name": "Project migration rule",
    "entity_type": "decision",
    "content": "Schema migrations should be additive by default so live memory clusters can be upgraded without deleting data."
  }
}

Retrieve related memory

{
  "tool": "hybrid_search",
  "arguments": {
    "query": "what migration rule did we decide on for live memory clusters",
    "limit": 5
  }
}

Record an evolving fact

{
  "tool": "write_temporal_fact",
  "arguments": {
    "entity_id": "<entity UUID from smart_ingest>",
    "fact_text": "Current rule: prefer additive migrations for live memory clusters."
  }
}

Search raw context segments

{
  "tool": "search_context_segments",
  "arguments": {
    "query": "migration policy additive",
    "limit": 5,
    "expand": {"prev": 1, "next": 1, "max_tokens": 2000}
  }
}

Workbench, hooks, and evals

The workbench at http://127.0.0.1:18765/ includes CQL, SPARQL, Datalog, graph/viz, aliases, rules, approvals, explanations, and Judge Config pages. Use the separate viz page at http://127.0.0.1:18766/viz to inspect linked entities and graph neighborhoods.

The onboarding flow can install Codex, Claude, and Hermes hooks. Hooks capture session turns, working directory metadata, and compact retrieval feedback so memories learned in a repository can be preferred when future agents work from the same directory.

cd ~/src/ferrosa-suite/ferrosa-memory
./setup.sh --harness auto

# or run only hook installation
python3 scripts/install-agent-hooks.py --harness auto --verify

Agents should call feedback after retrieval when they can judge returned items: 1 for useful, -1 for irrelevant or wrong, 0 for neutral, and "-" when a judge abstains or fails.

# deterministic harness smoke test
scripts/run-official-evals.sh --self-test

# tune MCP fusion/decomposition profiles
FMEM_EVAL_QUERY_DECOMPOSITION=llm \
FMEM_EVAL_QUERY_EMBED_VARIANTS=true \
scripts/run-fusion-ablations.sh

# start a resumable full-corpus BRIGHT-Pro MCP ingest
scripts/start-bright-pro-full-load.sh

The full-corpus loader writes heartbeat.json, progress.json, and load.log under diagnostics/eval-runs/... so long runs can be monitored or resumed without attaching to a terminal.

Operate safely

Stop and start without deleting data:

cd ~/src/ferrosa-suite/ferrosa-memory
docker compose stop
docker compose start
# or: podman compose stop && podman compose start

After rebuilding an image, recreate nodes one at a time and wait for health before moving on:

docker compose up -d --no-deps --force-recreate node1
Do not run down -v unless you intentionally want to delete persisted memory data.

Troubleshooting

SymptomCheck
live failsMCP container status, port mapping, HTTP vs HTTPS config.
ready failsCQL node health, auth, contact points.
Agent cannot list toolsMCP config path and harness restart/reload.
get_stats times outNode replay/OOM logs and CQL read timeouts.
Container MCP cannot reach CQLHost networking or container-routable contact points if config uses localhost.
cd ~/src/ferrosa-suite/ferrosa-memory
docker compose logs --tail=100 node1 node2 node3 ferrosa-memory-mcp

You're up — what next?

Nice work. Once memory is running and connected, here's where to go from here: