Developer Preview Ferrosa Memory 0.15 — structured, linked, auditable memory for agentic systems.
Usage guide · 0.15 · Developer Preview

How to use it.

A hands-on guide to putting Ferrosa Memory to work in a real agent loop — ingest your codebase once, retrieve instead of grepping, link what relates, record decisions as they evolve, and forget on purpose.

Start with the loop Ingest your codebase
The core loop

Five verbs you'll use every session.

Day-to-day, Ferrosa Memory comes down to a short loop. Ingest what you learn, retrieve before you do work, link facts that relate, record facts as they change, and forget deliberately when something is wrong or stale. The MCP tool names map straight onto these verbs.

Ingest smart_ingest Store an insight, decision, or relationship. The server decides create / update / supersede / skip.
Retrieve hybrid_search Ask in natural language; get a fused, ranked answer before you grep or read files.
Record write_temporal_fact Note a value that changes over time. The old one is superseded, not erased.
Forget forget Propose candidates, review the blast radius, then confirm. Reversible by default.

Ingest — store what you learned

{
  "tool": "smart_ingest",
  "arguments": {
    "entity_name": "Auth token rotation",
    "entity_type": "decision",
    "content": "Access tokens rotate every 15 min; refresh tokens are single-use and stored hashed."
  }
}

Store insights, decisions, relationships, and facts — not raw file contents. smart_ingest deduplicates against what's already there and supersedes when it recognizes an update.

Link — connect what relates

{
  "tool": "create_edge",
  "arguments": {
    "from": "Auth token rotation",
    "to": "Session store schema",
    "edge_type": "depends_on"
  }
}

After learning two related facts, link them. Edge types include depends_on, contains, part_of, related_to, calls, implements, uses, and references.

Retrieve — ask before you grep

{
  "tool": "hybrid_search",
  "arguments": {
    "query": "how do we rotate auth tokens",
    "limit": 5
  }
}

If hybrid_search returns what you need, you're done — no grep, no find, no re-reading files. For document-chunk hits, call get_chunk_context to expand around a match and recover the surrounding lines.

Record — facts that evolve

{
  "tool": "write_temporal_fact",
  "arguments": {
    "entity_id": "<entity UUID from smart_ingest>",
    "fact_text": "Rotation window changed from 15 min to 10 min."
  }
}

Temporal facts are timestamped and superseded rather than overwritten. Retrieval returns the latest value by default; the full chain stays inspectable for audit.

Forgetting is two-phase and safe to use day-to-day: call forget with a query to propose candidates and see each one's blast radius, then call it again with confirm: true and the IDs you approved. The default mode is a reversible retraction (restore with restore_forgotten); a hard delete is permanent. Never confirm on a user's behalf.

Ingest your codebase. Stop grepping.

The single most useful habit: ingest your whole codebase and docs into memory once, then let the agent retrieve the relevant functions and files semantically — instead of grepping, listing directories, and re-reading whole files at the start of every session.

1

Ingest the repo and its docs — once

Walk the source tree and feed files through smart_ingest (or batch_ingest for bulk). Documents are stored as a document → section → chunk hierarchy with semantic previous/next links, so a single hit can be expanded back into full surrounding context. Re-run only on the files that changed to keep memory fresh — you don't re-ingest the world every session.

2

Retrieve functions and files semantically

Next session, the agent asks hybrid_search "where do we validate refresh tokens?" and gets the exact function back — fused from vector, lexical (BM25), phonetic, and graph signals. No directory listing, no full-file re-read to locate one symbol. When a chunk hit needs its neighbors, get_chunk_context pulls the adjacent chunks.

3

Reuse work the agent already did

Memoization caches the results of completed, deterministic sub-calls by content hash. When the same sub-task recurs, the prior result comes back from cache instead of paying for a redundant LLM call. And session-restore hooks mean the agent opens each session already knowing what it was doing — it doesn't re-derive context it computed last time.

Without memory — every session

  • Grep the tree to find where a function lives.
  • List directories to rebuild a mental map.
  • Re-read a 2,000-line file to locate one function.
  • Re-derive context and decisions computed last time.
  • Repeat the same LLM sub-calls on the same inputs.

With memory — after one ingest

  • Ask in natural language; get the relevant function back.
  • Follow typed edges instead of re-reading to map structure.
  • Expand one chunk's neighbors instead of a whole file.
  • Open the session already knowing the prior plan.
  • Reuse cached sub-call results instead of recomputing.
Why this saves tokens, honestly: every grep result, directory listing, and full-file re-read is text that lands in the model's context window and is paid for on every turn it stays there. Retrieving a ranked, scoped answer puts only the relevant lines in context. Memoization skips redundant LLM sub-calls entirely, and session-restore avoids re-deriving context that was already computed. The mechanism is "load less, recompute less" — the exact savings depend on your repo, your queries, and your harness.
Progressive tool disclosure

A focused toolbox first. The rest when you need it.

Ferrosa Memory exposes around 80 MCP tools — but it doesn't hand them all to the model at once. They're organized into two tiers and disclosed progressively: a small, focused everyday set is exposed by default, and the fuller toolbox is revealed when the task reaches for it.

21

Tier 1 — default

The everyday memory loop: smart_ingest and hybrid_search, typed edges, context expansion, triggered intentions, durable session tasks, feedback, stats, runtime config, and forget. This is what the agent sees first.

58

Tier 2 — on demand

Batch work, memo & plan tracking, the full intention lifecycle, trajectory folds, bi-temporal facts, stored skills, consolidation and graph inference, and the governance plane — surfaced by passing include_all: true on tools/list.

To reveal Tier 2, your client sends include_all: true on the MCP tools/list request; the all_tools tool also enumerates the full catalog from inside a session. The agent effectively "graduates" to more tools as the work demands them.

// default — focused everyday set
{ "method": "tools/list", "params": {} }

// reveal the full toolbox when the task needs it
{ "method": "tools/list", "params": { "include_all": true } }
🪶

Fewer tokens spent on tool schemas

Every tool definition the model sees is text in its context window. Loading a focused default set instead of all ~80 schemas means fewer tokens spent on tool definitions before the agent has done any work.

🎯

Less chance of the wrong tool

A shorter, focused list is easier for the model to choose from correctly. The advanced and operator tools stay out of the way until a task actually calls for batch work, folds, or the governance plane — then they're one flag away.

A few shapes that show up again and again.

The same loop supports very different jobs. Here are three concrete ones.

coding agent

A codebase that persists across sessions

Ingest the repo and docs once, and record decisions as they're made. Next session, the agent runs check_intentions and a hybrid_search at startup, opening with what it already knows — the codebase map, the open decisions, and where it left off — before reading a single file.

research · notes

Notes that accumulate into knowledge

Ingest papers, articles, and your own notes as you go; link findings with create_edge so related ideas connect instead of piling up. Background consolidation surfaces relationships between notes that arrived separately, and hybrid_search recalls them by meaning later — not just by the words you happened to use.

team · workgroup

One shared memory for a team

Point several agents or teammates at the same Ferrosa Memory instance and they share one graph: a decision recorded by one is retrievable by all. Durable session tasks — a focus stack, working set, and recovery hints — survive restarts and support handoff, so an interrupted agent (or a colleague's) can pick up where the last left off.

For shared or team deployments, use unique credentials and proper TLS/auth boundaries — the default local credentials are only appropriate for single-user development on loopback. See Setup for connecting an MCP client.
Good habits

Five habits that make memory pull its weight.

Memory is only as useful as the loop around it. These habits keep recall sharp and the graph trustworthy.

🧭

Set up once

Install the server, connect your MCP client, and wire the session hooks so recall and capture happen automatically. The Setup guide covers ports, embeddings, credentials, and harness hooks end to end.

🔬

Understand the internals

Want to know how the fused ranking, bi-temporal facts, dream-cycle consolidation, and auditable forgetting actually work? The How it works page walks the architecture layer by layer.

Set it up, then put it to work.

Start by ingesting one repo and asking it a question. Then add the session hooks so the loop runs itself.

Get started Browse the tools