Run semantic similarity search over embeddings, and choose the index strategy that fits your storage and latency budget. This tutorial creates a secondary BTree index, a full-precision HNSW vector index, and a quantized HVQ vector index — then runs the same nearest-neighbour query against each.
|
Note
|
This tutorial assumes you have a running Ferrosa node. If you haven’t set one up yet, follow the 3-Node Cluster Setup guide first — it only takes about 5 minutes. |
The three index strategies
Ferrosa gives you three index strategies through plain CQL:
| Strategy | When to use it |
|---|---|
BTree secondary index |
Exact-match lookups on a scalar column ( |
HNSW vector index |
Full-precision approximate nearest-neighbour search. Highest recall; stores every vector in a navigable graph sidecar. |
HVQ vector index |
Hybrid vector quantization (Q8/Q4 codes plus a staged exact-rerank tier). Near-HNSW recall while reading far fewer bytes per query — the right default when the index is larger than memory or lives in object storage. |
HVQ is selected with a single option on an otherwise ordinary vector index:
USING 'vector' WITH OPTIONS = {'method': 'hvq'}. Omit the option (or pass
'method': 'hnsw') and you get the full-precision HNSW path.
Create the schema
We model a small article catalogue with a 4-dimensional embedding and a scalar
category column. To compare HNSW and HVQ on identical data we create two
tables of the same shape — one indexed each way — and add a BTree index on the
scalar column.
-- Vector index strategies: BTree + HNSW + HVQ on the same data.
--
-- Ferrosa exposes three index strategies a user can reach from CQL:
-- 1. A secondary BTree index for exact-match lookups on a scalar column.
-- 2. A full-precision HNSW vector index (the default for USING 'vector').
-- 3. A quantized HVQ vector index (USING 'vector' WITH OPTIONS={'method':'hvq'}),
-- which stores Q8/Q4 codes plus a staged exact-rerank tier for far less
-- storage and bytes-read per query at a small recall cost.
--
-- HNSW and HVQ are demonstrated on two identically-shaped tables so the same
-- ANN query can run against each.
CREATE KEYSPACE IF NOT EXISTS semantic
WITH replication = {
'class': 'SimpleStrategy',
'replication_factor': 1
};
USE semantic;
-- Full-precision HNSW table.
CREATE TABLE articles_hnsw (
id int PRIMARY KEY,
category text,
title text,
embedding vector<float, 4>
);
-- Identically-shaped table whose vector index uses the quantized HVQ method.
CREATE TABLE articles_hvq (
id int PRIMARY KEY,
category text,
title text,
embedding vector<float, 4>
);
-- 1) Secondary BTree index on a scalar column for exact-match lookups.
CREATE INDEX articles_category_idx ON articles_hnsw (category);
-- 2) Default vector index: full-precision HNSW navigable graph.
CREATE INDEX articles_hnsw_ann ON articles_hnsw (embedding) USING 'vector';
-- 3) Quantized vector index: hybrid vector quantization (Q8/Q4 + staged rerank).
CREATE INDEX articles_hvq_ann ON articles_hvq (embedding) USING 'vector'
WITH OPTIONS = {'method': 'hvq'};
Load sample data
The six articles fall into two clear clusters of the embedding space: a
science cluster near [0.9, 0.1, 0, 0] and a history cluster near
[0, 0, 0.9, 0.1]. Vectors are written as ordinary CQL list literals.
USE semantic;
-- Six articles in two clear clusters of the 4-dimensional embedding space:
-- a "science" cluster near [0.9, 0.1, 0, 0] and a "history" cluster near
-- [0, 0, 0.9, 0.1]. Vectors are written as CQL list literals.
-- Science cluster.
INSERT INTO articles_hnsw (id, category, title, embedding) VALUES (1, 'science', 'Quantum Foundations', [0.90, 0.10, 0.00, 0.00]);
INSERT INTO articles_hnsw (id, category, title, embedding) VALUES (2, 'science', 'Relativity in Practice', [0.80, 0.20, 0.10, 0.00]);
INSERT INTO articles_hnsw (id, category, title, embedding) VALUES (5, 'science', 'Statistical Mechanics', [0.85, 0.15, 0.05, 0.00]);
-- History cluster.
INSERT INTO articles_hnsw (id, category, title, embedding) VALUES (3, 'history', 'The Bronze Age', [0.00, 0.00, 0.90, 0.10]);
INSERT INTO articles_hnsw (id, category, title, embedding) VALUES (4, 'history', 'Maritime Empires', [0.10, 0.00, 0.80, 0.20]);
INSERT INTO articles_hnsw (id, category, title, embedding) VALUES (6, 'history', 'Industrial Revolution', [0.05, 0.05, 0.85, 0.15]);
-- The same six rows in the HVQ-indexed table.
INSERT INTO articles_hvq (id, category, title, embedding) VALUES (1, 'science', 'Quantum Foundations', [0.90, 0.10, 0.00, 0.00]);
INSERT INTO articles_hvq (id, category, title, embedding) VALUES (2, 'science', 'Relativity in Practice', [0.80, 0.20, 0.10, 0.00]);
INSERT INTO articles_hvq (id, category, title, embedding) VALUES (5, 'science', 'Statistical Mechanics', [0.85, 0.15, 0.05, 0.00]);
INSERT INTO articles_hvq (id, category, title, embedding) VALUES (3, 'history', 'The Bronze Age', [0.00, 0.00, 0.90, 0.10]);
INSERT INTO articles_hvq (id, category, title, embedding) VALUES (4, 'history', 'Maritime Empires', [0.10, 0.00, 0.80, 0.20]);
INSERT INTO articles_hvq (id, category, title, embedding) VALUES (6, 'history', 'Industrial Revolution', [0.05, 0.05, 0.85, 0.15]);
Query each index
The BTree index answers an exact-match lookup. The two vector indexes answer the
same ORDER BY … ANN OF … LIMIT nearest-neighbour query — the science-cluster
query vector returns the three science articles, and HVQ returns the same
ranking as HNSW.
USE semantic;
-- 1) BTree secondary index: exact-match lookup on the scalar `category` column.
SELECT id, title FROM articles_hnsw WHERE category = 'science';
-- 2) HNSW vector index: approximate nearest-neighbour search.
-- The query vector sits in the science cluster, so the three science
-- articles (1, 5, 2) rank highest.
SELECT id, title FROM articles_hnsw
ORDER BY embedding ANN OF [0.90, 0.10, 0.00, 0.00] LIMIT 3;
-- 3) HVQ vector index: the identical ANN query against the quantized index.
-- Results match the HNSW ranking while the index reads far fewer bytes.
SELECT id, title FROM articles_hvq
ORDER BY embedding ANN OF [0.90, 0.10, 0.00, 0.00] LIMIT 3;
-- 4) A history-cluster query, to confirm both clusters are separable.
SELECT id, title FROM articles_hvq
ORDER BY embedding ANN OF [0.00, 0.00, 0.90, 0.10] LIMIT 3;
|
Tip
|
The quantized index trades a small amount of recall for a large reduction in bytes read per query. For a head-to-head measurement of recall, bytes read, and latency across HNSW, IVFFlat, and HVQ, see the Vector Indexes reference and evaluation. |