Skip to main content
AtomicMemory provides two search endpoints that let you retrieve the most relevant memories for a given query. The standard search endpoint uses a hybrid scoring approach that combines vector similarity with keyword-based scoring, fused via Reciprocal Rank Fusion (RRF). The fast search endpoint uses vector similarity only, which reduces latency at the cost of some recall on exact-match or keyword-heavy queries.

POST /v1/memories/search

Hybrid semantic search scores memories using both vector embeddings and full-text search, then fuses the two ranked lists with RRF to produce a single ordered result set. This gives strong recall across a wide variety of query styles — natural language, short keywords, and mixed phrasing — and is the recommended default for most agent retrieval workflows.

Request

POST /v1/memories/search
query
string
required
The search query. Can be a natural language question, a short keyword phrase, or any text you want to match against stored memory content.
scope
object
required
Restricts search to memories that belong to the specified scope. At least one of the following keys must be present:
limit
number
default:"10"
Maximum number of memories to return. Results are ordered by descending relevance score.

Example Request

curl -X POST http://127.0.0.1:17350/v1/memories/search \
  -H "Authorization: Bearer local-dev-key" \
  -H "Content-Type: application/json" \
  -d '{
    "query": "What are Alice'\''s editor preferences?",
    "scope": {"user": "alice"},
    "limit": 5
  }'

Response

memories
array
A ranked list of memory objects, ordered by descending relevance score.
total
number
The total number of memories in scope that matched the query, before the limit was applied.

Example Response

{
  "memories": [
    {
      "id": "mem_abc123",
      "content": "Alice prefers TypeScript over JavaScript.",
      "score": 0.94,
      "createdAt": "2024-01-15T10:30:00Z",
      "scope": {"user": "alice"}
    }
  ],
  "total": 1
}

POST /v1/memories/search/fast

Fast search skips the keyword-scoring pass and queries only the vector index, returning the nearest neighbours by embedding similarity. This reduces query latency compared to hybrid search. Use it when your application is latency-sensitive and your queries are well-represented by semantic similarity alone (e.g., dense natural language questions rather than sparse keyword lookups).

Request

The request body is identical to /v1/memories/search. The same query, scope, and optional limit fields apply.

Example Request

curl -X POST http://127.0.0.1:17350/v1/memories/search/fast \
  -H "Authorization: Bearer local-dev-key" \
  -H "Content-Type: application/json" \
  -d '{
    "query": "programming language preferences",
    "scope": {"user": "alice"},
    "limit": 5
  }'

Response

The response format is identical to /v1/memories/search: a memories array of ranked memory objects and a total count.