POST /v1/memories/ingest — Store Memories with Extraction
Two ingest endpoints let you store memories from conversations: full LLM extraction with AUDN-SC mutation, or fast embedding-based deduplication only.
AtomicMemory provides two paths for writing memories from a conversation. The full ingest endpoint runs LLM-based fact extraction and applies the AUDN-SC mutation algorithm (Add, Update, Delete, No-op, Supersede, Clarify) to keep your memory store consistent over time. The quick ingest endpoint skips extraction entirely and stores the raw content after an embedding-based deduplication check — trading accuracy for throughput.
Full ingest processes the conversation messages through the memory engine: facts and claims are extracted, compared against existing memories in the specified scope, and mutated according to the AUDN-SC decision logic. Contradictions are detected and resolved automatically. This is the recommended path for durable facts, stated preferences, and decisions you want to persist reliably.
The final assistant response text, if it is not already included as the last message in the messages array. Providing it here lets you pass the raw messages array from your model call without modification.
Quick ingest stores content directly after running an embedding-based near-duplicate check. No LLM extraction occurs — the content you provide is stored as-is if it is sufficiently different from existing memories in scope. This makes it significantly faster and cheaper than full ingest, at the cost of extraction accuracy and AUDN-SC mutation logic.
Use full ingest (/v1/memories/ingest) for durable facts, user preferences, and decisions where accuracy and contradiction detection matter. Use quick ingest (/v1/memories/ingest/quick) for high-throughput pipelines, ephemeral notes, or situations where you want fast writes and can tolerate occasional duplicates slipping through at low similarity thresholds.