Amy
Concepts

Memory

What Amy remembers between turns, your goals, your preferences, the insights that earned validation. Memory is read into every turn's context, written to at the end of every turn, and fully visible…

What Amy remembers between turns, your goals, your preferences, the insights that earned validation. Memory is read into every turn's context, written to at the end of every turn, and fully visible and editable by you.

Amy is built to be a continuous companion, not a stateless oracle. Without memory, every conversation would start from zero, "remind me again, what's your morning routine?" Memory is what turns Amy from a question-answering API into something that watches your trends and nudges you toward the goals you said mattered.

This page covers what memory is, what categories exist, how it's extracted, how it's injected, and how you read, write, and delete it through the API.

Quick navigation


What Amy remembers, and what it doesn't

Remembered:

  • Goals you've stated ("I want to lift deep sleep by 15 minutes").
  • Preferences you've revealed ("vegetarian, no fish").
  • Insights the validator confirmed about your data ("HRV drops 8.2% on days you drink >2 espressos after 2pm, validated, ρ=-0.34, N=87").
  • A bookkeeping record of every quantitative hypothesis Amy has tested on your data, so it doesn't re-test the same hypothesis four times in four different conversations.

Not remembered:

  • The full text of past conversations. Amy doesn't keep transcripts, your client does, by passing the messages array on each turn.
  • Raw biomarker or wearable data. That lives in the data tables (/v1/data/*) and is queried as needed, not memorised.
  • The contents of failed validation gates. Rejected findings are visible in the turn trace but never enter durable memory.

The dividing line: memory is for things that are durably true about the user ("you sleep poorly after late workouts") or things the user told us to act on ("the goal is to fix that"). Everything ephemeral lives in the data tables.


The four categories

CategoryWhat it isExample
goalSomething the user is trying to achieve. Tracked across turns; coaching is anchored to it."Lift deep sleep average by 15 minutes over the next 6 weeks."
insightA validated finding about the user. Sourced from the validator's fact sheet."Recovery score correlates with sleep consistency (validated, ρ=0.41 over 90 days)."
preferenceA constraint or preference the user has stated. Coaching respects these as hard filters."Vegetarian. No fish. Mornings only for workouts."
historySignificant facts about the user's past. Lifestyle, history, contextual notes that don't fit the other three."Marathon runner for 8 years. Resting HR was historically 48 pre-2024."

A fifth internal category, tested_hypothesis, exists for bookkeeping. The validator writes one of these for every finding it processed (validated, conditional, or rejected) so the Hypothesis Investigator doesn't re-propose them. These are excluded from GET /v1/memory by default; pass ?include=tested_hypothesis to see them.


The Memory object

{
  "id": "mem_01HX2K3M4N5P6Q7R8S9T0V1W2X",
  "text": "Goal: lift deep sleep by 15 minutes over the next 6 weeks.",
  "category": "goal",
  "created_at": "2026-05-20T14:33:12Z",
  "source_turn_id": "turn_01HW8X...",
  "confidence": 0.8,
  "meta": null
}
FieldTypeNotes
idstringTyped prefix mem_…, ULID under the prefix.
textstringThe memory itself. One sentence, plain English. Max 500 chars.
categoryenumgoal · insight · preference · history. (Plus internal tested_hypothesis.)
created_atISO-8601When the memory was written. Memory is append-only, there's no updated_at.
source_turn_idstring | nullThe turn that produced this memory. null for memories you wrote directly via POST /v1/memory.
confidencenumber | null0-1, from the extractor. Higher = more certain the user explicitly stated it.
metaobject | nullFor insight memories sourced from validated findings: { finding_id, feature, target, verdict, effect }. For others: null.

A typical user accumulates 50-200 memory entries over a few months of active use. There's no hard cap; the summary that's injected into turn context is capped at the most recent ~80 entries.


How extraction works

Memory is extracted at the end of every turn (step 9 of the pipeline; see Turns: The pipeline). The extractor runs a small Sonnet call with the user's message, the assistant's answer, and a prompt that says, in effect:

Read the exchange. Emit any new durable facts about the user as JSON. Skip things that are already obvious from context.

Two sources flow in:

  1. The LLM extractor produces goal / insight / preference / history entries based on what was said.
  2. The validator writes one tested_hypothesis entry per processed finding, with the verdict and effect attached. This is deterministic, no LLM call.

Memories are append-only. The extractor never deletes, if your preferences change, you write a new entry ("Switched to pescatarian 2026-05") that takes precedence by recency in the prompt summary.

If the extractor fails (rare, Sonnet timeout, malformed JSON), the turn still completes successfully. Memory extraction is best-effort; the turn doesn't fail because step 9 hiccuped.

What you'll see in the trace

After the synthesis event, the SSE stream includes:

id: 87
event: agent.completed
data: {"agent":"memory","duration_ms":3200,"cost_usd":0.0014,"output_summary":"+3 entries (1 insight, 1 goal, 1 history)"}

Then turn.completed fires. The final Turn.result doesn't carry the new memory entries directly, fetch them with GET /v1/memory?after=<turn.completed_at> if you need them in the same flow.


How injection works

Every turn (unless you explicitly opt out) injects a compact memory summary at the top of every agent's system prompt. Not the full JSONL, a compressed view that fits the model's attention budget:

## Goals
- (2026-05-20) Lift deep sleep average by 15 minutes over the next 6 weeks.

## Preferences
- (2026-04-12) Vegetarian. No fish.
- (2026-04-12) Mornings only for workouts.

## Insights
- (2026-05-15) Recovery score correlates with sleep consistency (validated, ρ=0.41).

## Tested hypotheses (already validated/rejected)
- (2026-05-18) [rejected] Caffeine after 2pm hurts deep sleep (ρ=0.04, no signal).
- (2026-05-22) [validated] Late workouts (>8pm) drop next-morning recovery (ρ=-0.31).

Constraints:

  • Most recent first, capped at ~80 entries total.
  • Tested hypotheses are summarised with their verdict and effect size so the Investigator can de-prioritise them.
  • The summary is read-only context; agents cannot mutate memory mid-turn. Mutations happen only at extraction (step 9).

To skip injection entirely:

const turn = await amy.turns.create({
  messages: [...],
  context: { include_memory: false }
});

When to skip:

ReasonExample
Running evals on isolated turnsFrozen-input regression suite
Memory-extraction debuggingWant to see what Amy would remember without re-using prior memory
Cost-sensitive batch jobsMemory inflates the prompt by 2-5kB per agent; for a 100-turn batch that's a measurable saving

Default is true. Most clients should leave it on.


Reading memory

GET /v1/memory
Authorization: Bearer amy_live_…

Response:

{
  "data": [
    {
      "id": "mem_01HX...",
      "text": "Goal: lift deep sleep by 15 minutes over the next 6 weeks.",
      "category": "goal",
      "created_at": "2026-05-20T14:33:12Z",
      "source_turn_id": "turn_01HW8X...",
      "confidence": 0.8,
      "meta": null
    }
  ],
  "next_cursor": null,
  "has_more": false
}

Standard cursor pagination (API conventions). Filters:

ParamTypeDefault
categoryenumall (excluding tested_hypothesis)
afterISO-8601unbounded; useful for ?after=<turn.completed_at>
beforeISO-8601unbounded
includetested_hypothesisexcluded by default; pass this to include
limit1–10020

TypeScript SDK:

const { data: facts } = await amy.memory.list({ category: "goal" });

Writing memory

You can write memory directly, useful for onboarding, settings screens, or when the user explicitly states a fact:

POST /v1/memory
Authorization: Bearer amy_live_…
Content-Type: application/json

{
  "text": "Vegetarian. No fish.",
  "category": "preference"
}
FieldTypeRequiredNotes
textstringyesMax 500 chars. Plain text.
categoryenumyesgoal · insight · preference · history.
confidencenumbernoDefaults to 1.0 for user-written entries.

Response (201 Created):

{
  "id": "mem_01HX...",
  "text": "Vegetarian. No fish.",
  "category": "preference",
  "created_at": "2026-05-25T10:14:33Z",
  "source_turn_id": null,
  "confidence": 1.0,
  "meta": null
}

source_turn_id is null for user-written entries. Use it to distinguish what Amy inferred (non-null) from what the user told her (null).

TypeScript SDK:

const fact = await amy.memory.create({
  text: "Vegetarian. No fish.",
  category: "preference",
});

Why write directly?

  • Onboarding: ask the user about goals and preferences during setup; write them as memory entries before the first turn runs.
  • Settings UI: let the user toggle dietary preferences, contraindications, etc., as durable memory.
  • Correcting the extractor: if Amy inferred something wrong, write the correction explicitly. The summary is biased toward recent entries, so a new preference from today will override an old inference.

Deleting memory

DELETE /v1/memory/mem_01HX...
Authorization: Bearer amy_live_…

Response: 204 No Content.

Deletes are immediate and hard, the entry is removed from the JSONL store. No tombstone, no undo. The Investigator's tested_hypothesis records remain unaffected unless you delete those specifically.

TypeScript SDK:

await amy.memory.delete(fact.id);

To clear all memory at once, list and delete, there's no DELETE /v1/memory bulk endpoint in v1. The CLI's amy reset command does the equivalent client-side as part of a factory reset.


Retention and ownership

QuestionAnswer
How long is memory kept?Forever, until you delete it. No automatic expiry.
Who can read it?Only the user it belongs to, via their bearer token. There's no cross-user sharing in v1.
What happens on account deletion?All memory rows are dropped within 30 days, irreversibly.
Can I export it?GET /v1/memory?limit=100 with cursor pagination gives you the full JSON dump. No CSV export endpoint in v1.
Does memory inform the model's training?No. Memory is per-user, sent only to the LLM as context for that user's turns. Anthropic's API zero-data-retention terms apply to all model traffic.

Privacy notes

  • Memory text is stored in D1 (SQLite) at rest, encrypted at the Cloudflare layer. It is not encrypted at the application layer in v1, anyone with access to the database (you, in self-hosted; the Amy team, in managed deployments) can read it.
  • Memory is sent to the LLM provider (Anthropic, OpenRouter, or whichever backend you configured) as part of every turn's prompt. Provider data-retention policies apply.
  • Memory is never sent to third parties besides the LLM backend. Terra (wearable normalization) does not see memory; PubMed lookups do not include memory in queries.
  • Don't write secrets to memory. It's designed for personal health context, not credentials. If the user pastes a token into chat, the extractor is biased against capturing it, but assume nothing is filtered.

Common mistakes

Sending memory yourself in the messages array

You don't need to. Memory is injected automatically when include_memory: true (the default). Stuffing memory into the user message wastes tokens and confuses the extractor.

Deleting memory to "reset context"

Memory and conversation are separate. To start a fresh conversation, just send a fresh messages array, don't delete memory. Deletion is for things the user no longer wants Amy to know.

Treating insight memories as ground truth indefinitely

An insight is true as of when it was extracted. If the user's behaviour changes, the old insight is stale. The validator's tested_hypothesis records carry verdicts, but no one auto-invalidates an old insight. Write a new history entry ("Switched from late workouts to mornings 2026-06") so the summary's recency bias surfaces it.

Writing memory without a category

category is required. Sending null or omitting it returns 400 invalid_field.

Expecting mem_… IDs to be ordered alphabetically

ULIDs sort lexicographically by creation time, not insertion order. Two mem_… IDs created in the same millisecond may sort differently. For ordering, use created_at.

Asking Amy "what do you remember about me?"

This works, the model has the memory summary in context, but it's expensive (a full turn for what could be a GET /v1/memory call). For "show me what's stored," use the API directly. The CLI's amy memory command does exactly this without burning an LLM round-trip.

Bulk-importing memories without a category mapping

If you're migrating from another system, map source categories onto Amy's four explicitly. Don't fall back to insight as a default, that pollutes the validator's bookkeeping channel. Use history for anything you can't categorise cleanly.

Re-writing a memory instead of deleting + writing

There's no PATCH /v1/memory/:id. To "edit" a memory, write a new entry with the updated text and delete the old one. The summary respects recency, so the new entry wins in agent context immediately.


Where to next

On this page