← Main site ELI5 explainer Model guide Community Get started →
Agent memory · 2026 guide

Which memory provider should you use?

Eight providers compared — hosting, free tiers, benchmarks, architecture, and a concrete pick for each major use case.

🕐 Last verified April 2026
All providers at a glance

Scroll right on mobile. "Best for" and pricing are the most decision-relevant columns.

Provider Best for Free tier Paid from Hosting License Stars Tools Benchmark
Mem0 All-around default 10K adds / 1K recalls $19/mo Starter
$249 Pro (Graph)
Cloud + self-host Apache 2.0 51.4K ⭐ 3 LongMemEval-S 67.6%
Hindsight Best benchmarks, coding Full local, free $15/M retain
$0.75/M recall · $3/M reflect
Local + cloud MIT 2.4K ⭐ 4 LongMemEval 91.4–94.6%BEAM 64.1% · LoCoMo 89.6%
ByteRover Multi-hop / temporal Local CLI free $19/mo Pro
$35/user/mo Team
Local + cloud Partial OSS 4.2K ⭐ 3 LoCoMo 92.2%single-hop 95.4% · temporal 94.4%
Supermemory Search-heavy / RAG 1M tokens · 10K searches $19/mo Pro
$399/mo Scale
Cloud only Proprietary ~18K ⭐ 4 LongMemEval 81.6%with GPT-4o
Holographic Zero cost, privacy-first Fully local, free Local only MIT 2
OpenViking On-prem / air-gapped Self-host, free Self-host only AGPL-3.0 ~17.9K ⭐ 5
Honcho User modeling $100 free credits $2/M ingested
$0.001+/query
Cloud + self-host AGPL-3.0 414 ⭐ 4
RetainDB Structured schema recall None $20/mo Cloud only Proprietary 5
Benchmarks are not directly comparable. LoCoMo (ByteRover), LongMemEval (Hindsight, Supermemory), LongMemEval-S (Mem0), and BEAM test different tasks on different data. Hindsight's scores come from an independently validated arXiv paper (Virginia Tech) — the 94.6% figure is from Vectorize's own homepage. Use as directional guidance only. As of April 2026.
The fast version

Architecture, pricing, tools exposed, and what each provider actually does well.

🧠
Apache 2.0 · mem0ai/mem0 · 51.4K ⭐

Hybrid triple-store: vector + key-value + knowledge graph. An LLM pass extracts structured facts from conversations and stores them across all three layers. Most widely integrated option — first-class Python, TypeScript, and OpenAI-compatible SDKs.

Freemium Cloud + self-host 3 tools
PricingFree: 10K adds / 1K recalls · $19/mo Starter · $249/mo Pro (Graph Memory)
BenchmarkLongMemEval-S 67.6%
Best for: all-around use, teams wanting a managed service with a strong SDK and no vendor lock-in (self-host option, Apache license).
🔬
MIT · vectorize-io/hindsight · 2.4K ⭐

TEMPR architecture: four parallel retrieval strategies — temporal, entity, metadata, and BM25 for exact keyword matches. Strong at structured technical recall: port numbers, error codes, service names, deployment configs. Three-stage pipeline: retain (ingest) → recall (retrieve) → reflect (synthesize across stored knowledge).

Local = free Local + cloud 4 tools
PricingFull local free · Cloud: $15/M retain · $0.75/M recall · $3/M reflect
BenchmarksLongMemEval 91.4–94.6% · BEAM 64.1% · LoCoMo 89.6%
Best for: coding agents, privacy-first local setups, anyone who wants the best published benchmark scores. Highest LongMemEval score of any provider listed.
🤖
CLI: open source (custom license) · cloud: proprietary · 4.2K ⭐

Leads the LoCoMo benchmark — specifically designed for multi-hop and temporal reasoning across long conversation histories. Local CLI is open source and free. Cloud sync for cross-device persistence costs $19/mo. Built with coding agents as the primary use case.

Local CLI free Local + cloud 3 tools
PricingLocal free · $19/mo Pro (cloud sync) · $35/user/mo Team
BenchmarkLoCoMo 92.2% (single-hop 95.4%, temporal 94.4%)
Best for: coding agents and autonomous workflows that need multi-hop or temporal reasoning. Strong #2 on LongMemEval-class tasks.
🔍
Proprietary · cloud API · ~18K ⭐ (consumer frontend)

Optimized for search-heavy workloads. Ingests content from many sources and surfaces it via semantic search. Generous free tier (1M tokens). The star count reflects the consumer app frontend — the core memory engine is closed source. Strong LongMemEval score but behind Hindsight.

Generous free tier Cloud only 4 tools
PricingFree: 1M tokens · 10K searches · $19/mo Pro · $399/mo Scale
BenchmarkLongMemEval 81.6% (with GPT-4o)
Best for: knowledge bases and chatbots that need to surface relevant content from a large corpus quickly. No self-host option.
💾
Holographic
MIT · built into Hermes · local SQLite

Uses Holographic Reduced Representations (HRR) algebra on a local SQLite + FTS5 store. Zero external dependencies — no API keys, no network calls, no Docker. Memory lives in a single file in your Hermes home directory. The most private option by definition. The fact_store tool exposes 9 actions: add, search, probe, related, reason, contradict, update, remove, list.

Fully local, free Local only 2 tools
PricingCompletely free — no cloud tier
BenchmarkNot published
Best for: zero-cost local setups, privacy-critical environments, anyone who wants memory working instantly with no accounts or config.
🏛️
AGPL-3.0 · self-hosted · ~17.9K ⭐

Tiered context loading by resolution depth: L0 loads ~50-token abstracts, L1 loads ~500-token overviews, L2 loads full content on demand. Only the detail level needed for each query gets pushed into the context window — that's the mechanism behind the 80–90% token savings. Self-hosted only, AGPL. Requires Docker and an LLM provider for extraction.

Self-host, free Self-host only 5 tools
PricingFree to self-host · AGPL means server modifications must be disclosed if distributed
BenchmarkNot published
Best for: on-prem deployments and regulated industries needing full data sovereignty. 80–90% token reduction is real if you have large memory stores.
👤
AGPL-3.0 · plastic-labs/honcho · 414 ⭐

Three specialized LLM agents — Deriver (extracts user preferences), Dialectic (surfaces them in context), Dreamer (synthesizes across sessions). The only provider focused on building a persistent user model ("dialect"), not just storing facts. Available as cloud or self-hosted via the AGPL repo.

$100 free credits Cloud + self-host 4 tools
Pricing$100 free credits · $2/M tokens ingested · $0.001+/query
BenchmarkNot published
Best for: personal AI companions and multi-user apps where each user's preferences and working style need to shape responses over time.
🗃️
RetainDB
Proprietary · cloud only · domain status unverified

Database-style memory with structured schema. Explicit control over what gets stored and how it's queried — more like a managed database than an LLM memory layer. No free tier. Domain availability was inconsistent at time of writing — verify before depending on it.

No free tier Cloud only 5 tools
Pricing$20/mo base · enterprise contact sales
BenchmarkNot published
Best for: production apps that need structured data recall with predictable query behavior and no LLM extraction overhead.
Just tell me what to pick

Ranked by fit for each scenario — not by partnership or popularity.

💻
Coding agents

State across long sessions, multi-hop reasoning, exact technical recall (ports, configs, error codes).

  • 1
    HindsightTEMPR's BM25 layer handles exact keyword matches; structured entity extraction built for technical memory; highest overall benchmarks
  • 2
    ByteRoverleads LoCoMo (multi-hop 92.2%, temporal 94.4%); local CLI free; purpose-built for coding agents
  • 3
    Mem0mature SDK, works well with tool-calling patterns, self-hostable fallback
📚
Knowledge wiki / RAG

Indexing a large, growing body of content and surfacing the relevant slice at query time.

  • 1
    Hindsightreflect operation synthesizes across all stored knowledge; highest LongMemEval scores (91.4–94.6%)
  • 2
    Supermemorybuilt for search-heavy workloads; most generous free tier (1M tokens); good LongMemEval score (81.6%)
  • 3
    Mem0solid recall + open Apache license if you want to self-host the index
⚖️
All-around / default pick

Best option for most use cases when you don't have a strong constraint pushing you elsewhere.

  • 1
    Mem0best SDK quality, largest community (51.4K stars), self-host option, Apache license — covers the most ground
  • 2
    HindsightMIT, fully local free, best benchmarks — slightly more niche but stronger on technical workloads
🆓
Free / zero cost

You need memory working now with no billing setup, or your budget is zero.

  • 1
    Holographicbuilt-in, MIT, local SQLite, zero config — operational in seconds
  • 2
    Hindsightfull features locally for free; MIT; best benchmarks at zero cost
  • 3
    OpenVikingAGPL, self-host, no external calls — free forever if you can run Docker
🔐
Privacy / air-gap

Data cannot leave your infrastructure. No external API calls, no cloud.

  • 1
    HolographicSQLite file on disk, zero network calls, zero external dependencies — most private by construction
  • 2
    Hindsightfull local mode, MIT, no cloud required
  • 3
    OpenVikingself-host only, no external calls, AGPL — needs Docker + LLM for extraction
🧑‍💬
Personal AI companion

The agent needs to learn who you are — your style, preferences, working patterns — and apply that across every session.

  • 1
    Honchothe only provider built specifically for user modeling; three-agent pipeline builds a persistent "dialect" of each user's preferences
  • 2
    Mem0reliable fact recall across sessions; works well for preference tracking even without dedicated user modeling
🏢
Enterprise / production

SSO, audit logs, SLA, on-prem support, no AGPL/copyleft risk in your product.

  • 1
    Mem0Apache 2.0 (no copyleft), on-prem enterprise tier, SSO, audit logs — covers the enterprise checklist
  • 2
    HindsightMIT license, strong benchmarks, self-hostable with pay-per-token cloud option
Get started in 3 lines

Add to your config.yaml — full docs at the link below each snippet.

Mem0 (cloud)
memory_provider: mem0 mem0: api_key: your-mem0-key # get key at app.mem0.ai
Hindsight (local)
memory_provider: hindsight # no API key needed for local mode # add hindsight.api_key for cloud sync
Holographic (local, zero config)
memory_provider: holographic # that's it — built into Hermes # SQLite file at ~/.hermes/memory.db
ByteRover (local CLI)
memory_provider: byterover # local free — no key required # add byterover.api_key for cloud sync
Supermemory
memory_provider: supermemory supermemory: api_key: your-supermemory-key
Honcho
memory_provider: honcho honcho: api_key: your-honcho-key # or self-host via plastic-labs/honcho

Full config reference and advanced options at hermes-agent.nousresearch.com/docs/…/memory-providers

If one thing matters most

When a single dimension drives the decision.

LongMemEval benchmark
Hindsight
91.4–94.6% — independently validated, leads by a wide margin
LoCoMo benchmark
ByteRover
92.2% — multi-hop and temporal recall leader
Most private
Holographic
Local SQLite, zero network calls, zero dependencies
Easiest to start
Holographic
Built in, one config line, no account needed
SDK + ecosystem
Mem0
51.4K stars, Python/TS/OpenAI-compat — most mature
Free tier volume
Supermemory
1M tokens + 10K searches/mo — most generous cloud free tier
User modeling
Honcho
Only provider that builds a structured model of each user
Token efficiency
OpenViking
80–90% token savings — loads only needed resolution depth (abstract → overview → full)
Tools exposed
OpenViking / RetainDB
5 tools each — most capability surface area
Commercial-safe license
Mem0 / Hindsight
Apache 2.0 and MIT — no copyleft, no disclosure requirements
On-prem / air-gap
OpenViking
Self-host only, AGPL, designed for full data sovereignty
Cheapest paid
Mem0 / Supermemory / ByteRover
All start at $19/mo — Hindsight is pay-per-token

Ready to add memory to Hermes?

Full config reference, advanced options, and provider-specific setup guides in the docs.

Memory provider docs →