Memory Providers for AI Agents

Quick reference

All providers at a glance

Scroll right on mobile. "Best for" and pricing are the most decision-relevant columns.

Provider	Best for	Free tier	Paid from	Hosting	License	Stars	Tools	Benchmark
Mem0	All-around default	10K adds / 1K recalls	$19/mo Starter $249 Pro (Graph)	Cloud + self-host	Apache 2.0	51.4K ⭐	3	LongMemEval-S 67.6%
Hindsight	Best benchmarks, coding	Full local, free	$15/M retain $0.75/M recall · $3/M reflect	Local + cloud	MIT	2.4K ⭐	4	LongMemEval 91.4–94.6%BEAM 64.1% · LoCoMo 89.6%
ByteRover	Multi-hop / temporal	Local CLI free	$19/mo Pro $35/user/mo Team	Local + cloud	Partial OSS	4.2K ⭐	3	LoCoMo 92.2%single-hop 95.4% · temporal 94.4%
Supermemory	Search-heavy / RAG	1M tokens · 10K searches	$19/mo Pro $399/mo Scale	Cloud only	Proprietary	~18K ⭐	4	LongMemEval 81.6%with GPT-4o
Holographic	Zero cost, privacy-first	Fully local, free	—	Local only	MIT	—	2	—
OpenViking	On-prem / air-gapped	Self-host, free	—	Self-host only	AGPL-3.0	~17.9K ⭐	5	—
Honcho	User modeling	$100 free credits	$2/M ingested $0.001+/query	Cloud + self-host	AGPL-3.0	414 ⭐	4	—
RetainDB	Structured schema recall	None	$20/mo	Cloud only	Proprietary	—	5	—

Benchmarks are not directly comparable. LoCoMo (ByteRover), LongMemEval (Hindsight, Supermemory), LongMemEval-S (Mem0), and BEAM test different tasks on different data. Hindsight's scores come from an independently validated arXiv paper (Virginia Tech) — the 94.6% figure is from Vectorize's own homepage. Use as directional guidance only. As of April 2026.

Provider profiles

The fast version

Architecture, pricing, tools exposed, and what each provider actually does well.

🧠

Mem0

Apache 2.0 · mem0ai/mem0 · 51.4K ⭐

Hybrid triple-store: vector + key-value + knowledge graph. An LLM pass extracts structured facts from conversations and stores them across all three layers. Most widely integrated option — first-class Python, TypeScript, and OpenAI-compatible SDKs.

Freemium Cloud + self-host 3 tools

PricingFree: 10K adds / 1K recalls · $19/mo Starter · $249/mo Pro (Graph Memory)

BenchmarkLongMemEval-S 67.6%

Config keymemory_provider: mem0

Best for: all-around use, teams wanting a managed service with a strong SDK and no vendor lock-in (self-host option, Apache license).

🔬

Hindsight

MIT · vectorize-io/hindsight · 2.4K ⭐

TEMPR architecture: four parallel retrieval strategies — temporal, entity, metadata, and BM25 for exact keyword matches. Strong at structured technical recall: port numbers, error codes, service names, deployment configs. Three-stage pipeline: retain (ingest) → recall (retrieve) → reflect (synthesize across stored knowledge).

Local = free Local + cloud 4 tools

PricingFull local free · Cloud: $15/M retain · $0.75/M recall · $3/M reflect

BenchmarksLongMemEval 91.4–94.6% · BEAM 64.1% · LoCoMo 89.6%

Config keymemory_provider: hindsight

Best for: coding agents, privacy-first local setups, anyone who wants the best published benchmark scores. Highest LongMemEval score of any provider listed.

🤖

ByteRover

CLI: open source (custom license) · cloud: proprietary · 4.2K ⭐

Leads the LoCoMo benchmark — specifically designed for multi-hop and temporal reasoning across long conversation histories. Local CLI is open source and free. Cloud sync for cross-device persistence costs $19/mo. Built with coding agents as the primary use case.

Local CLI free Local + cloud 3 tools

PricingLocal free · $19/mo Pro (cloud sync) · $35/user/mo Team

BenchmarkLoCoMo 92.2% (single-hop 95.4%, temporal 94.4%)

Config keymemory_provider: byterover

Best for: coding agents and autonomous workflows that need multi-hop or temporal reasoning. Strong #2 on LongMemEval-class tasks.

🔍

Supermemory

Proprietary · cloud API · ~18K ⭐ (consumer frontend)

Optimized for search-heavy workloads. Ingests content from many sources and surfaces it via semantic search. Generous free tier (1M tokens). The star count reflects the consumer app frontend — the core memory engine is closed source. Strong LongMemEval score but behind Hindsight.

Generous free tier Cloud only 4 tools

PricingFree: 1M tokens · 10K searches · $19/mo Pro · $399/mo Scale

BenchmarkLongMemEval 81.6% (with GPT-4o)

Config keymemory_provider: supermemory

Best for: knowledge bases and chatbots that need to surface relevant content from a large corpus quickly. No self-host option.

💾

Holographic

MIT · built into Hermes · local SQLite

Uses Holographic Reduced Representations (HRR) algebra on a local SQLite + FTS5 store. Zero external dependencies — no API keys, no network calls, no Docker. Memory lives in a single file in your Hermes home directory. The most private option by definition. The fact_store tool exposes 9 actions: add, search, probe, related, reason, contradict, update, remove, list.

Fully local, free Local only 2 tools

PricingCompletely free — no cloud tier

BenchmarkNot published

Config keymemory_provider: holographic

Best for: zero-cost local setups, privacy-critical environments, anyone who wants memory working instantly with no accounts or config.

🏛️

OpenViking

AGPL-3.0 · self-hosted · ~17.9K ⭐

Tiered context loading by resolution depth: L0 loads ~50-token abstracts, L1 loads ~500-token overviews, L2 loads full content on demand. Only the detail level needed for each query gets pushed into the context window — that's the mechanism behind the 80–90% token savings. Self-hosted only, AGPL. Requires Docker and an LLM provider for extraction.

Self-host, free Self-host only 5 tools

PricingFree to self-host · AGPL means server modifications must be disclosed if distributed

BenchmarkNot published

Config keymemory_provider: openviking

Best for: on-prem deployments and regulated industries needing full data sovereignty. 80–90% token reduction is real if you have large memory stores.

👤

Honcho

AGPL-3.0 · plastic-labs/honcho · 414 ⭐

Three specialized LLM agents — Deriver (extracts user preferences), Dialectic (surfaces them in context), Dreamer (synthesizes across sessions). The only provider focused on building a persistent user model ("dialect"), not just storing facts. Available as cloud or self-hosted via the AGPL repo.

$100 free credits Cloud + self-host 4 tools

Pricing$100 free credits · $2/M tokens ingested · $0.001+/query

BenchmarkNot published

Config keymemory_provider: honcho

Best for: personal AI companions and multi-user apps where each user's preferences and working style need to shape responses over time.

🗃️

RetainDB

Proprietary · cloud only · domain status unverified

Database-style memory with structured schema. Explicit control over what gets stored and how it's queried — more like a managed database than an LLM memory layer. No free tier. Domain availability was inconsistent at time of writing — verify before depending on it.

No free tier Cloud only 5 tools

Pricing$20/mo base · enterprise contact sales

BenchmarkNot published

Config keymemory_provider: retaindb

Best for: production apps that need structured data recall with predictable query behavior and no LLM extraction overhead.

Use case recommendations

Just tell me what to pick

Ranked by fit for each scenario — not by partnership or popularity.

💻

Coding agents

State across long sessions, multi-hop reasoning, exact technical recall (ports, configs, error codes).

1

Hindsight — TEMPR's BM25 layer handles exact keyword matches; structured entity extraction built for technical memory; highest overall benchmarks
2

ByteRover — leads LoCoMo (multi-hop 92.2%, temporal 94.4%); local CLI free; purpose-built for coding agents
3

Mem0 — mature SDK, works well with tool-calling patterns, self-hostable fallback

📚

Knowledge wiki / RAG

Indexing a large, growing body of content and surfacing the relevant slice at query time.

1

Hindsight — reflect operation synthesizes across all stored knowledge; highest LongMemEval scores (91.4–94.6%)
2

Supermemory — built for search-heavy workloads; most generous free tier (1M tokens); good LongMemEval score (81.6%)
3

Mem0 — solid recall + open Apache license if you want to self-host the index

⚖️

All-around / default pick

Best option for most use cases when you don't have a strong constraint pushing you elsewhere.

1

Mem0 — best SDK quality, largest community (51.4K stars), self-host option, Apache license — covers the most ground
2

Hindsight — MIT, fully local free, best benchmarks — slightly more niche but stronger on technical workloads

🆓

Free / zero cost

You need memory working now with no billing setup, or your budget is zero.

1

Holographic — built-in, MIT, local SQLite, zero config — operational in seconds
2

Hindsight — full features locally for free; MIT; best benchmarks at zero cost
3

OpenViking — AGPL, self-host, no external calls — free forever if you can run Docker

🔐

Privacy / air-gap

Data cannot leave your infrastructure. No external API calls, no cloud.

1

Holographic — SQLite file on disk, zero network calls, zero external dependencies — most private by construction
2

Hindsight — full local mode, MIT, no cloud required
3

OpenViking — self-host only, no external calls, AGPL — needs Docker + LLM for extraction

🧑‍💬

Personal AI companion

The agent needs to learn who you are — your style, preferences, working patterns — and apply that across every session.

1

Honcho — the only provider built specifically for user modeling; three-agent pipeline builds a persistent "dialect" of each user's preferences
2

Mem0 — reliable fact recall across sessions; works well for preference tracking even without dedicated user modeling

🏢

Enterprise / production

SSO, audit logs, SLA, on-prem support, no AGPL/copyleft risk in your product.

1

Mem0 — Apache 2.0 (no copyleft), on-prem enterprise tier, SSO, audit logs — covers the enterprise checklist
2

Hindsight — MIT license, strong benchmarks, self-hostable with pay-per-token cloud option

Which memory provider should you use?

Ready to add memory to Hermes?