Memory tokens are dedicated tokens reserved in a prompt for storing persistent assistant memory (user preferences, tasks) separate from transient conversation.
Design Patterns
Trade-offs
- More memory improves personalization but raises cost/token.
- Stale memories cause outdated suggestions; need aging mechanism.
Current Trends (2025)
- Memory summarization runs daily to condense tokens by 3×.
- Per-user memory tokens encrypted with PII vault keys.
- Dynamic allocation allocates more tokens for power users.
Implementation Tips
- Expire memories older than 90 days unless pinned.
- Separate safety-critical memories (bans) from preferences.
- Monitor average memory token count per user over time.