Model usage logging captures metadata about each request and response—prompt length, latency, cost—to enable monitoring, debugging, and billing.
Typical Log Schema
Design Trade-offs
- Verbose logs improve observability but raise storage cost.
- Storing raw prompts aids debugging but may expose PII; hash or redact.
- Real-time streaming to ELK adds ingest latency.
Current Trends (2025)
- Differentially private logging scrubs rare token sequences1.
- Columnar log storage (Iceberg) reduces cost 35 % vs row stores.
Implementation Tips
- Sample low-value traffic (e.g., health checks) at 1 %.
- Separate hot (7-day) and cold (90-day) retention tiers.
- Encrypt logs at rest and restrict analyst roles.