Subodh Jena
Blog

Blog

Writing

Life, tech, and everything in-between.

Persistence and Checkpointing: Time Travel and Recovery for LLM Agents

Apr 24, 2026·8 min·AI

A long-running agent that loses its state on the next deploy is not a production system. Checkpointing saves agent state after every step, enabling conversational memory, human-in-the-loop pauses, time travel for debugging, and fault-tolerant resumption.

Work

ExperimentsPortfolio

Connect

AboutContact

© 2026 Subodh Jena

X (Twitter)GitHubLinkedIn