Finally! GKE Pod Snapshots are Generally Available
Keeping your stateful apps safe in Kubernetes just got a whole lot easier.

If you've been running AI/ML workloads on Kubernetes, you know how painful startup times can be. Loading large models into memory and initializing GPU state can take minutes, which really slows down scaling and recovery scenarios.
Well, good news! Google Kubernetes Engine (GKE) just announced that Pod Snapshots are now Generally Available. This is a pretty big deal for anyone running inference workloads or other GPU-intensive applications in GKE.
What does it mean? GKE Pod Snapshots capture a pod's in-memory state and GPU execution state—not persistent volumes. Think of it like checkpointing your running application's memory and GPU state so you can restore it almost instantly later. This can reduce startup times from minutes to seconds for workloads that load large models or initialize complex GPU state.
For GPU workloads using GKE Sandbox (gVisor), Pod Snapshots leverage NVIDIA cuda-checkpoint to capture CUDA state. This is a game-changer for AI/ML inference scenarios where you need to scale up quickly or recover pods without waiting for model initialization.
Note: Pod Snapshots are different from Volume Snapshots—those are for backing up and restoring persistent storage. Pod Snapshots focus on the runtime execution state of your application.
The best part? It's generally available on clusters running GKE version 1.35.3-gke.1234000 or later. That means it's considered stable and ready for your production workloads. No more messing around with beta features or custom solutions for basic data protection.
Pod Snapshots capture a pod's in-memory and GPU state, enabling fast recovery of running workloads. It's super helpful for making sure your applications are resilient. And it helps you recover fast if something goes wrong.
You can find all the nitty-gritty details in the official GCP release notes. Check that out for sure if you want to dive deeper into how it works and what versions you need.
It's a step in the right direction for making stateful apps on Kubernetes less stressful. Go try it out!




