Google Gemini API now supports event-driven webhooks — no more polling for batch jobs

Polling is one of those problems that seems fine until it isn’t. You kick off a Gemini Batch API job, and then your code sits there calling GET /operations every 30 seconds for an hour, burning API quota and blocking threads, waiting to find out the job finished 28 seconds after the last check. For low-volume prototypes, that’s tolerable. For production pipelines processing thousands of prompts overnight, it’s wasteful and fragile.

Google addressed this directly on May 4, 2026, by shipping event-driven webhook support in the Gemini API. The short version: instead of your code repeatedly asking “are you done yet?”, the Gemini API now calls you the moment a job finishes.

What’s actually changing

The new webhook system covers three categories of asynchronous work in the Gemini API:

Batch API jobs — the bulk processing feature that lets you submit thousands of prompts at a 50% cost discount, with jobs that can run for minutes or hours
Video generation — Veo model completions, including veo-3.1-generate-preview
Agentic interactions — workflows that reach a decision point and need human input before continuing

The full event catalog includes batch.succeeded, batch.expired, batch.failed, interaction.requires_action, interaction.completed, interaction.failed, and video.generated.

When one of these events fires, the Gemini API sends an HTTP POST to your registered endpoint with a signed payload containing status details and pointers to results. The payload deliberately does not include the raw output file itself, which keeps delivery fast and bandwidth predictable.

Two ways to configure webhooks

Google has built two configuration modes, each suited to different architectures.

Static webhooks are registered once at the project level. Every qualifying event in that project routes to the configured endpoint. This works well if you have a centralised event ingestion layer, say a Pub/Sub topic or an SQS queue sitting in front of your processing logic.

Dynamic webhooks let you specify a different destination URL per request, passed inside a webhook_config parameter when you submit the job. This is more useful in multi-tenant systems or agent orchestration pipelines where different jobs need to notify different downstream services. Dynamic webhooks also support a user_metadata field for custom routing logic.

The security model differs between the two. Static webhooks use HMAC with a shared secret generated at registration time. Dynamic webhooks use JSON Web Key Sets (JWKS), because there is no pre-shared secret to rely on. Your endpoint verifies the JWT signature using Google’s public keys at https://generativelanguage.googleapis.com/.well-known/jwks.json. Either way, the signing approach follows the Standard Webhooks specification, using webhook-signature, webhook-id, and webhook-timestamp headers.

That last point matters more than it might look. If your team already validates Stripe or GitHub webhooks, the same signature verification code works here. No new patterns to learn, no new libraries to pull in.

What this means for your pipelines

The practical impact depends on what you’re building.

If you have a nightly batch job processing thousands of documents, you can now fire and forget the submission, then let a webhook trigger your downstream pipeline the moment results land in Cloud Storage. No polling loop, no wasted quota, no artificial latency between job completion and your application finding out.

For video generation, the video.generated event can trigger cache pre-warming, user notifications, or CDN updates the instant a video is ready, rather than checking every few seconds.

The interaction.requires_action event is particularly useful for teams building human-in-the-loop agent workflows. Previously, if an agent reached a decision point that required human approval, you had to build your own approval queue from scratch, typically involving database polling, WebSockets, or long-poll HTTP. The native event changes that. The agent pauses, fires a webhook, your system drops a Slack message or opens a ticket, a human responds, and the agent continues via API. That whole pattern now has a supported primitive rather than requiring custom infrastructure.

The delivery mechanism provides at-least-once guarantees with exponential backoff retries for up to 24 hours. Use the consistent webhook-id header to deduplicate events on your side, since under congestion the same event can occasionally arrive more than once. Also validate the webhook-timestamp header and reject payloads older than five minutes to protect against replay attacks.

A few practical notes before you start

The minimum SDK version required is google-genai>=1.73.1. Webhooks are strictly for asynchronous jobs, they cannot be attached to standard synchronous inference calls. There is no AI Studio UI for managing webhook endpoints at launch, so configuration is API-only for now. The API returns up to 50 webhooks by default, with a maximum of 1,000.

One thing worth remembering: when you create a webhook, the signing secret is returned only once. Store it immediately in an environment variable or secrets manager. If you lose it, you will need to rotate it.

Polling continues to work and Google has not announced any deprecation path. For simple, low-volume integrations, polling is still a perfectly reasonable choice. The webhook support makes the most difference for long-running jobs and for backends that are already event-driven.

Full documentation is available on the Gemini API webhooks page, and Google has published an end-to-end Cookbook notebook on GitHub if you want to see a working implementation before writing your own.