Developer Tools & APIs

OpenAI Container Sessions Now Billed Per Minute — Short Sessions Get Cheaper from June 2

OpenAI switches container billing to per-minute with a 5-minute minimum from June 2, 2026, cutting costs for sessions under 20 minutes.

developer tools apis category

OpenAI’s flat container billing is getting fixed

If you use OpenAI’s Code Interpreter or Hosted Shell through the Responses API, you’ve been paying for a 20-minute container session every time, regardless of whether your task finished in two minutes or eighteen. Starting June 2, 2026, that changes. OpenAI is switching to per-minute billing with a 5-minute minimum, and the per-minute rate itself stays the same.

The practical result: any session that completes in under 20 minutes now costs less than it did before.

How the old model worked

When OpenAI launched container billing on March 31, 2026, sessions were charged at a flat 20-minute block rate. The pricing was structured around container memory, so a 4 GB container ran at $0.12 per session and a 64 GB container at $1.92, regardless of how long the container was actually active.

That made sense as a simple pricing model to launch with, but it created an obvious problem for short-lived tasks. A quick code execution that finishes in 90 seconds costs the same as a complex multi-step job that runs for the full 20 minutes. For developers running lots of small, fast tasks, that’s a real overpayment.

What changes on June 2

The per-minute rate underlying those session prices hasn’t changed. What changes is that OpenAI now applies it to actual usage rather than assuming every session fills the full block.

With the new model:

  • You’re charged for the minutes your container actually runs
  • There’s a 5-minute minimum per session, so very short tasks (anything under 5 minutes) pay the equivalent of a 5-minute session
  • Sessions that genuinely run for 20 minutes or more cost exactly the same as before

The 5-minute floor is worth noting. It means there’s a baseline cost per container invocation, which keeps the economics reasonable for OpenAI while still removing the biggest source of billing waste for developers.

What this means if you’re using Code Interpreter or Hosted Shell

These are the two tools directly affected. Both use OpenAI’s hosted container infrastructure: sandboxed environments with ephemeral block storage, wiped clean when the container expires or is deleted.

If your typical workflow involves spinning up a container, running a calculation or code execution, and getting a result back quickly, you’ve been overpaying. A session that completes in 6 minutes was previously charged as 20 minutes. After June 2, it’s charged as 6.

For agentic workflows where containers are invoked repeatedly for short tasks, the savings compound across many calls. For longer analytical or computational sessions that regularly hit or exceed 20 minutes, there’s no difference.

One thing worth keeping in mind: tokens used within built-in tools are still billed at standard model rates. The container billing change only affects the container session charge itself, not the token cost of whatever model is powering the work.

Monitoring your spend

If you’re running Code Interpreter or Hosted Shell at scale, it’s worth reviewing your tool line items in the API billing dashboard after June 2. The per-session cost will likely drop if your workload skews toward shorter sessions, and it’s useful to verify that against your expectations.

OpenAI also recently added Admin API capabilities for querying granular billing line items, which is relevant here. If you’re managing spend across a team or organisation, those new controls give you better visibility into exactly what container usage is costing.

For teams with data residency requirements: regional processing endpoints carry a 10% uplift for eligible models, and that applies to any container usage routed through those endpoints as well.

The broader direction

This change is a direct response to developer feedback that the flat 20-minute block was blunt pricing for what is often fast, automated work. Billing at the granularity of actual use is the more natural model for infrastructure that’s increasingly being integrated into agentic workflows, where containers might be spun up dozens of times in a single run with varying durations.

It’s a small but meaningful quality-of-life improvement for anyone building on top of these tools. The per-minute rate didn’t need to change for the overall cost to come down, just the unit of measurement.

If you’re actively using Code Interpreter or Hosted Shell through the Responses API, June 2 is a date worth noting. No action required on your end, the change applies automatically to eligible container sessions.