OpenAI unveils its first AI chip "Jalapeño" with Broadcom: designed in 9 months, aiming to slash inference costs

On 24/06/2026, OpenAI and Broadcom announced Jalapeño — the first "Intelligence Processor" designed by OpenAI itself. It is an ASIC built exclusively for inference (running LLM inference), not a repurposed training chip, and it is reticle-sized. The most striking detail: from design to tape-out in just 9 months — partly thanks to OpenAI's own models assisting in the chip design process. Deployment is expected from late 2026.

Quick summary

When: announced 24/06/2026; deployment from late 2026, with a multi-generation roadmap.
What: a reticle-sized ASIC dedicated to LLM inference — the first chip designed by OpenAI itself, manufactured by Broadcom, with Celestica building boards/racks.
Speed: 9 months from design to tape-out — with OpenAI's own AI taking part in the design.
Efficiency: 'significantly better' perf/watt (per OpenAI); Bloomberg reports a target of cutting inference costs by ~50%.
Why it matters: major AI companies are building their own hardware to escape GPU dependence.

What happened?

According to OpenAI's announcement and news sources (TechCrunch, Tom's Hardware, CNBC), Jalapeño was designed for a single goal: running inference cheaper and more efficiently at massive scale. Unlike general-purpose GPUs, a dedicated ASIC trades flexibility for performance per watt — a sensible trade-off now that ChatGPT's inference workload has stabilized in computational shape.

The detail that most impressed the engineering community: a 9-month cycle from design to tape-out — unusually fast for a reticle-sized chip — which OpenAI says was aided by its own AI models during the design process.

Table — Jalapeño at a glance
Item	Detail
Chip type	Reticle-sized ASIC dedicated to LLM inference
Design	OpenAI
Manufacturing	Broadcom
Boards/racks	Celestica
Design → tape-out cycle	9 months
Deployment	From late 2026
Efficiency	'significantly better' perf/watt (OpenAI); target of cutting inference costs by ~50% (Bloomberg, not confirmed)

Circuit board and chip on a dark background — microcircuits — An inference-only ASIC: trading flexibility for performance per watt. Photo: Miguel Á. Padriñán / Pexels

Why this matters

Inference — not training — is the long-term operating cost of AI. By building its own inference chip (Bloomberg reports a target of cutting costs by ~50% — a figure per Bloomberg, not publicly confirmed by OpenAI), OpenAI both reduces its dependence on NVIDIA and paves the way for lower API prices over the long run.

The bigger picture: Google has TPUs, Amazon has Trainium/Inferentia, Microsoft has Maia, and now OpenAI has Jalapeño — hardware independence is becoming a competitive requirement for major AI companies.

Table — Major AI companies & their in-house chips
Company	In-house chip
Google	TPU
Amazon	Trainium / Inferentia
Microsoft	Maia
OpenAI	Jalapeño

Rows of servers in a data center — Hardware independence is becoming a competitive requirement for major AI companies. Photo: panumas nikhomkhai / Pexels

The takeaway for businesses

The lesson is not "businesses should build their own chips" — it is the principle behind it: whoever controls the infrastructure controls their costs and their destiny. At the scale of a Vietnamese business, this principle takes a much simpler form: running internal AI on on-premise hardware (such as an energy-efficient Apple Silicon cluster) — fixed costs, with no dependence on cloud GPU rental prices or a vendor's discount cycles.

Frequently asked questions

What is Jalapeño?

It is the first 'Intelligence Processor' designed by OpenAI together with Broadcom — a reticle-sized ASIC dedicated to LLM inference, announced 24/06/2026, with deployment from late 2026.

Does it replace NVIDIA GPUs?

Not entirely — an inference-only ASIC complements GPUs (still needed for training and flexible workloads). But it reduces dependence and costs for stable inference workloads.

Is the 50% cost reduction certain?

The ~50% figure was reported by Bloomberg and has not been publicly confirmed by OpenAI — OpenAI only says perf/watt is 'significantly better'. Treat it as an indicative target.

What can Vietnamese businesses take away?

The principle 'controlling infrastructure = controlling costs' applies at every scale — for Vietnamese businesses, that means internal AI on on-premise hardware instead of depending entirely on cloud rental prices.

Take control of your AI infrastructure

Namtech deploys internal AI on on-premise hardware — fixed costs, no dependence on cloud GPU rental prices.

Book a free consultation

Note: This article is compiled from public sources as of 02/07/2026; information is for reference and may change.

Sources