On 24/06/2026, OpenAI and Broadcom announced Jalapeño — the first "Intelligence Processor" designed by OpenAI itself. It is an ASIC built exclusively for inference (running LLM inference), not a repurposed training chip, and it is reticle-sized. The most striking detail: from design to tape-out in just 9 months — partly thanks to OpenAI's own models assisting in the chip design process. Deployment is expected from late 2026.
Quick summary
- When: announced 24/06/2026; deployment from late 2026, with a multi-generation roadmap.
- What: a reticle-sized ASIC dedicated to LLM inference — the first chip designed by OpenAI itself, manufactured by Broadcom, with Celestica building boards/racks.
- Speed: 9 months from design to tape-out — with OpenAI's own AI taking part in the design.
- Efficiency: 'significantly better' perf/watt (per OpenAI); Bloomberg reports a target of cutting inference costs by ~50%.
- Why it matters: major AI companies are building their own hardware to escape GPU dependence.
What happened?
According to OpenAI's announcement and news sources (TechCrunch, Tom's Hardware, CNBC), Jalapeño was designed for a single goal: running inference cheaper and more efficiently at massive scale. Unlike general-purpose GPUs, a dedicated ASIC trades flexibility for performance per watt — a sensible trade-off now that ChatGPT's inference workload has stabilized in computational shape.
The detail that most impressed the engineering community: a 9-month cycle from design to tape-out — unusually fast for a reticle-sized chip — which OpenAI says was aided by its own AI models during the design process.
Why this matters
Inference — not training — is the long-term operating cost of AI. By building its own inference chip (Bloomberg reports a target of cutting costs by ~50% — a figure per Bloomberg, not publicly confirmed by OpenAI), OpenAI both reduces its dependence on NVIDIA and paves the way for lower API prices over the long run.
The bigger picture: Google has TPUs, Amazon has Trainium/Inferentia, Microsoft has Maia, and now OpenAI has Jalapeño — hardware independence is becoming a competitive requirement for major AI companies.
The takeaway for businesses
The lesson is not "businesses should build their own chips" — it is the principle behind it: whoever controls the infrastructure controls their costs and their destiny. At the scale of a Vietnamese business, this principle takes a much simpler form: running internal AI on on-premise hardware (such as an energy-efficient Apple Silicon cluster) — fixed costs, with no dependence on cloud GPU rental prices or a vendor's discount cycles.
Frequently asked questions
What is Jalapeño?
It is the first 'Intelligence Processor' designed by OpenAI together with Broadcom — a reticle-sized ASIC dedicated to LLM inference, announced 24/06/2026, with deployment from late 2026.
Does it replace NVIDIA GPUs?
Not entirely — an inference-only ASIC complements GPUs (still needed for training and flexible workloads). But it reduces dependence and costs for stable inference workloads.
Is the 50% cost reduction certain?
The ~50% figure was reported by Bloomberg and has not been publicly confirmed by OpenAI — OpenAI only says perf/watt is 'significantly better'. Treat it as an indicative target.
What can Vietnamese businesses take away?
The principle 'controlling infrastructure = controlling costs' applies at every scale — for Vietnamese businesses, that means internal AI on on-premise hardware instead of depending entirely on cloud rental prices.
Take control of your AI infrastructure
Namtech deploys internal AI on on-premise hardware — fixed costs, no dependence on cloud GPU rental prices.
Book a free consultationNote: This article is compiled from public sources as of 02/07/2026; information is for reference and may change.