Trained from scratch and designed for practical deployment, Mellum2 is built for routing, Q&A, sub-agents, and private AI use in software engineering systems.
Dubai, United Arab Emirates, June 03, 2026: JetBrains today announced that it is open-sourcing Mellum2, a 12B model engineered to solve the hardest parts of production AI: latency, throughput, and cost.
Built from scratch and released under the Apache 2.0 license, Mellum2 offers a high-performance, cost-efficient alternative for infrastructure. Mellum began with code completion and has evolved to handle both natural language and code. It is now a versatile tool ready to power routing, summarization, and intermediate reasoning steps across modern AI workflows.
Whether users want to experiment, fine-tune, or deploy at scale, Mellum2 is ready to run in their own systems. Mellum2 is engineered to solve the bottlenecks of production-scale systems through its architecture and focused, efficiency-driven design.
The model features 12B total parameters, but because it uses a Mixture-of-Experts (MoE) design, only 2.5B parameters are active per token. This reduces compute costs while enabling high-throughput, low-latency inference for real-time workloads.
Unlike many modern models, Mellum2 is not multimodal. It is trained specifically on natural language and code data. This specialization ensures the model excels in software engineering environments while remaining lean and fast.
Mellum2 can be used to:
- Route and orchestrate AI workloads: Byanalyzing incoming prompts and helping select the right model or tool for each task.
- Build low-latency RAG pipelines: By retrieving relevant context, use Mellum2 to summarize it, and generate responses instantly.
- Power fast sub-agents in complex workflows: By supporting agent pipelines into steps like context gathering, planning, and validation.
- Enable private, local AI deployment: By running Mellum2 locally or self-host it to keep code and data fully under your control.
As AI systems become more complex, performance bottlenecks shift from raw capability to latency, throughput, and cost at scale. Not every task requires the largest model. Many steps in modern AI systems are repetitive, latency-sensitive, and high-frequency. These steps benefit from a fast and reliable model that can be efficiently routed, hosted, and controlled.
At JetBrains, the belief is that the future belongs to coordinated systems, not single models. Frontier models will continue to push the limits, but practical AI products also require focal models: fast, specialized components that handle high-frequency tasks efficiently.
JetBrains sees Mellum2 playing this role in the next generation of AI software tooling. Mellum2 is available as an open-source model under the Apache 2.0 license.
About JetBrains:
JetBrains creates intelligent software development tools trusted by over 15 million users and 88 Fortune Global Top 100 companies. Its lineup of more than 30 products includes award-winning IDEs like IntelliJ IDEA and PyCharm, as well as the JetBrains AI-powered coding assistant, coding agent Junie, Mellum, JetBrains’ focal LLM, purpose-built for code-related tasks, and productivity-boosting team tools like YouTrack, Qodana, and TeamCity. JetBrains is also the creator of Kotlin, a cross-platform language used by more than 2.5 million developers a year, worldwide. The company is headquartered in Amsterdam, the Netherlands, and has offices around the world. For more information, please visit https://www.jetbrains.com/.
Media Contacts:
Chandni Chugh
Wallis PR
chandni.chugh@wallispr.com
Mobile: +971 56 440 4798
