What is an SLM?

An SLM (Small Language Model) is a language model with far fewer parameters than a flagship LLM β€” typically in the few-billion to tens-of-billion range, versus the hundreds-of-billions or trillions of frontier LLMs. The trade-off: lower raw capability, but cheap, fast, deployable on-device, and easy to fine-tune for narrow tasks. Examples: Claude Haiku, Microsoft Phi, Gemini Nano, Llama 3B/8B, Mistral 7B.

Real lesson: 90% of our simple agent tasks (classification, summarization, Linear card creation, TG message routing) run on Claude Haiku 4.5. Moving the same task to Opus costs 12x more and runs 3x slower, with only a ~10% quality gap. The rise of SLMs is what shifts AI from “pay OpenAI per API call” to “run it on your own GPU” β€” and a key driver of Edge AI adoption.