Member-only story
Model Fatigue Problem: Why AI Companies Are Missing the Point
Which model to use for what? No one cares.
In the booming world of AI, companies like OpenAI, Anthropic, Google DeepMind, Mistral, and Meta are locked in an arms race. Each sprint is marked with a cryptic model name — GPT-4.1, Claude 3.5 Sonnet, Gemini 1.5 Pro, Mistral Medium, etc. — and a flurry of benchmarks showing marginal wins in tasks most people never actually do.
But here’s the uncomfortable truth: no one cares.
Outside of AI Twitter, a few newsletters, and competitive dev teams, nobody wakes up asking, “Should I use GPT-4.1-Turbo or Claude 3.5 Opus for my use case today?” This obsession with model minutiae is missing the forest for the trees.
Model names are starting to look like Android version numbers crossed with 90s CPU SKUs. O1, Q*, Sonnet, Haiku, Pro, Ultra, Turbo — users aren’t comparing models like gamers compare GPUs. They’re here to get things done.
And yet, AI companies keep trying to position model selection as a feature:
- “Try Gemini 1.5 Pro — it outperforms GPT-4 on DROP by 2%!”
- “Claude 3.5 Sonnet has longer context than GPT-4-Turbo!”
- “Mixtral does sparse mixture-of-experts routing for cost-effective inference!”