GPT-5 was released and the general vibe I’m hearing is that it’s a significant upgrade in many areas, but not so much in general intelligence. This has caused many to update priors on an A.I. progress in the bearish direction. In this post I wish to make a simple observation regarding a particularly bearish sign of A.I.-based productivity improvements in the near term, rather than A.I. progress per se.
The launch was in many ways botched by OpenAI. Here is Zvi Mowshowitz describing the problems with the launch:
- The name GPT-5 and all the hype led to great expectations and underdelivery.
- All the different models were launched at once when they’re actually different.
- GPT-4o and other models were taken away without warning,
- GPT-5 baseline personality is off putting to a lot of people right now and it isn’t noticeably more intelligent than GPT-4o was on typical normal person usage.
- Severe temporary limits were imposed that people thought would be permanent.
- The router was broken, and even when not broken doesn’t work great.
It’s the last one that I wish to speak about here. Apparently the ‘model-router’ was ‘down’. I’m not entirely sure what they mean by ‘down’ here. Whether it means, “not running at all”, or “producing incorrect results”. This meant that, it seems, many users experienced low quality responses mostly because their query was routed to the incorrect model, presumably in particular being routed to the non-thinking models (gpt5-main and gpt5-main-mini) rather than one of the thinking ones.
I’m not an A.I. engineer, but to me the problem of selecting the best model for your query seems like something that should be done by the A.I., and done well, if not entirely solved. There are many ‘choose your model’ guides available, here is one from Ethan Mollick. You would think that this would be relatively easy to convert into a prompt for a model router, it almost already is such a prompt. You would think that the best entity available for such a task would be the entity that created the models, and yet OpenAI didn’t get this right? Why not?
One answer could be that the model router would have worked perfectly if invoked correctly, but that it was not invoked correctly. Okay, but if OpenAI cannot utilise their own models in coding/dev-ops to avoid a high-profile failure on a very high importance launch, what chance do other companies have? Why hasn’t the A.I. enabled OpenAI to improve their development/deployment strategies to avoid such techincal glitches?
All of that makes me a little more bearish on massive productivity boosts from A.I. in the very near future.