The Peak Software Problem: When IE10 Was Just Right and GPT-4 Actually Worked

I've been thinking about Internet Explorer 10 lately. Not because I miss it exactly, but because it represents something we rarely talk about in technology: the moment when software reaches its peak and everything after that is a slow slide into accumulated frustration.

IE10 was just right. It rendered everything perfectly, websites loaded exactly as they should, and I never had to think about the browser itself. It was functionally invisible in the best possible way. Then IE11 arrived with all its technical improvements and standards compliance upgrades, and suddenly I was fielding constant complaints. Nothing catastrophic, mind you. Just a steady drip of small problems that slowly eroded confidence until I finally gave up and switched to Firefox.

The GPT-5 launch reminded me exactly of that experience.

When "Better" Makes Things Worse

OpenAI released GPT-5 in August 2025 with tremendous fanfare. CEO Sam Altman promised "Ph.D.-level intelligence" for all users. The benchmark scores were impressive. The technical specifications looked outstanding on paper. And within days, users were complaining that it couldn't handle basic maths problems and geography questions that GPT-4 managed just fine.

Sound familiar?

GPT-4 had become that reliable workhorse that users could count on. People built workflows around its predictable behaviour. It wasn't perfect, but it was consistently imperfect in ways users understood. Then GPT-5 arrived optimising for benchmark scores and technical reviewers rather than preserving the core user experience that made GPT-4 successful.

The most dangerous phrase in software development might be "users will adapt." Because they don't. They accumulate frustration.

The Invisible Software Principle

The best software becomes functionally invisible. When IE10 was working properly, I wasn't thinking about Internet Explorer at all. I was thinking about the websites I needed to visit and the tasks I needed to complete. The browser had achieved that magical quality where the tool disappeared and you could focus entirely on your work.

This is what I call peak software: the moment when a program does exactly what users need without requiring them to think about the program itself.

Most "improvements" after that point optimise for the wrong stakeholders. IE11 was optimised for web standards compliance and enterprise requirements. GPT-5 appears to be optimised for benchmark scores and technical superiority. Both forgot about the users who just wanted their tools to work consistently.

Death by a Thousand Cuts

In my work in policy development, I see this pattern constantly. A regulation works fine initially, then gets "enhanced" with small changes that individually seem reasonable but collectively make compliance more burdensome. None of the changes are catastrophic alone, but together they make every interaction with the system feel slightly more friction-filled than it used to be.

That's exactly what happened with IE11. It wasn't one massive failure that drove me to Firefox. It was accumulated frustration. A site that used to load instantly now took an extra second. A form that used to submit cleanly now occasionally hiccupped. Developer tools that behaved slightly differently. Each problem was small and theoretically fixable, but the cumulative effect was a browsing experience that felt unreliable in ways IE10 never did.

The GPT-5 rollout follows the same pattern. Users aren't reporting catastrophic system failures. They're sharing examples of simple mistakes in basic tasks, the kind of reliable functionality that GPT-4 users had come to depend on. Each individual error might be correctable, but the collective impact is users losing trust in the system's fundamental reliability.

The Wrong Metrics Problem

Once software reaches peak functionality, the people making decisions about version 2.0 usually aren't the people who have to live with the daily friction. They're responding to critics, competitors, and metrics rather than preserving the core user experience.

This creates a fundamental misalignment. Engineers and product managers see benchmark improvements and standards compliance as obvious wins. Users see their reliable tools becoming unpredictable and frustrating. Both perspectives are valid, but only one determines whether people actually keep using the software.

The most successful updates are the rare ones that manage to add genuinely useful capabilities while fanatically preserving the "invisibility" that made users love the original version. But that requires the discipline to say no to improvements that look impressive on paper but add cognitive overhead to daily use.

The Policy Parallel

In policy work, the difference is stark between regulations that genuinely solve problems and those that primarily demonstrate regulatory activity. The best policies become part of the background. People comply without thinking about it because the requirements align naturally with what they were already trying to accomplish.

The worst policies require constant attention, interpretation, and workarounds. They may address more edge cases and look more comprehensive to regulators, but they create friction for everyone who has to implement them. Just like IE11's improved standards compliance created friction for everyone who just wanted to browse websites.

The Peak Software Challenge

The real challenge isn't building good software. It's recognising when you've built software that works and having the restraint not to "improve" it into mediocrity.

IE10 worked. GPT-4 worked. Both became functionally invisible tools that users could rely on. Their successors optimised for different audiences and lost that essential quality in the process.

Perhaps the most important question we can ask about any software update isn't "What new features does this add?" but rather "Does this preserve the invisibility that made the previous version valuable?"

Because once you break that invisible quality, once you make users think about your tool instead of their work, you've started the slow slide toward accumulated frustration. And accumulated frustration always wins in the end.

I should know. I'm still using Firefox.