Let's stop pretending that managers and executives care about productivity
I’ve just been on a bit of a summer break. Did a bit of travel locally. Visited Hvalfjörður. Walked a lot.
I know from experience that if I don’t take a summer break, the winter becomes more of a slog and my thoughts become groggier.
Often, as soon as you rest, your mind starts to “helpfully” come up with ideas to help fill your time. One of the invasive thoughts that kept prodding my brain during my break was about modern management theory and how modelling various “AI” tools using those approaches and practices might play out.
I kept thinking that an analysis of productivity interventions with high variability in both time and outcomes (like “AI”) might be interesting.
Most of the fields that touch on modern management – systems-thinking, work psychology, even economics – have strong opinions on the kinds of variability and information asymmetry that seem baked into how generative models work.
But this thought immediately shatters on the cliffs of reality:
Businesses today don’t care about management, productivity, or even costs.
Manifestly and demonstrably, businesses only care about control over labour and stock prices and they ignore anything resembling modern management theory or related fields.
Modern management theory itself isn’t that modern. It was born out of US’s World War Two efforts, that’s where people like W. E. Deming came up, and matured during the rebuilding of Japan – where, again, Deming was hugely influential – got reimported into the US via Japan in the seventies and eighties but then has almost entirely been ignored by tech and increasingly ignored by the rest of the economy as the influence of tech and financialisation increased. Today, big chunks English-language management and executive culture are effectively the opposite of what we know works when managing organisations.
One example are open offices: these have been proven many times over to harm productivity, focus, collaboration, and employee well-being. They only improve real estate costs and employee surveillance (no private spaces, executive corner offices work like panopticons).
Another is working from home: as a practice it’s a bit of a wash in terms of productivity when compare to a sensibly designed office, and a loss when compared to a collaborative cross-functional single-project team working closely together in a single functional space, but is overall an improvement over working in open offices.
Working from home also – in at least a few studies I’ve seen – tends to lead to substantially better sleep patterns among employees. Better sleep consistently improves work outcomes across the board for pretty much any kind of work you can think of, so that would in and of itself be a major win. Home offices also lower office real estate costs, so you’d think executives would love it, but they also makes employee surveillance harder. Turns out that if you’re core philosophy is authoritarianism, that surveillance and control matters more than anything else, including profits and business outcomes.
So, an analysis of “AI” from the perspective of modern management, modelling how it might play out, is kind of meaningless. Managers and executives have already demonstrated, en masse, that they don’t care about costs, productivity, effectiveness, or employee well-being. They only care about control and personal success. (This is even a core observation from modern management theory on why companies are dysfunctional. This is what Deming meant when he said “nobody gives a hoot about profits”.)
Even if Large Language Models resulted in a 20% improvement in productivity (unlikely) that would still be substantially less than the negative impact of the overall productivity-hostile design of the modern workplace. And if they harm overall productivity (as is likely) companies have already demonstrated that they absolutely do not care about that one single jot. Control and stock prices are all that matters.
This brings us to my dilemma.
For those of us who care about improving how we work and, specifically, about making better software, is there even an audience out there that both cares about doing things well and has the ability to change their practices?
Sure, there are probably a few that care about good management as a general principle. I’m guessing there are a few sensibly run small to medium-sized businesses out there what take these things to heart, simply because they’d collapse quite quickly if they didn’t.
There are also probably a few corners here and there in larger companies that care – sheltered by a sensible middle manager somewhere in the hierarchy.
But are any of those sensible corners all-in on generative models?
Probably not.
There’s plenty of reason not to use these tools if you have even a shred of common sense:
- There’s obviously a financial bubble in full force.
- The lock-in is a real issue.
- The costs are already higher than seems warranted for many businesses.
- The environmental impact is real.
- The political dimension to generative models (close collaboration with authoritarian governments, promises to make models “less woke”, attacks on labour and workers, etc.) is a big risk in and of itself.
- The religiosity of some of the management involved (singularity, accelerationism, longtermism, etc.) calls into question how rational any of their decisions are.
People likely to pay attention to rational management analysis are already unlikely to be heavily using generative models, so there’s probably no real audience for an analysis that would end up doing little more than demonstrate just how incredibly damaging this tech is along multiple dimensions to most organisations.
To those locked into working for an organisation that’s all-in on “AI”, such an analysis would just be a depressing rehash of what they’re already seeing at work.
It’s also very difficult to change minds once they’re already made. Convincing people who already believe in Large Language Models that their chatbot buddy isn’t a benefit but is instead an overall detriment to workplaces and the economy is next to impossible.
Anybody who has experienced prior major bubbles knows this. I remember, back in 2007, talking with friends, relatives, and co-workers about the real-estate bubble hitting their wall of conviction. Everybody was absolutely convinced it wasn’t a bubble and there wasn’t a problem.
“Everybody I know who has bought real estate using a 90% mortgage made a tonne of money on it. Nobody ever loses money on real-estate.”
Every conversation had more and more examples of peer success – “my uncle made a lot of money on it and has flipped two houses” – and a greater conviction that I was full of shit for trying to warn people about a bubble.
After the bubble popped, every single one of them was completely convinced that none of it was their fault and the reason why they risked losing their homes was entirely the fault of the banks.
The banks were at fault too, much like tech is at fault today for making outrageous claims about “AI” capabilities, but people also very specifically decided not to listen to the warnings they got.
So a detailed analysis of how these tools could hypothetically play out is kind of pointless. The people who need to hear it won’t listen and the people who would listen are already opting out of the bubble.
But, since I ended up already doing much of this analysis, and because I’m the kind of asshole who thinks Akerlof’s and Romer’s “Looting” paper is cracking reading and genuinely enjoy this sort of thing, here’s a quick gallop over a couple of the issues.
No, I won’t be explaining any of it any more than I do below.
And, yeah, this is a messy and quick overview with huge gaps in the analysis.
Task sequences as vectors #
If you model work as a sequence of tasks that already have some variability in duration and outcomes, the overall sequence will have less variability than each individual task. If that variability is symmetric – as in, equally likely to slow or speed up the task by a similar amount – that variability will trend towards zero as the sequence gets longer.
So, if you adopt the same productivity intervention, with the same kind of variability, over all tasks, the overall effect will trend towards zero. At the same time you will randomly see productivity gains or losses when you study the tasks in isolation. Smaller sample sizes looking at isolated tasks would have results that are all over the place, while larger sample sizes would eventually revert to the mean (same as before the intervention). Systemic analysis would show a small impact, if any.
An analogy to explain this: think of a traffic system with a single car. Without any other car on the streets, the overall driving time is down to route and skill. If you made a change to the traffic rules that injects a consistent symmetric variability at each turn – some turns would take more time, some less – while preserving the same route and driver, the overall journey time would remain the same as before, as long as the trip is long enough for the variability to “even out”. On one leg of the journey, the time would be shorter, but since the variability is symmetric, it’s likely that the next leg would be slow by to a degree that would offset the shorter earlier time.
And if the variability is asymmetric, then it still diminishes over the sequence, meaning you’re never going to get the full benefit you see in the best-case scenario.
Queueing theory #
But, obviously, a single car never has an entire city to itself, you always have to deal with everybody else being in the way.
That’s where queueing theory comes into it.
Most work in organisations can be modelled as sequences and queues. That means vector maths (the above nonsense) and queueing algorithms.
Tasks, problems, issues, customers, etc, all collect in queues and lists and each organisation has a set number of people working on each “queue” or task inventory.
How the queue is designed matters a lot, but this is a gallop so I won’t get into it, but what matters here is that:
- The time tasks/customers/issues spend in the queue responds non-linearly to load.
- At some point wait times will sharply bend towards the infinite, the queue overloads, and all work stalls. (A “knee” in the curve when you graph wait times against load.)
- That means 100% utilisation of capacity (as in, all the workers are working all the time) is impossible.
- One of the biggest variables for deciding this “overload” point is variability. Y’know, the thing that “AI” tools have oodles of. More variability means lower “overload” points.
- Even systems with low variability, like well-designed mass-manufacturing, can’t work at 100% capacity without risking disastrous delays and overloads, which is why most of them tend to operate at 80-90% capacity.
- Systems with high variability need to operate at much lower capacity to prevent overloads.
- A tool that, when looked at in terms of task sequence, might represent an overall improvement in mean productivity but does so with high variability, will be disastrous for most companies.
- Real world organisational capacity drops like a stone with higher variability. Projects will be much more prone to delays, overage, and outright failure. To maintain the same level of organisational function as before, the company would have to massively increase overall capacity and with that increase costs.
- If you couple a highly variable productivity intervention with layoffs, you’re simultaneously reducing capacity while massively increasing the capacity required to maintain prior levels of function. This is outright inviting disaster.
- This means that, for most organisations, variability is a much bigger issue than productivity as you never operate at 100% capacity anyway and lower variability has a bigger effect because it increases the loads the organisation can handle with the same capacity.
- And a productivity “improvement” that comes coupled with an increase in variability might as well be organisational poison.
The best case scenario is that you’re going to need a lot more staff to make up for the “productivity tool” you’re using primarily to boost stock prices.
The worst case scenario is that you keep the productivity intervention because internal studies show a 5-10% boost in task sequences and use that to reduce headcount by 5-10%, triggering an organisation-wide cascade of dysfunction and a death cycle.
I could go on #
There are so many more ways you could slice this particular shit cake. You could look at constraints and defect management, the price of automation and labour, and the nature of the various tasks they want to automate.
But the short version is that once you start looking at the impact on a systemic and organisational level, these tools look bad in every dimension.
But, apparently, “AI” is now mandatory so all of these issues are probably baked right into the very fabric of tech as an industry.
Which is a little bit worrying, if I am to be honest.