Web dev at the end of the world, from Hveragerði, Iceland

I’m available as a consultant. I also have a book out.

AI summaries and AI healthcare (links & notes)

AI summaries are almost certainly unreliable

It’s like the past few years' discussions on bias in training data and issues with shortcut learning in AI models just didn’t happen at all?

Like, our industry didn’t take in any of it, did it?

I thought we’d spent the past few years digging into issues and looking for ways to use these systems on problems they can handle, like data and media conversion and modification.

But, no, we’re going straight for the dystopian nightmare stuff:

  • Knowledgebases that straight up lie.
  • Creating an underclass that’s only served by mediocre AIs instead of actual lawyers or doctors
  • Summarisers that hallucinate in their summaries
  • Decision-making syststraight-upth shortcuts and biases

And the AI companies are behaving like cults:

  • Withhold information on their work to prevent the singularity
  • Theatrical “threat” testing to demonstrate the imminent arrival of a mechanical Yaldabaoth—a demiurge of a new world threatening humanity
  • Rules for disseminating knowledge about the models that aren’t even remotely scientific. It’s just insinuations and secrecy
  • Claims about the imminent arrival of the end times if we don’t follow their rules and ideology

This is all so fucked up.

I’m absolutely serious when I’m saying that all of this is starting to look like the mainstreaming of a cult.

And I’m not the only one to make this observation. Timnit Gebru has been pointing this out for a while. And Émile P. Torres goes into more detail on this over on the birdsite. This definitely smells like a cult to me.

(They are working on a paper on this, which I’m dreading/looking forward to reading in abject horror at the state of our world.)

My “summarisers that hallucinate” comment from earlier in this thread is based on the fact that a running thread through a lot of research over the past few years is the worry that hallucinations were a big problem for summaries generated by AI models.

See:

The trend in these papers is that hallucinations are an emergent property that seems to increase, not decrease as the models grow in size and happens just as often in summaries as it does in regular responses. Considering how prevalent hallucinations seem to be for people testing ChatGPT, Bing Chat, and GPT-4, it seems extremely unsafe to assume that using them to generate summaries will lead to accurate results.

The tech industry’s reaction?

😜

Who cares if AIs lie in their summaries? Nobody, that’s who!

None of these papers on hallucinations and factual consistency in AI-generated summaries are testing OpenAI’s super-secret-private chat/AI thingamabobs

That’s because it’s impossible to verify their claims due to their secrecy. Not that they make any solid claims. They just say that the current model is X% better at avoiding hallucinations than the previous one, which was also super-secret-private so we don’t have any meaningful statistics on their reliability either.

And according to the almost 100 page PDF they posted about GPT-4, their approach to reducing hallucinations consists of teaching it *manually *that some facts are facts. That’s not an approach that scales when the long tail of falsehoods is infinite. It is, however, an approach that does wonders when your primary audience are journalists who don’t have the time to do thorough testing and gullible tech pundits.

It’s all just theatrics.


AI in healthcare

“Epic’s overhaul of a flawed algorithm shows why AI oversight is a life-or-death issue”

The ratio of false alarms to true positives was about 30 to 1, according to CT Lin, the health system’s chief medical information officer.

What’s interesting about this case is that when the AI tool for spotting sepsis was deployed as designed, as directed by the vendor, it was essentially unusable. Too many false alarms. Continuing to use it as designed would have killed people. The tool didn’t begin to be useful until they moved it off to the side and turned it into a monitoring assistance for a dedicated team that was responsible for alerting other teams that a patient might be developing sepsis.

What made this work were the human relationships. At shift changes, members of the sepsis AI team introduced themselves to the teams and nurses in the departments they were monitoring. People could build trust in each other, and the AI team could develop a sense for what was and wasn’t the AI’s strong suit.

To bring this home and connect it to how everybody is injecting AI into their software, productivity tools, and knowledge bases:

AI didn’t begin to have positive results until they stopped using it for automation or reasoning and instead used it as an assistant that was assumed to be very unreliable.

I suspect that almost every AI integration we’ve seen announced to date will end up being a mistake, except for possibly the copilot-style ones, and even those are still too unreliable for broad use. (In my personal opinion. Most of tech is justifiably going to disagree. Especially since you can tell from their output that their software doesn’t need to work or have any long term reliability for it to be considered a success.)

What seems especially risky is using these systems for automated decision-making and predictions. There they seem especially prone to taking potentially catastrophic shortcuts:

“Against Predictive Optimization: On the Legitimacy of Decision-Making Algorithms that Optimize Predictive Accuracy”

And regulation seems to lag reality, as you can see from this 2022 blog post:

“No Doctor Required: Autonomy, Anomalies, and Magic Puddings – Lauren Oakden-Rayner”

By calling the device a normal detector, it makes us think that the model is only responsible for low risk findings.

And:

We need to we regulate these devices not based on what they are called or what cases they operate on a majority of the time, but based on the worst mistakes they can make.


​Trustworthy AI tools and companies

I’m not against using generative AI tools but, AFAICT there are very few trustworthy actors in the field: OpenAI, Facebook, Google, Stability AI, and Midjourney all have a track record that’s dodgy to say the least

The way they’re rushing into AI doesn’t add to the trust either

Anthropic, which is an AI startup focused on safety, pivoted to be about generative AI in their recent round of funding. And, as it turns out, a keyword in their safety research is “alignment” which is oogedy-boogedy speak for “let’s make sure this hypothetical god-like AI we think we’re fumbling into existence will be on our side.”

“Safety” as used by these people is a pseudo-religious catechism and they aren’t genuinely interested in reducing the harms done by these systems.

I’d like at least one trustworthy actor, who behaves with at least a modicum of decency—at least attempt to wear a fig-leaf of ethics—and is genuinely trying to develop these systems as productive tools for modern society.

And, no, Microsoft isn’t it. Are you kidding me?


I don’t think it’s controversial to say that between the gig economy, Amazon’s labour abuses, the surveillance economy, and crypto, the public in general has plenty of reasons to both distrust tech companies but also to distrust the notion that they know what they’re doing with AI and can do it safely

In fact, I’d say that we have plenty of cause to assume harm until otherwise proven and the precautionary principle should apply.


We’ve had over thirty years of experience in assessing the credibility of online source. Dismissing the fact that you can’t trust LLMs with “how is that different from a search engine” is just arrant nonsense. LLMs don’t have any of the markers we rely on for assessing sources.


Optimistic visions of the future

What depresses me, is that the people most enthusiastically hyping AI don’t get that their “optimistic” vision of the future is worse than a future were these systems either don’t work or remain imperfect.

Perfect art generation will eradicate art as a practice. All of the data sets all of these AIs have been trained on will stop growing. Every creative field will stagnate.

And once you’ve injected an AI into every process, you really think tech cos will resist the temptation to use that to manipulate?

With pervasive unemployment, you really think US companies will allow the taxation required to implement Universal Basic Income? You think that countries will take the destruction of local economies and the theft of entire industries by US tech cos lying down?

Where to you think demand will come from with near-universal unemployment?

You think allies will remain allies when you’ve taken all of their jobs?

The vision of the future where these technologies work is a nightmare, not an utopia.

This is something that AI and blockchain fans have in common. They don’t get that “this is kinda garbage and maybe full of fraud” is the optimistic take

The idea that the worst case scenario is that a bunch of companies will cycle themselves into worse business outcome by buying into the unfulfilled promise of a half-baked idea, that we’ll only have to deal with the abuse and fraud generated by criminals and bad actors, is the optimistic take on this tech

Because the alternative is much worse.

(I’d just like to point out the word “optimistic” in the posts above. If you think “this tech leads to AGI and a new species of slaves and/or the creation of new mechanical gods” is an optimistic vision of the future, then there’s something seriously wrong with you.)


I don’t know if I’ve just been reading too many academic papers but this looks like fairly straightforward guidance. I see that some media outlets are saying that lying about whether you used an AI to generate a work might cause you to lose copyright protection for that work, but that isn’t strictly true.

If you lie and register the copyright for an AI generated work, you will have claimed another’s work (the AI’s) as your own.

All that’s happening when the lie is revealed is the authorship of that work is getting correctly attributed to the AI, whose works aren’t eligible for copyright protection, so the work doesn’t get any. Nobody ‘lost’ anything. You just committed fraud.


Remember when we had the ostensibly reasonable side of the tech influencer sphere saying that web3 was too big to fail?

Good times. Good times.

Not a deranged industry at all.


“The stupidity of AI”

The belief in this kind of AI as actually knowledgeable or meaningful is actively dangerous. It risks poisoning the well of collective thought, and of our ability to think at all.

Honestly, this week’s must-read.


“How to tell if AI threatens YOUR job”

I’ve come to a pretty grim, if obvious, realization: the more excited someone is by the prospect of AI making their job easier, the more they should be worried.

Interesting take.


“This Week in AI Doublespeak - by Gary Marcus”

‘AI “art” is like moving into a house, painting the living room, and declaring that you built the house.'

"‘ChatGPT said I did not exist’: how artists and writers are fighting back against AI - Artificial intelligence (AI) - The Guardian"

“Why open data is critical during review: An example - Steve Haroz’s blog”

“What do tools like ChatGPT mean for Math and CS Education?"

Nothing could be farther from how a calculator works.

“Tooled. — Ethan Marcotte”. “But I’ll just note that labor economics has an old, old term for [gestures around] all this: de-skilling.”

“OpenAI’s GPT-4 Is Closed Source and Shrouded in Secrecy”

“Microsoft just laid off one of its responsible AI teams”. Building an ethics and responsibility team for AI is a productive way of getting all the troublemakers into one room to get rid of them all at once.

“The climate cost of the AI revolution - Wim Vanderbauwhede”

“Don’t ask an AI for plant advice - Tradescantia Hub”. Using AI for specialist problems (which is most of them) is a trap. The AI will lie confidently, and you won’t have the expertise to spot the lie. These are not information systems.


Best of the rest

“All data is health data. – Hi, I’m Heather Burns”

It is merely contextual. So you need to think of all of your data inputs, collections, and sharing, in that contextual way.

“Google won’t honor medical leave during its layoffs, outraging employees”

She was let go by Google from her hospital bed shortly after giving birth. She worked at the company for nine years.

Tech cos are run by scumbags and seeing media outlets frame this act, which would be illegal in most of the civilised world, as something that only “outrages” the employees, and not society in general, is disappointing.

The rest of the best of the rest

Join the Newsletter

Subscribe to the Out of the Software Crisis newsletter to get my weekly (at least) essays on how to avoid or get out of software development crises.

Join now and get a free PDF of three bonus essays from Out of the Software Crisis.

    We respect your privacy.

    Unsubscribe at any time.

    Archive

    Writing

    You can also find me on Mastodon and Twitter