Photo by Sincerely Media on Unsplash
AI text-generation tools have been getting into a bit of trouble lately.
Recently, OpenAI released “ChatGPT”, a quite remakable chatbot. It’s built atop GPT-3, OpenAI’s language model that’s very fluent at autocompleting and summarizing text. Denizens of the Internet quickly discovered the fun of getting ChatGPT to do weirdly creative tasks — like rewriting “Baby Got Back” in the style of Chaucer, or creating text games on the fly, or writing a sonnet about string cheese. I myself spent an evening getting ChatGPT to generate radio plays of famous historical figures arguing about what to have for dinner. It was pretty delightful!
But the problems begin when you require ChatGPT to be factually accurate.
When it comes to facts, the AI sometimes flies off the rails spectacularly. When the biology professor Carl T. Bergstrom asked ChatGPT to write a Wikipedia entry about him, it got basic dates of his career wrong, said he’d won awards he hadn’t, and claimed he held a professorship that doesn’t even exist. When Mike Pearl asked it what color were the uniforms of Napoleon’s Royal Marines, it utterly muffed it. (And OpenAI wasn’t the only AI running afoul of facts. A few weeks ago, Meta released Galactica, an AI it claimed could summarize and sift through scientific findings, but it mangled so much basic scientific info that Meta pulled it offline after only two days.)
Over at Stack Overflow, ChatGPT was causing even more havoc. Stack Overflow is a site where people post — and answer — questions about coding. But people started posting answers that had been written by ChatGPT, and many were incorrect; worse, they were often wrong in a subtle way. This meant that Stack Overflow’s moderators were suddenly having to spend hours carefully assessing this flood of AI-authored material.
The heads of Stack Overflow got so frustrated they made a blanket ban on any answers created by the ChatGPT …
Overall, because the average rate of getting correct answers from ChatGPT is too low, the posting of answers created by ChatGPT is substantially harmful to the site and to users who are asking or looking for correct answers.
Why is the AI getting basic facts wrong?
It’s probably because AI models like this do not appear to actually understand things. Having been trained on oodles of text, they’re great at grokking the patterns in how we humans wield language. That means they autocomplete text nicely, predicting the next likely phrases. They can grasp a lot of context.
But human intelligence is not merely pattern recognition and prediction. We also understand basic facts about the world in an abstract fashion, and we use that understanding to reason about the world. AI like GPT-3 cannot reason very well because it doesn’t seem to truly know any facts. It is, as the scientist Gary Marcus notes, merely the “king of pastiche”, blending together snippets of language that merely sound plausible.
Which leads to the real problem: The bot always sounds confident, even when it’s talking out of its ass.
As the Stack Overflow post notes …
The primary problem is that while the answers which ChatGPT produces have a high rate of being incorrect, they typically look like they might be good and the answers are very easy to produce.
This isn’t just a problem for Stack Overflow. In pretty much every other example where you see ChatGPT screwing up basic facts, it does so with absolute self-assurance. It does not admit a smidgen of doubt about what it’s saying. Whatever question you ask, it’ll merrily Dunning-Kruger its way along, pouring out a stream of text.
It is, in other words, bullshitting.
I’m borrowing the definition from Harry G. Frankfurt’s short 2005 book On Bullshit.
In the book, Frankfurt notes that bullshitting is different from lying. When someone is lying, they know the truth and are attempting to conceal it. That’s their goal.
But when someone is bullshitting, they don’t care about whether what they’re saying is true or not. Their goal is merely to seduce the audience, to give off the appearance of being authoritative. So the bullshitter confidently bloviates. The words and ideas gush out loghorreaically. Is what they’re saying correct, or true? Maybe! Maybe not? Who cares, so long as it holds the audience’s attention.
Bullshit is unavoidable whenever circumstances require someone to talk without knowing what he is talking about. Thus the production of bullshit is stimulated whenever a person’s obligations or opportunities to speak about some topic exceed his knowledge of facts that are relevant to that topic. This discrepancy is common in public life, where people are frequently impelled — whether by their own propensities or by the demands of others — to speak extensively about matters of which they are to some degree ignorant.
We can see, at a glance, how applicable this concept of bullshit is to ChatGPT — and indeed to the whole realm of “AI-generated prose” apps that are now on offer.
After all, on the one hand, the companies who create these AIs quite clearly state that you shouldn’t factually rely on what the bots tell you. As the folks at OpenAI themselves write, “ChatGPT sometimes writes plausible-sounding but incorrect or nonsensical answers.” (It also notes that “it will sometimes respond to harmful instructions or exhibit biased behavior”.) These are responsible and important caveats!
But of course, they’re somewhat lost in the architecture of the chatbot itself — which the creators designed to barge cheerfully along, bullshitting enthusiastically when asked about, like, whatevs. Officially, we aren’t supposed to trust what the AI writes. Normatively? People are gonna cut-and-paste that stuff all over the place, lol.
It is probably no accident that the industries who’ve most enthusiastically adopted “AI generated content” are the ones where bullshit — human authored bullshit — is historically common: Content marketing, PR, certain tech firms, and the more brackish, clickbaity tide-pools of blogging and journalism.
Now, with this critique out of the way, allow me to say: I actually really dig generative models — so long as you’re not relying on them for facts.
I think they’ll have many fabulous uses, particularly as they improve. I’m personally stoked by the possibility of auto-summarizing tech becoming better and better. In a world with so much published material, I’d love to have an AI that could give me the gist of scores of documents, white papers and blog posts, so I could figure out which ones to read more closely. I wouldn’t need the summaries to be perfect; I’m only going to truly rely on material I read directly myself.
I also already enjoy using language models as creative tools — for fiction or drama or poetry, where being factually correct isn’t an issue. (Indeed, it’s often fun to crank the temperature on a model, to get some truly weird-ass results.)
Also, I could also imagine text-generators being useful for helping people structure a piece of writing. You could ask the AI to write its version of an essay on subject X, to see how it pulls it off; then you do your own version. You wouldn’t copy or mimic its facts or points (which might, after all, be unreliable). You’d just fashion your work after the shape and structure of the AI’s piece. When people feel “frozen” or “stuck” with a blank page, the problem — I’ve often observed — is they have difficulty in figuring out “where to begin, where to end, what to put in the middle” … or, structure. Having examples really helps, so having an AI that could give you a customized example of structure could help even more.
So I’m actually quite into AI becoming part of our creative toolset!
I’m just not into the bullshit.
(If you enjoyed this piece, then hey — boogie on over to the “clap” button and go to town. It’s good for up to 50 claps! Per reader!)
I publish on Medium three times a week; follow me here to get each post in your email — and if you’re not a Medium member, you can join here using my link, and about half your monthly fee goes directly to supporting my writing on Medium, while also giving you access to everything else on the site.
I’m a contributing writer for the New York Times Magazine, a columnist for Wired and Smithsonian magazines, and a regular contributor to Mother Jones. I’m also the author of Coders: The Making of a New Tribe and the Remaking of the World, and Smarter Than You Think: How Technology is Changing our Minds for the Better. I’m @pomeranian99 on Twitter and Instagram, and @clive@saturation.social on Mastodon.