The perils of consulting an Electric Monk

Don't blame ChatGPT for the infamous incident of the made-up cases. And don't be too hard on the lawyer, either. We're all susceptible to a machine that tells us exactly what we want to hear.

May 30, 2023

Illustration by Midjourney

I’d planned to write about “dispute avoidance” this week, partly because it’s a concept with huge potential to affect the litigation industry, and partly because I didn’t want to write about AI yet again. But then the “ChatGPT Lawyer” story happened, and all hell broke loose on LawTwitter and LawLinkedIn, and I felt I needed to make three points, one of which involves an extra-terrestrial robot.

My first two points are pretty straightforward:

The tsunami of gleeful overreaction from lawyers on social media, urging bans on the use of ChatGPT and predicting prison time for the hapless practitioner, speaks not only to their fear and loathing of generative AI, but also to their desperate hope that it’s all really nothing but hype and won’t disturb their happy status quo. Good luck with that.
The condemnation and mockery of the lawyer himself, who made a bad mistake but who’s been buried by an utterly disproportionate avalanche of derision, speaks to the lack of compassion in this profession, whose members should pray that their worst day as a lawyer never makes it to the front page of The New York Times. There but for the grace of God.

My third point needs a little more explanation, because it references a publication with which many lawyers might not be familiar: Douglas Adams’s light-hearted 1987 science-fiction novel, Dirk Gently’s Holistic Detective Agency. The story is a little convoluted, but suffice to say the plot turns in part on a character called an “Electric Monk,” a highly intelligent walking appliance from another planet:

“The Electric Monk was a labour-saving device, like a dishwasher or a video recorder. Dishwashers washed tedious dishes for you, thus saving you the bother of washing them yourself; video recorders watched tedious television for you, thus saving you the bother of looking at it yourself. Electric Monks believed things for you, thus saving you what was becoming an increasingly onerous task, that of believing all the things the world expected you to believe.”

Disaster befalls the crew of a spaceship from this planet, when the chief engineer relies on an Electric Monk, which specializes in believing things, to determine whether the ship could safely launch following an accident. The Monk said it was safe; the ship disagreed and exploded. The engineer recalls:

“It’s probably hard for you to understand how reassuring [the Electric Monks] were. And that was why I made my fatal mistake. When I wanted to know whether it was safe to take off, I didn’t want to know that it might not be safe. I just wanted to be reassured that it was.”

When you read the details of the “ChatGPT Lawyer” case, it seems evident that he treated the AI like an Electric Monk. “Is Varghese a real case?” he challenged the AI. “What is your source?” “Are the other cases you provided fake?” He was skeptical, and properly so. These were exactly the right questions to ask.

But ChatGPT insisted they were real, and the lawyer wanted those cases to be real, and so he accepted it. But I’m not casting aspersions, because until you’ve used this program, it’s difficult to appreciate, as the chief engineer said, just “how reassuring” it can be.

Several weeks ago, I was doing some research into a legal regulation issue, and I decided to try the newly released ChatGPT-4 and see if it could help. I asked it to locate any scholarly articles that supported a point I wanted to make, and I was delighted when it returned three articles from reputable law reviews, bearing titles perfectly aligned with my theory.

Then I did something sensible. I pasted the first article’s title into Google to read it myself. “404 — not found,” came the reply from the law review’s website. Puzzled, I tried again, this time with the authors’ names — same result. With a growing sense of apprehension, I tried the other two articles. Nothing. All three were entirely fictional.

But here’s the crazy thing: They were all grounded in reality. The first paper listed two co-authors, law professors at two prestigious American universities. Both professors were real. Both had published law review articles on similar subjects in the past. But neither had ever written about the subject I was researching, and they had never collaborated.

And here’s the other weird thing: I felt let down and hurt, as if a newly hired and seemingly trustworthy intern had lied to my face. I felt like I’d been misled, even betrayed, because dammit, I wanted those articles to be real. They would have supported my point wonderfully. I felt, irrationally, like I’d been made a fool of.

So I went back to ChatGPT to challenge it, asking (with a certain amount of pique) the same kind of questions the ChatGPT Lawyer posed: “Are these citations real? I can’t find them anywhere. Did you make them up?” At first the AI provided more fictional links, but after repeated challenges, finally admitted it couldn’t find the articles and apologized for its error. I still felt upset, but I also felt like I’d dodged a bullet, because if I had submitted those “articles” to my client to support my point, I would have destroyed my credibility.

But it wouldn’t have been entirely the AI’s fault. True responsibility would have rested with me, because the AI was telling me what I wanted to hear. ChatGPT, as Ethan Mollick has noted, wants to make you happy, and it will tell you what it thinks will make you happy whether it’s true or not — like “this case exists” or “the spaceship is safe to launch.” And wanting something to be true can override all your instincts to the contrary.

It’s not hard to think about places and people that already seem to own an Electric Monk. International analysts wonder who’s advising Vladimir Putin about the actual state of his war on Ukraine. Business analysts shake their heads at Elon Musk’s decision to spend $44 billion on a criticism factory. Powerful but insecure people have always surrounded themselves with sycophants who interpret reality in pleasing ways. Well, now it seems we’ve invented an Electric Sycophant that can do it at scale.

Rest assured, Large Language Models are going to get better at this. Bing Chat’s GPT-4 model has the great advantage of access to the internet and rarely if ever provides links to thin air. Legally trained LLMs from powerhouse knowledge companies like Thomson Reuters, Lexis, and CaseText will be incapable of linking to cases or authorities that don’t exist. ChatGPT’s creators will (or absolutely ought to) bolster the program’s ability to say “I don’t know” when it can’t find an answer, rather than lying in order to keep its operator happy.

But the real weak link in this chain will continue to be the humans. I look at all the people convinced that the last US presidential election was stolen, or that vaccines cause autism, or that the world is flat, all in the face of overwhelming evidence to the contrary. And I wish I could ask these people, not, “Why do you believe that?” but rather, “Why do you want that to be true? Why is it important to you that the world should be this way? Why does it give you comfort?”

Those are questions we’ll need to ask each other more often in future, thanks to our new Electric Monks. More importantly, they’re questions we need to ask ourselves as lawyers. Are you looking for evidence to support the side that’s hired you? Or are you looking for the truth? Choosing the first option has never been easier. It’s also never been more dangerous.

Jordan Furlong

Discussion about this post