AI Fails as a Seinfeld Fact-Checker
Large language models continue to hallucinate when quoting iconic TV shows.
With a few words, anyone can prompt an AI chatbot to create just about anything. But is its output factually correct?
The answer, of course, varies. Consider two extremes:
- A single, easily verifiable sports statistic.
- A nuanced or contentious social, political, or economic issue.
If those two examples represent the continuum of facts, then Seinfeld quotes should fall on the easy side. After all, you can find entire episode scripts online in seconds.
Let's see how NotionAI did when I prompted it to create a database of quotes from the iconic show:
Initial Prompt
Here's my prompt:
True to form, Notion invoked its hodgepodge of AI models and dutifully created a database. I then added a few new properties that I would use to assess the accuracy of its output.
To read the rest of this post, subscribe or sign in with your
π Free account.
It'll only take a moment.