Perverse Incentives: Google Search in the Age of Hallucination
It is a truth universally acknowledged, that a student in want of facts will consult Google and Wikipedia. Though often taught not to use these tools in favor of academic databases, the barrier to entry is so low, the interfaces so familiar, the rewards so tangible, that advice is quickly revealed as counterproductive. It is advice heavily coated in hypocrisy, as most of the instructors offering it have probably used Google and Wikipedia in their own research.
Rather than forbidding the use of tools I use myself, for many years, I have taught a 2011 chapter by Randall McClure called “Googlepedia.” Even as it became dated, I appreciated what it demonstrated about Google and Wikipedia as tools as well as McClure’s efforts to lead instructors and students away from prescriptive bans and into best use practices. Students are going to use Google and Wikipedia, so let’s make sure they know how to do it right.
The best practices of 2011 are no longer enough for the Google of 2024. The advent of large language models and text-to-image models, in addition to changes that Google has made to its search tools, makes former search practices not only useless, but sometimes actively dangerous. While Wikipedia has in many ways improved considerably since 2011, Google Search, alas, has been chasing the red cape waved by its shareholders and venture capitalists. This year I am cutting “Googlepedia” from my syllabus.
That charge at the red cape is how Google was pushed into adding generative AI to its search results years before the product was ready—if it ever will be. When the company tried to hold back, the market gored them for it. Now, of course, the market is turning on generative AI, so they may be gored again for following the whims of people whose only incentive is seeing a line go up.
And since Google is a monopoly, the incentive to improve Search is limited. It’s the default search engine on most devices sold today, and according to StatCounter, as of July 2024 it holds around 90% of the global search market share, while its closest competitor, Microsoft’s Bing, holds around 4%. Google is too big to avoid and too big to care about its users.
So what would I recommend to students now? If possible, avoid Google. I’m using DuckDuckGo these days. It’s far from perfect, but it has better privacy protections than Google, and when I’ve compared the results of my searches, they’ve been comparable. Change your default search engine on the browser. (And, for that matter, avoid Google’s Chrome if you can. I use Firefox, but my kids recommend Opera.)
Skip any automated summaries that appear at the top of your search engine, whether it’s Google or something else. These are often so misleading as to be actively damaging. It’s important to understand that even if a generative AI tool has access to the internet, it doesn’t have the understanding to parse what it sees. It’s a fancier version of the autocomplete your phone does in texts. By picking what the algorithm calculates is most likely to go next in a sequence of words, it can create text that looks authoritative, but it is often disastrously wrong. The most famous example is when Google’s AI summary advised people to put glue on pizza, advice it picked up from a joke post by a Reddit user called “Fucksmith,” but it creates other incorrect claims that are much harder to catch, such as providing the wrong dates or authors for articles, or interchanging similar words. Generated summaries should be treated as disinformation, even if they are sometimes right. A shop that only actively puts mouse droppings in 3% of its sandwiches is a shop that would rightly be shut down. Don’t put mouse droppings in your research (unless you’re researching the digestive systems of mice, I guess).
Go to the websites that your search brings up and look up the websites to make sure they’re reputable. Cross reference information that you find with sources in your library databases. This is all stuff that people should be doing regardless, but it’s especially important when disinformation is infiltrating so much more of the web. There are human limits to creating bullshit; we have to eat and sleep, and we can only type so fast. Large language models and text-to-image generators have no such limits. It’s also worth looking up authors, when sites that were formerly reputable are now interspersing articles from real people with articles from fake people.
This all sucks. Finding accurate information is a lot more work than it used to be even a few months ago. I’m old enough to remember when I was introduced to Google back in the late 90s, and it was a huge revelatory thing to be able to just enter your query and find real results that weren’t curated by your browser. Google’s Search made the web usable and information findable in a way it never had been before. Now it’s making the web a morass of disinfo, privacy nightmares, and general disdain for users who depend on their products.
While I would like to see search improved in library databases, libraries and librarians remain our best resources for finding accurate information in a world that is drowning in bullshit.