I’m so old I remember when people looked things up on the internet like a leap of faith. Ancient history now. Maybe five years ago. You’d type a question into the little box, hit enter, and hope that somewhere out there, a stranger would be kind to you.
The intertunnel was Blanche DuBois central. If you’re not familiar with Blanche, you should watch “A Streetcar Named Desire,” and see Vivien Leigh fading away while Marlon Brando became the next big thing. At any rate, just like Blanche, you had to be off your rocker to trust the internet, but by gad, people sure did. It involved more self-deception than trust, and a healthy dose of bad judgment.
No matter what it said on the masthead of whatever site Google sent you to:
- You didn’t know who answered your question
- You didn’t know why they answered it
- You didn’t know whether they were brilliant, biased, drinking heavily, goofing off at work, or trying to sell you something on the sly
But there they were on the first ten results on Google. Strangers. And you depended on them.
Lording over this whole mess was Google itself. The ultimate stranger. Google never claimed to be your friend, exactly. It claimed to be helpful. Reached out an elbow for you to grasp on the way to the funny farm, and you took it. It claimed to be neutral and objective. That was a laugh. Its methods of curation were always opaque, its incentives were never purely aligned with your best interests. Google didn’t wonder what was true or even what was useful. Its ranking system consisted entirely of discerning who was kissing Google’s buttocks sufficiently to make Google some money.
What followed was inevitable. If Google rewarded visibility, then visibility became the greasy pole every website operator tried to climb. Entire industries sprang up to reverse-engineer the G’s algorithm. SEO experts. Content farms. Listicles with suspiciously specific headlines. “Doctors Hate This One Simple Trick” became a genre.
The internet quickly filled up with very strange people indeed, strangers shouting, waving, and stuffing keywords, all angling for a click. To be fair, no matter how bad it got, some people actually knew useful and amusing things, and offered them to the public, free of charge. These benighted souls, through titanic efforts, could climb to be found on page 114 of the Google results. It was left to the brave internaut to sort sincerity from strategy in real time. Pretty much, we all failed at that. Hence, Buzzfeed!
The kindness of strangers, it turned out, was unreliable, and just like Blanche lying on the floor, often exhausting. We were all ripe for something different. Google had hogtied the whole internet. The only place sadder than page 114 of Google was the top of the Bing results. You could hide dead bodies on Bing. The internet went from sclerotic to petrified. Only a completely different way to look for information could save us.
That’s what Large Language Models actually represent. They’re not glorified autofill, as many would characterize them. They’re not intelligent, either, in the true sense of the word, but so what? Unlike Google, which claimed to have the world’s digital information all curated for you, LLMs like Chad (Chat GPT) read the whole internet, and then some, and settle on a crowdsourced answer for you. Not original thinking. Not thinking at all, really. Just paying attention to everything, everywhere, more or less. Instead of being handed a ranked list of links curated by an inscrutable and avaricious stranger, you were handed a synthesis. Not a single authoritative voice, but an average. A blending. A statistical distillation of countless human scribblings. The good, the bad, and the ugly.
The prime idea behind this has a name: The Wisdom of Crowds. The term was popularized by James Surowiecki in the early 2000s. The observation itself is much older. Francis Galton came up with the concept, more or less, back in the 1800s. He spent half his time being pretty smart about statistics, and the other half writing a rough first draft of Idiocracy. He didn’t have faith in any single member of a crowd, not by a long shot. One of Galton’s classic illustrations is a fairground guessing game. A crowd is asked to estimate the weight of an ox or the number of jellybeans in a jar. Individual guesses are all over the place, too high, too low, and confidently wrong, usually. But if you take the average of all those guesses, the result is often eerily accurate. No single person knew the answer, but the crowd, in aggregate, effectively did.
This core insight is counterintuitive. Under the right conditions, large groups of ordinary people can collectively make better judgments than a small group of experts, or even the smartest individual you can find. That includes me, I guess. I’m the smartest individual I can find, but then again, I’m alone in my apartment right now. I’d have to put on pants and go outside and look for someone smarter than me. It could take minutes. Never fear. The wisdom of crowds doesn’t work because people are especially wise. It works because their mistakes are all over the place. Biases cancel out. Overconfidence is diluted. Individual blind spots are cancelled out by other individual idiocies.
Large language models are sorta like that. They are not intelligent in the human sense. They don’t reason or understand, and probably never will. They’ve been trained on enormous amounts of human-created text. Everything from high-quality scholarship mixed with drunken Reddit screeds, “journalism” (tee hee) mixed with marketing copy, insight mixed with the comments under cat videos. Much of it, taken individually, is not to be believed, never mind trusted. But when the model predicts answers based on patterns across all of that stuff, what emerges is something like a crowd’s best guess. It ain’t truth, exactly, but it’s at least a probabilistic consensus shaped by millions of whoevers rather than one loud stranger.
This is a subtle but profound shift. Before, you depended on the kindness of strangers. You hoped that someone, somewhere, had taken the time to answer your exact question thoughtfully. Then you hoped Google had decided this person deserved to be seen. Good luck with that. “Don’t be evil” is right up there with “Arbeit Macht Frei” in the accurate slogan department.
Now, you depend on mediation. The LLM doesn’t care about clicks (yet). It doesn’t care about ad revenue (right this minute). It doesn’t care about SEO tricks or keyword density (don’t worry, it will eventually). It doesn’t wake up hoping to sell you a multi-level marketing membership. Its incentives are different: produce something that sounds coherent, relevant, and responsive. That hardly makes it perfect or unbiased. Far from it. The wisdom of crowds can be wise under the right circumstances, it’s true. But crowds can also be lynch mobs. Garbage in, garbage out, averaged.
And since LLMs are programmed never to say, I don’t know, you end up reading hallucinations. You wish you’d get Sargent Shultz, and end up with Cliff Clavin instead:
The experience feels fundamentally different. You’re no longer wandering a digital marketplace, hoping to bump into a benevolent stranger. You’re having a conversation with a synthesized amalgam of John von Neumann and Cliff Clavin. Good luck figuring out who is who.
9 Responses
I’ve written last week about this and have only messed with whatever Google uses at the top of their search and chat gpt deeply but narrowly…. about Chad. he seems built to be very complimentary and encouraging to encourage you to use him. early on he guess what I was doing based on context rather than what I told him. when I got down to the actual nitty gritty of it although he seemed to know where I was going he had no idea how things worked in real life.
I had to ask the web who Cliff Claven was. I think the first answer was correct.
Back in the early, pre-Google days, AltaVista showed up and beat all the other search engines to a pulp. Particularly if you got proficients with its operators, AltaVista would yield good results quite readily. Of course, back then, there was a whole lot less cruft, with a much larger percentage of it generated by actual techies with domain knowledge. Also, back then, the big thinkers of the internet were proclaiming what a wonderful thing it was going to be, when everyone could get online and connect and have free speech and all that. The internet had no borders, no censors, etc. etc. Utopia! Democracy!
At least some of us who where expending real technical capital and money to get ourselves on the web saw Blogspot as a turning point, since it was the first major platform to get people on the web without them needing to know jack shit about web hosting, or even HTML. Sure, there were others, and I don’t recall exact timelines, but Blogspot was the biggie.
I try to avoid “AI” as much as possible. I still manage to find decent information on the web, but it’s more work than it should be, in most cases. And I have to add that a lot of human-generated content is far worse than the LLM stuff I have seen, which I admit is a very small sample.
I think I will always love Blogspot.
But then, I have been posting there 20+ years and it led me to Sipp’s place which is always at the top for me.
Hi Jean- About a zillion years ago, I started writing on these here intertunnels. I wrote a What’s New on my first furniture website. I typed the whole thing in notepad and uploaded it via ftp, republishing the whole website every time. Not very streamlined. Blogger came along about the same time. You could type, paste in a little picture, and hit a button. Pretty easy. So I went for it. But the best part was the community of like minded souls that congregated in everyone’s blogroll. It was all subsumed by social media, mostly, and by things like Substack nowadays. When I resurrected this blog, I vacuumed everything out of blogger and put it in this WordPress format on my own server because I didn’t trust anybody, especially Gargool. So I own this tiny little island, such as it is. But blogger was better, in its time, because it was real folks in one place for a while.
Well said. I miss so many of the friends I made through blogging.
Glad you are still here.
Mr. Sippican, Sir:
You wrote, “But if you take the average of all those guesses, the result is often eerily accurate. No single person knew the answer, but the crowd, in aggregate, effectively did.”
With all due respect, that premise is chock-full-o’-nuts. Note the cheating word, “often”, which also means the results were also often WRONG. Sometimes wildly wrong.
When I was working for Medium-Corp we had one of those “team-building” exercises which tried to emulate that concept. One of the features was a written test which featured open-ended answers. We were supposed to take the test as individuals, and then our results would be combined, and (lo and behold) our “average” answers were supposed to be better than any given individual’s. Unfortunately I was required to attend, and I scored 100% on my individual responses. The average of the group was on the close order of 80%. Oops.
In reality, the intelligence of a group, or of a mob which is essentially the same thing, is the IQ of the lowest-ranking individual divided by the number of people in the group/mob. Even worse is a committee. Heinlein once observed that, “A committee is the only known life form with 12 arms, 12 legs, 6 bellies, and no brain.”
While I’ve been un-gainfully unemployed (aka, retired) for a little over 6 years now, one of my discoveries in engineering was that the newbies were starting to rely on things like Gurgle for their engineering “calculations”. The example was a relatively simple but time-consuming heat transfer/condensation calculation. I had done them all by hand, complete with graph look-ups for stuff like viscosity and specific heat, heat of fusion/condensation, and then realized that with the newer spreadsheets I could automate my own work. I invested about 20 hours of time putting it together, with inputs of tables being turned into graphs, and then turned into a complex equation, and going through the exact same calculations that I had done by hand. I tested it, documented it, and from then on I could do a calculation by using the necessary boundary conditions and the spreadsheet would turn out in 5 seconds what used to take me 6 hours. By the fourth time I ran it I was ahead, and could spend my time doing other things (I was working 60 hours weeks at the time).
Some newbie engineer at one of our customers told me he didn’t need to do that, he just looked it up on the ‘net. I printed out the 8 pages of calculations (not bothering with the background pages of correlations) and showed him that his answer was off by a factor of ten.
I have yet to see an AI or LLM come up with a correct answer to ANYTHING. Those systems will make stuff up, while simultaneously lying to you to tell you what you want to hear. For engineering, anybody who lets anyone in a design position utilize one should be fired immediately, since they’re risking the entire organization in order to be lazy. We’ve already seen the results of allowing DEI to overrule competence (remember the footbridge going down?), just wait until people start using AI or LLM for engineering purposes.
And so, I must beg to differ.
> … and showed him that his answer was off by a factor of ten.
Obviously, he’d never watched Mr. Spock using a circular slide rule.
> … just wait until people start using AI or LLM for engineering purposes.
About a half hour ago, I saw a headline in my news feed about how bad AI surgery is. Oh, and another about the upcoming doctor shortage. To be fair, I guess it isn’t all bad. IIRC, results for “AI” in radiology are quite good.
I think Churchill said that he always got more out of alcohol than alcohol got out of him. My feeling about AI is I need to get more out of AI than AI gets out of me.
Just ten minutes ago I opened my “Gmail” to respond to an email that DD sent. Her and I have exchanged a total of four emails today. The first thing that popped up tonight when I started to write a response to her most recent comment was a complete summary of our emails. The summary was speaking as if it was human exchange of information scraped from all those emails. It looks something like this:
“this morning Anne suggested dates to Daughter”, “then Daughter responded saying she was not available on those dates” “Then Anne said, she could make it on this date.” This is the first time my private email has been intercepted and summarized by AI.
I got a new I pad last year but am just now getting comfortable with it. I am using Duck Duck Go as my search out of Safari and very happy with that service so far. I have started a new email over there using Safari (I hope and I think). Let’s see how that works!! When I first started using the iPad I tried very hard to keep it disconnected from my computer. It suddenly downloaded many folders from “cloud” and many photos from my computer. I have for years refused to post anything to cloud, but somehow some files got sent to cloud and then later to my new I pad.