I’ve been working with artificial intelligence as part of my work for 25 years. Most of the applications I’ve used it for are games. I needed AI for things like pathfinding and combat so I would learn what I needed to along the way.
More recently, I’ve used it for products that improve corporate training and to build systems that deliver conversational avatars.
So I’ve got a pretty good sense for the state of the art of AI and it’s been a steady improvement for many years. But in the last few years, with the introduction of an AI called GPT3, the game has changed.
For example, here is how GPT3 continued this article after I used just the above text as input. The green words are the AI:
This is a fascinating response. It’s correct in the first two lines but then it makes a direct claim.
“The result is an AI that can generate text that is indistinguishable from human-generated text.”
This is false. And the way you know is the next few lines where GPT3 makes all sorts of claims that read like marketing copy. But, technically speaking, it’s just making stuff up that “sounds good” based on the material it was trained on plus the input “prompt.”
GPT3 is a “Large Language Model” or LLM for short. LLMs are a snapshot of a bunch of text from whatever source the creators choose, that “trains“ a series of layered neural networks to predict whatever word will come next based on the entirety of the source material, using only a “prompt” for input.
In the case of GPT3, they trained it on basically the entire web, plus the Libraries of Congress and Alexandria. Also, for some goddamned reason, Reddit.
But what’s important to realize, despite all the hubbub, is this is really a fancy version of the predictive text you’ve had on your phone for many years. It just uses a LOT more context than the version your phone has to make its predictions.
The results you get from an LLM are stochastic, meaning it is non-deterministic except in rare and useless exceptions. You can never be certain what result you are going to get if the results are going to be interesting.
There is a variable that you can provide to GPT3 to determine just how “creative” the answers will be. It’s called “temperature.”
Basically, if you set the temperature at zero, it will always give you the same, most likely, prediction. But the higher you set it the more likely it is to produce an unlikely answer. For example, here are two responses to the same “prompt” with the temperature set to 0 and 1.
If you say “George Washington was” to this version of GPT 3 and set the temperature to zero, you will always get this answer:
But if you set it higher, GPT3 is more likely to go in an interesting direction. Like this one where it decides the most relevant thing about George Washington is his rocking chair patent.
So, what we have is a technology that does a very convincing job of acting human but it has absolutely no idea what it’s talking about. It’s just coming up with the potential next word based on lots of human data that may or may not be correct in either the input or output.
My fear has always been that people would use the technology for harm. That people would set up harmful LARPs that took no human effort to run.
Well, this guy went and did it. He created an LLM but instead of training it on normal human text, he trained it exclusively on three and a half years of 4chan /pol/ posts.
Like all dangerous trolls this guy portrays this project as a joke. It is not.
He trained this vile, antisemitic, violent, misogynistic AI and then set it loose back on 4chan. Over 48 hours he posted to 4chan thirty thousand times.
Guess what happened? 4chan immediately started creating conspiracy theories about this prolific new user that talked just like them but talked a LOT. This LLM represented 10% of all /pol/ traffic over those two days.
Technically, doing this on Twitter or Facebook is easier than it was for this guy on 4chan.
We are entering a world where the text you read will have a decreasingly likely chance it was written by a human. With the next iteration of LLMs coming in just the next few months many of the surface level problems will go away. The output really will start to be “indistinguishable from a human” as GPT3 falsely brags.
With no moral guidance or ethical boundaries, these AIs are the perfect cult disinformation and indoctrination tool. Bad actors will set them up to spew lies, and to attack and recruit people. If we don’t have tools that make it possible to see reality for what it is and not what someone wants you to see, we will sink into a bottomless pit of AI-created garbage.
I will go ahead and let GPT3 close this article out. AI is green:
Very scary and unfortunately things like this never seem to be done for the benefit of the good guys
YES YES YES YES YES YES YES YES YES YES YES YES