So You Want to Write Fiction With AI
Last week, I had the opportunity to briefly try a tool that uses GPT-3 to assist writers. The intention behind this tool is that it will help you think in outside ways when you get stuck, and it will help you generate momentum when you are trying to write something difficult or tedious for you. GPT-3’s ability to compose texts dazzled us last summer when some of its outputs went viral. A statistic from the OpenAI website illustrates its power: when asked to guess whether a 200 word article was written by either the largest version of GPT-3 or an actual human, people only got it right 52% of the time. This is clearly a huge step forward for AI, and in a way that the Turing test doesn’t quite capture.
In 1966, during the first AI boom, people mistook ELIZA for a human in brief interactions. They did this because we project intentionality onto anything and everything we interact with, and we are even capable of anthropomorphizing random sequences of white noise, imagining patterns and deliberations where none could possibly exist. It’s called pareidolia when you perceive an arrangement of clouds or rocks or hardware fixtures as a face; but in fact all human perceptions are tinged with the pareidoliac. For this reason, we also instinctively assume that anything producing human-like blobs of text is human. Our default behavior is to try to strain rationale out of nonsense, in part because our fellow humans’ attempts at communication often fail to clear the bar of coherency.
Indeed, and I say this with all sincerity: it is difficult to claim with certainty that no animating spirit is hidden behind the output of GPT-3, purely on the basis that so many of the people around us speak with so little originality or clarity. This may sound arrogant, cynical, or misanthropic, but if you are reading this I do not have to persuade you to spend a day browsing twitter, where you will observe the spectrum of human intelligence in all its repugnance and glory. There is something of a paradox in that any individual can see how stupid other people are in aggregate, and yet when he participates in that aggregate, he himself is no better.
The theory of the GPT-3-assisted writing tool is that this software, being capable of writing text that often passes for human, might also have some utility in assisting humans who are trying to write. This sounds good in the abstract, but the moment I was faced with the possibility of the tool, I realized the absurdity of its proposal. The act of asking, “what should I use to seed it?” was enough to show me that there could be no answer. It is not enough to imitate the style of a writer, perhaps by peppering the output with some characteristic words; as I scrolled back through my own works, it occurred to me that not only each story, but each paragraph of each story, was placed there with thought and intention relative to the entire work, a feat that the AI cannot replicate even in principle, given its current implementation.
To understand why this is, we need to first understand how GPT-3 does its magic, and then we need to talk about the mechanics of fiction-writing. But first, this is the paragraph I finally decided to use as a prompt. It comes from my next story:
When he is thirteen, in accordance with economic forecasts, he enrolls in a vocational track to learn about encephalic software engineering. In his sleep, Amy dreams schematics and flowcharts and diagrams into his mind. On the weekends, his father takes him geocaching with an app called Recollect; instead of coordinates, the proprietor of the cache uploads a memory of the act of hiding it. Together, the boy and his dad find a memory of a capsule in a flower pot next to a statue of Nikola Tesla. They join the others who have found it there by uploading their own memories to the chain of recollections.
I allowed GPT-3 to ramble for a while on the above theme, and this is what it produced:
The boy is the only one who has not yet found his capsule. He is fourteen years old.
The boy's father, who is a computer programmer, is a man whose job requires him to build software that can be used for anything from helping the blind to monitoring the health of the planet to making sense of the human genome. He has been working on this project for fifteen years, but he is still not satisfied. His daughter is sixteen, and she is also a computer programmer. She is the one who has been doing most of the work—she has spent her life building software that can be used to help people. Her father is not pleased; he thinks that she should spend more time at home, where she belongs.
The boy and his father are both young men, but they are different. The boy is a bit older than the father, but the father has a long way to go before he can match the boy's intelligence. The boy is smarter; the father is not. The boy is also more ambitious, which makes him more likely to succeed. The father is, as far as the boy knows, just another name for an accountant.
Like you, I instinctively recoil when I am confronted with any substantial text written by an AI. These sentences are located in the uncanny valley, because although they are grammatically correct, they are a semiotic train wreck. Disregard the obviously specious paragraph three, in which it becomes clear that the AI has “lost the plot,” even the first sentence directly contradicts the seed. GPT-3 can count from thirteen to fourteen, and it can model that we are talking about a boy, his father, and a capsule, but what it’s not doing is modeling those entities in any meaningful way. It has no model of why we are talking about a boy and his father. There is likely nothing in its vast training set that approximates the hypothetical neuralink app called Recollect, and it cannot deduce anything about it.
It does understand that a relationship exists, or at least, that one is likely, which is why those symbols recur many times in its output, but it cannot model the nature of the link, only its structure. By way of example, suppose you don’t speak English, and I present you with the following passage:
On earth, you can breathe, because the atmosphere contains a suitable concentration of oxygen. On Mars, there is no atmosphere, so it is difficult to breathe.”
Now, predict the next word in the sentence. You might very well write “On mars, you can breathe”. Glossing over the question of how you ingested billions of similar texts in order to accurately model grammar, this is a perfectly likely sentence to write, given that the words have no meaning to you outside of their frequency and the order in which they occur. Contra Wittgenstein, there is much more to knowledge than the bare facts of the words as you say them. If you were blind and deaf, and we gently scrambled your Wernicke’s and Broca’s area with an electric whisk, you would still be able to understand that water is wet and fire is hot. You might well manipulate these concepts in your “mental workspace” using words as pointers (at some point these analogies may obscure more than they reveal) but there are still other, entirely nonverbal dimensions to cognition and cogitation.
GPT-3 is able to compose novel texts because it has ingested a vast library of text from the internet. It does this by scanning bodies of text and cross-indexing each encountered word with each other word in some proximity to the original, in order to build a mathematical model of how likely each word is to be related to each other word in the same sentence and in previous sentences. This model is stored as a neural network that can retrieve relevant context from a “few shots” of seed data, and then predict, based on that input, the most likely next word in the sequence. Repeat this a few hundred times, and you get an “original” composition.
Nostalgebraist, who understands AI better than I do, made the following comments on GPT-3’s writing, after he found it emitting a passage verbatim from another book:
if it gains the ability to regurgitate something verbatim, that thing is still stored only implicitly in some compressed form and mixed together with everything else it knows[, b]ut it’s possible to compress information in this way and still be able to “read it off of” the resulting model in a surprisingly complete way....
It makes me think of (one simplified view of) the model where it essentially has this huge implicit library of phrases and even sentences and paragraphs, which are all sort of “competing” to be part of the next stretch of text. In this view, some of the higher-level abstractions it seems to form (like certain styles complete with diction and sentence structure) may be represented internally not as equally high-level abstractions, even implicitly, but as a large number of noisy/compressed concrete examples which can be “strung together” via lower-level similarities…
this reminds me in some ways of how it feels when I’m coming up with the next thing I’ll write or say, and maybe the lesson is really that I have some misguided intuitions about human cognition.
In light of this analysis, it becomes much clearer why GPT-3 decides to introduce a daughter in the second paragraph of its output, and then use her as a vehicle to present noxious feminist claptrap:
His daughter is sixteen, and she is also a computer programmer. She is the one who has been doing most of the work—she has spent her life building software that can be used to help people. Her father is not pleased; he thinks that she should spend more time at home, where she belongs.
Literally every single word of this paragraph is objectionable to my worldview and my ethos. How did GPT-3 manage to produce this excrement using my earlier input? It’s because on average, in the 45 terabytes of text that were used to train it, when someone talks about a father in a modern style, they are doing it disparagingly, and presenting him as an obstacle to the daughter’s noble humanitarian impulses. This passage treats the father (and the concept of fatherhood, and ultimately society itself) with contempt, and implies that the daughter is morally and intellectually superior. Centrists and leftists will balk at my reading, but that’s because, while they may know how to sound out glyphs and make them into words, they have no idea how to actually read. (Reading and writing are nearly same skill; a good reader is able to discern intention from beneath layers of moving symbols, and a good writer is able to do the same, in reverse.)
This is a frustrating observation, because at every turn, when we try to talk about the limitations of AI, we also encounter the inexorable limitations of humans, who seem to suffer from many of the same defects and in almost the exact same way. In a video game, an NPC is a non-player character, whose only agency is pre-scripted by a programmer within a narrow domain. In real life, an NPC is a person whose cognitive method is roughly isomorphic to that of GPT-3.
The first computer neural networks were designed based on our understanding of biological neurons in the human brain. Whatever thought and perception consist of, the flow of electricity through neurons in the brain is a large part of the story of what cognition is. Much of your own ability to think is grounded in statistical models of your perceptions, stored in the form of connections between your neurons. Because of this, it’s reasonable to suspect that neural networks built with machine learning are doing something analogous to what your own mind is doing as you read these words. A key difference is that your mind is able to cross-index many different sensory modalities, whereas GPT-3 has only the written word.
But what’s also clear is that “likelihood of perceptions” isn’t the whole story, because after we abstract our perceptions into meta-perceptions and meta-meta-perceptions and so on, we are then able to move those concepts around in our minds to sketch out new hypothetical territory in our mental map, and then plan a route through reality to arrive at a new destination. The level where we ask “what words are likely to be related to other words” is only the outermost, surface level. The level above that is the one where we ask, what is the nature of the relationships I have modeled? When I write about a boy, his father, and a hypothetical new geocaching service, I am not saying “these things are related” and leaving it at that: they are related to each other in a particular way that flows from my understanding of the world at every level. GPT-3 can guess, using its its cached “knowledge”, at sentences that might be appropriate to that relationship, but it cannot use its model to determine if a novel configuration of those objects is rationally possible.
And there is another layer, even deeper than that, where the relationship between the father, the son, the geocaching app, and the capsule exists in a meshwork of relationships that I as the author have configured to convey an even more abstract level of concepts. A writer does not deploy the characters and entities in his story for no reason: each idea is carefully arranged with regard to all of the others for a definite purpose, and in concert they will form a unified whole. This is known as concinnity, or at the risk of sounding pretentious, (I mean OK that ship has sailed) metaphysics. The AI can’t contribute meaningfully to this endeavor because anything it writes will have, at most, a mixture of words that is statistically similar to the input text.
The only utility this offers to a writer is the same as that of a dice roll, or a tarot card, or spilling of entrails. There may be situations where randomness or pseudo-randomness is called for, but they are rare. For example, I did make use of randomness and Markov chains when I was writing The Gig Economy, but only to help me generate the glossolaliac nonsense that the characters speak when they are going mad.
The fallacy of “the AI can help you write” is based on a faulty understanding of why the writer writes. Whether the purpose of the work is to persuade, entertain, or inform, the writer is only successful when an underlying unity of intention motivates his pen. Especially with people who fancy themselves as potential writers, they wish to write books or stories for the sake of writing. They do this because they overestimate the importance a particular component of their favorite stories relative to the story as a whole.
A book called Save the Cat!, written by Blake Snyder, and published in 2005, laid out a template for writing screenplays. In some ways it’s not a large deviation from Joseph Campbell’s idea of the monomyth, the hero’s journey. The book is written as a manual of advice for producing a hit movie script, and it lays out a comprehensive structure that any movie story can follow in order to succeed. I recommend this book to everyone; nearly every big budget movie to come out of Hollywood in the past fifteen years has followed it to the letter.
It’s true: there is a family resemblance to every good story, and if we abstract the structure of a story enough, then all stories appear to be the same. Many abstract thinkers suffer from a belief that higher levels of abstraction are truer or more fundamental than lower levels, but in fact, the art of thinking lies not in finding the highest level of abstraction, but in finding the correct level of abstraction. At the highest level, a story is an account of things that happen. “All stories are the same! They’re just accounts of things happening!” Yes, you solved the puzzle, you are very smart. It’s only one or two steps down from the true and useless “all stories are accounts of things happening” to the true and useless, “there are two types of stories: one where someone leaves home, and one where someone new comes to town.” There is a pleasing symmetry in this formulation, but it’s still worthless.
When you operate at a too-high level of abstraction, you are what Joel Spolsky calls an architecture astronaut:
When you go too far up, abstraction-wise, you run out of oxygen. Sometimes smart thinkers just don’t know when to stop, and they create these absurd, all-encompassing, high-level pictures of the universe that are all good and fine, but don’t actually mean anything at all.
Joel Spolsky is a software engineer, and also a better philosopher than most actual philosophers, because unlike “real” philosophers, his code actually has to work. At the risk of thinking too abstractly once again, a software developer is an applied philosopher, whereas thinkers from Aristotle all the way to Deleuze are what you might call theoretical philosophers. Both types of men deal in the arrangement of abstractions. Computers, for the first time in history, facilitate the field of applied philosophy.
Joseph Campbell’s monomyth is the architectural astronautics of storytelling, and slavish adherence to it has the same predictable result as in any other domain: you lose the ability to think clearly, because you stop being able to breathe. What literature, software engineering, and theoretical philosophy all have in common is that they rely on the ability of the author to model and deploy concepts in a deliberate way, relative to a unified vision.
All this a preamble to introduce a specific concept from Save the Cat!, what the author calls the “Fun and Games:”
[The Fun and Games] provides the promise of the premise. It is the core and essence of the movie’s poster. It is where most of the trailer moments of the movie are found. And it’s where we aren’t as concerned with the forward progress of the story – the stakes won’t be raised until the midpoint – as we are concerned with having “fun.” The fun and games section answers this question: Why did I come to see this movie? What about this premise, this poster, this movie idea, is cool? [...] This is the heart of the movie.
Noticing this component of narratives may be Mr. Snyder’s enduring contribution to the discourse of storytelling. Over-adherence to formula is death, but so is under-attention to structure. Executing the fun and games is the most rewarding part of writing for the writer. Reading the fun and games is the most rewarding activity for the reader. But the fun and games only work, they only become fun, when the rest of the story exists to frame them.
In order to be a story, it is not sufficient to provide an account of what, you must also provide an account of why. A good story may contain a false theory of physics, but no matter how falsely or fantastically it models material nature, it must contain a realistic theory of human nature. The behaviors of the characters and the emotions they feel are how the author conveys his theory. A great story achieves greatness not through the creativity of its world, but through the truth and clarity of its metaphysics.
Often, aspiring writers begin with a premise they find compelling; they imagine a world where some counterfactual is true, and they want to write a story that takes place in that world, to explore that counterfactual. When you think about the premise, you imagine the fun and games, because that tends to be the part of every story that you enjoy the most. When you think of your favorite stories, this is the piece you remember, and it takes on an outsize importance, despite being only a small part of the story as a percentage of its duration.
So you think to yourself, “OK, time to start writing.” And you open up your editor, and you think about your world, and you make up a character, and then… nothing! You switch over to twitter or whatever real quick. Send off an email. Buy something online. Back to the editor, OK, the premise, the made up character, I guess he should do something. Maybe he could have a conversation with someone…? Check twitter one more time. Post some thot in the GC. WYB? Wait, wait, the story…
What’s going wrong in this behavioral loop? It’s not that you’re a mindless junkie feeding a dopamine addiction with whatever shiny junk is closest at hand in a vicious circle of self-abasement. (Though I wouldn’t rule that out). It’s that you’re failing to think about your story in terms of what it intends to say. Only people who have never seriously written anything imagine that a story can exist without saying something. Every story has a theory of human nature, and it only succeeds when it deploys its concepts in a way that conveys that theory. AI is NOT going to help you with this. AI cannot break you out of the failure cycle of not understanding your own theory of human nature, and not knowing how to articulate it.
This kind of articulation is work. It requires critical, laborious thought. Good storytellers think deeply about people and how they think and behave. The fantasy of AI-assisted writing is that the AI will do the hard work of developing a theory of human nature for you, so you can focus on the one, flashy, relatively pointless bit in the middle that gives you breezy mind candy feelings. In order for the story to be worth reading, it must also be a story, which means it must contain a plot. A plot is not just a sequence of events, but a sequence of human events, which is to say, it is a sequence of intentions and emotions. Stories without intention or emotion are not just boring, they are fatally boring; they are not even stories, they are only accounts of facts.
A story told by GPT-3 is quite literally a tale told by an idiot, signifying nothing. A model of the statistical likelihood of arrangements of words cannot stand in for a theory of human nature, so although it can produce disjointed, dreamy sequences of “events,” any attempt to discern the intentions that underlie the composition will fail, because the underlying network of symbols and relationships is exclusively built on induction, where a robust faculty of reasoning must also contain deduction. Again, this does not mean that no AI could implement such a model, only that we can be certain, when we understand its method, that it’s doing no such thing.
But despite this, there are echoes of theories of human nature in the texts that GPT-3 generates, because in many cases it is quoting passages from the texts it has ingested, which did themselves contain such theories. Balaji Srinivasan has argued that, owing to GPT-3’s propensity to quote its sources at length, it is more similar to a search engine than it is to an author or a reader. This also aligns with Nostalgebraist’s off-the-cuff analysis. Its intelligence is the efficiency of indexing, not the cleverness of imagination.
The implication of all this is that GPT-3’s theory of human nature, as such it is, is a clumsy, lowest-common-denominator mashup of the texts it has ingested. With a little cajoling, it can no doubt be persuaded to say things that would provoke our current year commissars into a state of aneurysm, but in the main it seems to have an affinity for the kinds of inane leftist pablum that it generated in response to my paragraph, above. Perhaps it’s my wingnut paranoia, but what does it say that a single instance of the word “father” is enough to trigger a feminist tirade about a woman who does all the work while her father “regressively” tries to stifle her?
When people who don’t speak each other’s languages live and communicate in the same place, they develop a broken, hybrid language called a pidgin, which is derived from the Chinese pronunciation of the english word “business”, to denote the language that 17th century Chinese spoke when dealing with English merchants. The term now refers to any hybrid language of this sort. The curious thing about pidgins is that children who grow up hearing them innately end up formalizing and grammaticalizing them. Where their parents’ use of the pidgin is inconsistent and strictly pragmatic, their use follows a more rigid structure.
Some characteristics of pidgins, as opposed to “real” languages, are that they tend to lack embedded clauses, grammatical tense, and conjugation, and they tend to employ reduplication for plurals and superlatives (saying something is “big big” denotes a degree of bigness bigger than merely “big”), among other things. What this means is that pidgins are simple; they lack the richness and majesty of mature languages. As a result, they lack the capacity for belletrism. The theory of human nature that emerges from GPT-3’s schizophrenic reading of 40 terabytes of internet text is probably well-characterized as first generation pidgin. The “language” of its “thought” is inconsistent, haphazard, and practical, in the sense that it is able to use it to produce text.
In light of this, it is no surprise that the implicit theory of human nature in the text it produces is so similar to that of the every generation post-boomer, which is most prominently displayed among millennials. Millennials have grown up in a hypermediated postmodern world, where postmodernism is defined as “skepticism of metanarratives”. No one ever sat millennials down and transferred a robust, mature theory of human nature to them, so like the children of pidgin speakers, they gleaned it from the implicit theories in all of the media they grew up consuming. And just as second generation pidgin languages are lacking in morphological complexity and nuance, pidgin theories of human nature are lacking in psychological complexity and nuance.
A big part of the reason that millennials find both “woke” and “trad” morality compelling (both of which are ridiculous caricatures of moral behavior) is that their metaphysical understanding of the world is a ludicrous, miscegenated pidgin of stories told by people whose only shared metanarrative is a skepticism of metanarratives. In this way, GPT-3, which was no doubt built by mostly millennials, is a kind of reification of their own beliefs about the world; deduction is superfluous, ideas are just arrangements of words, and AI is a genie who can fulfill your fantasies about narrative fun and games without having to explore the difficult territory of complex intentions and motivations.