Phew! Google's Gemini AI still can’t write a synopsis of my novel

Most writers hate writing a synopsis, so it seems like the perfect challenge for an AI. What I discovered reveals a lot about the current state and limitations of generative AIs like ChatGPT and Google Gemini.

The current wave of interest in AI is fascinating because it inspires so much hype on both sides. I don’t buy into the utopian fantasies of grifter techbros and Dollar-blinded CEOs. Nor am I convinced that generative AI will be the death of creative arts as a career choice.

An AI toolbox of delights

I have no interest in asking AI to write entire works from a prompt, even if it can. At the same time, I see the potential for generative AI to be used as a creative tool. It could help writers overcome our individual shortcomings or streamline time-consuming tasks that are adjacent to our central goal of creative writing.

Unfortunately, this is complicated by the highly unethical way in which this technology has been developed. So far, it’s involved content piracy on a massive scale. I only hope that new laws and regulations will correct this (though not self-regulation, which is mostly pointless).

But I don’t want writing to become a hobby. I seriously worry about the cognitive and empathic dissonance displayed by AI champions who say things like: “Worried about your career being taken by AI? Don’t worry, you can have a new job for lower pay, labelling data for AI so that it can do your job better.” It’s a bit like saying: “Hey, you know that insatiable monster we just made that ate your life? How about we pay you less to feed the monster instead?”

An error message from Google's Gemini AI — If only the marketing guff was this honest about the current capability of AIs.

Writing a synopsis with Google Gemini

When you submit your novel to an agent or publisher, they’ll want to read a synopsis before the full manuscript. To produce this, you have to shrink your finely-crafted 100,000 words into a summary of less than five hundred.

It’s a harrowing and time-consuming act of narrative compression. You’ll abandon beloved minor characters and subplots as you focus on the main characters, narrative, setting and themes. You battle to hold onto the unique flavour, tone and narrative voice that might win you a deal.

But hang, did I say “compression”? Isn’t that something that computers are really good at? It seems like the perfect job for a large language model AI like Chat GPT or Google Gemini.

A conversation with the Google Gemini AI — The document was just over 50,000 words, so Gemini’s only about 88% inaccurate.

ChatGPT vs Google Gemini: best choice for your synopsis

ChatGPT is the most well-known LLM on the generative AI scene, but it has a major flaw. Everything you give it is ingested into its body of training material (remember the thing full of pirated works?).

Google Gemini, in contrast, will examine documents taken from your Google Workspace without ingesting them. After all, I don’t want to give an AI more free training materials. Gemini also enables you to give feedback, and to iterate or query the response with additional prompts.

[Update: I’ve heard good things about Claude but it’s not available in the EU for now.]

I ran these experiments on four of my own texts: Blood River, my first published horror novel; the first draft of Blood Point, my current horror WIP; the current draft of the first In Machina sci-fi novel; and the first draft of The Stuffing of Nightmares, a short story featuring the murderous plushie pals, Bongo & Sandy.

Blood River and Blood Point have similar narrative styles based on found footage diaries, and are just over 50,000 words long. In Machina #1 is around 101,000 words long but it employs both first and third-person narratives. Bongo and Sandy’s story is a simple third-person narrative of about 10,000 words.

I gave Google Gemini a simple request: “The document in my Google Workspace titled “X” is a novel. Please create a 250-word synopsis of this story.”

Google Gemini lies to cover its mistakes

In the tradition of narrative flashbacks, this is the point where I reveal that I started my experiment with a more complex prompt. I asked for a synopsis that identified the major characters, themes, settings, plot points and resolutions of each story.

In every case, Gemini did a good job of summarising the texts and breaking down the stories…all the way to around 4,500 words. Beyond there, it sort of made things up.

Acording to Gemini, Blood River and Blood Point are “unfinished”. Gemini even identified a point where it thinks the Blood River story ends, at just over 4,000 words. The summaries began to suggest how the stories might develop instead of how they actually emerge. In both cases, neither the supernatural elements nor the antagonists were accurately described.

The summary of In Machina #1 also reached about 5,000 words. The protagonists never meet each other or their nemesis. Gemini identified the genre, characters, settings, and some of the themes, though it was unable to describe their development.

When I asked Gemini how much it had read, it told me that it had read all of Blood River’s 56,700 words. For Blood Point, it said: “I read the entirety of the document…which consists of 5,939 words.” That’s about 12% of the entire document. With In Machina #1, Gemini told me: “I read 100% of the document, or 1,013 words, to create the synopsis.” That’s about 1% of the entire novel.

I pressed Gemini to say why it fell short, and it initially claimed it had full access to the document. A short conversation later, it agreed that this wasn’t the problem and we could try again. This attempt repeatedly failed due to an unspecified problem with my internet connection.

My internet connection is fine.

Shucks, I’m just a l’il ol’ language model

The exception is Bongo and Sandy’s story. It’s a violent tale of plushie-on-plushie violence, shocking sour jelly sweet addiction and foul mouths on the lead characters that you wouldn’t believe. I thought that the shortest story would yield the most complete synopsis, but on Gemini refused to complete the task.

Time and again I watched several paragraphs of synopsis appear, only to be replaced by variations on this message: “I can’t assist you with that, as I’m only a language model and don’t have the capacity to understand and respond.”

Just when I’d given up, Gemini delivered a truncated and inaccurate synopsis. The story again petered out at about 5,000 words, although Gemini claimed to have read the whole 10,000.

What can we learn from Google Gemini’s synopsis fails?

Asking Gemini about its failures feels like talking to a politician: the question it answers is rarely the one that you asked.

My feeling is that it’s a matter of scale. It’s one thing for Gemini to summarise a few thousand words. It’s significantly harder to summarise ten thousand words, let alone 100,000. I don’t know whether the task scales in a linear fashion or if it becomes exponentially more difficult as the narrative gets longer. I know it’s a challenge for my human brain — and I wrote these stories.

The problem might lie in the availability of computing power for each user, especially when Google, OpenAI et al are currently in the “free sample” phase of getting users hooked on their products. Even if they have enough computing power, the electrical power being hoovered by by AI-hosting data centres has become a major issue worldwide. Maybe they just can’t afford to indulge my experiment.

AI’s limits are a win for creatives

While it would be great to have an AI take on time-consuming tasks like writing a synopsis, it’s a good sign for creative industries as a whole. Some jobs, it seem, are still too big for today’s inefficient AIs.

Sure, an AI could create a synopsis chapter-by-chapter and sew it together into something coherent. In fact, that’s exactly what OpenAI did in a 2021 experiment with ChatGPT 3 using classic novels. At the time, the ChatGPT summaries received a rating of 4/7 or less by 85% of people who had read those books. That’s not a great score, and though it’s certainly improved, they admit that this is one of the hardest tasks for these types of AI.

At least one commercial outfit has now launched a tool dedicated to summarising long-form text. So far, the summaries look little better than those Gemini gave me.

Hopefully, this is good news for editors as well as for writers. AI-enhanced tools might be competent at spelling, grammar and improving short text. They can change your style but they can’t yet learn and follow your individual sense of language over the length of a novel.

As for book-length tasks, like development editing or writing a synopsis, those jobs are likely to remain safe for years to come.