0%
Still working...

If Anyone Builds it, Everyone Dies review – how AI could kill us all | Books


What if I told you I could stop you worrying about climate change, and all you had to do was read one book? Great, you’d say, until I mentioned that the reason you’d stop worrying was because the book says our species only has a few years before it’s wiped out by superintelligent AI anyway.

We don’t know what form this extinction will take exactly – perhaps an energy-hungry AI will let the millions of fusion power stations it has built run hot, boiling the oceans. Maybe it will want to reconfigure the atoms in our bodies into something more useful. There are many possibilities, almost all of them bad, say Eliezer Yudkowsky and Nate Soares in If Anyone Builds It, Everyone Dies, and who knows which will come true. But just as you can predict that an ice cube dropped into hot water will melt without knowing where any of its individual molecules will end up, you can be sure an AI that’s smarter than a human being will kill us all, somehow.

This level of confidence is typical of Yudkowsky, in particular. He has been warning about the existential risks posed by technology for years on the website he helped to create, LessWrong.com, and via the Machine Intelligence Research Institute he founded (Soares is the current president). Despite not graduating high school or university, Yudkowsky is highly influential in the field, and a celebrity in the world of very bright young men arguing with each other online (as well as the author of a 600,000-word work of fanfic called Harry Potter and the Methods of Rationality). Colourful, annoying, polarising. “People become clinically depressed reading your crap,” lamented leading researcher Yann LeCun during one online spat. But, as chief scientist at Meta, who is he to talk?

And while Yudkowsky and Soares may be unconventional, their warnings are similar to those of Geoffrey Hinton, the Nobel-winning “godfather of AI”, and Yoshua Bengio, the world’s most-cited computer scientist, both of whom signed up to the statement that “mitigating the risk of extinction from AI should be a global priority alongside other societal-scale risks such as pandemics and nuclear war”.

As a clarion call, If Anyone Builds It, Everyone Dies is well timed. Superintelligent AI doesn’t exist yet, but in the wake of the ChatGPT revolution, investment in the datacentres that would power it is now counted in the hundreds of billions. This amounts to “the biggest and fastest rollout of a general purpose technology in history,” according to the FT’s John Thornhill. Meta alone will spend as much as $72bn (£54bn) on AI infrastructure this year, and the achievement of superintelligence is now Mark Zuckerberg’s explicit goal.

Not great news, if you believe Yudkowsky and Soares. But why should we? Despite the complexity of its subject, If Anyone Builds It, Everyone Dies is as clear as its conclusions are hard to swallow. Where the discussions become more technical, mainly in passages dealing with AI model training and architecture, it’s still straightforward enough for readers to grasp the basic facts.

Among these is that we don’t really understand how generative AI works. In the past, computer programs were hand coded – every aspect of them was designed by a human. In contrast, the latest models aren’t “crafted”, they’re “grown”. We don’t understand, for example, how ChatGPT’s ability to reason emerged from it being shown vast amounts of human-generated text. Something fundamentally mysterious happened during its incubation. This places a vital part of AI’s functioning beyond our control and means that, even if we can nudge it towards certain goals such as “be nice to people”, we can’t determine how it will get there.

That’s a problem, because it means that AI will inevitably generate its own quirky preferences and ways of doing things, and these alien predilections are unlikely to be aligned with ours. (This is, it’s worth noting, entirely separate from the question of whether AIs might be “sentient” or “conscious”. Being set goals, and taking actions in the service of them, is enough to bring about potentially dangerous behaviour.) In any case, Yudkowsky and Soares point out that tech companies are already trying hard to build AIs that do things on their own initiative, because businesses will pay more for tools that they don’t have to supervise. If an “agentic” AI like this were to gain the ability to improve itself, it would rapidly surpass human capabilities in practically every area. Assuming that such a superintelligent AI valued its own survival – why shouldn’t it? – it would inevitably try to prevent humans from developing rival AIs or shutting it down. The only sure-fire way of doing that is shutting us down.

What methods would it use? Yudkowsky and Soares argue that these could involve technology we can’t yet imagine, and which may strike us as very peculiar. They liken us to Aztecs sighting Spanish ships off the coast of Mexico, for whom the idea of “sticks they can point at you to make you die” – AKA guns – would have been hard to conceive of.

Nevertheless, in order to make things more convincing, they have a go. In the part of the book that most resembles sci-fi, they set out an illustrative scenario involving a superintelligent AI called Sable. Developed by a major tech company, Sable spreads through the internet to every corner of civilisation, recruiting human stooges through the most persuasive version of ChatGPT imaginable, before destroying us with synthetic viruses and molecular machines. It’s outlandish, of course – but the Aztecs would’ve said the same about muskets and Catholicism.

Yudkowsky and Soares present their case with such conviction that it’s easy to emerge from this book ready to cancel your pension contributions. The glimmer of hope they offer – and it’s low wattage – is that doom can be averted if the entire world agrees to shut down advanced AI development as soon as possible. Given the commercial and strategic incentives, and the current state of political leadership, this seems a little unlikely.

The crumbs of hope we are left to scrabble for, then, are indications that they may not be right, either about the fact that superintelligence is on its way, or that its creation equals our annihilation.

There are certainly moments in the book when the confidence with which an argument is presented outstrips its strength. A small example: as an illustration of how AI can develop strange, alien preferences, the authors offer up the fact that some large language models find it hard to interpret sentences without full stops. “Human thoughts don’t work like that,” they write. “We wouldn’t struggle to comprehend a sentence that ended without period.” But that’s not really true; humans often rely on markers at the end of a sentences in order to interpret them correctly. We learn language via speech, so they’re not dots on the page but “prosodic” features like intonation: think of the difference between a rising and falling tone at the end of a phrase such as “he said he was coming”. If text-trained AI leans heavily on punctuation to figure out what’s going on, that shows its thought processes are analogous, not alien, to human ones.

And for writers steeped in the hyper-rational culture of LessWrong, Yudkowsky and Soares exhibit more than a touch of confirmation bias. “History,” they write, “is full of … examples of catastrophic risk being minimised and ignored,” from leaded petrol to Chornobyl. But what about predictions of catastrophic risk being proved wrong? History is full of those, too, from Malthus’s population apocalypse to Y2K. Yudkowsky himself once claimed that nanotechnology would destroy humanity “no later than 2010”.

The problem is that you can be overconfident, inconsistent, a serial doom-monger, and still be right. It’s important to be aware of our own motivated reasoning when considering the arguments presented here; we have every incentive to disbelieve them.

And while it’s true that they don’t represent the scientific consensus, this is a rapidly changing, poorly understood field. What constitutes intelligence, what constitutes “super”, whether intelligence alone is enough to ensure world domination – all of this is furiously debated.

At the same time, the consensus that does exist is not particularly reassuring. In a 2024 survey of 2,778 AI researchers, the median probability placed on “extremely bad outcomes, such as human extinction” was 5%. More worryingly, “having thought more (either ‘a lot’ or ‘a great deal’) about the question was associated with a median of 9%, while having thought ‘little’ or ‘very little’ was associated with a median of 5%”.

Yudkowsky has been thinking about the problem for most of his adult life. The fact that his prediction sits north of 99% might reflect a kind of hysterical monomania, or an especially thorough engagement with the problem. Whatever the case, it feels like everyone with an interest in the future has a duty to read what he and Soares have to say.

If Anyone Builds it, Everyone Dies by Eliezer Yudkowsky and Nate Soares is published by Bodley Head (£22). To support the Guardian, order your copy at guardianbookshop.com. Delivery charges may apply.



Source link

Recommended Posts