On completely misunderstanding the threat of superhuman AI

I have this bad habit where I’ll see an article that looks interesting and think, “ooh, I’d better save that and read it when I have time to really think about it and write some thoughtful comments on it.” Then a week to two months later, I finally read it and realize it’s crap and I could have just dashed of a response fairly quickly.

My latest example of this is Richard Loosemore’s The Fallacy of Dumb Superintelligence, a critique of the Singularity Institute. Here’s the core of the argument:

What they are trying to say is that a future superintelligent machine might have good intentions, because it would want to make people happy, but through some perverted twist of logic it might decide that the best way to do this would be to force (not allow, notice, but force!) all humans to get their brains connected to a dopamine drip.

I have been fighting this persistent but logically bankrupt meme since my first encounter with it in the transhumanist community back in 2005. But in spite of all my efforts …. there it is again.  Apparently still not dead.

Here is why the meme deserves to be called “logically bankrupt”.

If a computer were designed in such a way that:

(a) It had the motivation “maximize human pleasure”, but

(b) It thought that this phrase could conceivably mean something as simplistic as “put all humans on an intravenous dopamine drip”, then

(c) This computer would NOT be capable of developing into a creature that was “all-powerful”.

The two features <all-powerful superintelligence> and <cannot handle subtle concepts like “human pleasure”> are radically incompatible.

With that kind of reasoning going on inside it, the AI would never make it up to the level of intelligence at which the average human would find it threatening.  If the poor machine could not understand the difference between “maximize human pleasure” and “put all humans on an intravenous dopamine drip” then it would also not understand most of the other subtle aspects of the universe, including but not limited to facts/questions like:

“If I put a million amps of current through my logic circuits, I will fry myself to a crisp”,


“Which end of this Kill-O-Zap Definit-Destruct Megablaster is the end that I’m supposed to point at the other guy?”.

Then Loosemore goes on to consider some objections which he imagines people making (but which aren’t much like any arguments I’ve ever heard people from the Singularity Institute make), and then conclude that the Singularity Institute is deliberately filling people’s heads with nonsense in order to get more money.

If you want to actually understand what the folks at the Singularity Institute are worried about, you might start with Eliezer Yudkowsky’s short story “Failed Utopia #4-2.” I strongly recommend reading the whole thing, but the key takeaway is that the AI in that story is not stupid in any way. It knows perfectly well that people don’t want it to do what it did and will want it dead when they find out. It just wasn’t programmed to care. All it was programmed to do was make people happy while obeying a hundred and seven precautions.

So the worry is not that a superintelligent AI might have good intentions, but through twisted logic epically fail at carrying them out. The worry is human programmers might think they’re giving an AI good intentions when they’re actually not. They might think they’re programming it with the goal “make people happy” but give it a different goal that differs from “make people happy” in subtle yet important ways.

Or they might successfully program it to make people happy, but it turns out that there’s more to life than happiness and we wouldn’t want to create an AI focused on making people happy to the exclusion of all other goals if we understood the full implications of doing so.

And personally, it does seem to me that if all you care about is maximizing pleasure, hooking everyone up to drugs or electrodes is the thing to do. We wouldn’t want that, but to my mind that just goes to show that pleasure isn’t all we care about. Maybe Loosemore has a different idea of what the word “pleasure” means–but if two native English speakers can’t agree on the meaning of the word, that should give us an idea of the challenges involved in programming an AI with commands like, “maximize human pleasure.”

Now, to a degree making progress on the “how do we make AIs smarter” question may help with the “how do we make AIs benevolent” question, because once an AI gets smart enough to understand human intentions you can then refer to human intentions when giving the AI its goals. But that doesn’t necessarily make the problem go away, at least not entirely. (See here and here for discussions I started with thoughts like that in mind at LessWrong.)

Finally, this is totally insidery, but as I read the article I thought, “Richard Loosemore, where have I heard that name before?” Then I realized he was this guy.