At LessWrong: Why a human (or group of humans) might create unFriendly AI halfway on purpose

This post that I just wrote contains some of the stuff on evolutionary psychology I’ve been meaning to write more about here:

Too many people–at least, too many writers of the kind of fiction where the villain turns out to be an all-right guy in the end–seem to believe that if someone is the hero of their own story and genuinely believes they’re doing the right thing, they can’t really be evil. But you know who was the hero of his own story and genuinely believed he was doing the right thing? Hitler. He believed he was saving the world from the Jews and promoting the greatness of the German volk.

We have every reason to think that the psychological tendencies that created these hero-villains are nearly universal. Evolution has no way to give us nice impulses for the sake of having nice impulses. Theory predicts, and observation confirms, that we tend to care more about blood-relatives than mere allies and allies more than strangers. As Hume observed (remarkably, without any knowledge of Hammilton’s rule) “A man naturally loves his children better than his nephews, his nephews better than his cousins, his cousins better than strangers, where every thing else is equal.” And we care more about ourselves than any single other individual on the planet (even if we might sacrifice ourselves for two brothers or eight cousins.)

Most of us are not murderers, but then most of have never been in a situation where it would be in our interest to commit murder. The really disturbing thing is that there is much evidence that ordinary people can become monsters as soon as the situation changes. Science gives us the Stanford Prison Experiment and Milgram’s experiment on obedience to authority, history gives us even more disturbing facts about how many soldiers commit atrocities in war time. Of the soldiers who came from societies where atrocities are frowned on, most of them must have seemed perfectly normal before they went off to war. Probably most of them, if they’d thought about it, would have sincerely believed they were incapable of doing such things.

This makes a frightening amount of evolutionary sense. There’s reason for evolution to, as much as possible, give us conditional rules for behavior so we only do certain things when it’s fitness increasing to do so. Normally, doing the kind of things done during the Rape of Nanking leads to swift punishments, but the circumstances when such things actually happen tend to be circumstances where punishment is much less likely, where the other guys are trying to kill you anyway and your superior officer is willing to at minimum look the other way. But if you’re in a situation where doing such things is not in your interest, where’s the evolutionary benefit of even being aware of what you’re capable of?

  • mnb0

    “Too many people seem to believe that if someone is the hero of their own story and genuinely believes they’re doing the right thing, they can’t really be evil.”
    Those people should watch movies like M and A Clockwork Orange. Reading some books of Lawrence Sanders and Ruth Rendell also might help.

  • jamessweet

    Most of us are not murderers, but then most of have never been in a situation where it would be in our interest to commit murder. The really disturbing thing is that there is much evidence that ordinary people can become monsters as soon as the situation changes.

    I was re-watching the first season of Breaking Bad recently, and there was something that struck me, and I’m pretty sure I had the same reaction on the first viewing as well.

    Let me give a very quick description of the premise of the show, as well as the scenario in question, for anyone who hasn’t watched the show: (Partial spoiler alert, but it only applies to the first two or three episodes) Chemistry teacher Walter White gets diagnosed with terminal lung cancer, and in order to leave a nest egg for his family decides to team up with a meth-dealing former student to cook crystal meth. At the end of the first episode, they are threatened at gunpoint by a couple of mid-level dealers whom they were supposed to sell too, so Walt botches the chemistry in order to create poisonous gas that kills the dealers while he escapes. While planning on how to dispose of the bodies, they discover one of the dealers survived. So they tie him up in the basement and… they don’t know what to do, because neither of them wants to be a murderer, but they don’t know what else to do.

    What struck me about it is this: While I seriously doubt I would ever make the series of choices that landed them in that situation, I’m pretty confident that if I ever did wind up in that situation, I would rapidly decide I was going to off the guy, and I would do it within an hour of deciding (Walt and Jesse end up keeping the guy tied up for days in their paralysis over how to handle it). Of course I think the ethical thing to do would be to drop the guy at a hospital and turn themselves in… but I am pretty sure if I’d already gone that far, I would find a way to rationalize killing the guy (“He was trying to kill me, after all! I ‘killed’ him in self-defense, it just didn’t work. Now, I either ruin my life — because he attacked me! — or I subject this guy to needless torture by keeping him alive in my basement, or I just kill him. Gotta do what you gotta do…”)

    As to actually doing it… Well, I know a baby squirrel is not the same as a human, but see I have two cats and two dogs, and in the summer months they will occasionally maim-but-not-kill some poor small animal, and I have to “finish the job”. I hate it… it is just not in my nature to bludgeon a small animal to death. If I think about what I am doing, I can’t do it.

    But I found that if I just rehearse the actions in my head, devoid of context — I don’t think, “I’m going to club this dying chipmunk with the butt end of this ax,” instead I think, “I am going to raise this ax I am holding a few feet, I am going to swiftly bring it down in approximately this area in front of me, and I may have to repeat that action of my arms if necessary” — then it becomes pretty easy. I just focus on the physical movements rather than the intention of what I am doing.

    I don’t think it would necessarily be any different with a human being. If I thought about the fact I was killing a person, it would be impossible. But you reduce it to actions. You reduce it to a process. You divorce the process from any broader meaning.

    The key is deciding that you want to. As you said, most of us simply don’t find ourselves in a situation where murder is in our best interest. If we did, many of us would have very little difficulty rationalizing it, just as we rationalize away all sorts of unethical choices we make on a day to day basis. And then it’s just another small step…

    • chrisj

      So let me get this straight: they deliberately killed some people, and then wibbled about killing one more (who should have died with the first batch) because they didn’t want to be murderers?

      Because if intentionally killing people with poison gas isn’t murder, I’m not quite sure what is. I can believe there might well be a big psychological step between something like poison and physically bludgeoning someone, but anyone who thinks that poisoning people is OK is already several steps beyond acceptable.

      • Stevarious

        Very very few bad people actually think of themselves as bad people.

      • Darren

        It was more subtle than that. The first instance could credibly have been described as self-defense.

        Two men kidnap you at gunpoint. You manage to kill one of them and wound the second. The wounded man in no longer an immediate threat, but should he survive he will surely seek revenge on you and your family. Do you then shoot him in the back as he crawls slowly along the ground?

        The legal distinction is clear, the ethical one less so.

  • Nomen Nescio

    But if you’re in a situation where doing such things is not in your interest, where’s the evolutionary benefit of even being aware of what you’re capable of?

    that was a key insight, to me. i suspect most people actively don’t want to be aware of what they’re capable of, because it’s a thoroughly unpleasant subject to even contemplate. it goes against the grain of fitting into peaceful society, which most of us do want to, most of the time at least.

    there’s a mental exercise among gun nuts (i’m one) that’s recommended for people contemplating getting a concealed weapons permit — simply to mentally roleplay what sorts of scenarios you might need to use a gun in, and decide for yourself if it’s really possible that you would shoot someone dead. to find out, even if only through introspection and speculation, what it would take for you to deliberately kill somebody — or if that handgun you might be contemplating carrying would just end up being something for J. Random Mugger to steal off you.

    thing is, suggesting that sort of mental exercise to “normal” non-gun nuts who aren’t considering carrying a handgun, and suggesting it seriously as something that might be useful for discovering one’s own personal limits and abilities, can get random gun nuts like me some extremely odd looks. it’s as if the whole experiment was infringing some sort of social taboo, not just on killing but on discovering the limits around deadly violence for oneself.

    and in retrospect, it does make sense to me that this might be not selected for, or even selected against, in the evolution of a social, cooperative animal. most of us will never kill, or even come close to needing to — thankfully. why risk wasting energy on something that might make us even potentially more likely to?

  • Kyle Ewan

    Others who thought they were doing the right thing, but who set in motion operations that killed MILLIONS in their attempt to eliminate religion, were Lenin, Trotsky, and Stalin.

    And, YES, they were doing it BECAUSE they were atheists and wanted to eliminate religion.

    Hitchens is another example of an atheist who was Gung Ho for war.

    And he thought it was the right thing.

    • Stevarious

      And, YES, they were doing it BECAUSE they were atheists and wanted to eliminate religion.

      Citation needed.

  • http://rockstarramblings.blogspot.com/ Bronze Dog

    Others who thought they were doing the right thing, but who set in motion operations that killed MILLIONS in their attempt to eliminate religion, were Lenin, Trotsky, and Stalin.

    Except they were doing it to spread Communism, not atheism. Authoritarians don’t like their ideas to compete against others. It’s the same reason why religions burned heretics instead of letting them argue their position.

    Of course, we only advocate eliminating religion by 1) preventing governments from endorsing any religion and 2) free speech. People can believe or say what they want, and we can criticize them if we want.

  • http://angramainyusblog.blogspot.com/ Angra Mainyu

    Most of us are not murderers, but then most of have never been in a situation where it would be in our interest to commit murder.

    Are you counting ‘our interest’ in terms of reproductive success, or in terms of what we actually want, or what we would want (given sufficient information), or something else?
    If it’s reproductive success, our propensities reflect conditions in the ancestral environment, rather than maximization of reproductive success on specific cases, no matter how unusual.
    So, I suppose humans might have a propensity to commit murder if responding in that manner to certain environmental cues would have been on average conducive to reproductive success in the ancestral environment.
    If it’s about what we want or would want, I see no good reason to suppose that it would ever be in our interest to commit murder – not for all of us, anyway. That would depend on our mental makeup.
    If it’s about something else, I don’t know. I’d like to know more about how you measure self-interest.

    There’s reason for evolution to, as much as possible, give us conditional rules for behavior so we only do certain things when it’s fitness increasing to do so.

    I agree, in the sense that responding in a certain fashion to certain environmental cues would have been on average conducive to reproductive success in the ancestral environment.

    But if you’re in a situation where doing such things is not in your interest, where’s the evolutionary benefit of even being aware of what you’re capable of?

    There may well be an overall benefit in being able to assess how one would react under different circumstances, by means of considering different hypothetical scenarios. That may give an advantage when deciding what course of action to take.
    Also, being able to assess, with at least some confidence, how others would react under certain circumstances, may well be useful. But an ability to predict the behavior of humans in hypothetical scenarios can be applied to oneself, at least to some extent, unless there is some mechanism that actively blocks that (which might or might not be the case, but I’m saying that the matter is pretty complicated).

    • http://angramainyusblog.blogspot.com/ Angra Mainyu

      Some further comments:

      So, I suppose humans might have a propensity to commit murder if responding in that manner to certain environmental cues would have been on average conducive to reproductive success in the ancestral environment.

      I say ‘might have’ because it’s not clear that they would, either, since such situations would have had to happen frequently enough to result in selection pressure.
      But even if that was the case, if humans also evolve a moral sense that assesses that such behavior is murder (and, in particular, immoral), then there is also a counter-propensity not to behave in such fashion.
      If so, then different propensities might have evolved at different times in our past, etc.
      On the other hand, if there isn’t such an evolved moral sense, then what you get is (probably) some metaethical consequences (namely, some kind of antirealism), and then that raises an assortment of other issues.

  • Art

    Japanese behavior in Nanking was both a crime of opportunity, allowed in part by the ingrained belief that the Chinese were sub-human, and a strategy in and of itself. Terror drove population before them and this human wave made defense by nationalist forces more difficult by clogging roads, consuming resources, and spreading the diseases.

    The Japanese claimed, variously, that such tactics were justified by their being massively outnumbered by the local population, and, of course, that any suffering inflicted would be made good and repaid many times over once the peace and prosperity of the benevolent Japanese influence was established and in full control.

    A classic case of one of the great moral questions: How much evil is justified to gain a greater good?

    Religions play the same game torturing and maiming the heretic’s body to ‘save their immortal soul’. Like the assumption that Japanese rule would pay off some time in the future, religion offers paradise only at a much later time. In their case, a hypothetical hereafter.

    If the Catholic Church is taken at their word the ultimate good is drop-kicking souls through the goalposts so those souls can spend eternity in heaven. The offered alternative is eternity in Hell. Purgatory was an option until it was doctrinally made to not exist any more. If you really, I mean really, really, believe that it is either heaven of hell, and the church is the only way to make it through those holy goalposts, then all the lies, manipulations, torture, rapes, exploitation, kiddie diddling, and corruptions are just footnotes about collateral damage during a Manichean struggle over souls.

    It is a fine story. A strong story. A story so grand that it entirely overshadows all the evil they have done. At least it does if you buy into it. If you buy into the whole heaven and hell thing. And believe in souls, and eternity, and salvation then it all makes sense and the greed and lust for power are just means to a higher end. A down-payment in inequity for eternal happiness delivered at a later date.

    If, on the other hand, you see the greed and lust for power and wealth as the primary goal, and the fight for souls is just a fairy tale story designed to allow easier access to venal ends then none of the crimes committed by the church may be excused.

    The difference between saint and monster is the amount of traction the story you use as your excuse gets you. No crime is too small to avoid punishment. But given the right story you can justify any crime.

    Judgment can be both pass/fail, and be graded on a curve defined by how good your standup routine is.

    • http://angramainyusblog.blogspot.com/ Angra Mainyu

      Just a quick clarification:

      Purgatory was an option until it was doctrinally made to not exist any more.

      Purgatory wasn’t eliminated from their doctrine (Limbo was).

      • http://rockstarramblings.blogspot.com/ Bronze Dog

        Wait… Yeah, I guess that’s right. I had the two stuck in the same place in my mind, like Art might have.

        Purgatory is where sins get purged, hence the name, and Limbo was the “meh” place where unbaptized babies went when the popes decided auto-damnation was bad for public relations.

  • Pierce R. Butler

    … we might sacrifice ourselves for two brothers or eight cousins.

    I never quite understood the logic of this: personally, it comes out as a total loss, and genetically it’s (at best) break-even.

    Of course, such calculations make it obligatory to throw oneself at the menace of the moment to protect three siblings or nine cousins (which declaration comes quite easily from an only child with only four cousins…).

    • Darren

      Not what we might choose, but what our Machiavellian genes might choose for us.

      The idea being that our genes, as one strategy among others, might ‘choose’ to favor self-sacrifice, but it only works if the self-sacrifice is privileged towards kin (as kin carry a greater or lessor amount of the same genetics as us). Thus, I am more likely to throw myself at the hungry tiger to save the life of my brother’s child than my cousins’ child. Still not as likely as I would be to protect my own child, but certainly much more so than for the child of Ogg the Unrelated’s child.

      The numbers can be deceptive, genes are not so good at counting, but they are pretty good at recognizing kin (or rather, humans appear to have developed better faculties for recognizing and factoring kinship into our decisions than for factoring in numbers). Thus, I will save my own child at the expense of a dozen cousins, even though genetically it might be a better payoff to save the cousins. But, I might not save my child at the expense of my entire tribe.


CLOSE | X

HIDE | X