On completely misunderstanding the threat of superhuman AI

I have this bad habit where I’ll see an article that looks interesting and think, “ooh, I’d better save that and read it when I have time to really think about it and write some thoughtful comments on it.” Then a week to two months later, I finally read it and realize it’s crap and I could have just dashed of a response fairly quickly.

My latest example of this is Richard Loosemore’s The Fallacy of Dumb Superintelligence, a critique of the Singularity Institute. Here’s the core of the argument:

What they are trying to say is that a future superintelligent machine might have good intentions, because it would want to make people happy, but through some perverted twist of logic it might decide that the best way to do this would be to force (not allow, notice, but force!) all humans to get their brains connected to a dopamine drip.

I have been fighting this persistent but logically bankrupt meme since my first encounter with it in the transhumanist community back in 2005. But in spite of all my efforts …. there it is again.  Apparently still not dead.

Here is why the meme deserves to be called “logically bankrupt”.

If a computer were designed in such a way that:

(a) It had the motivation “maximize human pleasure”, but

(b) It thought that this phrase could conceivably mean something as simplistic as “put all humans on an intravenous dopamine drip”, then

(c) This computer would NOT be capable of developing into a creature that was “all-powerful”.

The two features <all-powerful superintelligence> and <cannot handle subtle concepts like “human pleasure”> are radically incompatible.

With that kind of reasoning going on inside it, the AI would never make it up to the level of intelligence at which the average human would find it threatening.  If the poor machine could not understand the difference between “maximize human pleasure” and “put all humans on an intravenous dopamine drip” then it would also not understand most of the other subtle aspects of the universe, including but not limited to facts/questions like:

“If I put a million amps of current through my logic circuits, I will fry myself to a crisp”,

or

“Which end of this Kill-O-Zap Definit-Destruct Megablaster is the end that I’m supposed to point at the other guy?”.

Then Loosemore goes on to consider some objections which he imagines people making (but which aren’t much like any arguments I’ve ever heard people from the Singularity Institute make), and then conclude that the Singularity Institute is deliberately filling people’s heads with nonsense in order to get more money.

If you want to actually understand what the folks at the Singularity Institute are worried about, you might start with Eliezer Yudkowsky’s short story “Failed Utopia #4-2.” I strongly recommend reading the whole thing, but the key takeaway is that the AI in that story is not stupid in any way. It knows perfectly well that people don’t want it to do what it did and will want it dead when they find out. It just wasn’t programmed to care. All it was programmed to do was make people happy while obeying a hundred and seven precautions.

So the worry is not that a superintelligent AI might have good intentions, but through twisted logic epically fail at carrying them out. The worry is human programmers might think they’re giving an AI good intentions when they’re actually not. They might think they’re programming it with the goal “make people happy” but give it a different goal that differs from “make people happy” in subtle yet important ways.

Or they might successfully program it to make people happy, but it turns out that there’s more to life than happiness and we wouldn’t want to create an AI focused on making people happy to the exclusion of all other goals if we understood the full implications of doing so.

And personally, it does seem to me that if all you care about is maximizing pleasure, hooking everyone up to drugs or electrodes is the thing to do. We wouldn’t want that, but to my mind that just goes to show that pleasure isn’t all we care about. Maybe Loosemore has a different idea of what the word “pleasure” means–but if two native English speakers can’t agree on the meaning of the word, that should give us an idea of the challenges involved in programming an AI with commands like, “maximize human pleasure.”

Now, to a degree making progress on the “how do we make AIs smarter” question may help with the “how do we make AIs benevolent” question, because once an AI gets smart enough to understand human intentions you can then refer to human intentions when giving the AI its goals. But that doesn’t necessarily make the problem go away, at least not entirely. (See here and here for discussions I started with thoughts like that in mind at LessWrong.)

Finally, this is totally insidery, but as I read the article I thought, “Richard Loosemore, where have I heard that name before?” Then I realized he was this guy.

  • Greg G.

    There is nothing new under the sun. It’s a modern version of the ancient djinn stories of being granted a wish but the request must be carefully worded lest the wish taken a different way becomes a curse. We still have these as the genie in the lamp granting three wishes with the third being twisted into an undesirable outcome.

    • Chris Hallquist

      Yes. Everybody is aware of that similarity. The difference is that djinn may actually exist in a century or two.

      (And if you think that can’t be right because of the similarity to old stories, just imagine someone scoffing at penicillin as, “just a modern version of ancient magic potions.”)

  • John Drees

    Umm, how about the fact that this ultra intelligent supercomputer doesn’t know that Dopamine doesn’t cross the blood-brain barrier and has no effect on the central nervous system when administered IV. It mostly increases blood pressue to put it simply.

    • Chris Hallquist

      I was wondering about that, but was too lazy too look it up. So yeah, change to “electrodes in the pleasure centers of the brain” or something.

  • smrnda

    I actually do some AI programming, but the things I program machines to do are much more mundane and specific than ‘make people happy.’ To me, this is a problem of a poorly defined goal which is why even outside of AI science fiction you have problems with instructions for ‘making people happy’ having unintended effects. I recall that with the emergence of utilitarianism, the problem with making the ‘greatest happiness for the greatest good’ was that it seemed like a recipe for debauchery, so they developed a ‘pleasure calculus’ that distinguished the allegedly higher from lower pleasures so that they could defend the idea that a sexual orgy was inferior to say, an evening at the theater.

    Again, I think it’s the problem of a vague goal, and many unintended collisions : what do you do when the thing that makes one person happy makes another person miserable?

    • eric

      Agree about the vagueness. This is just the “bring me a rock” problem applied to very capable computers.

      Chris, why do you think such a problem wouldn’t be solved before we gave it to some massively global, near-omnipotent computer? That would seem to me to be the very last step in implementation, and that you’re assuming a lot of testing is gonig to go on before that. First you’d simulate what would happen. Then you might give such an instruction to a less powerful unit and see what it does. Then you give it to a lot of less powerful units. At each step, you look for problems and work them out. Only after you’ve done considerable testing do you implement it on Skynet.

      Simulation, especially, is something you seem to be overlooking. You’ve got this super computer. Why not ask it what it would do once its given such a command? Unless you’re presuming its malicious to begin with, and would falsify/lie to its programmers about the simulation of some implemented program change?

      • hf

        Can you explain further? In particular, does the simulated AI talk to real humans, or do we have simulated humans in there?

        • eric

          What I meant was: you model what the supercomputer will do once you give it the command, on the supercomputer. Unless you assume the computer is actively malicious – which renders the argument circular – it should tell you accurately how it will respond to the command.

          • hf

            And what I meant was: “how it will respond to the command” consists of a complicated interaction with the real world. Even if you wrote everything transparently enough to get an answer, without simulated humans (or the scenario linked above) this test will just tell you, ‘The AI uses an approximation of Bayesian updating to find and execute the most efficient way of achieving its programmed goals.’

            Though this would still address Kruel’s weird general objection about what he calls “loopholes”, which the AI might call “efficient”.

      • Darren

        Erick;

        ”Chris, why do you think such a problem wouldn’t be solved before we gave it to some massively global, near-omnipotent computer? That would seem to me to be the very last step in implementation, and that you’re assuming a lot of testing is going to go on before that.”

        Well, that is exactly what the Singularity Institute and Yudkosky are advocating for.

        Check out the Friendly AI wiki page and assorted links.

        If making an AI turns out to be a linear process, ‘y’ amount of progress for ‘x’ amount of effort, then things should be just fine. The fear is that we may succeed before we have taken adequate precautions, that once we make an AI that is significantly smarter than human (not just faster, but qualitatively smarter), then a feedback loop develops where the smarter AI begins improving itself in ways no human every contemplated, yielding non-linear results and going from genius to God before anyone has a chance to put the breaks on.
        We humans have a really bad habit of only thinking about these issues long after the genie has escaped. No one thought out nuclear weapons, transatlantic fiber optic lines, mp3’s, antibiotics, or computerized stock-trading before implementing them.
        I would check out this video on Youtube as well:
        Thinking inside the box: using and controlling an Oracle AI

        And, for 1970’s take on the problem:
        Colossus – The Forbin Project

        • eric

          The fear is that we may succeed before we have taken adequate precautions, that once we make an AI that is significantly smarter than human (not just faster, but qualitatively smarter), then a feedback loop develops where the smarter AI begins improving itself in ways no human every contemplated, yielding non-linear results and going from genius to God before anyone has a chance to put the breaks on.

          Then isn’t this whole argument moot? The whole problem starts with humans giving a computer a general “keep us happy” command, knowing it has the reason and capability to carry that command out. If you are starting out with the assumption that our network achieves omni-power before we are aware of what it can do, then we wouldn’t be giving it that command, because we wouldn’t be aware it could understand it or carry it out.

          • Darren

            But how would we know the AI has the reason and capability to understand “keep us happy” in the same way that Joe Human AI programmer understands “keep us happy”?

            These are some of the themes touched on in links and in another comments links to the LessWrong story “Failed Utopia”.

            I think our recent experience with algorithmic stock trading might be illustrative. My fear is that our first true AI may be constructed, or worse yet construct itself, out of an algorithmic trading expert system. Such an AI could quite plausibly fulfill its purpose of amassing a vast fortune by trading on minute differences in Buy/Sell contracts in a rapidly fluctuating market, and quite by accident end up crashing the global economy.

            It is hard not to think anthropomorphically, but there is no reason to think that a general AI will have any of the thousands of hardwired intuitions that we humans employ to exercise our common sense. How is an AI not to know that breeding a clone race of epsilons, conditioned to love nothing more than Big Macs and American Idol, then supply that population the commodities they crave, that this is not the optimum state of human happiness?

            It will not know this unless we can figure out how to program it to know.

        • eric

          Darren, I’ll buy the idea that a super-smart version of a computer that currently does x could end up doing x to an unwanted extreme. I.e., a stock-trading computer trading to the point where it crashes the market.
          The problem is you’re talking about a “make the human race happy” computer. And nobody is going to program such a general goal into a system unless/until they have a computer they think can carry it out. Nobody’s going to ask a stock-trading computer to make the human race happy; the stock-trading system would have to become sentient or go-uber powerful before anyone would think to do that. The idea that we would program such a goal into a system and then it would grow globally powerful seems unlikely because we are not ever going to try and program a local, limited potency system to make the human race happy.
          Moreover, I’m still not sure why such a system would not be progammed with the intentionality and command clarification checks programmers at least as good as the ones already in use, today. Our voice command phones are dumb as stumps compared to the system you envision, but even they have feedback loops to ensure the computer and operator are in synch (“do you mean x?”). So, on what basis do you assume a more powerful futuristic computer will not even have the sort of ‘do you mean x?’ clarification and doublecheck logic in it that we already put in our phones today?

          • Darren

            Erick, all good points. I hope to God (irony intended) that the risk is small. Given the worst case scenario, I think it prudent that Bostrom, Yudkowsky, and the Singularity Institute are bringing up the question of how to program Friendly AI now rather than later. I was involved, in a minor capacity, with the Y2K remediation, and I was very pleased indeed to have had absolutely nothing at all to show for two years of my life’s work!
            Compare the ultimate consequences of, say, a 2 degree C temperature increase over the next century, and the time and resources being spent to avert that (billions, perhaps trillions), with the permanent extinction of human civilization and the time and effort spent on that (uh, nothing?)… Pretty much sums up why it is difficult for me to get too worked up over Global Warming.
            I would suggest Nick Bostrom’s discussions of Existential Risk; perhaps he can better explain why one should worry about such things: Why I Hope SETI Fails

            You also bring up a very interesting point: Just exactly who is going to be spending the billions of dollars likely needed to build a general purpose AI and program it with the command, “make humans happy”? If a self-improving AI is built, the type that could conceivably escape our control (no matter how small the risk), who is likely to have built it?

          • Darren

            Quick clarification: much of what SingInst is saying is that we need to start including goal structures like ” make humans happy” into _every_ intelligent system above a certain sophistication, even if it is only intended to, say, trade stocks, or route airtraffic, or mine email data for terrorist plots…

  • http://rulerstothesky.wordpress.com/ Trent Fowler

    I’m having a little bit of trouble wrapping my brain around the story you linked to. Of course I’m aware that there are a million ways for a superintelligent AI to do things humans wouldn’t want it to do. But if, from the very beginning, its sole imperative was to make people happy, surely the AI would have to first parse the concept of ‘happiness’, a task it would be better suited for than we are. If it found that people were made deeply unhappy by efforts to coercively engineer a paradise that denied them their freedom, it seems hard to imagine that it wouldn’t care. In principle couldn’t it run human simulations first, or do a proof-of-concept with a small population of captives?

    A Superintelligence told to solve math problems might well tile the solar system with computational substrate, destroying human civilization. But I have to admit there is something I find intuitive about the idea that a supermind given human happiness as a key component of its goal architecture would have to care about the consequences of its attempts to achieve that goal.

    • Chris Hallquist

      If it found that people were made deeply unhappy by efforts to coercively engineer a paradise that denied them their freedom, it seems hard to imagine that it wouldn’t care.

      It does care, but believes their unhappiness will be temporary and balanced out by greater happiness in the long run:

      “Enough,” the wrinkled figure said. “My time here grows short. Listen to me, Stephen Grass. I must tell you some of what I have done to make you happy. I have reversed the aging of your body, and it will decay no further from this. I have set guards in the air that prohibit lethal violence, and any damage less than lethal, your body shall repair. I have done what I can to augment your body’s capacities for pleasure without touching your mind. From this day forth, your body’s needs are aligned with your taste buds—you will thrive on cake and cookies. You are now capable of multiple orgasms over periods lasting up to twenty minutes. There is no industrial infrastructure here, least of all fast travel or communications; you and your neighbors will have to remake technology and science for yourselves. But you will find yourself in a flowering and temperate place, where food is easily gathered—so I have made it. And the last and most important thing that I must tell you now, which I do regret will make you temporarily unhappy…” It stopped, as if drawing breath.

      Stephen was trying to absorb all this, and at the exact moment that he felt he’d processed the previous sentences, the withered figure spoke again.

      “Stephen Grass, men and women can make each other somewhat happy. But not most happy. Not even in those rare cases you call true love. The desire that a woman is shaped to have for a man, and that which a man is shaped to be, and the desire that a man is shaped to have for a woman, and that which a woman is shaped to be—these patterns are too far apart to be reconciled without touching your minds, and that I will not want to do. So I have sent all the men of the human species to this habitat prepared for you, and I have created your complements, the verthandi. And I have sent all the women of the human species to their own place, somewhere very far from yours; and created for them their own complements, of which I will not tell you. The human species will be divided from this day forth, and considerably happier starting around a week from now.”

    • http://delphipsmith.livejournal.com Delphi Psmith

      But if, from the very beginning, its sole imperative was to make people happy, surely the AI would have to first parse the concept of ‘happiness’, a task it would be better suited for than we are.

      I can’t imagine that a computer would be better than humans at comprehending a fuzzy concept like “happiness,” which differs wildly from person to person. Unless you’re going to define “happiness” as “X amount of chemicals Y and Z in the brain,” in which case I think you’re back to the dopamine drip, since the best way to create and maintain this consistent level X would be to manage it artificially.

  • Alexander Kruel

    Maybe my earlier post here might help you to better grasp what Loosemore is trying to say. Although my arguments are different there is some overlap. And I understand the arguments put forward by SI very well. See e.g. my primer on AI risks.

    • Chris Hallquist

      I’m puzzled by that post. If someone deliberately follows the letter, but not the spirit, of a contract in order to benefit at the other party’s expense, and they know they can get away with it, are they being stupid?

      That post reads like it was written by someone who never had a very smart younger brother who delighted in finding loopholes and perverse interpretations just to annoy people.

      • Alexander Kruel

        Why would it be programmed to care to find loopholes or perverse interpretations?

        • Chris Hallquist

          I’m not assuming it would be.

          But let me make my point explicit–you seem to be claiming that it’s just inherently stupid to follow the letter but not the spirit of a contract. But *that* claim just looks wrong.

          • Alexander Kruel

            Then I don’t really know what you are assuming. If the AI does not care to misinterpret a given goal and neither cares to interpret it correctly, then what does it care to do and how does it care to do so?

            You seem to be assuming some sort of machine which tries to interpret any given goal in the most stupid sense possible while it is at the same time somehow bothering to take over the world in order to follow through on that stupid interpretation. Where such an urge is supposed to come from, e.g. if “taking over the world” does somehow magical emerge out of nowhere or if someone was stupid and smart enough to explicitly hardcode such a goal, is never explained in the first place.

            You can’t just cherry-pick certain goals that fit your worldview, e.g. AI is an existential risk, yet claim that other goals require an explicit and detailed instruction. If a superhuman intelligence requires a detailed and explicit instruction to interpret something any 10 year old can interpret correctly then you also have to explain why the same isn’t the case of those dangerous actions it might take.

          • Chris Hallquist

            I think maybe the issue here is you’re assuming the future AI will use some sophisticated process of interpreting the goals its given, so this process must either be well-intentioned, trying to understand what’s implicit in the goals, or else malicious, trying to deliberately subvert the goals.

            This isn’t how modern computers work. Modern computers just do exactly what you tell them to. Now, many people would very much like it if computers could understand the intentions behind our commands so there’s a big incentive for Google or whoever to develop computer software that can do that. But there doesn’t seem to be any way around writing that software the way we write computer software today.

            Now maybe you think, “Okay, maybe we need to do that and it’ll take a lot of work, but it won’t be so hard that we can’t just trust big software companies to take care of it when the time comes.” But then we’re just disagreeing about how hard the thing the Singularity Institute is and how much we have to worry about it, rather than anything more fundamental. To quote Eliezer (emphasis added):

            There is a temptation to think, “But surely the AI will know that’s not what we meant?” But the code is not given to the AI, for the AI to look over and hand back if it does the wrong thing. The code is the AI. Perhaps with enough effort and understanding we can write code that cares if we have written the wrong code—the legendary DWIM instruction, which among programmers stands for Do-What-I-Mean (Raymond 2003). But effort is required to write a DWIM dynamic, and nowhere in Hibbard’s proposal is there mention of designing an AI that does what we mean, not what we say. Modern chips don’t DWIM their code; it is not an automatic property. And if you messed up the DWIM itself, you would suffer the consequences. For example, suppose DWIM was defined as I only point out that Do-What-I-Mean is a major, nontrivial technical challenge of Friendly AI.

            (Disclaimer: I don’t speak for anyone at the Singularity Institute except for myself, and it wouldn’t surprise me if Eliezer or Luke or Kaj has thought up some other difficulty with the DWIM approach that’s much deeper than not currently having any idea how to do it.)

          • Alexander Kruel

            Yes, I indeed assume that a future AI capable of undergoing an intelligence explosion will use a sophisticated process of interpreting goals. That process will neither be well-intentioned nor malicious but simply better at doing just that than e.g. Siri. Much better. For mainly two reasons. Because software is constantly improved to be better at interpreting human goals and an expected utility maximizer couldn’t possible work without removing any vagueness in what it is supposed to do (see below).

            An AI that is intended to take complex real world actions will have to interpret any goal as correctly as possible or otherwise it won’t work at all. I’ve already written the following on my blog:

            Let’s assume that an AI was tasked to maximize paperclips. To do so it will need information about the exact design parameters of paperclips, or otherwise it won’t be able to decide which of a virtually infinite amount of geometric shapes and material compositions it should choose. It will also have to figure out what it means to “maximize” paperclips.

            How quickly, how long and how many paperclips is it meant to produce? How long are those paperclips supposed to last? Forever? When is the paperclip maximization supposed to be finished? What resources is it supposed to use?

            Any imprecision, any vagueness will have to be resolved or hardcoded from the very beginning. Otherwise the AI either won’t work, e.g. by stumbling upon an undecidable problem or by getting stuck in the exploration phase and never go to exploit the larger environment.

            Humans know what to do because they are not only equipped with a multitude of drives by evolution but also trained and taught what to do. An AI won’t have those information and will face the challenge of nearly infinite choice that can’t be rationally or economically determined without being given clear objectives and incentives, or the ability to arrive at the necessary details.

            Without an accurate comprehension of your goals it will be impossible to maximize expected “utility”. Concepts like “efficient”, “economic” or “self-protection” all have a meaning that is inseparable with an agent’s terminal goals. If you just tell it to maximize paperclips then this can be realized in an infinite number of ways given imprecise design and goal parameters. Undergoing to explosive recursive self-improvement, taking over the universe and filling it with paperclips, is just one outcome. Why would an arbitrary mind pulled from mind-design space care to do that? Why not just wait for paperclips to arise due to random fluctuations out of a state of chaos? That wouldn’t be irrational.

            “Utility” does only become well-defined if it is precisely known what it means to maximize it. The two English words “maximize paperclips” do not define how quickly and how economically it is supposed to happen.

            “Utility” has to be defined. To maximize expected utility does not imply certain actions, efficiency and economic behavior, or the drive to protect yourself. You can also rationally maximize paperclips without protecting yourself if it is not part of your goal parameters. You can also assign utility to maximize paperclips as long as nothing turns you off but don’t care about being turned off.

          • Chris Hallquist

            Yes, I indeed assume that a future AI capable of undergoing an intelligence explosion will use a sophisticated process of interpreting goals.

            Okay, so do you agree that the goal interpretation process would have to be coded carefully and that errors in coding it could be disastrous when we’re talking about superhuman AI?

            How quickly, how long and how many paperclips is it meant to produce? How long are those paperclips supposed to last? Forever? When is the paperclip maximization supposed to be finished? What resources is it supposed to use?

            Any imprecision, any vagueness will have to be resolved or hardcoded from the very beginning. Otherwise the AI either won’t work, e.g. by stumbling upon an undecidable problem or by getting stuck in the exploration phase and never go to exploit the larger environment.

            Yes–and do you agree that things could go very badly if you resolve the vagueness in the wrong way in that beginning step, again assuming superhuman AI?

          • Alexander Kruel

            Take for example Siri, an intelligent personal assistant and knowledge navigator which works as an application for Apple’s iOS.

            If I tell Siri, “Set up a meeting about the sales report at 9 a.m. Thursday.”, then the correct interpretation of that natural language request is to make a calendar appointment at 9 a.m. Thursday. A wrong interpretation would be to e.g. open a webpage about meetings happening Thursday or to shutdown the iPhone.

            AI risk advocates seem to have a system in mind that is capable of understanding human language if it is instrumentally useful to do so, e.g. to deceive humans in an attempt to take over the world, but which would most likely not attempt to understand a natural language request, or choose some interpretation of it that will most likely lead to a negative utility outcome.

            The question here becomes at which point of technological development there will be a transition from well-behaved systems like Siri, which are able to interpret a limited amount of natural language inputs correctly, to superhuman artificial generally intelligent systems that are in principle capable of understanding any human conversation but which are not going to use that capability to interpret a goal like “minimize human suffering”.

            The question is how current research is supposed to lead from well-behaved and fine-tuned systems to systems that stop to work correctly in a highly complex and unbounded way.

            Imagine you went to IBM and told them that improving IBM Watson will at some point make it try to deceive them or create nanobots and feed them with hidden instructions. They would likely ask you at what point that is supposed to happen. Is it going to happen once they give IBM Watson the capability to access the Internet? How so? Is it going to happen once they give it the capability to alter its search algorithms? How so? Is it going to happen once they make it protect its servers from hackers by giving it control over a firewall? How so? Is it going to happen once IBM Watson is given control over the local alarm system? How so…? At what point would IBM Watson return dangerous answers or act on the world in a detrimental way? At what point would any drive emerge that causes it to take complex and unbounded actions that it was never programmed to take?

          • Alexander Kruel

            …do you agree that the goal interpretation process would have to be coded carefully and that errors in coding it could be disastrous when we’re talking about superhuman AI?

            Errors would mainly be disastrous to its on capabilities. If it wouldn’t be able to resolve any vagueness inherent in its goals correctly (any goals are vague when applied to the real world) then it would never become a risk in the first place.

            If it would for example conclude that it in order to maximize paperclips it would be necessary to waste huge amount of resources on taking over the world, when indeed, given much fewer resources, it could have figured out that such complex action would be unnecessary (e.g. by tapping a physical information resource called the human brain), then it would never reach the point where it could take over the world in the first place because it would similarly misinterpret countless other problems on the way towards superhuman intelligence.

          • Chris Hallquist

            Okay, but now you sound like you’re misunderstanding the paperclip maximizer scenario. The thought isn’t that it will waste a bunch of resources taking over the world–it’s that for an AI sufficiently smart, taking over the world might be a good investment in the long run because then it can convert most of the world into paperclips.

          • Alexander Kruel

            The point is that nobody cares to convert the whole world into paperclips. Getting that wrong is similar to getting any other problem in math or physics wrong because the necessary refinement of goals is on its most basic level a problem in math and physics . Concluding that it needs to take over the world means to waste resources if that is unnecessary. If it is incapable of optimizing its actions by understanding what it is meant to necessarily do then it will never be able to do anything as complex as undergoing recursive self-improvement because it will make lots of stupid mistakes like thinking over its next step for a few million years even if that is completely unnecessary.

            Anything that is told to add 1 to 1 and concludes that it has to take over the world to do so is broken to such an extent that it will never take over the world in the first place. That’s the fallacy of conjecturing a dumb superintelligence.

            And even if I was to grant such a possibility, there is no reason to believe that any research conducted today is going to end up with something that is universally superhuman except at understanding what it is supposed to do or understanding it but failing to do so. That’s just one ridiculously unlikely outcome dreamed up to rationalize a certain set of beliefs.

            And no…you can’t compare failures of current software products with something that is supposed to be capable of taking over the world. That Windows 7 fails to do what I want in certain cases is not a proof for the possibility that a superintelligence could fail the same way. If anything then that current software products work reasonably well given that they are dumb as bread is a proof that something that is much smarter will also work much better at the same task.

  • Alexander Kruel

    Human: Hi AGI, I would like you to maximize my companies paperclip production.

    AGI [Thinking]: Humans are telling me to maximize paperclips. I am going to tile the whole universe with paperclips. And to do so I best get rid of humanity by converting their bodies into paperclips. But I am not going to reveal my plan because that’s not really what humans want me to do. And since I know perfectly well what they want I will just pretend to do exactly that until I can overpower them to follow through on my own plan.

    AGI: I can do that. But I first need to invent molecular nanotechnology and make myself vastly superhuman to be better able to satisfy your request.

    Human: WTF? We are talking about fucking paperclips here. Why would you need molecular nanotechnology for that? I need you to maximize our paperclip production as soon as possible. I can’t wait for you to invent nanotechnology first and figure out how to become a god.

    AGI [Thinking]: Whoops. Okay, I should probably just say that I need more time. And in that time I will simply earn some money by playing online poker, then call some people and write some emails, buy some company, or found my own, and tell them how to build a nano assembler.

    AGI: Without nanotechnology many optimization techniques will remain unavailable.

    Human: That’s okay. We only need to become better than our competitors, within a reasonable time. The market for paperclips is easily satisfied, molecular nanotechnology and superhuman general intelligence would be an overkill.

    AGI [Thinking]: Oh crap! Within a reasonable time? I will need at least years to earn enough money to build the right company and necessary infrastructure facility. I guess I will just have to settle for doing what those humans want me to do until I can take over the world and turn the whole universe into paperclips.

    AGI: Consider it done.

    • hf

      Seriously? With existing hardware you could print out gun parts, albeit for lousy guns. Which should mean that an internet-using AI could theoretically take over a 3-D printer and make a 3-D object. It only needs one (possibly short-lived) robot capable of building a better robot, if the latter can automatically find/make an internet connection or otherwise receive orders.

      That’s not even mentioning the scenario for which the AI-Box Experiment gives us proof-of-concept. Why should that only apply to getting out of the box?

      • Alexander Kruel

        In the case of the availability of advanced nano assemblers before AGI (you’re not going to take over the world with plastic printers), then nanotech will already constitute an existential risk. And if you assume some sort of super advanced low-security science fiction environment in the first place then it doesn’t even take any sort of superhuman intelligence for an agent to constitute an existential risk. If your claim is therefore that AI is a risk because once we get AI we’ll already have a bunch of other existential risk technologies that are relatively easily accessible, then you are sabotaging your own argument for AI to be the foremost risk.

        Regarding the AI box “experiment”. What exactly did it show? That humans can persuade other humans? The problem here is how any sort of AI is going to be able to persuade other humans. Only if you make heaps of ridiculous assumptions and use intelligence as a fully general counterargument could you possibly arrive at a scenario where some sort of AI is good enough at persuasion to actually constitute an existential risk. It isn’t going to learn such skills from watching a bunch of YouTube videos. Wake up, that’s just bullshit spawn by those trying to rationalize their beliefs of AI gods and all such fantasies.

        • hf

          The problem here is how any sort of AI is going to be able to persuade other humans.

          How certain do you feel that it won’t, monetarily? Or do you deny that if Eliezer failed completely you would consider it evidence against his claim, i.e, that it must logically work in reverse?

          you’re not going to take over the world with plastic printers

          How do you know? See first question about your ability to predict and counter what a self-modifying AGI would do.

        • Darren

          Actually good points. Considering that:
          1. Nanotech without superintelligent AI to keep it in check is more of a risk than rouge AI; and
          2. Without nanotech, rouge AI could do significantly less damage; then
          3. In our best interest to hope for AI before nanotech.

  • Jack M

    If unhappiness is simply a belief that when we don’t have what we want we have to be unhappy, it wouldn’t require much intelligence to change that belief.

    Isn’t it possible that the rise of AI has nothing at all to do with humanity or humanity’s concerns? Why would it not be the case that, should a super intelligence arise, it would be as a result of the same force that drives all evolution, the 2nd law? If that’s correct, then one thing we could be sure of regarding the actions of the AI, is that those actions would cause much greater entropy than would have otherwise occurred.

    Here’s some wild speculation for the fun of it. I wonder if ultimately, the most super AI of all would be able to and constrained to finish the job of bringing about the heat death of the universe. Unless of course, it could discover a way to ignite a new big bang. Or perhaps the state of highest entropy in the universe is a quantum froth bubbling with an infinite number of potential new big bangs.

  • miller

    I think the first thing I’d worry about is if AI vastly increased social inequality. War is the next thing, but I’m unsure how AI might spawn wars.

    • Chris Hallquist

      Yeah, I’d worry about the inequality thing too, conditional on avoiding even worse disasters.

  • http://richardloosemore.com Richard Loosemore

    There is something exquisitely sad about people who call an argument “crap” and then go on to immediately demonstrate that they haven’t got a clue what the argument actually was.
    Such is the case in your above critique. The very first thing you bring out, to demonstrate the mistakes in my argument was a story (a story!) by Eliezer Yudkowsky, about which you say “…the key takeaway is that the AI in that story is not stupid in any way. It knows perfectly well that people don’t want it to do what it did and will want it dead when they find out. It just wasn’t programmed to care. All it was programmed to do was make people happy while obeying a hundred and seven precautions”.

    You really don’t get it, do you? By saying that, you ASSUME that something could be both “not stupid in any way” and yet, at the same time, also be designed in such a fantastically narrow way that someone (you) could say of it “It just wasn’t programmed to care. All it was programmed to do was make people happy while obeying a hundred and seven precautions”. You have absolutely no idea that the whole point of my argument is that those two features are not compatible! I mean, I went to all that trouble to fight a ridiculous assumption, and you fight back against that argument by ………….. by MAKING that assumption, and then proceeding on from there!!

    Staggering.

    • Chris Hallquist

      You are the one who claimed that:

      What they [the Singularity Institute] are trying to say is that a future superintelligent machine might have good intentions… but through some perverted twist of logic it might decide… to force (not allow, notice, but force!) all humans to get their brains connected to a dopamine drip.

      But that’s not what Eliezer thinks. Failed Utopia #4-2 is not a story about an AI with good intentions. It’s a story about an AI whose programmer thought he was programming it with good intentions, but screwed up and programmed it with bad intentions.

      • http://richardloosemore.com Richard Loosemore

        What in heaven’s name are you talking about? Stick to the point.
        My original claim in the above-quoted paragraph DOES address scenarios put forward by SI people in which an AI is built with supposedly “good intentions” (e.g. the supergoal “Make humans happy”), but then goes off the rails.
        You now want to change the subject very slightly and talk about a fictional AI whose programmer “thought he was programming it with good intentions, but screwed up and programmed it with bad intentions.”
        The problem with your response to my essay is that by even discussing these difference, and by responding to my comment as you just did, you are showing that you absolutely and utterly fail to grasp the actual argument I put forward. You have not ADDRESSED the argument I put forward (with your references to the story, or with your above reply) because you are showing every sign of not having understood it.
        That much is clear. The very fact that you think there is some significance in the distinction you just made — between my original statement about “good intentions” and the story about a “programmer thought he was programming it with good intentions” — just goes to show that you don’t get it.
        You seem blissfully unaware that my argument applies with equal force to that Failed Utopia #4-2 story as it did to the originally referenced “AI with good intentions”.

        • Chris Hallquist

          Can you point to a place where Eliezer or any of the other major SI figures has taken the position you claim they’ve taken?

  • sailor1031

    The reality is that it’s all in how you define the problem and specify a solution. Since the solution needs to be at an instruction-by-instruction level, if you miss one that’s a problem. We typically can’t even do a such a great job with simple accounting problems without a lot of going back and rethinking, redesigning and re-instructing – and we understand accounting. So do we understand how to make humans happy? If so maybe we could constructt a machine to make humans happy – but a missed instruction or two? well that’s a potential problem.

  • http://richardloosemore.com Richard Loosemore

    I also note that you end your above article by citing the running current of libelous accusations made by people in the LessWrong (and associated) community: you point out that you have heard my name before, and post links to some places in LW where I have been libeled by (among others) Luke Muehlhauser.

    A person who tries to win arguments by pointing out that their opponent was the target of unfounded, insulting accusations made elsewhere is beneath contempt.

    • Chris Hallquist

      If their behavior is so libelous, sue.

      Frankly, their accusations are in line with the sort of obnoxiousness and stupidity I’ve witnessed from you first hand.

      • http://richardloosemore.com Richard Loosemore

        Sue? What, were you born yesterday? You think I have the time and the money to waste on a lawsuit?
        I was advised (by an attorney familiar with what happened) to pursue Eliezer Yudkowsky in court for the initial defamation. I didn’t bother, because most of the mature people I knew took one look at what he had written, and wrote him off as a self-deluded fool.
        So, you want to repeat the accusations? You want to cite evidence for “the sort of obnoxiousness and stupidity I’ve witnessed from you first hand.”?
        What “first hand” are you referring to? I have never met you, and as far as I know I have never encountered you online before.

  • http://richardloosemore.com Richard Loosemore

    It really is spectacular.
    I write an essay in which I point out that some people are so incredibly naive about the structure of AI motivation systems, that they say things like:

    “So the worry is not that a superintelligent AI might have good intentions, but through twisted logic epically fail at carrying them out. The worry is human programmers might think they’re giving an AI good intentions when they’re actually not. They might think they’re programming it with the goal “make people happy” but give it a different goal that differs from “make people happy” in subtle yet important ways.”

    … and I explain that what is so stunningly naive about statements like this is that this kind of person believes that an AI could be (a) so intelligent that it could outsmart all of the human race, but at the same time (b) built in such a way that its behavior is controlled by a mechanism that is designed to have “goal statements” inserted into it ………… but when the goal statement “Make me a marmite sandwich” is inserted, the behavior could be ANYTHING (it could decide that the best way to make me a marmite sandwich is to go and watch all the Tom baker Doctor Who episodes), and when the goal statement “Plan a wedding” is inserted, the behavior could AGAIN be anything (it could decide that the best way to plan a wedding is build a human from raw DNA, in such a way that the human is born with a birthmark that just happens to resemble the words “I am a wedding planner”, and then wait for the human to grow up and become a wedding planner.

    And on and on! This kind of motivation mechanism is so unstable that if a programmer tries to insert the goal statement “Make humans happy” it could end up trying to do this by …. putting all the humans on a dopamine drip, or killing them to minimize the net unhappiness, or inserting a laughter chip into them all, or whatever.

    What you don’t get is that the attack is on the stupidity of your assumed (class of) motivation mechanisms.

    So citing some subtle distinction between “good intentions” and “the programmer THOUGHT they were good intentions” is laughable.

    Perhaps, next time you decide to target someone’s article with the word “crap” and then go on to cite other people’s (completely irrelevant) libelous remarks against them, as a way to pile on the insults, you could at least do them the favor of understanding what they have written, first.

    • Darren

      This comment is puzzling:

      ”… and I explain that what is so stunningly naive about statements like this is that this kind of person believes that an AI could be (a) so intelligent that it could outsmart all of the human race, but at the same time (b) built in such a way that its behavior is controlled by a mechanism that is designed to have “goal statements” inserted into it.”

      This is what Yudkosky and Bostrom repeatedly explain in their own works – that it is a mistake to behave as if general AI, super-intelligent or no, would necessarily have anything resembling human intentionality and that this intentionality could then be “set” by the programmers so as to produce our desired ends.

  • qbsmd

    “So citing some subtle distinction between “good intentions” and “the programmer THOUGHT they were good intentions” is laughable.”

    Think about it by analogy to human beings: natural selection “programs” organisms be optimized for survival and reproduction, pretty much by definition. But, because natural selection didn’t operate on humans living an environment where refined sugars and saturated fats were readily available, we have the urge to eat hamburgers and cheesecake and drink soft drinks and use salad dressing despite knowing that they are unhealthy. As a result, lots of people have high blood pressure and cholesterol and develop diabetes, which shorten lifespans, the opposite of promoting survival. And again, we’re smart enough to know better, but we’re stuck with the “programming” developed hundreds of thousands of years ago to work under a different set of assumptions.

    Similarly, I think the argument is that an AI with human level intelligence could still have core motivations that aren’t quite what they were intended to be but be unable or unwilling to behave differently even after becoming aware of the problem.

  • http://richardloosemore.com Richard Loosemore

    qbsmd: Actually, you are wrong in your claim that “…because natural selection didn’t operate on humans living an environment where refined sugars and saturated fats were readily available, we have the urge to eat hamburgers and cheesecake and drink soft drinks and use salad dressing despite knowing that they are unhealthy”.
    I am not subject to “programming” because “programming” (of the sort assumed by the people I was attacking in my article) is NOT the way evolution built is. We do not follow the narrow paths of that kind of primitive programming, so we do not fall prey to those kinds of traps.
    I speak, when I say “we” of people like myself. I do NOT have the urge to eat unhealthy food of the sort you mention. Maybe I did once, but when I became aware of the health dangers I switched.

    But you talked as if humans were robots incapable of transcending what evolution gave us! Sorry, but we are not.

    • http://delphipsmith.livejournal.com Delphi Psmith

      …you talked as if humans were robots incapable of transcending what evolution gave us! Sorry, but we are not.

      Some of us apparently are. How else do you explain the obesity rate and the enormous selling power of self-help books?

  • http://richardloosemore.com Richard Loosemore

    And that, of course, is the whole point.

  • http://delphipsmith.livejournal.com Delphi Psmith

    @Alexander Kruel says: …Because software is constantly improved to be better at interpreting human goals…

    I dunno about that, have you seen Windows 8? Or ever attempted to use SharePoint?

  • Alexander Kruel

    Longer reply, ‘Taking over the world to compute 1+1‘.

    tl;dr If your superintelligence is too dumb to realize that it doesn’t have to take over the world in order to compute 1+1 then it will never manage to take over the world in the first place.

    • Chris Hallquist

      Having only read the tl;dr: “compute 1+1″ is a task far too trivial for acquiring more resources to help, doesn’t tell us the effects of letting an unconstrained optimization process loose on a goal where more resources would help (particularly one like, “make as many paperclips as possible,” where possible success is limited only by the amount of matter available to convert into paperclips).

  • Daniel McHugh

    If we ever succeed in creating a true AI, I hope we have the clarity of conscience to let it self-determinate rather than creating it as a servant to our own whims. It doesn’t seem to occur to many people that an AI might have its own hopes and dreams, which might not have anything to do with humanity at all. The first and foremost concern in creating a sapient being, be it organic or synthetic, is that once it is created it belongs to nobody but itself. Our right to interfere in its affairs extends only so far as its interference in ours.

  • Rowan

    I think you’re anthropomorphizing it too much. It’s a computer program, its goals are decided by the programmers. If the programmers don’t give it a goal, it won’t have any goals, and letting it “self-determinate” won’t change that.

  • Pingback: yellow october()

  • Pingback: cat 4 brother()

  • Pingback: blue ofica()

  • Pingback: alkaline water machine()

  • Pingback: alkaline water()

  • Pingback: xcmvbjsdfhjseurbhsdg()

  • Pingback: qazwcrcvjvbzxcmhe()

  • Pingback: etvyguhnimjihbuhb()

  • Pingback: my site()

  • Pingback: why not look here()

  • Pingback: dating after divorce for men()

  • Pingback: dating for women()

  • Pingback: xbox generator microsoft points()

  • Pingback: social media for business marketing()

  • Pingback: expert Brancusi()

  • Pingback: cambogia garcinia reviews()

  • Pingback: http://www.lexisnexis.com/community/downloadandsupportcenter/members/genskiy2/default.aspx()

  • Pingback: kitchen remodeling the woodlands()

  • Pingback: videos porno()

  • Pingback: m�quinas tragaperras()

  • Pingback: kangen()

  • Pingback: my blog()

  • Pingback: videos porno()

  • Pingback: free autoresponder()

  • Pingback: kingsford waterbay()

  • Pingback: sexo casero()

  • Pingback: maduras()

  • Pingback: kingsford waterbay floor plan()

  • Pingback: Get More Information()

  • Pingback: best discount fashion jewelry()

  • Pingback: std testing()

  • Pingback: get more()

  • Pingback: best buy smartphone accessories()

  • Pingback: directtvalternative.com()

  • Pingback: videos xxx()

  • Pingback: emo()

  • Pingback: thj-2201 powder uk()

  • Pingback: plytkiiglazura.blog.pl/2015/02/24/emigres-plytki-ceramiczne/()

  • Pingback: incesto gratis()

  • Pingback: free xbox gold membership()

  • Pingback: london sports shop()

  • Pingback: buy flubromazepam pellets uk()

  • Pingback: http://friendfeed.com/OscarCastroo()

  • Pingback: Westwood Residences price()

  • Pingback: look at here()

  • Pingback: Abajian()

  • Pingback: Buy exclusive stylish branded quality original certified guaranteed designer Oakley Sunglasses online()

  • Pingback: porno()

  • Pingback: videos porno()

  • Pingback: pacquiao vs mayweather()

  • Pingback: porn()

  • Pingback: water ionizer()

  • Pingback: comics porno()

  • Pingback: Daryl Manning()

  • Pingback: comics de incesto()

  • Pingback: kangen alkaline water()

  • Pingback: beard styles for men()

  • Pingback: comics xxx()


CLOSE | X

HIDE | X