Friendly artificial intelligence

The term friendly artificial intelligence (also friendly AI or FAI) is a term coined by Eliezer Yudkowsky^[1] to discuss superintelligent artificial agents that reliably implement human values. Stuart J. Russell and Peter Norvig's leading artificial intelligence textbook, Artificial Intelligence: A Modern Approach, describes the idea:^[2]

Yudkowsky (2008) goes into more detail about how to design a Friendly AI. He asserts that friendliness (a desire not to harm humans) should be designed in from the start, but that the designers should recognize both that their own designs may be flawed, and that the robot will learn and evolve over time. Thus the challenge is one of mechanism design—to define a mechanism for evolving AI systems under a system of checks and balances, and to give the systems utility functions that will remain friendly in the face of such changes.

'Friendly' is used in this context as technical terminology, and picks out agents that are safe and useful, not necessarily ones that are "friendly" in the colloquial sense. The concept is primarily invoked in the context of discussions of recursively self-improving artificial agents that rapidly explode in intelligence, on the grounds that this hypothetical technology would have a large, rapid, and difficult-to-control impact on human society.^[3]

Risks of unfriendly AI

The roots of concern about artificial intelligence are very old. Kevin LaGrandeur showed that the dangers specific to AI can be seen in ancient literature concerning artificial humanoid servants such as the golem, or the proto-robots of Gerbert of Aurillac and Roger Bacon. In those stories, the extreme intelligence and power of these humanoid creations clash with their status as slaves (which by nature are seen as sub-human), and cause disastrous conflict.^[4] By 1942 these themes prompted Isaac Asimov to create the "Three Laws of Robotics" - principles hard-wired into all the robots in his fiction, and which meant that they could not turn on their creators, or allow them to come to harm.^[5]

In modern times as the prospect of strong AI looms nearer, Oxford philosopher Nick Bostrom has said that AI systems with goals that are not perfectly identical to or very closely aligned with human ethics are intrinsically dangerous unless extreme measures are taken to ensure the safety of humanity. He put it this way:

Basically we should assume that a 'superintelligence' would be able to achieve whatever goals it has. Therefore, it is extremely important that the goals we endow it with, and its entire motivation system, is 'human friendly.'

Ryszard Michalski, a pioneer of machine learning, taught his Ph.D. students decades ago that any truly alien mind, including a machine mind, was unknowable and therefore dangerous to humans.^{[citation needed]}

More recently, Eliezer Yudkowsky has called for the creation of “friendly AI” to mitigate existential risk from advanced artificial intelligence.

Steve Omohundro says that all advanced AI systems will, unless explicitly counteracted, exhibit a number of basic drives/tendencies/desires because of the intrinsic nature of goal-driven systems and that these drives will, “without special precautions”, cause the AI to act in ways that range from the disobedient to the dangerously unethical.

Alex Wissner-Gross says that AIs driven to maximize their future freedom of action (or causal path entropy) might be considered friendly if their planning horizon is longer than a certain threshold, and unfriendly if their planning horizon is shorter than that threshold.^[6]^[7]

Luke Muehlhauser recommends that machine ethics researchers adopt what Bruce Schneier has called the "security mindset": Rather than thinking about how a system will work, imagine how it could fail. For instance, he suggests even an AI that only makes accurate predictions and communicates via a text interface might cause unintended harm.^[8]

Coherent extrapolated volition

Yudkowsky advances the Coherent Extrapolated Volition (CEV) model. According to him, our coherent extrapolated volition is our choices and the actions we would collectively take if "we knew more, thought faster, were more the people we wished we were, and had grown up closer together."^[9]

Rather than a Friendly AI being designed directly by human programmers, it is to be designed by a seed AI programmed to first study human nature and then produce the AI which humanity would want, given sufficient time and insight, to arrive at a satisfactory answer.^[9] The appeal to an objective though contingent human nature (perhaps expressed, for mathematical purposes, in the form of a utility function or other decision-theoretic formalism), as providing the ultimate criterion of "Friendliness", is an answer to the meta-ethical problem of defining an objective morality; extrapolated volition is intended to be what humanity objectively would want, all things considered, but it can only be defined relative to the psychological and cognitive qualities of present-day, unextrapolated humanity.

Other approaches

Ben Goertzel, an artificial general intelligence researcher, believes that friendly AI cannot be created with current human knowledge. Goertzel suggests humans may instead decide to create an "AI Nanny" with "mildly superhuman intelligence and surveillance powers", to protect the human race from existential risks like nanotechnology and to delay the development of other (unfriendly) artificial intelligences until and unless the safety issues are solved.^[10]

Steve Omohundro has proposed a "scaffolding" approach to AI safety, in which one provably safe AI generation helps build the next provably safe generation.^[11]

Public policy

James Barrat, author of Our Final Invention, suggested that "a public-private partnership has to be created to bring A.I.-makers together to share ideas about security—something like the International Atomic Energy Agency, but in partnership with corporations." He urges AI researchers to convene a meeting similar to the Asilomar Conference on Recombinant DNA, which discussed risks of biotechnology.^[11]

John McGinnis encourages governments to accelerate friendly AI research. Because the goalposts of friendly AI aren't necessarily clear, he suggests a model more like the National Institutes of Health, where "Peer review panels of computer and cognitive scientists would sift through projects and choose those that are designed both to advance AI and assure that such advances would be accompanied by appropriate safeguards." McGinnis feels that peer review is better "than regulation to address technical issues that are not possible to capture through bureaucratic mandates". McGinnis notes that his proposal stands in contrast to that of the Machine Intelligence Research Institute, which generally aims to avoid government involvement in friendly AI.^[12]

According to Gary Marcus, the annual amount of money being spent on developing machine morality is tiny.^[13]

Criticism

Some critics believe that both human-level AI and superintelligence are unlikely, and that therefore friendly AI is unlikely. Writing in The Guardian, Alan Winfeld compares human-level artificial intelligence with faster-than-light travel in terms of difficulty, and states that while we need to be "cautious and prepared" given the stakes involved, we "don't need to be obsessing" about the risks of superintelligence.^[14]

Some philosophers claim that any truly "rational" agent, whether artificial or human, will naturally be benevolent; in this view, deliberate safeguards designed to produce a friendly AI could be unnecessary or even harmful.^[15] Other critics question whether it is possible for an artificial intelligence to be friendly. Adam Keiper and Ari N. Schulman, editors of the technology journal The New Atlantis, say that it will be impossible to ever guarantee "friendly" behavior in AIs because problems of ethical complexity will not yield to software advances or increases in computing power. They write that the criteria upon which friendly AI theories are based work "only when one has not only great powers of prediction about the likelihood of myriad possible outcomes, but certainty and consensus on how one values the different outcomes.^[16]

References

Cite error: Invalid <references> tag; parameter "group" is allowed only.

Use <references />, or <references group="..." />

External links

Ethical Issues in Advanced Artificial Intelligence by Nick Bostrom
What is Friendly AI? — A brief description of Friendly AI by the Machine Intelligence Research Institute.
Creating Friendly AI 1.0: The Analysis and Design of Benevolent Goal Architectures — A near book-length description from the MIRI
Critique of the MIRI Guidelines on Friendly AI — by Bill Hibbard
Commentary on MIRI's Guidelines on Friendly AI — by Peter Voss.
The Problem with ‘Friendly’ Artificial Intelligence — On the motives for and impossibility of FAI; by Adam Keiper and Ari N. Schulman.

↑ Lua error in package.lua at line 80: module 'strict' not found.
↑ Lua error in package.lua at line 80: module 'strict' not found.
↑ Lua error in package.lua at line 80: module 'strict' not found.
↑ Lua error in package.lua at line 80: module 'strict' not found.
↑ Lua error in package.lua at line 80: module 'strict' not found.
↑ 'How Skynet Might Emerge From Simple Physics, io9, Published 2013-04-26.
↑ Lua error in package.lua at line 80: module 'strict' not found.
↑ Lua error in package.lua at line 80: module 'strict' not found.
↑ ^9.0 ^9.1 Lua error in package.lua at line 80: module 'strict' not found.
↑ Goertzel, Ben. "Should Humanity Build a Global AI Nanny to Delay the Singularity Until It’s Better Understood?" Journal of consciousness studies 19.1-2 (2012): 1-2.
↑ ^11.0 ^11.1 Lua error in package.lua at line 80: module 'strict' not found.
↑ Lua error in package.lua at line 80: module 'strict' not found.
↑ Lua error in package.lua at line 80: module 'strict' not found.
↑ Lua error in package.lua at line 80: module 'strict' not found.
↑ Kornai, András. "Bounding the impact of AGI." Journal of Experimental & Theoretical Artificial Intelligence ahead-of-print (2014): 1-22. "...the essence of AGIs is their reasoning facilities, and it is the very logic of their being that will compel them to behave in a moral fashion... The real nightmare scenario (is one where) humans find it advantageous to strongly couple themselves to AGIs, with no guarantees against self-deception."
↑ Lua error in package.lua at line 80: module 'strict' not found.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[14]

[15]

[16]

Friendly artificial intelligence

Contents

Risks of unfriendly AI

Coherent extrapolated volition

Other approaches

Public policy

Criticism

See also

References

Further reading

External links

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools