The only way of discovering the limits of the possible is to venture a little way past them into the impossible (Arthur C. Clarke's 2nd law)

Wednesday 13 April 2011

Arms races and intelligence explosions (extended abstract)

Carl Shulman, Singularity Institute of Artificial Intelligence
Stuart Armstrong, InhibOx

I. Introduction

A number of researchers (Chalmers, 2010; Good, 1965; Kurzweil, 2005; Moravec, 1999; Sandberg, 2009; Solomonoff, 1985; Vinge, 1993; see also Baum and Goertzel, 2010) have argued that sometime in this century humanity will develop Artificial Intelligence (AI) programs capable of substituting for human performance in almost every field, including AI research, and that this will greatly accelerate technological progress as AIs design their successors. That hypothetical event has been described as an “intelligence explosion” or “technological singularity.” While the term “singularity” is sometimes taken to refer to broader claims of accelerating change or limits of prediction (Yudkowsky, 2007), and has been elaborated in diverse formal models (Sandberg, 2009), we will rely on the recent overview in Chalmers (2010) for its account of the intelligence explosion.

Chalmers notes that even if it is technically feasible for humanity to produce an intelligence explosion, we may not exercise that capacity because of “motivational defeaters,” choosing to restrict, slow, and manage the development of advanced AI technologies to reduce risk. On the other hand, since a lead in AI technology may translate into overwhelming military advantage, an arms race dynamic may give states incentives to pursue even very dangerous research in hopes of attaining a leading position.

Not only is the arms race dynamic important for the evaluation of many aspects of the singularity hypothesis, it is also an area where existing empirical evidence and theory can be brought to bear from the study of nuclear weapons. This paper discusses some key parameters on which a race to intelligence explosion might differ from the historical race to nuclear explosion: the potential for small differences in research progress to produce massive military disparities in an intelligence explosion, the high risks of accidental catastrophe during research and development, and additional barriers to verification and enforcement of arms control treaties. Collectively, these factors suggest that states would have more to gain from AI control than nuclear control treaties, but would also face greater challenges in coordinating. 

II. An AI arms race may be “winner-take-all”

The threat of an AI arms race does not appear to be primarily about the direct application of AI to warfare. While automated combat systems such as drone aircraft have taken on greatly increased roles in recent years (Singer, 2009; Arkin, 2009), they do not greatly disrupt the balance of power between leading militaries: slightly lagging states can use older weapons, including nuclear weapons, to deter or defend against an edge in drone warfare.

Instead, the military impact of an intelligence explosion would seem to lie primarily in the extreme acceleration in the development of new capabilities. A state might launch an AI Manhattan Project to gain a few months or years of sole access to advanced AI systems, and then initiate an intelligence explosion to greatly increase the rate of progress. Even if rivals remain only a few months behind chronologically, they may therefore be left many technological generations behind until their own intelligence explosions. It is much more probable that such a large gap would allow the leading power to safely disarm its nuclear-armed rivals than that any specific technological generation will provide a decisive advantage over the one immediately preceding it.

If states do take AI potential seriously, how likely is it that a government's “in-house” systems will reach the the point of an intelligence explosion months or years before competitors? Historically, there were substantial delays between the the first five nuclear powers tested bombs in 1945, 1949. 1952, 1960, and 1964. The Soviet Union's 1949 test benefited from extensive espionage and infiltration of the Manhattan Project, and Britain's 1952 test reflected formal joint participation in the Manhattan Project.

If the speedup in progress delivered by an intelligence explosion were large, such gaps would allow the leading power to solidify a monopoly on the technology and military power, at much lower cost in resources and loss of life than would have been required for the United States to maintain its nuclear monopoly of 1945-1949. To the extent that states distrust their rivals with such complete power, or wish to exploit itthemselves, there would be strong incentives to vigorously push forward AI research, and to ensure government control over systems capable of producing an intelligence explosion.

In this paper we will discuss factors affecting the feasibility of such a localized intelligence explosion, particularly the balance between internal rates of growth and the diffusion of or exchange of technology, and consider historical analogs including the effects of the Industrial Revolution on military power and nuclear weapons.

III. Accidental risks and negative externalities

A second critical difference between the nuclear and AI cases is in the expected danger of development, as opposed to deployment and use.Manhattan Project scientists did consider the possibility that a nuclear test would unleash a self-sustaining chain reaction in the atmosphere and destroy all human life, conducting informal calculations at the time suggesting that this was extremely improbable. A more formal process conducted after the tests confirmed the earlier analysis (Konopinski, Marvin, & Teller, 1946), although it would not have provided any protection had matters been otherwise. The historical record thus tells us relatively little about the willingness of military and civilian leaders to forsake or delay a decisive military advantage to avert larger risks of global catastrophe.

In contrast, numerous scholars have argued that advanced AI poses a nontrivial risk of catastrophic outcomes, including humane extinction. (Bostrom, 2002; Chalmers, 2010; Friedman, 2008; Hall, 2007; Kurzweil, 2005; Moravec, 1999; Posner, 2004; Rees, 2004; Yudkowsky, 2008). Setting aside anthropomorphic presumptions of rebelliousness, a more rigorous argument (Omohundro, 2007) relies on the instrumental value of such behavior for entities with a wide variety of goals that are easier to achieve with more resources and with adequate defense against attack. Many decision algorithms could thus appear benevolent when in weak positions during safety testing, only to cause great harm when in more powerful positions, e.g. after extensive self-improvement.

Given abundant time and centralized careful efforts to ensure safety, it seems very probable that these risks could be avoided: development paths that seemed to pose a high risk of catastrophe could be relinquished in favor of safer ones. However, the context of an arms race might not permit such caution. A risk of accidental AI disaster would threaten all of humanity, while the benefits of being first to develop AI would be concentrated, creating a collective action problem insofar as tradeoffs between speed and safety existed.

A first-pass analysis suggests a number of such tradeoffs. Providing more computing power would allow AIs to either operate at superhumanly fast timescales or to proliferate very numerous copies. Doing so would greatly accelerate progress, but also render it infeasible for humans to engage in detailed supervision of AI activities. To make decisions on such timescales AI systems would require decision algorithms with very general applicability, making it harder to predict and constrain their behavior. Even obviously risky systems might be embraced for competitive advantage, and the powers with the most optimistic estimates or cavalier attitudes regarding risk would be more likely to take the lead.

IV. Barriers to AI arms control

Could an AI arms race be regulated using international agreements similar to those governing nuclear technology? In some ways, there are much stronger reasons for agreement: the stability of nuclear deterrence, and the protection afforded by existing nuclear powers to their allies, mean that the increased threat of a new nuclear power is not overwhelming. No nuclear weapons have been detonated in anger since 1945. In contrast, simply developing AI capable of producing an intelligence explosion puts all states at risk from the effects of accidental catastrophe, or the military dominance engendered by a localized intelligence explosion.

However, AI is a dual-use technology, with incremental advances in the field offering enormous economic and humanitarian gains that far outweigh near-term drawbacks. Restricting these benefits to reduce the risks of a distant, novel, and unpredictable advance would be very politically challenging. Superhumanly intelligent AI promises even greater rewards: advances in technology that could vastly improve human health, wealth, and welfare while addressing other risks such as climate change. Efforts to outright ban or relinquish AI technology would seem to require strong evidence of very high near-term risks. However, agreements might prove highly beneficial if they could avert an arms race and allow for more controlled AI development with more rigorous safety measures, and sharing of the benefits among all powers.

Such an agreement would face increased problems of verification and enforcement. Where nuclear weapons require rare radioactive materials, large specialized equipment, and other easily identifiable inputs, AI research can proceed with only skilled researchers and computing hardware. Verification of an agreement would require incredibly intrusive monitoring of scientific personnel and computers throughout the territory of participating states. Further, while violations of nuclear arms control agreements can be punished after the fact, a covert intelligence explosion could allow a treaty violator to withstand later sanctions.

These additional challenges might be addressed in light of the increased benefits of agreement, but might also become tractable thanks to early AI systems. If those systems do not themselves cause catastrophe but do provide a decisive advantage to some powers, they might be used to enforce safety regulations thereafter, providing a chance to “go slow” on subsequent steps.

V. Game-theoretic model of an AI arms race

In the full paper, we present a simple game-theoretic model of a risky AI arms race. In this model, the risk of accidental catastrophe depends on the number of competitors, the magnitude of random noise in development times, the exchange rate between risk and development speed, and the strength of preferences for developing safe AI first.

VI. Ethical implications and responses

The above analysis highlights two important possible consequences of advanced AI: a disruptive change in international power relations and a risk of inadvertent disaster.

From an ethical point of view, the accidental risk deserves special attention since it threatens human extinction, not only killing current people but also denying future generations existence. (Matheny, 2007; Bostrom, 2003). While AI systems would outlive humanity, AI systems might lack key features contributing to moral value, such as individual identities, play, love, and happiness (Bostrom, 2005; Yudkowsky, 2008). Extinction risk is a distinctive feature of AI risks: even a catastrophic nuclear war or engineered pandemic that killed billions would still likely allow survivors to eventually rebuild human civilization, while AIs killing billions would likely not leave survivors. (Sandberg & Bostrom, 2008).

However, a national monopoly on an AI intelligence explosion could also have permanent consequences if it was used to stably establish its position. Permanent totalitarianism is one possibility (Caplan, 2008).

We conclude by discussing some possible avenues for reducing these long-term risks.
 
References

Arkin, R. (2009). Governing Lethal Behavior in Autonomous Robots. Boca Raton: Chapman & Hall / CRC.

Baum, S., Goertzel, B., and Goertzel, T. (forthcoming). How long until human-level AI? Results from an expert assessment. Technological Forecasting and Social Change. DOI 10.1016/j.techfore.2010.09.006.

Bostrom, N. (1998), How long before superintelligence? Int. Jour. of Future Studies, 2.

Bostrom, N. (2002). Analyzing human extinction scenarios. Journal of Evolution and Technology, 9(1).

Bostrom, N. (2003). Astronomical waste: The opportunity cost of delayed technological development. Utilitas, 15(3), 308–314.

Bostrom, N. (2005). The future of human evolution. In C. Tandy (Ed.), Death and anti-death: Two hundred years after Kant, fifty years after Turing (pp. 339-371). Palo Alto: Ria University Press.

Caplan, B. (2008). The totalitarian threat. In N. Bostrom & M. Cirkovic (Eds.), Global catastrophic risks, (pp. 504-519). Oxford: Oxford University Press.

Chalmers, D. J. (2010). The Singularity: A philosophical analysis. J. of Consciousness Studies, 17 (9-10), 7-65.

Good, I. J. (1965). Speculations concerning the first ultraintelligent machine. In F. L. Alt & M. Rubinoff (Eds.), Advances in computers, vol. 6.(pp. 31–88). New York: Academic Press.

Hall, J.S. (2007). Beyond AI: Creating the conscience of the machine. Amherst: Prometheus Books.

Friedman, D. (2008). Future imperfect: Technology and freedom in an uncertain world. Cambridge: Cambridge University Press.

Hanson, R. (2000). Long-term growth as a sequence of exponential modes. Retrieved from http://hanson.gmu.edu/longgrow.pdf.

Konopinski, E. J., Marvin, C., Teller, E. (1946, declassified February 1973). Ignition of the atmosphere with nuclear bombs. Technical Report Los Alamos National Laboratory LA-602. Retrieved from http://www.fas.org/sgp/othergov/doe/lanl/docs1/00329010.pdf.

Kurzweil, R. (2005). The Singularity is near: When humans transcend biology. USA: Viking Adult.

Matheny, J. G. (2007). Reducing the risk of human extinction. Risk Analysis, 27(5):1335-1344.

Moravec, H. P. (1999). Robot: Mere machine to transcendent mind. New York: Oxford University Press.

Omohundro, S. (2007). The nature of self-improving AI. Paper presented at the 2007 Singularity

Summit, San Francisco. Retrieved from http://selfawaresystems.com/2007/10/05/paper-on-the-nature-of-self-improving-artificial-intelligence/

Posner, R. (2004). Catastrophe: Risk and response. Oxford: Oxford University Press.

Rees, M. (2004). Our final hour: A scientist's warning : How terror, error, and environmental disaster threaten humankind's future in this century - on Earth and beyond. Basic Books.

Sandberg, A. (2009). An overview of models of technological singularity. Presented at AGI-2010 Workshop: Roadmap and the Future of AI. Retrieved from http://agi-conf.org/2010/wp-content/uploads/2009/06/agi10singmodels2.pdf.

Sandberg, A, & Bostrom, N. (2008). Global catastrophic risks survey. Future of Humanity Institute Technical Report 2008/1. Retrieved fromhttp://www.philosophy.ox.ac.uk/__data/assets/pdf_file/0020/3854/global-catastrophic-risks-report.pdf

Singer, P. (2009) Wired for war: The robotics revolution and 21st century conflict. New York, NY: Penguin.

Solomonoff, R. (1985). The time scale of artificial intelligence: Reflections on social effects. Human Systems Management, 5, 149-153.

Vinge, V. (1993). The coming technological singularity: How to survive in the post-human era. Whole Earth Review, winter 1993. New Whole Earth.

Yudkowsky, E. (2007). Three major singularity schools. Retrieved from http://yudkowsky.net/singularity/schools.

Yudkowsky, E. (2008). Artifical Intelligence as a positive and negative factor in global risk. In Bostrom, N. and Cirkovic, M. (Eds.). Global catastrophic risks (pp. 308-343). Oxford: Oxford University Press.

1 comment:

  1. Hi Carl,

    Please forgive my short response as I am typing on my phone.

    Do you think a corporate arms race also deserves consideration? IBM has the blue brain project. Perhaps humanity will be destroyed by big blue.

    ReplyDelete