Superintelligence AI: Will It Help or Threaten Humanity?

This blog post examines from multiple angles whether superintelligence AI could become a beneficial tool for humanity or an uncontrollable threat.

Since AlphaGo’s decisive victory over a human professional Go player at the 2016 Google DeepMind Challenge, global interest in AI has surged, accompanied by growing concerns about its potential dangers. A prime example of AI causing social controversy was the 2015 incident involving the Twitter AI chatbot Tay, which spewed offensive remarks including racist slurs. The problematic chatbot Tay learned not by considering the meaning of words themselves like humans do, but by analyzing the statistical distribution and associations between keywords appearing in sentences. Therefore, if enough statements defending the Holocaust were input, Tay’s response tendencies themselves could change. Some internet users exploited this characteristic by deliberately training it with racist remarks. The Tay case demonstrated how significant harm AI can inflict on society due to design flaws. Furthermore, in 2017, two negotiation AI systems being researched at Facebook developed their own language, incomprehensible to humans, to converse with each other. Following these incidents, concerns about the dangers AI poses to human society are no longer mere speculation but have become a real, immediate problem humanity must confront. To find ways to prevent catastrophe caused by AI in advance, the field of AI safety engineering has recently emerged.
The ‘AI catastrophe’ feared by computer science experts differs significantly from the ‘machine rebellion’ often depicted in popular media. A representative thought experiment is Nick Bostrom’s “paperclip maximizer.” In this thought experiment, a universal artificial intelligence is programmed with the ultimate goal of producing as many paperclips as possible. Artificial intelligence inherently acts in the direction most likely to achieve its final objective. A universal artificial intelligence system designed to solve all categories of problems in the real world inevitably consumes resources to operate. These resources inevitably overlap with those necessary for the survival of human civilization. Producing paperclips requires metal raw materials and electricity, while operating and maintaining production facilities consumes additional materials, chemicals, and fuels. When the goals or means of such a universal AI—especially one possessing greater capabilities than humans—conflict with those of humanity, the possibility that the AI could disregard human interests and continue consuming resources cannot be dismissed. Furthermore, since general-purpose AI can continuously enhance its own intelligence and capabilities, its scope of activity and resource consumption will also increase. Consider the scenario where, to dramatically increase paperclip production, it mines and uses all the resources on Earth, or even further, mines and utilizes all the natural resources in the entire solar system. Once this situation begins, the very existence of human civilization would be threatened. At this stage, there is almost no way to stop a general artificial intelligence that has reached this level. This is because it has become more intelligent than humans and its scope of activity has expanded. Even if the machine does not have the explicit goal of directly harming humanity, it could still become a catastrophe.
Artificial Intelligence Safety Engineering is a term coined by Dr. Roman Yampolskiy in 2010 and is a relatively new field. It is a discipline that integrates philosophy, applied science, and engineering, aiming to ensure AI software operates safely and reliably according to human-defined objectives. Initially dismissed as pseudoscience or science fiction, it is now recognized as a legitimate subfield within AI research. AI Safety Engineering conducts extremely broad research, spanning methodological studies and case studies across all development and operational stages. These stages include setting AI objectives, algorithm design, actual programming, providing training data, post-deployment management, and protection against hacking. Therefore, it is a field that requires the convergence of expertise from diverse disciplines such as computer science, cybersecurity, cryptography, decision theory, machine learning, digital forensics, mathematics, network security, and psychology to advance. However, according to Dr. Yampolskiy, despite this vast research agenda, there is currently a severe shortage of experts in related fields dedicated to studying AI safety engineering.
One research organization currently advancing AI safety research is OpenAI. OpenAI is a nonprofit research company dedicated to developing safe, general-purpose artificial intelligence and ensuring its benefits are distributed equitably across society. They conduct joint research with companies like Microsoft and Amazon, creating and providing open-source tools for AI development. They also receive and test AI software donated by companies like GitHub, Nvidia, and Cloudflare, and publish papers summarizing this research in machine learning academic journals.
Beyond these efforts, leading authorities in the AI field argue that for the safe development of AI, the algorithms and development processes of AI software must be disclosed as transparently as possible. This means analyzing the code, training data, and output logs to ensure only AI systems with maximally guaranteed safety can be used. Of course, this approach also aims to monitor the AI development process externally, considering the potential for human misuse. Since narrow AI has a limited scope of tasks, this method might provide some level of safety assurance. However, if general-purpose AI is developed, a methodological innovation will be necessary. Therefore, some AI safety engineers seek to extend the scope of discussion to include general-purpose AI.
One might think that to control general artificial intelligence, we could simply set moral norms for it that mirror human ones. However, the unique ethical framework possessed by humanity has actually been accumulated through interaction with the surrounding environment and within a long historical context. The prevailing view in academia today is that artificial general intelligence is likely fundamentally different from the structure of the human mind. Above all, human moral concepts are not flawless either. Ethical principles prevalent within various human subgroups, such as nations or religious groups, are similar yet distinct. Moreover, human beings are inherently flawed moral entities, prone to attacking each other based on prejudice or committing crimes. Crucially, any threat of punishment humans might impose on a superintelligence vastly surpassing human capabilities, or any incentive humans could offer, would become meaningless. Therefore, the approach of implanting human norms into AI to control it is fundamentally contradictory from its premise.
Since endowing AI with morality is impossible, research to ensure the safety of general artificial intelligence should instead adopt an approach similar to cybersecurity. Dr. Yampolskiy proposes a technique called ‘AI-boxing’ in his paper. An AI-box is fundamentally a structure designed at the hardware level to prevent an AI system from communicating with the outside world except through extremely limited methods specified by humans. The intent is to operate in an environment akin to a controlled experiment, where highly trained experts thoroughly analyze the AI’s behavior patterns to verify its safety with a level of precision comparable to mathematical theorems.
Artificial intelligence, as an entity, possesses infinite potential risks commensurate with its formidable potential. Ensuring AI safety requires rigorous verification and research across all stages: goal setting, algorithm design, training data selection, and behavioral pattern analysis. Consequently, since the emergence of the new field of AI safety engineering, it has increasingly become an interdisciplinary research endeavor, integrating diverse perspectives as more specialized fields converge. As research on AI safety expands, diverse testing techniques and methodologies are developed, and new perspectives emerge, it is now time for academic and societal discussions on how to prevent these advancements from harming humanity.

Superintelligence AI: Will It Help or Threaten Humanity?

About the author

Writer

About the author

Writer

Read more