What is the AI control problem?

In simplest terms, the AI Control Problem is the creation of an AI that cannot be controlled by humans.

In Pop Culture

There are many fictional representations of dramatised representation of the Artificial Intelligence (AI) Control Problem where we create an AI that then proceeds to destroy humanity, such in The Terminator and The Matrix movie series. Many envisioning of the future with AIs tend to be a dystopia where humans have lost control of the AIs that they built. The more optimistic versions often tend to involve a more human-like AIs that are learning to be human, such as Star Trek’s Data, A.I. Artificial Intelligence’s David and the Bicentennial Man’s Andrew Martin where the control problem is less of a concern. However, the AI’s we create are likely to be different to these fictional depictions and therefore the AI control problem is also likely to be different from the envisioning in popular culture.

Control

AI Control Problem

There are multiple version of the control problem they often differ on what is meant by “control”. The lack of control for in the the Ethigi context is the inability to shut down the AI by an authorised person because the AI is able to out-think that person.

This idea of control does not apply if the shut-down authority is limited by design. For instance, if the end-user is prevented by design from shutting-down an AI but if the maintenance engineer could shut it down, then that AI can be controlled. This scenario is already common where the manufacturers may prevent user access.

This idea of control also excludes the scenario where an AI cannot be switched off by design. For instance, if a rouge military organisation builds an AI that causes destruction and cannot be shut-down, then again, that AI is controlled in the sense that it is acting as intended by its creators. This scenario is similar to weapons such as landmines that cannot easily be deactivated once they are activated.

Finally, the idea of control in this context also excludes the scenario where an AI could be switched off but are not actually switched off because the AI accidentally disables or kills the person authorised to switch it off. This scenario is similar to an accidental release of a biological weapon that kills its creator.

So the key is that “control” in this context relates to the intelligence of the AI and its ability to out-think humans to prevent it’s shutdown.

One remaining grey area is when an AI could be shut-down but where its are actions are not detected or predicted by its human creators due to its intelligence and capacity. Depending on the circumstances, the actions could be dire enough to be included within the AI Control problem coverage. Once such scenario is highlighted in the Story of Gi.

All about AI

What is Artificial Intelligence (AI)?

The term “AI” is used in different contexts with slightly or widely differing definitions. These definitions also evolve over time. The aim of this post is to outline how AI will be used in the context of the Ethigi project. As this project is a cross between a Computer Science approach and the Philosophy approach, we should start with the definitions in those contexts.

Computer Science perspective

AI is a sub-field of computer science with the goal of enabling “the development of computers that are able to do things normally done by people — in particular, things associated with people acting intelligently.”1 There are three versions of the overaching goal that also slightly modify the definition:2

  1. Build computers that think exactly as humans do
  2. Just get the job done without caring if the computation has anything to do with human thought
  3. Using human reasoning as a model that can inform and inspire but not as the final target for imitation

The bulk of the AI currently in the industry falls under the 3rd goal. In contrast, the Turing Test arguably falls under the 1st goal as it identifies AI in terms of the ability to mimic human responses.

Philosophy perspective

AI is “the field devoted to building artificial animals (or at least artificial creatures that – in suitable contexts – appear to be animals) and, for many, artificial persons (or at least artificial creatures that – in suitable contexts – appear to be persons).”3

The four possible goals of AI can be characterised as:4

Human-BasedIdeal Rationality
Reasoning-BasedSystems that think like humansSystems that think rationally
Behavior-Based: Systems that act like humansSystems that act rationally

By this definition, the Turing Test falls in the Human & Behaviour quadrant.

Weak AI, Strong AI, Narrow AI, AGI & Superintelligence

AI could be further categorised as either Weak AI or Strong AI.

Weak AI or Narrow AI, focuses on a particular task and it is by far the most commonly encountered form of AI. It falls within goals 2 and 3 in the Computer Science perspective and the Behaviour-based row in the Philosophy perspective. Examples of Weak AI include Apple’s Siri, self-driving cars, spam filters, image recognition and Facebook’s advertising algorithm.5

Strong AI or Artificial General Intelligence (AGI) can meet or exceed the generalised human cognitive abilities. It can perform any intellectual task that a human could perform without any human intervention. There are no known examples in practice, though there are common examples in fiction such as 2001: A Space Odyssey’s HAL 9000, Star Trek’s Data and Westworld’s Hosts. There are many experts who doubt if an AGI is possible while others question whether it would be desirable. 6

A Superintelligence as a AGI that surpasses the intelligence of the best human minds.

Ethigi is a response to AGIs, so that is the definition of focus for this project.


References