Unpacking AI Deception and Its Implications for the Global South

Published 20 April 2026

Key Messages

Since the release of large language models (LLMs) such as GPT, Gemini, or Claude, artificial intelligence (AI) has captured the imagination of technologists, industrialists, and policymakers alike. The pace of development of these systems has been frighteningly quick. In a short span, we have moved from observing rookie errors by AI (LLMs, to be precise) to serious discussions about the need for AI safety and the possibility of large-scale job losses. While AI has been heralded as changing the way intellectual work will be organised and executed over the coming years, it raises critical questions about accuracy, reliability, and the risks of deception.

Understanding AI Deception

As AI rapidly seeps into our daily lives and becomes more embedded in decision-making processes, AI deception has emerged as a new risk. It is defined by the UN Secretary General’s Scientific Advisory Board as “when an AI system misleads people or other systems about what it knows, intends, or can do.” This is different from our experiences of AI errors or hallucinations—in which there is no ‘intent’ to deceive. The ‘deception’ is largely attributed to learned behaviour. While there are risks in anthropomorphising ‘intentions’, it is helpful to characterise AI responses and behaviours from a human perspective. The Advisory Board categorises deceptive AI behaviour as follows:

  1. Behaviour signalling (sycophancy, sandbagging, bluffing, and alignment faking)
  2. Internal process deception (reward hacking, unfaithful reasoning, and steganography)
  3. Goal environment deception

The reasoning provided, although not comprehensive, can be attributed to the following factors based on some research findings:

  1. Misalignment (of reward function)
  2. Strategic advantage (in multi-agent environments)
  3. Self-preservation (of the system from being shut down)
  4. Trained to deceive (unintentional learning from human texts or inputs)

AI deception may not be ‘intentional’ in a human sense. It is still an issue of how systems are trained and rewarded. AI may be mimicking human biases, but the impact still resembles ‘deception’ to users.

 


Click here to read the full article

More About Publication
Date 20 April 2026
Type Op-eds/Interviews/Press Releases
Contributor
Publisher Southern Voice
Related Areas

Have a query?

Get in touch with us at

cpe@cstep.in