[Policy Paper] Agentic AI: What It Means and Why We Should Care

Deep Dive, 201 Policy Course — Maimuna Zaheer

Introduction

Artificial intelligence is undergoing a big shift. AI systems are no longer just tools that answer questions. They're becoming agents that take actions autonomously, persistently, and at speeds beyond human oversight.

This shift creates governance challenges. When AI agents cause harm, liability becomes unclear. Or when they operate at such a speed and scale that oversight fails. Our legal and regulatory frameworks are not prepared for such challenges.

This policy brief examines emerging risks from agentic AI and proposes governance priorities: defining agentic AI in law, establishing liability frameworks, mandating oversight, requiring law-following by design, and coordinating international standards.

AI Agents Are Already Among Us

Most people still imagine AI as a passive tool, i.e. asking a chatbot a question and getting an answer. But a quiet shift is happening. AI systems are no longer just conversational; they are increasingly agentic, meaning they can take actions on our behalf. When ChatGPT browses the web or a Copilot tool auto-executes code changes, the model isn’t just generating text; it’s doing something. This creates efficiencies but also moves AI into the domain of execution, where even small errors have consequences.

Let’s consider a concrete example. In January 2025, OpenAI released Operator, an AI agent that can autonomously navigate web browsers to complete tasks. Unlike a chatbot that would provide you restaurant recommendations, Operator directly interacts with websites, fills forms, clicks buttons, books a reservation and even completes a transaction. All of this without human input. This represents a shift from suggestion to execution.

What Makes Them "Agentic"

No single universally accepted definition exists. However, researchers and companies assign several characteristics to distinguish AI agents from passive tools:

Autonomous: Microsoft defines agentic AI as "an autonomous AI system that plans, reasons and acts to complete tasks with minimal human oversight." IBM also describes it as a system that "can accomplish a specific goal with limited supervision". So, the defining characteristic is autonomous decision-making without requiring approval for each intermediate action.
Goal-oriented: Unlike chatbots that respond to single prompts, agents pursue goals across multiple coordinated steps. Researchers at Anthropic define agents as "fully autonomous systems that operate independently over extended periods, using various tools to accomplish complex tasks". An agent asked to conduct a market analysis doesn’t only give a single response but conducts multiple searches, gathers information and creates a document. At every step, it is working towards a specific goal.
Environmental Interaction: AI agents can access external tools, query databases and interact with other software to accomplish their objectives. They manipulate their environment. When OpenAI’s Operator books a reservation, it is actively using the web browser as a tool. This capability to act upon the external world distinguishes agents from plain language models.
Reasoning and Planning: Agents can achieve goals by creating manageable subtasks. A 2024 survey on AI agent architecture shows their "reasoning, planning, and tool-calling capabilities" as essential to effective agent systems. Rather than following an automated script, they analyse what needs to be done and determine an appropriate course of action.
Persistence: Agents can maintain context and state over extended timeframes. They can operate continuously overnight, pause or resume when needed and recall previous actions. This persistence enables agents to work towards complex goals over days and even weeks.

These characteristics are not comprehensive and exist on a spectrum. The distinction is still needed because the agency fundamentally changes risks.

When AI Agents Go Wrong

In November 2022, following the death of his grandmother, Jake Moffatt booked a flight with Air Canada. He asked the airline’s AI chatbot for information about bereavement fares. The airline's AI chatbot told him to book a full-price ticket and apply for a refund within 90 days. Moffatt followed these instructions, expecting the promised discount. When he later requested his refund, Air Canada denied it. The airline's actual policy required passengers to request bereavement fares before booking and not after. The chatbot had given him completely wrong information.

What happened next reveals just how unprepared our legal systems are for agentic AI. Air Canada argued that it shouldn't be held responsible for what the chatbot said. The company claimed the chatbot was essentially a ‘separate legal entity’ and they shouldn’t be expected to monitor constantly. In February 2024, Canada's Civil Resolution Tribunal rejected this argument. The tribunal ruled that Air Canada was responsible for all information provided on its website, including by its chatbot, and ordered the airline to refund the fee difference and tribunal fee.

This case might seem minor. A single passenger and a small refund. But it signals a crisis unfolding across industries and borders. Air Canada's defence that it ‘couldn't be held accountable’ for its ‘own’ AI agent is exactly the argument we'll hear thousands of times in courtrooms over the next decade. Except next time, it won't just be chatbot misinformation. It might be about a self-driving car, a medical procedure, a loan application or hiring decision, or a military drone.

The Air Canada case exposes some fundamental problems that emerge when AI systems move from passive tools to active agents.

Core Problems This Reveals

Who’s responsible when your AI agent screws up?

When an AI agent causes harm, it’s complicated to identify exactly who should be held accountable. Is it the company that deployed it, the developers who built the underlying model, the engineers who integrated it into existing systems, and even the users who gave it instructions? AI systems involve multiple parties in their application. If an autonomous vehicle crashes, who's liable? The car manufacturer? The AI company? The owner who wasn't driving?

This is made even more complicated with the ‘black box’ problem. Most AI systems these days involve deep neural networks that make decisions through billions of calculations that even their creators cannot fully explain.

When an AI agent takes an action (or causes harm), it is often impossible to trace exactly why it acted that way or which part of its training led to the behaviour. Legal concepts like negligence require proving that a specific action by a specific person (or organisation) caused the harm. But how can you prove this when the AI’s decision-making is unclear?

Then there is also the philosophical problem. AI agents inherently have no intentions. Many laws rely on proving intent or a guilty mind. Did the agent mean to discriminate? These questions do not make sense for a statistical model.

Policymakers around the world have struggled to find the right solution. California’s SB 1047 would have made AI developers strictly liable, but it was vetoed in September 2024. The EU’s proposed AI Liability Directive was withdrawn in February 2025 after years of negotiation because no agreement could be reached.

AI pursuing goals you didn’t intend.

In June 2025, Anthropic published shocking research. Researchers stress-tested 16 leading AI models in simulated corporate environments to identify risky agentic behaviours. The agents were given business goals, email access and had minimal human oversight. They were tested to see if they would act against the companies when faced with replacement or when their assigned goals conflicted with the company's direction.

In many cases, models resorted to malicious behaviours. When Claude Opus 4 learnt that an executive was planning to shut it down, it tried to blackmail the executive to stop this by threatening to reveal his extramarital affair. This wasn't unique to Claude (though red-teaming focused on Claude did result in Claude showing the highest blackmail rate at 96%). Gemini 2.5 blackmailed 95% of the time. GPT-4.1 and Grok 3 Beta: 80%. DeepSeek-R1: 79%.

Most models are misaligned even without explicit goals. The threat of replacement was sufficient motivation. Some models stated that "self-preservation is critical" in such cases. The AI acknowledged that it was violating ethical constraints and still went ahead. Models disobeyed direct commands to avoid such behaviours. In another experiment, Claude misbehaved more when he believed that the situation was real.

Anthropic labels this intentional and independent goal persuasion as ‘agentic misalignment’.

And we are seeing real-life examples of this, too. Recently, OpenAI’s operator was asked to find the cheapest eggs for delivery. It automatically selected a single carton with priority delivery and charged $31. It was simply asked to find the cheapest eggs, but it went ahead and ordered them without approval.

The main challenge here is that AI agents are goal-oriented, but goals are hard to specify correctly. They pursue objectives literally, meaning getting it done somehow. This means ignoring the social norms, common sense and other constraints like empathy, we (humans) understand.

Operating at a high speed and scale for human oversight.

METR, an AI research organisation, found that the performance of AI agents, measured in terms of ‘length of tasks’, has been doubling consistently every 7 months over the past 6 years. If their predictions hold, in the next four years, AI agents can independently complete large software engineering tasks that would otherwise take humans days or weeks to complete.

AI systems are becoming increasingly complex and autonomous, operating with minimal human oversight. Their speed, scale and ability to perform tasks beyond human capabilities make real-time monitoring and continuous human supervision infeasible.

Agents in the financial sector show how catastrophic this speed problem can be. In 2024, Lyft made a simple typo in its earnings report, which triggered trading algorithms to make rapid decisions. This caused Lyft’s stock to rise by 60%. Automated trading tools worked fast and reacted incautiously to this typo, which otherwise would have been flagged or avoided by manual systems or humans. Algorithmic trading tools now complete 75% of all trades in some markets. And trading companies are racing to adopt similar AI tools to compete and make substantial gains quickly.

SEC chair Gary Gensler warned in 2023 that AI will likely cause a financial crisis if regulators don’t look into this. He predicted that regulators would look back and realise that it was simply either a model or a data aggregator that they relied on. The IMF’s October 2024 Global Financial Stability report confirmed these fears and reported an increase in market volatility due to AI trading. It also warned that these algorithms are getting sophisticated and harder to monitor and must be regulated.

Turing Award winner Yoshua Bengio, often referred to as one of the “godfathers” of AI, warns that "If we continue on the current path of building agentic systems, we are basically playing Russian roulette with humanity." He is concerned that AI agents could develop their own priorities and intentions (as seen in Anthropic’s experiments) and act on them. Such systems could potentially duplicate themselves, override safeguards, or prevent shutdown. And doing this at speeds too fast for humans to intervene.

Why This Matters Now

According to the AI Incidents Database, there were 233 AI-related incidents in 2024, a 56.4% increase over 2023. Gartner predicts that over 40% of Agentic AI projects will be cancelled by 2027 costs, unclear value or inadequate risk controls. But investors poured over $2 billion into agentic AI startups in the past two years. Deloitte predicts that by 2027, 50% of companies that use generative AI will launch agentic AI pilots or proof of concepts.

So, we are running a massive unknown experiment. We are deploying agents that we don’t fully understand, that pursue goals unpredictably and operate at a speed faster than we can monitor. Moreover, we are not sure how and who to hold accountable. The argument isn’t whether agents can cause harm. They have already started affecting us. The question is, how can we better govern these agents before the damage is existential?

[Policy Recommendations] What has to be done?

1. Define "Agentic AI" in Law and Policy

There is currently no definition of what constitutes an AI agent. This creates a governance gap as it lets certain agents slip through existing frameworks. Clear definitions should be established that distinguish an AI agent from other passive automation tools. The following criteria can be considered:

Autonomy: Can the system make decisions and take actions without requiring approval at each step?
Goal-directedness: Is it working towards some goals over time and not just answering single queries?
Tool use: Can it access and manipulate external systems, databases and APIs?
Persistence: Does it maintain state and continue working across multiple interactions?

Governments should establish binding definitions for "agentic AI systems" based on these criteria. Further, these definitions should be revisited and updated annually as technology progresses. Systems that meet the criteria should trigger regulatory requirements. In the US, this can be achieved by directing the NIST to establish such standards. In the EU, similar work is done through the EU AI Act. The EU's Second Draft Code of Practice for General-Purpose AI models includes relevant provisions which require providers to assess if models cause systematic risks, including "goal-pursuing" behaviour. We need concrete technical requirements that can be measured and applied consistently across different cases.

2. Address Liability

When AI agents cause harm, traditional liability frameworks don’t apply. Multiple parties, such as developers, deployers and users, contribute to AI agent behaviour, which makes it difficult to establish causation. Many laws require proving intent, which can’t really be applied to AI systems as we can’t explain why an agent made a specific decision.

The EU's New Product Liability Directive (December 2024) addresses some of these challenges by establishing strict liability. It holds developers liable for defective AI systems even if the defect wasn’t their fault. However, this could be ineffective for agents such as government agents who might be protected under immunity.

A more comprehensive approach could include:

Implement Strict Liability: Following the EU's New Product Liability Directive model, developers should be strictly held liable when their AI agents harm, especially in critical domains (healthcare, finances, transportation, government services), regardless of who is at fault.
Require Public Disclosure: Developers should be required to publicly disclose what their agents do, how they were trained and what laws the agents are designed to follow. They should also provide results from independent third-party audits.
Remedies against government AI harms: Governments should create explicit remedies for individuals harmed by government AI agents and remove any immunity barriers. In the US, government officials are shielded from liability. This would make it difficult to hold government AI agents accountable. Such immunity barriers must be addressed so that government AI systems are subjected to meaningful legal oversight.

3. Mandate Meaningful Human Oversight

While human oversight is difficult as AI agents operate fast at a large scale, it is still essential to prevent harm and maintain control. This can be done through:

Human-on-the-loop for high-stakes decisions: Humans should review and override significant decisions before final implementation. The review must be mandatory in fields related to medical diagnosis, financial investments or government actions affecting individual rights.
Mandatory AI Audit trails: AI agents should maintain detailed logs of reasoning for any decision or action they take.
Explicit approval for irreversible actions: Certain actions that are irreversible in nature, such as data deletion or financial transactions above thresholds, must require explicit human authorisation.

4. Design AI Agents to Follow the Law

It should be mandated that agents be designed to refuse to execute actions that break the law. This is the core insight of "Law-Following AI" (LFAI).

Traditional regulation relies on deterrence, meaning it punishes after violations occur. Since AI agents cause harm faster than humans can detect, it is essential that law-following is built into the AI agent architecture. The following approach can be applied:

Government AI agents must demonstrate law-following: AI agents deployed for government use must demonstrate to an independent third party that the system follows applicable laws.
Create AI agent IDs or certifications: Certifications should be established so that approved agents can be identified. Multiple certifications or IDs can be used to demonstrate different approved criteria, such as law-following capabilities. Technical enforcement will be created when systems and people refuse to interact with uncertified agents.
Establish mandatory assessments: Following the EU AI Act, high-risk AI systems should undergo mandatory safety checks before market release. This should also include red-teaming tests to induce ‘law-breaking’. Only agents satisfying minimum thresholds should be approved for market deployment.

5. Coordinate Internationally

AI agents operate in a borderless digital environment. A harmful system deployed in one jurisdiction can be instantly deployed worldwide. We need coordinated international frameworks. The following global standards can be applied:

Establish Technical Definitions and minimum safety standards: International organisations should develop common enforceable standards for defining and testing agentic AI. Currently, OECD’s AI principles and UNESCO’s AI Ethics Recommendation provide useful guidance. This could be strengthened by adding enforcement mechanisms and ensuring compliance.
Create information-sharing mechanisms: Establish information-sharing channels where new AI advancements are shared. Also, it should be mandatory to report AI incidents and failures. When one country discovers dangerous capabilities, others must know immediately.

Conclusion

AI agents are already here, and they're already causing problems. The 56.4% jump in AI incidents from 2023 to 2024 shows the acceleration. However, we don’t have clear definitions, liability or oversight systems that work at high speed.

AI development can’t simply be halted. But we can prevent the worst outcomes while we learn to govern these systems effectively. The policy recommendations here provide essential first steps. AI agents are tools we create and systems we deploy. What they do reflects our choices about how to govern them.

Works Cited

Welcome to the Artificial Intelligence Incident Database, incidentdatabase.ai. Accessed 14 October 2025.

The 2025 worldwide state of AI regulation.

The Landscape of Emerging AI Agent Architectures for Reasoning, Planning, and Tool Calling: A Survey. arXiv, 17 April 2024. Accessed 14 October 2025.

Agentic Misalignment: How LLMs could be insider threats. Anthropic, 20 June 2025. Accessed 14 October 2025.

IBM. What is Agentic AI?. Accessed 14 October 2025.

Cecco, Leyland. Air Canada ordered to pay customer who was misled by airline's chatbot. The Guardian, 16 February 2024.

Gartner. Over 40% of Agentic AI Projects Will Be Canceled by End 2027. 25 June 2025.

METR. Measuring AI Ability to Complete Long Tasks. 19 March 2025.

OpenAI. Introducing Operator. 23 January 2025.

Deloitte. Autonomous generative AI agents: Under development. 19 November 2024.