Notice & Comment

Determining the Reasonableness of Regulating with AI, by Gilbert Orbea & Emily Froude

This post is the fifth contribution to Notice & Comment’s symposium on AI and the APA. For other posts in the series, click here.

Among the operative principles of administrative law is the requirement that agencies “examine the relevant data and articulate a satisfactory explanation” for their actions, commonly known as the reasoned decisionmaking requirement. What this means in practice is that agencies must show their work. Agencies are obligated to provide a record, replete with explanations for their actions, that would allow a court to conclude after review under the Administrative Procedure Act (APA) that an agency did not act arbitrarily and capriciously. But how do recent advances in AI jibe with this decades-old standard? In these early days of government-by-AI, we preliminarily lay out three factors to guide courts when reviewing the use of AI in the regulatory process:

To what degree Congress, in its statutory mandate to the agency, granted the agency broad latitude to engage in value-laden decisionmaking;
Where and how in formulating agency action the AI is put to use; or
Whether the subject matter of the agency action touches on rights- or safety-impacting domains.

We posit that courts should review agencies’ use of AI in the regulatory process with greater scrutiny when agencies use it in implementing broad statutory mandates, substantively drafting or producing analysis, or regulating rights- or safety-impacting domains. Conversely, courts should be less skeptical of AI when used in implementing relatively narrow congressional mandates, in interstitial aspects of the regulatory process, or in non-rights- and non-safety-impacting domains. This test is holistic, and the factors are neither conjunctive nor disjunctive.

Additionally, a word of caution: these factors are not intended to be definitive or exhaustive. For example, what it means for an agency to “use” AI—given the plethora of ways that systems like large language models (“LLMs”) can be utilized at every stage of an agency formulating a policy—defies comprehensive categorization. The field of AI is rapidly developing, and future AI advances could obviate some of these concerns. As the technology and the law surrounding AI usage continue to develop, how administrative law will adapt to AI use will depend on the facts of any particular case. We provide these three factors to spark conversation on what “reasoning” requires in the context of the current limitations of AI.

Before laying the factors out, we explain certain baseline factual characteristics of AI systems that undergird our analysis. Certain kinds of AI systems draw on patterns in their training data of billions of words, performing trillions or more calculations simultaneously, to predict the most statistically probable answer to a given prompt. AI developers’ aims—in contrast to agencies serving the people—are not to discern truth or share facts, but to design systems that can predict the most logical subsequent word. This trait can lead to numerous types of so-called “hallucinations,” in which systems produce responses that provide contradictory information, confidently assert unfounded or patently false information, or overfit data to ignore accurate but rare outliers. Although they can access peer-reviewed journal articles and high quality media sources, AI models also absorb and may spit out inaccurate or biased results contained in their training data. Crucially, AI systems that use machine learning refine themselves over time, enabling them to produce outputs that no programmer taught them. This is a fundamental difference from traditional software, which is created when humans “give computers explicit, step-by-step instructions.” These characteristics of AI models indicate that, while often impressively coherent and frequently accurate, many AI outputs currently cannot be comprehensively explained nor fully trusted. This calls into question instances where the technology is relied upon to “reason” and, consequently, whether an agency using AI in the regulatory process is able to “show its work” under the APA. These characteristics also point toward our three factors to consider in evaluating the use of AI in the regulatory process.

Under the first factor, courts should look to the breadth of the grant of statutory authority an agency is relying on for a particular action. When writing legislation, Congress sometimes provides an agency with a broad grant of discretionary authority, such as when it tasked the Food and Drug Administration with denying applications for tobacco products that are not “appropriate for the protection of the public health.” Or Congress can be much more specific, such as when it tasked the Securities and Exchange Commission in Section 1502 of the Dodd-Frank reform legislation with requiring disclosures by publicly-listed companies of their use of “conflict minerals” sourced from the war-torn Democratic Republic of the Congo. Indeed, courts already look to whether a statute gives an agency broad authority when deciding whether to defer to what the agency did. And while agencies generally get greater deference where there is broad authority, in our view, they have a complementary duty to deploy it with greater care where AI is involved.

Agencies regularly engage in “value-laden decisionmaking” of the kind that appeared in Department of Commerce v. New York, where Chief Justice John Roberts described the Secretary of Commerce’s role under the Census Act as one of discretion, “mak[ing] policy choices within the range of reasonable options,” which reflect value judgments and the “weighing of incommensurables under conditions of uncertainty.” Agency action that involves an especially high degree of analysis and discretionary judgment also involves making a correspondingly greater number of value-laden policy choices throughout the process of regulating, and the process by which the agency makes each of these choices must be “logical and rational.” Given the technological limitations detailed above, the use of AI where many policy choices must be made to reach final agency action may fail to meet the reasoning bar (though not necessarily always).

In other words, the more Congress tasks the agency with value-laden decisionmaking, like whether to regulate in the first instance, the more courts should scrutinize how much of that value-laden decisionmaking was made by AI. Conversely, the more Congress has already determined the need for regulation and laid out the bounds of the agency’s delegated authority, the less searching the court’s review needs to be where AI is involved.

Second, courts should scrutinize when in the regulatory process AI was deployed, and what role it played. In our estimation, AI raises fewer concerns when, for instance, it is used by experts within the agency to generate summaries of documents in an administrative record, or to retrieve specific facts or technical information along with their sources. In these circumstances, AI operates like any tool that can, to a certain degree, be reviewed and controlled by the person using it, like Google or Westlaw. Policymakers can ensure relevant information is not missed and otherwise make the agency’s final decision on any given issue. In these cases, the judicial need to assess the underlying model’s technical specifications, auditing processes, and the like is allayed, though not eliminated entirely. But where AI is used for more substantive purposes, courts should ratchet up their review of the underlying model and, when necessary, reject its use altogether. For example, an agency in a rulemaking could provide an AI system with all comments, records of past similar rulemakings, and any other materials typically used when drafting a rule, and ask the AI to draft text for a new proposed rule. In these circumstances, the agency shifts its reasoned decisionmaking obligation to the algorithmic black box, asking for a tool with all of the (literally) logical limitations of AI to “consider the evidence and give reasons for [its] chosen course of action,” which AI currently cannot do, despite advances in chain-of-thought reasoning.

Finally, courts should be especially mindful of AI where it is used in the process of formulating policies that are rights-impacting or safety-impacting. This emphasis aligns with guidance from both the Biden-Harris administration and Trump-Vance administration. Both administrations concluded that the risks of AI in certain fields warranted even greater documentation, transparency, and public input, given that incorrect or rogue AI outputs in these realms have an enormous potential to harm human lives and national security. Rights-impacting regulatory action, in particular, is likely to involve value judgements. For instance, the Office of Refugee Resettlement implements policies regarding the placement of unaccompanied children. Imagine agency action setting the policy regarding shelter placements provided to pregnant, unaccompanied immigrant teenagers. That would require the agency—and all Executive Branch officials that may insert themselves into this value-laden decisionmaking—to decide between more or less restrictive care placements, supportive or punitive treatment of immigrants, expedited or lethargic vetting of and release to family members, and reproductive health care options. AI systems should not and likely cannot decide how to balance these competing priorities because they do not have human experiences, morals, or intrinsic beliefs to draw upon.

Each of these factors exists along a spectrum, and any arbitrary and capricious challenge is, “as always, of course, [a] question of sufficiency of an agency’s stated reasons under the arbitrary and capricious review of the APA, [which] is fact-specific and record specific.” AI is rapidly evolving, and much of the focus of current research efforts relates to the ability to explain how outputs are constructed and how to minimize hallucinations. The factors outlined above are meant merely as preliminary guideposts to courts—as well as lawmakers and the public—when there arises a question of whether an agency failed in its most fundamental obligation: to articulate a satisfactory explanation of its actions.

Gilbert Orbea is Staff Attorney, Strategic Initiatives and Emily Froude is Research Analyst at Democracy Forward Foundation. Democracy Forward is a national legal organization that, as part of its work, spearheads the Legal Action Warning (LAW) Project, which provides contemporaneous insights into developments in the law.

A blog from the Yale Journal on Regulation and ABA Section of Administrative Law & Regulatory Practice.

Made possible in part by the support of Davis Polk & Wardwell LLP