Notice & Comment

How Should the Government’s Automated Legal Guidance Evolve?: A Response, by Joshua D. Blank & Leigh Osofsky

This post concludes Notice & Comment’s symposium on Joshua D. Blank and Leigh Osofsky’s Automated Agencies: The Transformation of Government Guidance. For other posts in the series, click here.

Thank you to the editors of JREG for organizing the wonderful symposium this week regarding our new book, Automated Agencies: The Transformation of Government Guidance, and thank you to all the symposium participants for thoughtfully engaging with our work. In our response, we would like to draw out some of the themes that arose from the symposium.

Professors Emily Cauble and Larry Zelenak both ask, in different ways, variations of the same question: What is automated agency guidance good for? Professor Cauble’s question is rooted in concerns regarding the limited reliability of automated agency guidance. Professor Zelenak’s question is motivated by a different kind of limit: the limited questions that agencies’ automated legal guidance tools address, relative to other types of artificial intelligence, such as ChatGPT and other large language models.

Turning first to disclaimers, as Professor Cauble notes, agencies sometimes have clear and broad disclaimers regarding the public’s ability to rely on informal guidance. At other times, no disclaimers are present or, as with the IRS’s Interactive Tax Assistant, the disclaimers are cryptic.[1] Interestingly, adding to the confusion, Professor Cauble has documented in her own work how non-experts often fail to appreciate what the disclaimers mean. Some people think they indicate more protection than they provide; others think the opposite. Whatever the actual, underlying reality, disclaimers leave many wondering why agencies would offer automated legal guidance that is not always a reliable source of information.

Instead of focusing on the limited reliability of the guidance, Professor Zelenak focuses on the limited scope. He provides a colorful hypothetical from the 1990s TV sitcom Roseanne to illustrate how the Interactive Tax Assistant (and, indeed, much of the IRS’s informal guidance) fails to provide answers to some of the most straightforward tax questions, much less the really hard ones. Professor Zelenak muses that, for many of these questions, taxpayers are likely to turn to ChatGPT and other free, generative forms of AI. Thus, while exploring a different issue than Professor Cauble, Professor Zelenak also leaves readers wondering what, exactly, is the point of agency guidance like the Interactive Tax Assistant?

Professor Clint Wallace’s contribution helps us understand, in part, the answer to this question. Agencies’ automated legal guidance may be far from optimal but, as it turns out, we live in a suboptimal world. As Professor Wallace describes, there may have been a moment in time in which we could have hoped for extensive simplification of the tax law and ample IRS resources to apply it, but that moment seems to have passed. Now, more than ever, we live in a world with increasingly complex tax law (and federal laws in general), with limited agency resources to explain how the laws apply to a confused public.

Professor Zelenak is surely right that, in this suboptimal world, many members of the public will turn to ChatGPT and other generative technology to try to understand even the basics of how federal law applies to their situations. But recent research suggests that people do so at their peril. For example, a 2025 Journal of Empirical Legal Studies article examined the reliability of leading AI legal research tools, including both general-purpose chatbots (GPT-4) and retrieval-augmented generation (RAG) offered by leading legal research providers (such as Lexis and Westlaw). The study found that even the RAG systems hallucinated a substantial amount of the time (though less frequently than the general-purpose chatbots), and there were additional problems with these systems’ reliability and accuracy.

The suboptimal, actual state of the world, then, leaves members of the public between a rock and a hard place. On the one hand, members of the public cannot understand the complex federal laws and regulations that apply to them. (Indeed, as one of us has identified in prior work, the government does not even expect members of the general public to be able to understand these sources of law.) On the other hand, the government only offers relatively limited and often unreliable explanations of the law to the public. And, while there is a rise in generative AI that purports to offer answers to all sorts of questions that the public may have, these answers simply often are not correct.

The temptation is strong, in this environment, for agencies to do what they can to make the public’s lives easier (and, not just incidentally, to ease the burden on the agencies that are trying to administer the law themselves). Recognizing the difficulties that agencies and the public face, in other words, helps us understand how we get the form of automated agency guidance that might otherwise be hard to understand. The IRS, for instance, answers millions of user questions a year with the Interactive Tax Assistant, a decision-tree system that appears almost purposely archaic. The Interactive Tax Assistant eschews more sophisticated generative AI and tightly limits the questions that it answers. But it does so precisely because the government is simultaneously under intense pressure to try to administer a complex legal system with so few resources and is so concerned about getting it wrong. The government offers a confusing disclaimer so as not to mislead the public about the ability to rely on the system, but also so as not to dissuade people from using it entirely. Agencies, in other words, are self-consciously, and somewhat painfully, trying to have it both ways – to ease burdens on the agency and the public, without actually being able to make real changes needed to ease those burdens.

In so doing, the agency faces a broader rub that Professor Sarah Lawsky identifies. The more usable and likeable the agency makes these systems, the more problematic they become in some ways. Professor Lawsky describes the classic “ELIZA effect,” where users project human characteristics onto automated systems, and even forget they are conversing with automated systems. As we have documented, and as Professor Lawsky highlighted, agencies use evaluation methods like user surveys to determine whether their automated legal guidance systems are working. Agencies also try to make their advice to the public as concise as possible, in accord with public preferences to read as little as possible. Agencies give their automated systems human-sounding names like “Emma” and “Aidan” and sometimes even give these systems human-looking faces. All of this is designed to make these systems more palatable and to make humans like them more. Doing these things helps obscure many of the weaknesses of the systems, including that they are not, in fact, human, and that people cannot actually rely upon what they say.

Professor Kristin Hickman crystallizes how agency attempts to make their automated legal guidance usable (and used) by the public generate frictions in administrative law. Professor Hickman has long carefully assessed how administrative law applies to agency communications, and how agencies (and especially the IRS) may be missing the mark in applying it. Focusing on automated legal guidance, Professor Hickman characterizes agencies as being “willfully blind” in ignoring how members of the public rely on this guidance to organize their affairs. She warns that, while agencies may find it expedient to generate reliance without underlying reliability, the long-run effect may be a decline in public faith. And yet, Professor Hickman acknowledges the difficult fit between traditional administrative law doctrines and automated legal guidance. Part of the problem, as Hickman highlights, and as we examine in our book as well, is that administrative law is premised on the idea that sophisticated parties and lawyers will be the interpreters of agency guidance. Consequently, when tools like automation democratize access to agency guidance, some of the traditional guardrails of administrative law fail to protect the public.

In some ways, we might see the task agencies are facing as akin to that facing a teacher who is struggling to convey that her student does not understand the material, while still trying to preserve her teaching reputation among her students. This is a difficult needle to thread. If the teacher had all the resources in the world, she could work with the student at length to make sure the student understands the information. If the teacher had much greater control than she actually has, she would change the underlying information to make it easier to understand. Absent unlimited resources and control, the teacher does the best she can by helping the student gain an approximate understanding of the information, in the process glossing over the fact that the student may not fully understand it. Viewed in this way, we can better appreciate the difficult task that agencies face and empathize with their efforts.

But we can do better than just empathizing with the difficult task federal agencies face. Former National Taxpayer Advocate Nina Olson’s trenchant comments, based on her government experience, illustrate well the ways in which automation can create serious problems that simply do not need to be created. One poignant example she provides is when the IRS, without any prior notification or publicity, directed its programmers to change the “attainment of age” rule in its processing systems, which had a surprising and adverse consequence on thousands of taxpayers. Olson also highlights well the same dynamic Lawsky identified – how these automated systems may contain systematic errors, denying taxpayers relief to which they are entitled, at the same time that taxpayers report satisfaction with the programs. Especially as more and more people come to rely on the automation without adequate understanding of the underlying law, we face the risk of blissfully having automated systems apply rules that are not actually faithful to the underlying law.

Together, the contributions in this symposium help us home in on a tension at the heart of the government’s use of automated legal guidance: This guidance is both increasingly critical to keep the public informed of how federal law applies, and it is increasingly dangerous. The more agencies turn to this form of guidance, the greater the possibility of non-transparent changes being made that have adverse impacts on the public, the greater the threat to norms of transparency and participation in agency implementation of the law, and the higher possibilities of access to justice gaps between well-resourced individuals and the general public. The promise of automation, while great, is accompanied by perils.

We hope that our book and this symposium will help dispel beliefs that federal government agency use of automated legal guidance is all good or all bad. It can be an important tool, it is almost certainly going to be an indispensable tool, and it deserves our continuing attention and critique.

Joshua D. Blank is Professor of Law at the University of California, Irvine School of Law, and Leigh Osofsky is the William D. Spry III Distinguished Professor of Law at the University of North Carolina School of Law.

[1] The Interactive Tax Assistant disclaimer states, “Answers do not constitute written advice in response to a specific written request of the taxpayer within the meaning of section 6404(f) of the Internal Revenue Code.”

A blog from the Yale Journal on Regulation and ABA Section of Administrative Law & Regulatory Practice.

Made possible in part by the support of Davis Polk & Wardwell LLP