Transparency and Authority Concerns with Using AI to Make Ethical Recommendations in Clinical Settings

Byrnes, Jeffrey
Robinson, Michael

1142Grand Valley State University
6226Chapman University

Jeffrey Byrnes, Grand Valley State University, 1 Campus Drive, Allendale, MI 49401-9401, USA. Email: [email protected]

Received April 25, 2024

Received in revised form November 9, 2024

Accepted November 24, 2024

Nursing Ethics 32(6):p 1749-1760, September 2025. | DOI: 10.1177/09697330241307317

In response to recent proposals to utilize artificial intelligence (AI) to automate ethics consultations in healthcare, we raise two main problems for the prospect of having healthcare professionals rely on AI-driven programs to provide ethical guidance in clinical matters. The first cause for concern is that, because these programs would effectively function like black boxes, this approach seems to preclude the kind of transparency that would allow clinical staff to explain and justify treatment decisions to patients, fellow caregivers, and those tasked with providing oversight. The other main problem is that the kind of authority that would need to be given to the guidance issuing from these programs in order to do the work set out for them would mean that clinical staff would not be empowered to provide meaningful safeguards against it in those cases when its recommendations are morally problematic.

Introduction

The power of artificial intelligence (AI) is increasingly being brought to bear on all manner of industries, reshaping them in ways we can only begin to anticipate. Healthcare is no exception. From reading mammograms for indications of cancer and discovering new biomarkers for fetal congenital heart defects to precision dosing and identifying patients at high risk for readmission, there are few aspects of the field that AI does not promise to affect in some way. Recently, some have proposed using AI to assist healthcare professionals in making ethical decisions as well.^– The basic idea is to use this emerging technology to develop programs that can perform the consulting role that clinical ethicists currently play in many large hospitals.

The benefits of using AI in medicine are increasingly apparent. The potential benefits of using AI in clinical ethics consultation are also clear, given that even a small ethical consultation program requires recruiting, funding, and managing specialized employees.^– Licensing a piece of software—which doesn’t get tired or need time off, or have preferences about where it is located—would be easier and less costly than hiring full-time professionals with advanced training in clinical ethics. There is little doubt, then, that this will be an attractive option to many healthcare administrators, especially those tasked with managing institutions that are rural or underfunded.^, Although this impulse is understandable, we want to take the present moment, while AI ethical consultants are still in the developmental stages and not quite ready for deployment, to sound a cautionary note.

Here we draw attention to two problems that warrant special consideration in connection with the prospect of having clinicians rely on ethical guidance from AI-driven programs when providing patient care. One has to do with the kinds of explanations that are and, more importantly, are not possible to have with AI-generated ethical recommendations. The other involves a dilemma concerning the kind of authority that should be bestowed upon recommendations of this sort and whether and when clinical staff should be empowered to contravene them. While there has been a good deal written about some of the potential pitfalls accompanying a range of AI applications in healthcare,^– proposals to employ AI for real-time ethical guidance in clinical settings are very new and little has been said about them. Moreover, for reasons we will explain, this particular usage of AI makes solving these problems more challenging than it is with other applications of AI. Given the high stakes in these cases, however, it is imperative that these worries are resolved prior to any serious consideration of implementing these proposals.

The role of clinical ethicists

To start with, then, what is it that clinical ethicists do that could potentially be accomplished by sophisticated AI? Clinical ethicists offer real-time ethical guidance for clinical professionals, patients, and families about ongoing care decisions. It is not unusual for healthcare professionals to face challenging patient situations in which ethical concerns lead to uncertainty, and even distress, about what they should do. Such cases can be particularly morally distressing for nurses, given both the subsidiary orientation of their professional role and the sustained and personal nature of their interactions with patients, as has been well documented.^– The fundamental reason this occurs is that healthcare professionals do not have a single main objective; they have a multiplicity of goals and obligations, which often pull in different directions. For instance, they have a duty to pursue the best health outcomes for their patients, but they also have an obligation to respect their patients’ autonomy, provide them with the relevant information, and honor their wishes. They must seek to prevent harm, but they are also bound to protect their patients’ privacy and to distribute resources in a way that is fair and just.

When these duties conflict—as they often do—it is necessary to determine which of the conflicting moral considerations outweigh the others in that particular case. Part of what makes this so challenging is that these duties do not admit of a lexical ordering of priority. Each of these duties is capable of being overridden by others in certain circumstances. Sometimes autonomy concerns are dominant. Other times autonomy concerns are outweighed by other moral considerations (either because the patient’s capacity is diminished, or because the likelihood of serious and irreversible harm is so great, or both). Figuring this out is often difficult—especially for clinical staff who lack special training in identifying, distinguishing, and evaluating ethical considerations. In such situations it is increasingly common for clinical staff to consult ethicists, when they are available, to assist them in thinking through the moral dimensions of the case and to help them determine the best way to proceed. This assistance often includes articulating staff feelings of moral distress as concrete ethical questions, identifying the moral costs and benefits of each of the possible paths forward, explaining how the case at hand is relevantly similar to (or distinct from) other cases that have morally clearer solutions, and offering ongoing analysis and explanations of the case as new questions continue to arise when new clinical teams engage with the patient’s care.

To illustrate, consider a relatively common kind of case involving an elderly patient with advanced dementia who is found to have cancer. Treating the patient with chemotherapy may well be effective in suppressing or even reducing the cancerous tumor, but this particular treatment would almost certainly take a considerable toll on the patient’s quality of life in order to achieve what is likely only a marginal to moderate benefit. Typically, of course, physicians allow patients to make these decisions for themselves. If the patient’s advanced dementia means that she lacks the capacity to make these determinations, though, how should they proceed? The team of healthcare providers must determine whether starting chemotherapy in the hopes of extending the patient’s life at the cost of some comfort and quality of life is the morally correct thing to do. In situations like this, a clinical ethicist can support the care team in weighing the potential benefits of this course of treatment with the potential harms.

Developing AI to play this role

So, how could AI be used to develop software that could play this role? According to the proposals on offer,^– this would be very similar to the way machine learning is being used to create programs to assist radiologists in evaluating X-ray images to determine whether they indicate tumors. While there are different versions of this general approach, all methods begin with a training phase wherein programs are fed descriptions of cases and proposed medical interventions together with a verdict (or “label”) indicating whether the proposed intervention would be morally permissible or problematic (as determined by a consensus of clinical ethicists). This is where AI comes into play. The program is then given the task of employing machine learning to search for patterns among the case descriptions and the ethical permissibility of the various interventions in order to generate a rule, or algorithm, about when a given medical intervention is permissible and when it is problematic.

After the program has developed an algorithm that yields the correct judgments about the first batch of cases inputted at the start of the training phase, the next step is to test it. To do this, new pairs of case descriptions and medical interventions are fed to the program—this time without labels indicating the ethical permissibility of the interventions in those circumstances—and the program is directed to indicate whether the proposed intervention would be morally problematic. In instances where the program’s verdicts are incorrect, the program can use this feedback to revise its algorithm to account for this new data.

In theory, once the program has demonstrated itself to be sufficiently accurate through this process, it could be clinically deployed. At that point, clinical staff confronting ethically challenging patient situations could input case details together with the interventions they are considering and receive an ethical recommendation. As noted at the outset, however, there are at least two reasons that providers should think long and hard before pursuing this kind of program.

The transparency concern

The first major worry we want to highlight concerns transparency. Healthcare professionals handling morally complicated cases need more than a bare ethical recommendation; they need to understand the recommendation in a way that enables them to justify their decision and make it comprehensible, not only to patients and their families but also to colleagues. This is an important part of what clinical ethicists do. They dialogue with healthcare providers, asking questions about pertinent clinical factors, answering questions about salient ethical considerations, and explaining why certain treatment options are morally preferable while others might be morally problematic. Doing this in a way that adequately clarifies the moral rationale behind a given choice requires more than merely identifying the moral considerations that favor that option (and disfavor alternatives); it involves explaining why the moral considerations that favor that option (and disfavor alternatives) outweigh the moral considerations that favor alternative courses of action. AI programs are incapable of providing explanations of this sort. As indicated above, the way these programs work is by applying an algorithm they have generated, which provides a long, complicated rule for weighing certain kinds of values (representing certain moral considerations) against other values (representing other moral considerations) given specific background features of a case.

In practice, of course, hospital staff utilizing these programs will not have access to the algorithm being applied to their cases. For them, the software will function like a black box: they input details of the case and then receive a recommendation, with little to no understanding of why those details yielded that verdict. As a result, clinicians who were to rely on AI for ethical recommendations in this way would not be in a position to be able to adequately justify or explain their decisions.

Explicability is a well-known weakness of existing AI, especially in the most powerful and accurate models, those utilizing deep learning.^–^,^– In these models, even if one were able to gain direct access to the algorithm and discover the specific values assigned to each of the many nodes that constitute the deep neural network, there is no way to decipher what those values represent.^, As a result, there is no way to tell what kind of weight the algorithm has given to the various factors that have contributed to its output or why. Although there have been numerous efforts to increase transparency and explicability in different AI models, none has been capable of providing the kind of explanation required for the purposes at issue in this paper. The best that the various methods used in pursuit of explainable AI have been able to accomplish is to identify the factors that were particularly salient and played a significant role in generating the outputted verdict or recommendation. What they cannot do, though, is explain why those factors yielded the results they did.

The same is true of recent proposals to use AI to provide real-time ethical advice to clinicians. Those that propose utilizing cognitive fuzzy maps would allow programmers to designate specific nodes to represent particular ethical values, such as autonomy and beneficence. This would make it at least possible for clinical staff to learn how the algorithm has weighted the various conflicting moral considerations in a given situation and to see which it has deemed more salient. Similarly, proposals that direct AI to use symbolic methods would rely on rules that can be translated into natural language and that specify the conditions in which ethical considerations of certain sorts (e.g., beneficence) outweigh other sorts of ethical considerations (e.g., autonomy).^, This would make it possible, at least in principle, to identify the rule on which a given recommendation was based—even if that rule would have to be so complex as to render it nearly incomprehensible. The most important thing to note here, however, is that neither of these approaches is going to be able to provide the kind of explanation needed here because neither is able to explain why certain ethical considerations have been deemed to be weightier, or more significant, than the others.

Even if users were given access to how the AI system has weighted various competing ethical considerations, there would be no access to why they have been weighted that way. Not even the programmers themselves would be capable of working out why the AI had assigned the relative weights to the various nodes (representing certain moral considerations) in the way that it had or why it had generated the rules it did. That is part of the independent work of the AI that remains inscrutable. Without knowing why certain ethical considerations have been deemed weightier than others, however, there can be no meaningful explanation of the sort required in the current context.

To help see why this is true, consider the following case, which is based on a real-life scenario encountered by one of the authors in their role as a clinical ethicist. A 17-year-old male patient has tested HIV positive since birth and been on a regimen of antiretroviral drugs his whole life. During a hospital admission for unrelated minor treatment, the burden of routine medical interventions became too much and he began refusing his antiretroviral drugs. Given the serious health risks involved with curtailing the medication, the nursing staff experienced considerable distress from his refusal, even though he was a mature minor who demonstrated a clear grasp of his situation over years of treatment. However, the thought of forcibly administering the medication was also deeply troubling to bedside caregivers. For this reason, the team was conflicted about what to do. In this case, the nursing staff consulted the hospital’s on-call clinical ethicist in communication with the treating physician, who was able to help the clinical team think through their conflicting duties. The ethicist affirmed the team’s notion that there are some occasions when it would be ethically appropriate to override a distressed patient’s refusals and forcibly administer a life-saving medication, especially in cases where the administration of a single dose is all that would be required. Yet in order to convey the life-saving benefit to the HIV-positive patient, his refusal would have to be overridden and medication forcibly administered twice a day, indefinitely. Such a course of action would substantially mitigate the benefit conveyed by the drug, as the long-term benefits of the treatment could only be conveyed to this patient with his cooperation. The discussion with the ethicist brought the clinical team to a clearer understanding of the morally significant differences between this case and other somewhat similar cases in which forcefully administering life-saving medication would be the right thing to do. Equipped with this enhanced understanding of the relevant ethics considerations at play in this case, they were able to engage with the patient’s family and their colleagues with much greater clarity and confidence about how they ought to proceed and why.

Now imagine that, instead of an on-call ethicist, the nursing staff had consulted the kind of AI program that has recently been proposed. After inputting the required information characterizing the situation, including both the likelihood and severity of the potential harms and benefits of the different courses of action being considered, they would have received a recommendation. Suppose the program indicated that they should honor the patient’s wishes and discontinue the antiretrovirals. From this, the clinical team could infer that, according to the algorithm driving the AI program’s calculations, the ethical considerations of preventing harm and promoting health were outweighed by the considerations having to do with respecting patient autonomy in this particular case. Importantly, however, the clinical team would have no way to rationally infer why those considerations were outweighed by autonomy concerns in this case, nor would there be any explanation—of the sort that the clinical ethicist was able to provide above—of what distinguished it from other cases in which autonomy considerations were not dominant.

Depending on the software and the steps taken to try to increase the AI model’s transparency, it is possible that the nursing staff might have been able to see either the precise numerical values that the algorithm had assigned to the nodes representing the potential benefit to the patient’s health and the cost of overriding his autonomy or, in the case of models using symbolic methods, the complicated rule that the algorithm had developed, spelling out the various circumstances in which beneficence considerations outweigh autonomy considerations. But none of that would serve to clarify why the AI program determined that discontinuing the antiretrovirals was the morally right thing to do. Like most ethically challenging cases, this is a situation in which there are conflicting ethical values and duties. The duty of beneficence to act in ways that will best promote the patient’s health provides a reason to forcefully administer the drugs against the patient’s will. The duty to respect the patient’s autonomy provides a reason not to do so. The challenge of figuring out the morally correct thing to do here just is the challenge of figuring out which of these conflicting moral reasons outweighs the others. That is simply what it is for a course of action to be the right thing to do: it is for the moral reasons favoring that course of action to be stronger than the moral reasons favoring the alternative courses of action. As a result, any adequate explanation of why a given course of action is the morally right thing to do will necessarily involve an explanation of why the moral reasons favoring that course of action outweigh the conflicting moral reasons. And any substantive explanation of why a person (or program) has determined that a given course of action is the right thing to do is going to have to explain why that person (or program) deemed the moral reasons supporting that course of action stronger than the moral reasons opposing it. As we have seen, however, this is the very point on which even the most transparent AI models are unable to provide any kind of explanation.

If colleagues or the patient’s family members had asked the nursing staff to explain why they thought discontinuing the antiretrovirals was the right thing to do, what kind of answer could they have provided in this case, other than merely reporting that that is what the AI program said they should do? What could they say if asked to explain why the program made that recommendation or why it determined that was the morally right thing to do? They could say that the AI program determined that their duty to respect the patient’s autonomy was stronger than their duty to seek the best health outcome for the patient in this case (perhaps even accompanied by some concrete numerical values). But this is no more an informative explanation than if you were to ask us to explain why Team A beat Team B in the football game and we answered that it was because Team A scored more points than Team B. To say that the moral reasons favoring discontinuing the antiretrovirals outweigh the moral reasons favoring forcefully administering the drugs is merely to assert that discontinuing the antiretrovirals is the morally right thing to do; it does nothing to explain this. Similarly, knowing that the AI program has assigned greater value to the factors favoring discontinuing the antiretrovirals than it has to the factors favoring forceful administration does not do anything to explain the program’s recommendation or make it any more comprehensible. Given that this is the best that even the most transparent AI models being proposed are able to do, it seems that the kind of explanations that are needed in this particular context are not ones that AI is close to being able to provide.

The authority concern

The second worry we want to raise is about the kind of authority that would be vested in AI programs tasked with making ethical recommendations in clinical settings. One way to understand this worry is to see it as presenting the following dilemma: Either the ethical recommendations these programs issue would be treated as discretionary suggestions that clinical staff are free to disregard or they would be meant to be taken as decisive rulings to be followed as a matter of course. The problem with the former is not merely that these issues lie beyond the scope of clinicians’ expertise but that, even by clinicians’ own lights, these are situations where they are conflicted and unsure what to do. That is why they are seeking ethical guidance to begin with. But there is no point soliciting recommendations if they are going to be followed only when they align with clinicians’ predispositions. If clinicians are going to follow the AI’s recommendations only when they agree with them, and otherwise set them aside, then there is not much point in seeking the recommendation in the first place. This would appear to merely add an extra step, one that is superfluous. The alternative horn of the dilemma is equally problematic. If clinical staff are not empowered to contravene the program’s guidance when it seems troubling, then, in cases where the program’s algorithm yields verdicts that are genuinely mistaken—and we have every reason to think this will continue to happen at least sometimes beyond the training phase—there will be no safeguards in place to prevent the morally problematic guidance from being put into action.

Consider how this dilemma might arise in the case of the clinical team caring for the HIV-positive 17-year-old who was refusing to continue taking his antiretrovirals. Feeling some moral distress about the case, the clinical team would seek guidance from the AI ethical consultant. The team would input the relevant information into the program, which would return a recommendation. Suppose the AI program determined that the morally correct thing for the treatment team to do was to heed the patient’s refusals and discontinue the antiretrovirals until he could be brought back on board with his treatment plan. Now imagine that the hospital-wide training on the use of the AI ethical consultant program made very clear that the technology was available for clinical teams to consult but that they were not bound to follow its advice; all users had the final say in ethical decision-making. In that case, with the program’s recommendation to heed the patient’s refusal in hand, but without any new or improved understanding of the ethics of the case, the clinical team would be in virtually the same position that they were in when they initially consulted the program. If heeding the patient’s refusal seemed like the right thing for them to do, then the team would follow the program’s recommendation. If the team was not predisposed to think that allowing the patient to refuse the medications was the right thing to do, then it is unlikely that the team would follow the program’s recommendation. In a context such as that, it is worth inquiring about just what benefit the AI-driven program would be bringing to the hospital. Clearly, the recommendations it issued would not be shaping clinical practice beyond what clinicians were already inclined to do. Indeed, in such a case, it is likely that the treatment team itself would quickly learn that determining and entering the relevant data into the consultant program was a wholly wasted step because they were going to have to tackle their initial question all over again in determining whether they should follow the recommendation.

Alternatively, suppose hospital staff were told that the AI ethics consultant was being introduced into the hospital in order to provide ethics expertise at a higher level than the existing staff is able to provide, and so, the AI program should be considered authoritative and all recommendations should be regarded as tantamount to directives. And now imagine that, when consulted about the 17-year-old refusing his antiretrovirals, the AI program recommended overriding the patient’s refusal and forcibly administering the medication twice a day. How should hospital staff respond to this recommendation—especially if they remain unconvinced, or even skeptical, that this is the right thing to do? It is certainly possible, after all, that this problematic recommendation merely represents a gap in the AI’s training data, a kind of case iteration that had not yet been addressed. Perhaps there is a minor error in the design of the program, of the sort that typically emerges when a new system is deployed. Or maybe the clinical team simply had a typo in the data it inputted when consulting the program about this case. No matter, once the program has been deemed authoritative, and so becomes standard practice, it must be followed. Once this occurs, it is highly unlikely that bad recommendations will be identified as such. (A further consequence of this is that both minor bugs and major gaps in the training data would have a significant likelihood of becoming entrenched in the program and would perpetuate the putting into action of morally problematic recommendations.)

One might think this danger could be alleviated by having clinical ethicists on staff to act as safeguards and help adjudicate these situations—but that undermines the primary motivation for adopting these programs in the first place, which is to serve as cost-saving alternatives to having clinical ethicists on staff. If a hospital has clinical ethicists on staff to double-check the recommendations issued by AI ethical consultants—which can be done only by conducting a review of the cases themselves—that obviates the need for the AI consultant. So, the most obvious way of blunting the second horn of the dilemma appears to be ruled out on pragmatic grounds. Whether there is some other way to neutralize the threat posed by this option without undercutting the very impetus for this approach remains to be seen.

At this point, some readers familiar with these issues might wonder how the authority concern we have raised here is related to another problem that has received a good deal of attention, which has to do with responsibility. As others have argued, one of the negative side effects of introducing AI-generated recommendations into high-stakes decision-making systems, such as those that exist within healthcare institutions, is that it obscures matters of responsibility and frustrates accountability.^– In situations involving adverse effects resulting from clinicians following an AI-prescribed course of action, who should be held accountable for the negative outcome? The nurse or doctor who followed the AI’s recommendation? The computer programmers, or perhaps the computer programming company, who oversaw the training and development of the AI-driven program? The hospital administrators who signed off on the implementation of these programs in their clinical settings? Figuring out how to properly assign responsibility in institutional environments is tricky enough as is, and there is a significant body of scholarship addressing this topic.^– Incorporating non-agential entities, such as AI, into the decisional structure further complicates an already challenging task.

How one addresses the authority concern that we have raised here will certainly have implications for the responsibility problem, but these are distinct issues. The matter concerning authority that we are focused on is, in a way, the more fundamental of the two. This is because determinations of responsibility are partly a function of facts about authority. Answering questions about who should be held accountable for an outcome in a given situation largely depends on figuring out who was in charge of causing or allowing that outcome and who had the authority to prevent it. The level of authority bestowed upon AI-generated recommendations will, ipso facto, have an inverse impact on the authority that clinical staff have to contravene those recommendations. The more authority granted to the AI’s recommendations, the less clinical staff are empowered to disregard them, even when they seem to them mistaken. So, in addition to the dilemma we set out above, this matter of figuring out what kind of authority it would be prudent to accord AI ethical consultants will also have a significant impact on the responsibility clinicians ought to bear for carrying out those recommendations.

Objections and replies

Before concluding, two potential objections are worth briefly addressing. The first has to do with our claim that programs utilizing AI to offer ethical guidance would be incapable of providing the kind of explanation for these recommendations that transparency considerations seem to demand in clinical contexts. At this point, some readers might be thinking that, even if none of the proposals currently on offer involve producing explanations of this sort, it is hard to see what prevents developers from modifying these proposals to include this functionality. After all, large language models (LLMs), such as ChatGPT, are perfectly capable of providing explanations on demand. Of course, given the way that LLMs operate to produce their responses—by simply calculating the most likely continuations of inputted texts—one can see why we might not want to rely on something like ChatGPT for the production of the recommendation itself. But we could easily imagine a hybrid approach, wherein clinicians would utilize the kinds of programs described earlier to provide the actual ethical recommendations and then turn to an LLM like ChatGPT to explain why the course of action recommended by the first program (the computerized ethical consultant) is morally speaking the best thing to do in the present situation, or why the alternatives are more morally problematic.

We certainly acknowledge that LLMs are, generally speaking, capable of producing explanations. For instance, we have no doubt that if asked to explain why the ocean tide goes in and out, or why smoking cigarettes can cause cancer, ChatGPT would provide explanations of these phenomena. Our contention, however, is that LLMs are incapable of providing the kind of explanation that is at issue here—namely, the same kind of explanation that a clinical ethicist could, and is reasonably expected to be able to, provide if asked to explain why they think the course of action they are recommending is the most ethical option (or, at least, why they think it is less morally problematic than the available alternatives).

To see this, it will be useful to consider an example. Imagine an obstetrician has a patient who is in the early stages of labor. After several hours, once it becomes clear to the obstetrician that things are not proceeding nearly as quickly as he hoped they would, he informs the patient that he thinks it would be best to perform a C-section, rather than continuing to pursue the original plan of a vaginal delivery. Suppose that the real reason the doctor makes this recommendation is that he desperately wants to squeeze in a round of golf before heading home, which he knows he can accomplish if they opt for a C-section but is increasingly unlikely if they stick with the patient’s original plan. Now, the obstetrician knows perfectly well that, if he is asked why he opted for a C-section in this case, he cannot say that it is because he wanted to play golf. So, imagine that he uses Google to search for a list of reasons that physicians might prefer a C-section to a vaginal delivery. The search results are going to provide him with reasons that some physicians in some cases would prefer a C-section over a vaginal delivery. And so there is a perfectly reasonable sense in which the search results would be providing an explanation of a sort. Indeed, the search results would be providing explanations commonly offered by physicians of their choice to pursue a C-section. Consequently, if the imagined obstetrician in our example were to offer one of those reasons when asked to explain why he thought a C-section was the right way to go in this case, it is perhaps true that he would be offering a genuine explanation. The problem is that it would not be the right kind of explanation—or, rather, it would not be an explanation of the right thing. It would not be an explanation of why he had recommended a C-section in this case because the reasons he stated would not be the reasons that actually caused him to make that recommendation. Instead, it would be something closer to a post hoc rationalization.

Our worry with the kind of hybrid approach sketched above is that something very similar to the obstetrician example would be occurring. Given the way LLMs operate, if an LLM, such as ChatGPT, were provided with a description of a clinical scenario and asked to explain why a certain course of action—the one recommended by a different AI-driven program, the computerized ethical consultant—is morally preferable to a range of alternatives, the answer we receive would be equivalent to a post hoc rationalization. It might be an explanation of why some people might recommend that particular course of action in a case like this, but it would not explain why the other AI-driven program produced that recommendation. (Note: This is the only point that the above example involving the obstetrician is meant to help illustrate. The example is not offered either as an instance of a genuine ethical conflict or as analogous to the typical case in which a clinical ethicist would be consulted.)

The second potential objection we want to consider focuses on what we have described as the authority problem: either clinical staff will be empowered to contravene the recommendations made by the AI-powered programs, in which case there seems to be no point in soliciting those recommendations to begin with, or clinical staff will not be empowered to second guess the verdicts issued by these programs, in which case there are no safeguards against implementing those verdicts in the nearly inevitable situations in which they are incorrect. The same seems to be true, it might be thought, of clinicians’ relationship to the moral recommendations offered by human clinical ethicists. Since clinicians who request ethical consultations are not bound by the consultants’ judgments in our current system, that appears to dull the sharpness of the first horn of that dilemma. Insofar as the prospect of empowering clinical staff to contravene the ethical recommendations made by AI is problematic, it would not obviously be any more problematic than our current situation with human clinical ethicists, and so would not provide a strong reason against adopting some of the AI proposals that have been put forward.

In our view, there are at least two significant differences between these scenarios which provide reasons for thinking that the authority concern applies much more forcefully in cases where the recommendations are provided by computer programs instead of human clinical ethicists. First, the clinical ethicist’s recommendations come with reasons (or justifications) in an attempt to persuade. This generates a pressure to address those reasons if one is not going to follow them. Moreover, responses that a clinician might give for thinking the ethicist’s reasons are not compelling will themselves be open to rejoinder by the clinical ethicist, either in the form of correcting some misunderstanding or offering yet further considerations. Second, there are reasons that clinicians could have for discounting the recommendations coming from AI programs that do not apply to the recommendations they would receive from clinical ethicists. The program’s verdicts will be dependent on the clinician’s data entry. They are a function of the clinician’s characterization of the case at issue. In this way, they are also dependent on the clinician’s judgment. Additionally, the information included in the description of the case at issue, as well as that of the potential interventions, is going to be limited. The clinician knows much more about the case than will be provided to the program. This is not true of the clinical ethicist.

In the same way that human clinical ethicists can, through dialogue with clinical staff, provide the considerations that underlie their recommendations in a way the AI-driven programs cannot, so too can clinical staff, through dialogue with the human ethicists, provide clarifications and subtle contextual information about the clinical situation that cannot be shoehorned into the program’s inputs. This is especially true, given the ways that these programs’ interfaces simplify and restrict the data to be entered in order to make working with them more user friendly. Taking these two differences together, we can see that the exchange between the human ethicist and the clinical staff is a process of two-way education. The clinician teaches the ethicist enough about the clinical situation for the ethicist to see and understand the weight of the problem, and the ethicist teaches the clinician enough about the morally salient features of the case that the clinician can understand and endorse the recommendation. As a result, the authority problem does not arise between the clinician and the human ethicist in anything close to the same way that it does with the AI-driven ethical consultant program because the discussion between the two human beings can lead to a decision about the best way forward that is understood to be a shared decision. Without that dialogue, the outputs from the AI program are, to the consulting clinician, externally imposed recommendations which either are or are not authoritative.

Conclusion

While seemingly ubiquitous in recent years, AI is still very new. Novel applications of this emerging technology are being discovered and put forward all the time, including in healthcare. These innovations carry the potential to significantly enhance healthcare providers’ ability to deliver quality care in a way that is more efficient and affordable, to the benefit of patients and clinicians alike. But there are potential dangers lurking here as well—dangers that will become harder to eliminate after these applications have been in use for a little while, due to the way that institutional inertia resists opposition to and reconsideration of practices once they have been established. Recent proposals to utilize AI to provide real-time ethical guidance in clinical settings are still in the early stages. The purpose of this paper has been to capture this window of opportunity, before these proposals are in a position to be carried out, to highlight a few serious concerns about this idea. In our view, the problems regarding transparency and authority that we articulate here provide strong prima facie moral reasons not to employ the use of AI ethical consultants, so that the presumption should be against pursuing this approach. To be clear, we have not argued that these reasons are necessarily definitive. Perhaps there are ways of overcoming these challenges, though we are skeptical this can be done. Nevertheless, given that the problems with transparency and authority are entirely foreseeable, institutional leaders have an obligation not to proceed down this path without first being able to adequately address these concerns.

Declaration of conflicting interests The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding The author(s) received no financial support for the research, authorship, and/or publication of this article.

References

1. Grady DAI. Is learning to read mammograms. The New York Times 2020. https://www.nytimes.com/2020/01/01/health/breast-cancer-mammogram-artificial-intelligence.html
Cited Here
2. Baird T, Eastman L, Auger E, et al. Reducing readmission risk through whole-person design. NEJM Catal Innov Care Deliv. 2023; 4(1). https://catalyst.nejm.org/doi/full/10.1056/CAT.22.0237.
Cited Here
3. Poweleit E, Vinks A, Mizuno T. Artificial intelligence and machine learning approaches to facilitate therapeutic drug management and model-informed precision dosing. Ther Drug Mont 2023; 45(2): 143–150.
Cited Here
4. Bahado-Singh R, Friedman P, Talbot C, et al. Cell-free DNA in maternal blood and artificial intelligence: accurate prenatal detection of fetal congenital heart defects. Am J Obstet Gynecol 2023; 228(76): 76.
Cited Here
5. Meier L, Hein A, Diepold K, et al. Algorithms for ethical decision-making in the clinic: a proof of concept. Am J Bioeth 2022; 22(7): 4–20.
Cited Here
6. Anderson M, Anderson S, Armen C. MedEthEx: a prototype medical ethics advisor. In: Proceedings of the Eighteenth Conference on Innovative Applications of Artificial Intelligence. Washington: AAAI Press, 2006.
Cited Here
7. Anderson M, Anderson S. GenEth: a general ethical dilemma analyzer. Paladyn, J. Behav. Robot 2018; 9: 337–357.
Cited Here
8. La Puma J, Schiedermayer DL. Ethics consultation: skills, roles, and training. Ann Intern Med 1991; 114(2): 155–160.
Cited Here
9. Burda M. Certifying clinical ethics consultants: who pays? J Clin Ethics 2011; 22(2): 194–199.
10. Wasserman JA, Brummett A, Navin MC. It’s worth what you can sell it for: a survey of employment and compensation models for clinical ethicists. HEC Forum 2024; 36: 405–420.
Cited Here
11. Kon AA, Rich B, Sadorra C, et al. Complex bioethics consultation in rural hospitals: using telemedicine to bring academic bioethicists into outlying communities. J Telemed Telecare 2009; 15(5): 264–267.
Cited Here
12. Kon AA, Garcia M. Telemedicine as a tool to bring clinical ethics expertise to remote locations. HEC Forum 2015; 27(2): 189–199.
Cited Here
13. Mittelstadt BD, Allo P, Taddeo M, et al. The ethics of algorithms: mapping the debate. Big Data Soc. 2016; 3(2): 1–21.
Cited Here
14. Morely J, et al. The ethics of AI in health care: a mapping review. Soc Sci Med 2020; 260: 113172.
15. Murphy K, Di Ruggiero E, Upshur R, et al. Artificial intelligence for good health: a scoping review of the ethics literature. BMC Med Ethics 2021; 22(1): 14.
Cited Here
16. Wilkinson JM. Moral distress in nursing practice: experience and effect. Nurs Forum 1987; 23(1): 16–29.
Cited Here
17. Gaudine A, LeFort SM, Lamb M, et al. Clinical ethical conflicts of nurses and physicians. Nurs Ethics 2011; 18(1): 9–19.
18. Oh Y, Gastmans C. Moral distress experienced by nurses: a quantitative literature review. Nurs Ethics 2015; 22(1): 15–31.
19. Haahr A, Norlyk A, Martinsen B, et al. Nurses experiences of ethical dilemmas: a review. Nurs Ethics 2020; 27(1): 258–272.
Cited Here
20. Beauchamp TL, Childress JF. Principles of biomedical ethics. 6th ed. New York, Oxford: Oxford University Press, 2009.
Cited Here
21. London AJ. Artificial intelligence and black-box medical decisions: accuracy versus explainability. Hastings Cent Rep 2019; 49(1): 15–21.
Cited Here
22. Grote T, Berens P. On the ethics of algorithmic decision-making in healthcare. J Med Ethics 2019; 46(3): 205–211.
23. Amann J, Blasimme A, Vayena E, et al. Explainability for artificial intelligence in healthcare: a multidisciplinary perspective. BMC Med Inform Decis Mak 2020; 20: 310.
24. Kundu S. AI in medicine must be explainable. Nat Med 2021; 27: 1328.
Cited Here
25. Burrell J. How the machine ‘thinks’: understanding opacity in machine learning algorithms. Big Data Soc. 2016; 3(1): e2053951715622512.
Cited Here
26. Zhang J, Zhang ZM. Ethics and governance of trustworthy medical artificial intelligence. BMC Med Inform Decis Mak 2023; 23(7): 7.
Cited Here
27. Ghassemi M, Oakden-Rayner L, Beam AL. The false hope of current approaches to explainable artificial intelligence in health care. Lancet Digit Health 2021; 3: e745–e750.
Cited Here
28. Kitchin R. Thinking critically about and researching algorithms. Inf Commun Soc 2026; 20(1): 14–29.
Cited Here
29. McKendrick J, Thurai A. AI isn’t ready to make unsupervised decisions. Harv Bus Rev September. 2022; 15: Accessed October 1, 2024. https://hbr.org/2022/09/ai-isnt-ready-to-make-unsupervised-decisions
Cited Here
30. Matthias A. The responsibility gap: ascribing responsibility for the actions of learning automata. Ethics Inf Technol 2004; 6: 175–183.
Cited Here
31. Braun M, Hummel P, Beck S, et al. Primer on an ethics of AI-based decision support systems in the clinic. J Med Eth 2021; 47: e3.
32. Crootof R, Kaminski M, Nicholson W. Humans in the loop. Vanderbilt Law Rev 2023; 76(2): 429–510.
33. Noorman M. Computing and moral responsibility. In: Zalta EN, Nodelman U (eds). The Stanford Encyclopedia of Philosophy. California: Spring, 2023. https://plato.stanford.edu/archives/spr2023/entries/computing-responsibility
Cited Here
34. Thompson DF. Moral responsibility of public officials: the problem of many hands. Am Polit Sci Rev 1980; 74: 905–916.
Cited Here
35. Thompson DF. Responsibility for failures of government: the problem of many hands. Amer Rev Public Adm 2014; 44: 259–273.
36. Dixon-Woods M, Pronovost PJ. Patient safety and the problem of many hands. BMJ Qual Saf 2016; 25: 485–488.
Cited Here