A trust based framework for the envelopment of medical AI

The notion of ‘enveloping’ describes the process of ‘surrounding’ the development and use of AI by an ‘envelope’ of regulations, laws, practices, norms and guidelines that allow the AI to be used socially (rather than merely technically) successfully2,21. The precise nature of this envelope will be determined by the specific social scenario and the social goals the relevant stakeholders wish to achieve. In the case of medical AI, we have identified (Trust in and Reliance on Medical AI) those goals as persevering and fostering the desired attitudes summarised in Fig. 1. In the following subsections, we will outline the basic components of an enveloping of medical AI for each of the relevant attitudes.
It has generally been recognised that the construction of an envelopment for AI requires a higher level of technical transparency than is currently available for many AI applications22. Robbins describe this transparency-problem as follows2:
The central idea of envelopment is that machines are successful when they are inside an ‘envelope’. This envelop constrains the system in a manner of speaking, allowing it to achieve a desired output given limited capacities. However, to create an envelope for any given AI-powered machine we must have some basic knowledge of that machine—knowledge that we often lack.
This is particularly true for deep-learning applications which use highly variable algorithms that process enormous amounts of data. A tracking of the intermediate computational steps of the algorithm is therefore practically not possible; in other words, the computation takes place in a metaphorical ‘black box’. However, there are now several computational methods that make the intermediate steps of data-intensive, adaptive algorithms more accessible to the developers and users23, i.e., that result in transparent AIs. A successful envelopment of any AI will therefore usually begin with the requirement that such methods are used in the development of the AI. In particular, Robbins requires that sufficient information is made available about the composition and potential biases of the algorithm’s training data2; the boundaries of reliably processed input data; the reliability of the algorithm’s success on any given tasks; and the space of possible outputs. With respect to medical AI, the amount of knowledge about the algorithm that is necessary to foster attitudes of reliance on medical AI in medical professionals and patients (Trust and Reliance in Healthcare and Medical AI) will be different for each of those groups and for each form of medical AI. In particular, greater knowledge is generally required for AIs involved in high-risk, life-affecting treatments. However, it should generally be noted that reliance on the AI by medical professionals seems to exclude fully opaque ‘black box’-applications, as the professional needs to be able to make professional judgements in the knowledge that they are able to check the algorithm for errors and biases.
Making available decision-guiding information about medical AI to the individual medical professionals and patients only constitutes part of the envelopment. In addition, this information should be used to develop general guidelines and regulations for the safe use of a given medical AI. Such guidelines and regulations are usually enacted on the level of healthcare organisations and systems. As we are focusing on the attitudes between healthcare professionals, patients and medical AI, we will primarily discuss the information that should be directly available to those parties. The provision of such information will likely be regulated to some degree by regulations and policies on the organisational and national level. However, we assume that the exact formulation of such policies should be guided by what medical professionals and patients need to preserve the desired attitudes (Trust and Reliance in Healthcare and Medical AI) and should therefore supervene on the parts of the envelopment pertaining directly to the core triangle of attitudes (Fig. 1).
The successful envelopment of medical AI clearly faces challenges, in particular, with respect to compliance and the degree to which the interest groups in each scenario are willing to engage with the recommend actions. To be fully successful, the enveloping therefore requires the buy-in of at least the majority of stakeholders, appropriate institutional support and – if necessary – regulatory oversight. In the envelopment components as presented below, we have flagged up areas were institutional time and resources should be made available to allow the components to be implemented and were regulatory reviews (usually by an oversight organisation like NICE) should be conducted. In order to foster buy-in and compliances, we recommend that all stakeholder groups are involved from the very early stages of the development of the medical AI and that great focus is placed on the accessibility of any information about the AI. During the former, it will be particularly important to take into account concerns by each stakeholder about the practicality of the enveloping, e.g., enveloping components for a particular medical AI need to take into account the limited time physicians have to engage with documentation or the specific organisational capabilities of the targeted health care sector. For the latter, it is important to work with communication specialist and to elicit feedback from patients and medical professionals on whether they have been able to gain a sufficient understanding of the medical AI. The research on efficient communication about complex medical issues and medical AI in particular is currently in its infancy. We envision that the envelopment will track state-of-the art developments in this area and potentially even generate data on which communication strategies are effective, to feed back into this process.
The envelopment is not a fail-safe way of ensuring that every scenario of medical AI-use turns out optimally with respect to preserving the attitudes as summarised in Fig. 1. In particular, clinical settings are replete with complex and changing constraints, be that on the medical professionals’ time or the patients’ attention span in moments of crisis. Dealing with those constraints is a problem that exceeds the remits of the envelopment for medical AI and this project. We propose that the best way of acknowledging and minimising these limitations is to require that the implementation of the envelopment complies with state-of-the-art empirical data on how to mitigate these factors, e.g., large-amount of ‘fine-print’ as initially deployed in internet safety regulations should be avoided (Implementation), and that, in turn, insights gained from the implementation of the envelopment components for different medical AI are appropriately recorded and used to gain further insights into how to mitigate the difficulties of the implementation of envelopments into real-life clinical settings. With ongoing clinical implementation of medical AI, established tools like clinical quality management systems, the use of a clinical incident reporting system or root cause analysis techniques, like the London protocol, will help to further shape the envelopment on the organisational level.
In the following, we will discuss the enveloping necessary to preserve each of the intended attitudes in the medical professionals-patients-AI triangle. In Framework for Enacting Medical AI Envelopment, we will then show how each component should be implemented during different stages of the product cycle of an AI application.
Envelopment for medical professionals’ reliance on medical AI
As discussed in Trust and Reliance in Healthcare and Medical AI, the acceptance of any medical technology by medical professionals is conditional on the technology passing the success requirements of evidence-based medicine, i.e., that the technology has been demonstrated to have acceptable success rates in rigorous clinical and experimental trials and that its workings can be adequately understood by the medical professional. The latter part of the requirement prescribes the implementation of suitable transparent algorithms and precludes the use of opaque black box AI applications. This is particularly important for diagnostic medical AI, which often uses deep learning algorithms that are particularly difficult to make transparent. However, there now exists a growing range of tools and programming methods that can make intermittent data processing steps visible23 and those should be routinely employed in the development of medical AI. This condition implies that some potential medical AIs, namely those that cannot be made explainable, are not suitable to be developed with the prospect of immediate employment in a medical setting. We consider this restriction an acceptable trade-off for making sure that medical professionals can gain an adequate understanding of the technology and therefore develop an attitude of reliance on the technology while maintaining a trust-relationships with their patients. The restriction also leaves open the possibility of developing ‘black box’-applications technical prototypes, which can then be moved into the regulatory envelope once explainability tools become available.
The use of explainable, rather than opaque, medical AI will also aide the fulfilment of the second part of the requirement, namely the demonstration of the success and reliability of medical AI in rigorous clinical experiments and trials. With respect to medical AI, this requirement roughly corresponds to Robbin’s requirement that the reliability of the outcomes of any task-based AI needs to be comprehensively examined and quantified2. However, there are two specific problems that arise in providing empirical evidence for the success of medical AI: (i) it can be difficult to fix appropriate success-standards for different medical AIs; and (ii) success rates can be wildly different for different populations of patients.
With respect to aspect (i), success-standards will necessarily be different for different kinds of AI. In the cases where AI replaces a medical professional in carrying out a given task, i.e., diagnostic AI, the success-standard usually is that the error-rate needs to be lower and the success-rate higher than the same task carried out by a human. However, such standards can be more difficult to fix in cases where the AI carries out tasks that are not as easily quantified, i.e., health-monitoring AI or AI that provides more holistic healthcare advice to patients. It is evidently harder to quantify how well such AI performs in improving patients’ health outcomes, and the studies needed to provide empirical data on their success require long-term monitoring of patients. The envelopment of medical AI cannot reduce the difficulties in quantifying the success of such applications. However, it needs to guarantee that medical professionals can access data about the performance of medical AI in clinical trials, or in long-term monitoring, and are likewise made aware of any lack of such studies.
With respect to aspect (ii), the problem of biases in healthcare provision to minorities and, in particular, biases in the data upon which healthcare practices are based, exists in many healthcare settings24. It can be amplified in the use of medical AI which are heavily data reliant and often show extreme performance differences between mainstream and minority patient groups25. In particular, the performance of deep-learning AIs – a technology often used for diagnostic medical AI – is crucially dependent on availability of the training data, i.e., of pre-diagnosed samples, and will heavily reflect historical biases in data collection. An often-used example using already existing technology is the use of deep learning in the detection of skin cancer from photographs of suspicious skin regions26. The data is trained on existing images of potential lesions. However, skin cancer detection has been notoriously biased towards detection of lesions on Caucasian skin, and the amount and quality of training data that exist for Caucasian patients is a lot higher than that for other skin types. Accordingly, skin cancer detection AI will work less well for non-Caucasian patients. Furthermore, if a Deep Learning application, like a skin cancer detection programme, produces markedly less accurate results for a particular group of patients, then the ‘black box’-nature of the algorithm makes it impossible to pinpoint where in the algorithm the mistakes occur. Explainable AI methods can help to visualise how the images are processed through the algorithm and which features are extracted and could thereby lead to a more targeted search for better input data for particular patient groups23.
Therefore, in order to judge whether medical AI can be relied upon, medical professionals need to be made aware of the composition of the training data as well as any differences in the performances of the algorithm for different patient groups. To facilitate this and the identification of biases that are amplified by the algorithm, medical AI should generally be explainable AI (see above). Furthermore, biases in the training data, known error amplification and differences in the success-rates for different patient-groups should translate into the identification of restraints on the possible inputs into the AI, i.e., into guidance for the medical professional for which patient groups the AI is safe to use.
The accessibility of extensive and patient-group specific data on the efficiency of the medical AI provides the first component of the envelopment aimed towards fostering an attitude of reliance on medical AI by medical professionals. The second part is the preservation of a central maxim of modern medicine: First Do No Harm27. This often translates into a cautionary attitude to any medical treatment: what needs to be considered is not just the success-rate of the proposed treatment but also the risks it poses to the patient, and the avoidance of such risk generally takes precedence over the trying for success. In order to rely on medical AI, medical professionals need to be able to enact this precautionary principle in their use of the AI. This includes having data on performances and biases available but also the freedom to decide whether the use of a given medical AI is appropriate for an individual patient. Therefore, the envelopment needs to guarantee that the medical professional can appropriately oversee the employment of the AI and has the autonomy to override and/or eschew the use of the medical AI.
How can these abstract requirements of the envelopment be implemented in the practical development and use of medical AI? We propose three practical requirements that the development and use of each medical AI should include: (MR1) medical professionals should be involved in the development of the AI, and be internally and externally responsible for the implementation of the envelopment framework; (MR2) each medical AI should come with standardised factsheet addressing the information requirement about safe uses, biases and performance in clinical trials as laid out above; and (MR3) each medical AI should be subject to an ethical assessment on the institutional level, taking into account the information provided on the fact sheet and the report of the medical officers. We will provide justifications for each of these recommendations below.
With respect to enveloping-component (MR1), the involvement of medical professionals during the development of the AI, this appears to be the most efficient way to both ensure an in-depth understanding of the algorithm by the medical community: beyond the direct experience of the individual ‘liaison’ professionals, they can transmit this understanding to their professional community through targeted reports and advisement on accessible documentation of the medical AI. It also means that they can advise during the development phase on what features are needed to allow professionals to use the AI within the ‘First Do No Harm’-maxim as outlined above, e.g., the ability to override the medical AI as they see fit and a targeted use of the AI for particular patient-groups only. The mandatory involvement of a medical professional in the early stages of the development process also establishes a position (e.g., medical officer, medical supervisor) whose main responsibility is to ensure the implementation and application of the envelopment framework, meaning that the position must be equipped with far-reaching rights and veto power regarding the medical AI development.
Enveloping-component (MR2), a standardised, comprehensive factsheet for each medical AI, targeted towards the medical professionals who will be using the AI in a clinical setting, is the easiest way to transmit the necessary information for a safe use of medical AI, as outlined generally by Robbins (2020) and specified for medical AI above: it should contain comprehensive information on the composition of training data, including biases; on the range of safe input ranges and formats, taking into account training-data biases; on the general workings of the algorithm; on the expected outputs and their interpretation; and on the performance of the AI in field-standard clinical trials. While the kind of information included in the factsheet should be standardised, the presentation of technical details can be advised upon by the medical professionals involved in the medical AI’s development (MR1), thereby making sure they are accessible to the community of medical professionals.
Enveloping component (MR3), a comprehensive, independent ethical evaluation of the medical AI on an institutional level, will draw upon the other two enveloping-components, i.e., the reports of the medical professionals involved in the development of the AI and the factsheet on the capabilities and limitations (broadly construed) of the AI. The ethical review will provide a further safeguard ensuring the use of the AI complies with fundamental patient rights and patient autonomy and should take into account whether the medical AI is right for the anticipated patient population; whether it can safely be integrated into existing clinical practice and whether medical professionals feel that they have been provided with sufficient information to rely on the medical AI in the sense outlined in Trust and Reliance.
Enveloping for patients’ trust in medical professionals
As described in Trust and Reliance in Healthcare and Medical AI, preserving patients’ trust in medical professionals entails preserving the patients’ reliance on the medical professional’s specific care for their wellbeing (Trust and Reliance). In the context of the use of medical AI, it has been shown that this reliance is inhibited if patients feel that either (i) the medical professionals decisions are based on the restrictions posed by the AI e.g., if it assumes the position of a virtual-decision maker (“computer says no”), rather than being used in an assisting capacity, or ii) that the introduction of the medical AI has reduced possibilities for shared decision-making and dialogue28. Those concerns are consistent with the spelling out of trust as the ascription of a specific concern for one’s wellbeing: in this context, patients cannot trust in a medical professional if they belief that decisions about their treatment are either (i) made by an entity that cannot exhibit this specific concern, i.e., the algorithms of the medical AI, or (ii) they cannot transmit enough relevant information about their wellbeing to the medical professional.
Given the asymmetrical nature of the relationship between medical professional and patient, fostering justified reliance of the medical professional on the medical AI will also foster patients’ trust in the decision of the medical professional to use the AI: as with other medical technology, their trust-relationship will include an assumption that the medical professional can make good judgements about what technology to rely on (Trust and Reliance in Healthcare and Medical AI). Therefore, enveloping components (MR1-3), as discussed in Envelopment for Medical Professionals’ Reliance on Medical AI, will also be crucial to preserving and fostering trust between the medical professional and the patient. In addition, we propose to add two new developing components to address concerns (i) and (ii): (PT1) the involvement of patients and patient advocacy groups at the development stage to identify and address trust-inhibiting treatment-restrictions in the AI; (PT2) the development of targeted, institutional guidelines for the integration of each medical AI into the medical consultation and decision-making process, which preserves the ability of patients and medical professionals to enter into a joined decision-making process.
Enveloping to address the first inhibitor to a trust-based relationship between the patient and medical professional should be aimed at providing the medical professional with the knowledge of and control over the medical AI that allows them to use the AI as a tool for (rather than a restriction on) the patient’s treatment. Therefore, the enveloping components aimed at allowing the medical professional to rely on the medical AI (Envelopment for Medical Professionals’ Reliance on Medical AI) are also crucial to preserving patient’s trust in the medical professional. In particular, enveloping component (MP1) – the involvement of the medical professional at the development stage – needs to include considerations on how to avoid ‘computer says no’-situations, i.e., situations in which the limitations of the AI rather than the medical professionals’ opinions would determine the course of treatment. Medical professionals should proactively identify situations in which the AI might place undue restrictions on treatment options and should be involved in the development of individualisation options and overrides. If medical professionals are involved in the development of the AI, they will not only be able to advise on the necessary individualisation options for the AI, but also gain the technical mastery and insights to explain their use of the medical AI to the patients, thereby enabling the kind of dialogue the fosters trust between patients and medical professionals.
In addition to making sure that the medical AI allows sufficient individualisation options and that concerns about any restrictions of treatment options are addressed (or, at least, recorded), we propose an additional enveloping component: namely, (PT1) the involvement of patients and patients’ advocacy groups in the development of the AI. Involving patients in the development of the AI means that potentially harmful restrictions in treatment options can be addressed before the AI is deployed and that the AI’s algorithm contains suitable individualisation options. Furthermore, patients’ advocacy groups are often best placed to raise concerns about data biases and historic neglect of specific patient-groups, which should be used to identify deficiencies in the training data and determine the limitations and restrictions of the AI. If those restrictions and limitations are recorded on the factsheet (enveloping component MR2), enveloping component (PT1) will also foster the reliance of medical professionals on the medical AI (Envelopment for Medical Professionals’ Reliance on Medical AI). In the case of patient-accessible AI, the input of patients at the development stage will also lead to better user interface design and to better guidelines on when patients should be encouraged to use the medical AI independently.
We also propose a second enveloping component (PT2) to foster and preserve patients’ trust in medical professionals: namely, the development of a comprehensive set of specific guidelines for each medical AI on how it is to be integrated into the diagnostic and treatment process. The development of those guidelines should involve representatives of all stakeholders: different patient groups, medical professionals and the relevant medical institutions. However, a particular emphasis should be put on keeping the interaction between patients and medical professionals as a key part of the treatment process and, therefore, ensuring that sufficient opportunities for dialogue and shared decision-making are preserved. As with other medical technology, patients need to be given sufficient opportunities to discuss the results or recommendations of the medical AI with medical professionals, and institutional routes need to be developed to preserve patients’ ability to make decisions about their own treatment, including forgoing the use of medical AI all together.
Enveloping for patients’ reliance on medical AI
It is a specific aspect of patient-accessible medical AI that some of its applications will involve use by patients with minimal intervention by medical professionals, e.g., in ‘smart’ health monitoring, through self-diagnostic tools or in the context of interactional therapies like ‘talking therapy bots’. However, both patients16 and medical professionals13 have expressed reluctance to rely (Trust and Reliance) on patient-facing medical AI. In particular, such concerns include the (i) low quality of the interaction with an AI, including the ascription of ‘spookiness’ or ‘uncanniness’ to such applications; (ii) the potential for unintentional misuse or misinterpretation of the medical AI’s recommendations by the patients; (iii) the unclear attribution of legal responsibility in case of errors by the medical AI; (iv) and the storage, security and legal further use of patient data inputted into the medical AI. The enveloping of the medical AI therefore needs to address those concerns in order to foster the patients’ ability to rely on the medical AI.
The envelopment components introduced in the previous sections will already address some of those concerns: in particular, (MR3) and (PT2) will ensure that, within an institutional setting, medical AI is not used as a substitute for the interaction between medical professionals and patients but is embedded in a treatment process that fosters these interactions at the appropriate moments. Similarly, (MR1) and (PT1), which require the involvement of patients and medical professionals in the development of medical AI, will allow these groups to input on how misuse and misinterpretations of the medical AI can be prevented, and which user interface components or additional information will reduce reluctance in interacting with the medical AI. Envelopment component (PT1), i.e., the involvement of patients and patient advocacy groups in the development of the medical AI, will also empower advocacy groups for different patient groups to effectively communicate the benefits and drawbacks of the AI to their specific client groups, hence enabling patients to make informed decisions about the use of the medical AI.
However, while those existing envelopment components largely address concerns (i) and (ii), addressing concerns (iii) and (iv) requires the introduction of further, targeted envelopment components. We, therefore, propose three further envelopment components: (PR1) the development of comprehensive, specific documentation for each medical AI that allows patients to understand how data is handled by the AI and which security risks to their data exist; (PR2) a set of comprehensive, specific documentation for each medical AI specifying the legal responsibility of the distributors, developers, medical professionals and institutions, including a clear identification of potential ‘grey’ or unregulated areas; (PR3) the development of appropriate legally or institutionally binding safeguards that offer clearly signposted opportunities for the patients to involve medical professionals at appropriate moments in the use of the AI.
Most data-based applications come with some extensive, small-printed documentation of their data usage – which most users ignore due to the fact that they find it incomprehensible. Given the sensitivity of medical data, the need to make such documentation more accessible to the user is particularly pressing for medical AIs. Accordingly, envelopment component (PR1) should focus on developing documentation that is easily understandable and allows the user to engage actively with questions of data security and usage. This will require the cooperation of developers, distributing companies, medical professionals, communications specialists and patient advocacy groups. The inclusion of the latter two will ensure that the documentation contains information that is important to different patient groups and that it is presented in formats that are accessible to different users. Furthermore, patients’ advocacy groups and medical professionals can make use of such documentation to inform specific patient groups about the data usage and security of the medical AI specifically, and through different media. The aim of envelopment component (PR1) should therefore be that no patient uses the medical AI without having previously considered its data usage and security, thereby empowering patients to make truly informed decisions about the use of the medical AI.
The aims of (PR2) are similar to those of (PR1), but with a focus on the legal framework and on who is legally responsible for the different aspects of the AI, rather than questions of data security. This set of documentations should also be developed through a comprehensive cooperation of different stakeholders: distributing companies, legal experts, medical professionals, patients’ advocacy groups and communication experts. As with envelopment component (PR1), a strong emphasis should be put on the accessibility of the conveyed information to different patient groups.
Beyond the difficulty of communicating legal information about the medical AI in an accessible and engaging format, the documentation needs to recognise that the legal framework for AI-use in the European Union and the UK is currently still underdeveloped29. The enactment of such a framework is often a question of both national and international law and depends on a large number of institutions beyond the medical profession. As it is unlikely that the large-scale deployment of medical AI will be delayed until the legal framework has caught up, we will therefore view existing AI-legislation as an external factor, which the enveloping must content with (rather than e.g., making changes to this legislation part of the envelopment). This means that grey areas and under-regulation need to be accepted as fact. However, they should be clearly identified and flagged in the legal documentation that constitutes (PR2). Making patients and medical professionals aware of such legal lacunas will empower them to make informed decisions about each specific medical AI, including the choice to forgo use or advocacy of the AI if they consider the legal framework to be insufficient.
Lastly, enveloping component (PR3) – legally or institutionally binding safeguards enforcing the involvement of medical professionals at appropriate moments in medical AI use – will bolster efforts to address concerns (i) and (ii). The precise nature of such mandated safeguards will depend on the specific nature of the medical AI and could range from prompts to contact a medical professional to automatic alerts if the AI registers certain results. However, we envision that a guiding factor to developing such safeguards will be the need to ensure that internationally recognized clinical pathways will still be followed. Accordingly, it needs to be ensured that patient-accessible medical AI leads patients back onto those pathways. For example, it needs to be made clear and suitably enacted in laws and guidelines what the results of a medical AI enable patients to do in seeking further treatment, e.g., whether the result of a diagnostic AI would enable patients to order prescription medication directly from a pharmacy. Generally, this should only be the case, if the recognized clinical pathway allows similarly self-directed action.
link