Artificial intelligence (AI) is a rapidly expanding field of research which is destined for a promising future. Its applications, which cover all human activities, are bringing about improvements in the quality of healthcare. AI is at the heart of tomorrow's medicine, what with assisted operations, remote patient monitoring, smart prostheses and personalized treatments thanks to the cross-referencing of an increasing amount of data (big data), to name but a few.
With this in mind, researchers are developing multiple approaches and techniques, from natural language processing and the construction of ontologies to data-mining and machine learning. It is nonetheless essential that the general public understand how these systems work, in order to know what they do and – more importantly – what they do not do. The omniscient robot, symbolic of AI in the minds of many, is still a long way off!
Difficulty4 sur 5
Artificial intelligence emerged in the 1950s with the objective of having human tasks performed by machines that mimic the activity of the brain. Faced with the setbacks of the early years, two branches were formed.
The proponents of strong artificial intelligence aim to design a machine that is capable of reasoning like a human being, with the supposed risk of generating a machine superior to humans and with its own consciousness. This research path is still being explored today, even though many researchers in the domain consider that such an objective is impossible to attain.
The proponents of weak artificial intelligence deploy all available technologies to design machines capable of helping humans with their tasks. This field of research calls on many disciplines, from information technology and the cognitive sciences to mathematics without forgetting the specialist knowledge of the domains to which we wish to apply AI. This approach – which will be discussed throughout this report – generates all of the specialized and efficient systems that populate our environment today, such as creating profiles of future possible "friends" on social media, identifying dates in texts in order to file agency dispatches, helping doctors make decisions, etc. These systems, which vary greatly in their complexity, share the characteristic of being limited in their ability to adapt: they must be adapted manually in order to accomplish tasks other than those for which they had been initially designed.
Some AI systems use logic…
The oldest approach is based on the idea that we apply logical rules (deduction, classification, hierarchical organization, etc.) in our reasoning. The systems designed on this principle apply various methods, based on the creation of models of interaction between automations or autonomous software (multi-agent systems), syntactic and linguistic models (natural language processing) or the creation of ontologies (knowledge representation). These models are then used by logical reasoning systems to produce new elements.
In the 1980s, this symbolic approach enabled the development of tools capable of reproducing the cognitive mechanisms of an expert. That is why they are called expert systems. The most well-known, MYCIN (for the identification of bacterial infections) or SPHINX (for the detection of jaundice), are based on the entire body of medical knowledge in a given domain and a formalization of the reasoning of specialists which link this knowledge to produce a diagnosis.
The current systems, qualified as decision support, knowledge management or eHealth, are more sophisticated. They benefit from better reasoning models as well as better description techniques of medical knowledge, patients and medical procedures. The algorithmic mechanism is essentially the same, but the description languages are more effective and the machines more powerful. The intention is no longer to replace the doctor but rather to support him in a reasoning founded on the medical knowledge of his specialty.
Aiding breast cancer management
Teams from the Laboratory of Medical Informatics and Knowledge Engineering in eHealth (LIMICS, Inserm Unit 1142) and the Paris Public Hospitals (AP-HP), are participating in Desiree, a European project which is based on a symbolic approach to help clinicians treat and monitor patients with breast cancer. This highly complex disease often requires the adaptation of standard protocols.
The Desiree platform incorporates the good practice recommendations for the deployment of reasoning based on an ontology. The system can also learn from previously resolved cases (by reproducing decisions on cases similar to the clinical case in question), or from reasoning through experience (by reusing decisions which were non-compliant with recommendations, based on criteria explained in the justification for that non-compliance). Ongoing enrichment of the case database improves the system’s proposals for help with the therapeutic management of the patients.
The principal difficulty of the symbolic approach is the modeling of knowledge (description of the domain and of the reasoning) which is based on in-depth work with the specialists of the domain concerned.
…While others call on past experience
Contrary to the symbolic approach, the numerical approach reasons using data. The system looks for regularities in the data available in order to extract knowledge, without a pre-established model. This method, which emerged with connectionism and the networks of artificial neurons in the 1980s, is developing today thanks to the increased power of computers and the accumulation of vast quantities of data, the famous big data.
Many of the current systems work by machine learning, a method based on the mathematical and computerized representation of biological neurons, according to more or less complex procedures. The algorithms of deep learning for example, whose use has boomed over the past decade, are inspired by the functioning of the brain: they simulate a network of neurons organized into various layers, which communicate with each other. The strength of this approach is that the algorithm learns the task assigned to it by trial and error, before handling things alone.
Deep learning applications exist in image processing, for example to detect potential melanomas on photos of skin, or to detect diabetic retinopathy on images of retinas. Their development demands large learning samples: 50,000 images for melanomas, and 128,000 for retinopathies were necessary to train the algorithm to identify pathological signs. For each of these images, the application is told whether or not the image presents pathological signs. At the end of the learning process, the algorithm is able to recognize, with an excellent level of performance, new images presenting an abnormality.
Robotics in full boom
Robotics is a specific subdomain of AI. The objective is to increase the autonomy of machines by equipping them with the ability to perceive, decide and act.
Computer-assisted surgery is without doubt one of the most well-known aspects, currently making it possible to improve precision or operate remotely.
Smart prostheses aim to repair or even augment the human body. For example, artificial limbs or organs (arm, cochlea, heart, sphincter, etc.), cardiac pacemakers, etc.
Companion robots, for example for the elderly or frail, represent a third highly publicized sector, which is in rapid development. These service robots aim to imitate living beings and interact with humans. Various ethical issues are raised, relating notably to the protection of privacy and personal data, but also to the consequences of a blurring of the human-robot frontier. A line which can very quickly be crossed by the user.
Challenges facing research
AI is fast-growing, and many research avenues are being explored to improve the technical performances of these systems and their suitability to the medical practices in question. Their cost must also be justified by real added value for doctor or patient.
The research avenues particularly concern the processing of highly heterogeneous data, the structuring and anonymization of these data, and also the design of systems which are transparent for the user and sufficiently appropriate for the use context.
A bottleneck: the quality of the data sample
The numerical approach can claim great performances in medicine, but it requires data that are perfectly clean and well-annotated, such as those used for the recognition of melanomas. However, since the majority of medical data has not been collected for the purpose determined by the software designer, they present numerous problems for their use.
France has in particular one of the world's biggest health databases: its national system of medical and administrative data, SNIIRAM, which covers the various health insurance schemes. This database stores all medication prescriptions, descriptions of pathologies, and hospital procedures. It is nevertheless tricky to use because it was created for the economic analysis of healthcare services and not for medical analysis. As such, an individual hospitalized for a respiratory problem will be treated for that condition without mention necessarily being made of the cancer from which he is also suffering. In some cases, a 30% error rate in the description of the pathologies associated with the patients is observed. Checking more or less automatically the consistency of the data and correcting them is therefore a crucial challenge of research.
Protecting personal data
A national healthcare platform grouping all the health data of the French population is an inestimable resource, not just for healthcare practitioners, but also for medical and pharmaceutical researchers. Nevertheless, it must be ensured that these data are used appropriately and in respect of the law, particularly the General Data Protection Regulation (GDPR) which came into force in May 2018, and the 2016 Digital Republic Act.
In this context, personal data are neither the property of the patient nor that of the body collecting them. The French have the right to use their data but they cannot sell them. Furthermore, the processing of these data is dependent on the informed consent of the person concerned. In France, health data are made anonymous so that they can be accessed by researchers, and only for authorized projects.
Crossing multiple patient-related textual data
Another problem raised by the use of medical data is that 80% of the information on patients is textual (such as hospitalization reports or imaging reports). Therefore, natural-language processing software needs to be deployed to analyze these texts and extract information about the patient (data mining).
This software can take a symbolic approach or neural networks-based approaches. The unsupervised-learning algorithms (without prior learning on samples) are generating hope in this field: they enable the rapid cross-referencing of a very large number of data in order to establish hidden structures and determine categories of interest for the task in question. In this way, it is hoped to be able to better identify risk factors, personalize treatments and verify their efficacy, forecast epidemics and improve pharmacovigilance.
These algorithms can be very efficient but still require a lot of research before they can be used reliably.
Providing information at the right time and at the right level
Today, projects with more targeted objectives are being realized. For example, researchers from LIMICS participated in the design of natural-language processing software as part of the Lerudi project, which concerns the rapid reading of patient medical records in an emergency room setting. They led the development of an ontology of emergencies that is used in the development of a prototype which performs text-based searches of patient medical records or future CNAM shared medical records. For use by emergency room staff, the tool must meet their needs, namely bring to their knowledge essential information (such as medication prescriptions from which pre-existing conditions can be identified) in the few minutes they have to make a decision.
Furthermore, a decision support system in ectopic pregnancy ultrasound analysis developed by LIMICS and Trousseau Hospital, OPPIO, will enter the test phase in 2019. It is supported by an ontology which provides a sign-centric model of the domain, with the relations of signs to ectopic pregnancy types, anatomical structures and the technical elements. This system enables the doctor to select an ectopic pregnancy type and receive suggestions of the relevant signs to look for and the associated reference images.
Providing real aid to medical practice
For an application to be used by the doctor in his daily practice, it is not sufficient that it does what it is meant to - the system must also be easy to live with! For example, a system designed to alert to possible drug contraindications must not saturate the physician with alerts that are accurate in themselves but inappropriate to the clinical context of the patient. Therefore, instead of issuing an alert each time a contraindication presents itself, the new interfaces ask questions on the patient beforehand, in order to reduce the number of alerts and, as such, the tendency of the doctor to unplug an "annoying" machine.
Providing the means to understand decisions
To be acceptable or legitimate, or to be ruled out because deemed irrelevant, the decisions of the algorithm must be able to be understood, and therefore explained. A major advantage of the symbolic approaches is tracing the reasoning process. But even in this case, the number of micro-reasonings performed by the machine is such that is unthinkable to display them all. That is why researchers are currently working on how to describe these reasonings "as explicit classes", in order to emphasize the most important decisions. Only a good understanding of the solutions proposed by the application can enable the doctor to discuss them with his patient and formulate any possible alternatives.
The numerical approaches however resemble a black box, incapable of justifying its decisions: no one knows what the algorithm is doing. How then, can the responsibility for the medical decision be assumed? The learning data are in particular biased by the prejudices of the time and those of the designers. The algorithm tends therefore to reproduce, and even reinforce, those same prejudices. In the medical domain, the main biases are due to the overrepresentation of a category of people, such as the elderly or patients with a specific geographic origin.
The challenge of the future is to combine approaches
Projects are attempting to combine symbolic and learning approaches, in order to benefit from the reasoning of one and the performances of the other. As such, in the aforementioned Lerudi project, ontologies (symbolic AI) are constructed from numerical text-based search algorithms.
Another example, the interpretation of pediatric medical images is of major importance for diagnosis, patient follow-up and surgical procedure preparation. It involves detecting, segmenting and recognizing anatomically normal and pathological structures and proposing 3D visualizations of them. To respond to the difficulty of these tasks, it is important to combine the numeric information extracted from the images, and therefore specific to the patient, with generic models, representing anatomical knowledge in the form of knowledge bases, ontologies, graphics, etc. It is particularly crucial with the pediatric images which must be acquired over durations which are kept as short as possible and which show structures that are often of small size and broadly variable from one patient to another.
This dual approach is also particularly pertinent for exploiting the "varied" patient data (genomic, clinical, imaging, and lab test data) which will be gathered on one platform in the framework of the French Plan for Genomic Medicine 2025. AI will make it possible to manage this considerable quantity of data by supplying classifications or description ontologies of patient clinical elements. Machine learning will make it possible to identify profiles of patients taking all of these data into account. It will then be possible to personalize treatments and improve their success rates, notably for cancer, rare diseases and diabetes to start with.
Helping but not replacing doctors
Some see in the medical applications of AI the possibility to replace doctors, whether to overcome medical deserts or to filter and orient patients. But the use of such software by the general public without medical supervision raises major ethical questions. The system reduces the relationship with the doctor to a technical act. The patient is left to deal with their questions and worries alone.
Furthermore, the risk that the doctor yields before the machine "which knows better" is real. He may be required to shoulder a decision which is not his own and discover later that the machine got it wrong. To avoid this issue, the doctor, the only one authorized to make a diagnosis, must be able to maintain his autonomy in the face of the machine. He must be in a position to understand the why and the how of the decisions produced and to circumvent them if needed.
With this in mind, Allistène’s Commission for the Ethics of Research in Information Sciences and Technologies (CERNA) emphasizes the necessity to design systems whose functioning is transparent, explained and traceable, and which performs the specified tasks while respecting specific constraints. For decision-support systems based on learning algorithms, ensuring such conformity is not evident.
Cognitive sciences: source of inspiration and field of application
In spite of the enormous calculation capacities offered by current computers, no existing application can claim to be genuinely intelligent: for that it needs to be multitasking and able to react correctly in unforeseeable and non-preprogrammed situations. There is a long way to go yet.
To make headway, researchers are trying to understand the behavior of the neurons and their connections, in order to be able to mimic the brain. This initiative will maybe one day make it possible to create robots that imitate human intelligence. In the meantime, it will help us to better understand how the brain works and to elucidate the causes of certain diseases of cerebral origin, such as Alzheimer's, Parkinson's and Charcot disease. This is the objective that motivates the participation of the European Union, as part of its flagship initiative Future and Emerging Technologies, in the Human brain project. The aim of this project is to build a world class information technology infrastructure, which can be used by the scientific community to simulate brain functioning in specific experimental conditions.