Artificial Intelligence and Health Data: a promising combination with potential privacy pitfalls

Artificial Intelligence (AI) is making a significant impact in the medical field, opening new and revolutionary possibilities for diagnosing, treating, and preventing diseases. Analyzing large-scale health data feeds increasingly sophisticated algorithms that can detect patterns, predict risks, optimise costs, personalise therapies, and more. However, this transition from a digital society to an algorithmic society should not neglect the protection of a fundamental right: the protection of citizens' data.

The European Union is preparing to release the final version of the Artificial Intelligence Act (AI Act). This is a significant development, as it will provide a much-needed framework for the regulation of AI in the EU. The AI Act aims to ensure that AI is used in a safe way, and respects fundamental rights and the principles of a democratic society.

However, there are concerns about the impact of the AI Act on the processing of personal data, privacy, and the cybersecurity of businesses. These concerns are particularly relevant in the context of health-related data, which is considered to be sensitive and can be used to discriminate against individuals.

What are the data protection risks? How should we behave in this new scenario? What will be the implications of the AI Act for the future of medicine? How will the need for innovation be reconciled with privacy protection?

This article aims to integrate our series of articles on AI previously published on our Business Blog, providing an in-depth analysis of the use of AI in the processing of health data and its implications for the protection of the rights and freedoms of individuals.

What are the risks to privacy?

The use of AI systems in healthcare involves the processing of particularly sensitive data, such as diagnoses, reports, clinical images, and genetic information. The leakage or misuse of this data could have devastating consequences for the privacy and dignity of individuals.

For example, consider an AI system that analyses medical records to identify patients at risk of diabetes. If this information were to be stolen or used for commercial purposes, the consequences could be disastrous.

Health data is considered sensitive because it could be used to discriminate against people in various situations. There are two main aspects to consider:

Misuse by individuals: This could lead to various types of discrimination and other devastating consequences for people:
In employment: Employers could use medical records to discriminate against potential employees or to dismiss employees with certain health conditions.
Access to credit: Banks and financial institutions could use medical records to deny loans or mortgages to individuals with specific pathologies.
Exclusion from insurance coverage: Insurance companies could use health data to deny coverage or increase premiums for people with pre-existing conditions. For example, life insurance companies might deny coverage to individuals with certain health problems, or travel insurance companies might refuse coverage to individuals with pre-existing conditions. In some cases, health data can be used to limit access to certain health services.
For extortion purposes: Medical data can also be used to threaten people in various ways, such as by threatening to disclose sensitive information. An individual could threaten to disclose sensitive details about someone if their demands are not met. An individual could also use medical information to get revenge on someone for a perceived wrong.

For example, imagine that a person has a genetic disease that increases the risk of a particular type of cancer. If this information were to be violated, it could be used to discriminate against the person in various ways. For example, a health insurance company could deny coverage for cancer, an employer could fire them for fear of illness, or an individual could pressure them by threatening to spread their condition to everyone.

In addition, health information is an attractive target for cybercriminals and hackers. The theft of such data could fuel a growing black market, with serious economic and psychological consequences for the people involved.

Another significant threat is that once health data is digitised, it becomes a valuable resource for large technology companies. There is a temptation to use this information for commercial purposes, such as advertising profiling or the development of new pharmaceutical products.

Other potential threats may arise from the configuration or setting of the algorithms themselves, and, in particular:

The analysis of large-scale health data could lead to health profiling or the creation of personalised "health profiles" using artificial intelligence on the health data of individuals or population groups. This sensitive data includes diagnoses, treatments, genetic predispositions, and lifestyles, becoming valuable resources for insurance companies, pharmaceutical industries, and other stakeholders in the sector. Discriminatory algorithms could perpetuate harmful stereotypes or limit access to crucial care and services. There is a fear that the processing of increasingly important data sets and/or errors made during the development and training phases of AI systems could lead to abuse and discrimination against people, denying them certain coverage, job opportunities, etc. The quality of the predictive capabilities of an AI system varies depending on a plurality of elements such as the number, quality, and accuracy of the training data. The Italian Data Protection Authority, in a provision of 24 January 2024, recalls that the GDPR establishes the principle of non-discrimination by algorithms (cf. recital 71). Based on this principle, the data controller must use appropriate mathematical or statistical procedures for profiling, implement technical and organisational measures to guarantee the security of personal data and minimise the risk of error, correct the factors that cause inaccuracies in the data, and prevent discriminatory effects against individuals.
Opacity of some AI systems: The lack of transparency in some artificial intelligence systems makes it difficult to understand the logic behind automated decisions. This lack of clarity compromises citizens' trust and hinders their control over the use of their data. In Italy, there is a lively debate on the impact of artificial intelligence in the medical sector on data protection. In this regard, we recall the ruling of the Council of State of 13 December 2019, according to which "everyone has the right to know the existence of automated decision-making processes concerning them and, in this case, to receive significant information on the logic used".
The risk of de-anonymization (the so-called Reverse Engineering): Although health data is anonymized, there is a fear that it can be easily linked back to the original individual, compromising the protection of personal data. Advanced algorithms and data cross-referencing techniques can correlate health information with other data from social networks, wearables, health apps, and other systems, thus revealing the identity of patients. In a recent decision of 8 February 2024, the Italian Data Protection Authority issued a provision in which it recalled the need to adopt appropriate anonymisation techniques for the creation of an international database aimed at improving patient care through the collection and analysis of health data.

How to reconcile innovation and privacy?

AI has the potential to revolutionise the Life Sciences sector, but only if privacy is considered an essential element for a safe and transparent future. Europe has equipped itself with a solid regulatory framework for the protection of personal data, the General Data Protection Regulation (GDPR). The GDPR sets out rigorous principles for the processing of health data, which must be:

Lawful, fair and transparent,
Carried out for specific, explicit and legitimate purposes,
Aimed at collecting data that is adequate, relevant and limited to what is necessary for the purposes pursued,
Provide for the storage of data for a period not exceeding that necessary for the purposes of the processing, and
Be carried out in such a way as to ensure the security and confidentiality of the data.

In addition, the adoption of appropriate technical and organisational measures is essential to minimise the risks associated with the processing of health data. Pseudonymisation and anonymisation of data, encryption and the adoption of rigorous security protocols are just a few examples.

Finally, the training of healthcare professionals and the awareness of citizens are two key aspects of an effective privacy culture. All actors involved must be aware of the risks and best practices for the protection of health data.

The goal of Europe, which is at the forefront of privacy protection, is to develop a reliable and responsible artificial intelligence that puts privacy and data security at the centre. The European institutions, as well as the individual Data Protection Authorities of the EU Member States, have published and are developing detailed opinions and guidelines on the application of artificial intelligence in the health sector.

The European Union has adopted a series of measures to protect the privacy of citizens, even in the age of artificial intelligence. At the European level, the GDPR Regulation represents the cornerstone of the privacy protection system and establishes rigorous principles for the processing of personal data, including informed consent and data minimisation. However, applying it to artificial intelligence in the healthcare sector poses complex challenges. How to find a balance between innovation and data protection? How to ensure the transparency of algorithms and the right of access to information? How to strengthen cybersecurity and prevent data breaches?

At the local level, the Italian Data Protection Authority has also presented a ten-point list on the use of artificial intelligence in the healthcare sector, offering practical guidelines for companies and public administrations that wish to use these technologies. Specifically, the "Decalogue of the Privacy Guarantor on the Use of Artificial Intelligence in Healthcare" (issued in September 2023) presents ten principles for a responsible and GDPR-compliant use of artificial intelligence in the healthcare field. These principles include the identification of appropriate legal bases for data processing, the adoption of designed and predefined security measures, the definition of the roles of the various parties involved, the guarantee of knowability, non-exclusivity and non-discrimination of algorithms, the performance of a data protection impact assessment (DPIA), the guarantee of data quality, the protection of the integrity and confidentiality of personal information, the guarantee of correctness and transparency in automated decision-making processes, the implementation of appropriate security measures and the respect of the rights of data subjects. The Decalogue represents a fundamental tool to promote a responsible and safe use of artificial intelligence in the healthcare sector with full respect for individual rights and freedoms.

In order to develop trustworthy AI systems that involve the processing of health data, some concrete solutions can be adopted, such as:

Development of trustworthy AI systems and robust governance models: AI must be designed and used in a way that respects the ethical principles and privacy of citizens. During the development of AI systems, it is essential to promote transparency and control. Before their data can be processed by algorithms and/or entered into databases, individuals must have the right to be informed about how their data is used and what decisions are made based on it. In addition, it is useful to implement control and supervision systems to guarantee the ethics and reliability of AI systems in the healthcare field.
Advanced pseudonymisation and anonymisation: These are de-identification techniques used to make health data unusable for individual identification while preserving their usefulness for analysis. Today, there are also solutions on the market for so-called Differential Privacy, which allow data to be anonymised at the time of collection by adding noise or an element of randomness (randomisation).
Use of synthetic data (artificial computer-generated data that mimics real data): These can be useful for privacy-preserving research and analysis, but they also present re-identification risks. The current EU GDPR Regulation struggles to regulate synthetic data, especially fully synthetic data that is not considered personal data. The doctrine proposes a paradigm shift towards clearer guidelines for all types of synthetic data, which prioritise transparency, accountability and fairness. These guidelines can help mitigate potential risks and encourage responsible innovation in the use of synthetic data.
Awareness and training: Educate citizens and healthcare professionals about the risks and opportunities of AI concerning privacy.
Investments in cybersecurity: Health data must be protected from cyberattacks and security breaches.
Public-private collaboration: The research and development of trustworthy AI systems requires collaboration between authorities, researchers, companies and citizens.

Towards a future of care and respect for privacy

Innovation in the field of artificial intelligence should not compromise the fundamental rights and freedoms of citizens. The competent authorities, together with the scientific and industrial world, must ensure that the development and use of AI take place ethically and responsibly.

The AI Act represents an important step in this direction. The creation of a Department for Artificial Intelligence at the European Commission, an independent scientific committee and an AI Board will ensure the monitoring of artificial intelligence models and the development of standards and control procedures.
In addition, the imposition of significant fines for violations, which can amount to several million euros, will make companies more responsible for ensuring and demonstrating their compliance with the EU Regulations.

In conclusion, the challenge lies in finding a fair balance between the potential of artificial intelligence in the medical field and the protection of the fundamental rights of citizens. It is essential to develop artificial intelligence systems that are reliable, transparent and respectful of privacy. Thanks to the joint efforts of authorities, researchers, businesses and citizens, we can shape a tomorrow in which artificial intelligence is at the service of the health and well-being of all while preserving privacy.