What is it? Why is it important?
Data Coding and Anonymisation is a process by which Sensitive Information (SI) is removed or encrypted in order to comply with privacy protection laws.
Participant identifiers (e.g. name, DOB, address) are replaced with an individualised ID-code. This code is subsequently used in the study (e.g. CRF). A separate access protected log documents the match between ID-Code and participant identifiers (e.g. participant identification-Log)
Participant identifiers (e.g. name, DOB, address) are irreversibly removed or irreversibly altered in such a way, that participants can no longer be identified. Anonymisation can pose many challenges to realise. Anonymisation methods must be explained in detail, traceable and robust in in order to prevent the re-identification of participants.
Sensitive Information includes:
- Evident identifiers such as name, date of birth, or personal address
- Less obvious identifiers, which when used in conjunction with other data can lead to the identification of a participant (e.g. date of visits, rare diseases or conditions, marital status, number of children, religion, and race)
What do I need to do?
- Identify study data that qualifies as SI
- Identify and only collect SI needed for the interpretation / evaluation of the study
- Describe planned coding/anonymisation procedures, such as how to:
- Replace SI with individualised participant ID-Codes
- Encrypt SI. This requires a decryption program in order to retrieve SI at some later date
- Define alternative SI data potentially leading to participant identification. Ensure this data is removed prior to data export (e.g. data sharing with other researchers, statistician, laboratories)
- Document anonymisation procedures which must be submitted to EC for approval
Consult with the data manager on how to best implement coding/anonymisation procedures in the study database.
Encrypted data is rendered unreadable to anyone except to a defined group of individuals. The process includes to:
- Pass the data through a cipher, or a secret disguised way of writing (e.g. an algorithm that encodes data according to a key)
- Only individuals that possess the key on how to decrypt the data can read its content
Example on how to create participant ID-Codes
- Define a prefix that represents the study
- In multicentre studies, define an individual code for each study site(s)
- Define participant screening numbers that represent participants that were screened for the study. Screening does not always mean that participants are included in the study
- Add participant serial numbers (e.g. participant 01, 02, 03, …)
- Separate numbers using a symbol (e.g. dash (-) or underscore (_)
- Example: Study_Site_Screening_Participant = TGF-2-14-05
A coding system can also be predefined or programmed by CDMS of the study.
Where can I get help?
Your local CTU↧ can support you with experienced staff regarding this topic
Basel, Departement Klinische Forschung, CTU, dkf.unibas.ch
Lugano, Clinical Trials Unit, CTU-EOC, www.ctueoc.ch
Bern, Clinical Trials Unit, CTU, www.ctu.unibe.ch
Geneva, Clinical Research Center, CRC, crc.hug.ch
Lausanne, Clinical Research Center, CRC, www.chuv.ch
St. Gallen, Clinical Trials Unit, CTU, www.kssg.ch
Zürich, Clinical Trials Center, CTC, www.usz.ch
Swissethics – search for
- Coding of trial subjects accepted by swissethics
FADP – Federal Act on Data Protection
GCDMP – see in particular
- Chapter “Data Privacy”
SCTO Regulatory Affairs – see in particular
- RAW Issue 1, April 2019, Essential information on data protection
ICH GCP E6(R2) – see in particular guidelines
- 2.11 Confidentiality of records
- 4.9 Records and reports
- 5.5 Trial management, data handling, and record-keeping
HRA – see in particular articles
- Art. 3 Definition of coded and anonymised health related data and biological material
- Art. 56 Transparency and data protection