Development↦Data Management↦Database Development↦Data Coding and Anonymisation
What is it? Why is it important?
Data Coding and Anonymisation is a process by which Sensitive Information (SI) is removed or encrypted in order to comply with privacy protection laws.
Coded Data
Participant SI (e.g. name, DOB, address) are replaced with an individualised ID-code. This code is subsequently used in the study (e.g. CRF). A separate access protected log documents the match between ID-Code and participant identifiers (e.g. participant-identification-log)
Anonymised Data
Participant SI (e.g. name, DOB, address) are irreversibly removed or altered in such a way, that participants can no longer be identified. Anonymisation can pose many challenges to realise. Anonymisation methods must be explained in detail, traceable and robust in order to prevent the re-identification of participants.
Participant SI includes:
- Evident identifiers such as name, DOB, or personal address
- Less obvious identifiers, which can lead to the identification of a participant when used in conjunction with other data (e.g. date of visits, rare diseases or conditions, marital status, number of children, religion, race)
What do I need to do?
As a Site-INV:
- Identify study data that qualifies as SI
- Identify and only collect SI needed for the interpretation / evaluation of the study
- Describe planned coding/anonymisation procedures that comply with privacy protection laws, such as how to:
- Replace SI with individualised participant ID-Codes
- Encrypt SI. This requires decryption procedures in the event SI must be retreived at some later date
- Define data potentially leading to participant identification. Ensure this information is removed prior to data transfer (e.g. to other researchers, statistician, laboratories)
- Submit coding / anonymisation procedures to EC for approval
Consult with the data manager on how to best implement coding/anonymisation procedures in the study database.
More
Encrypted data is rendered unreadable to anyone except to a defined group of individuals. The process includes to:
- Pass the data through a cipher, or a secret disguised way of writing (e.g. an algorithm that encodes data according to a key)
- Only individuals that possess the key on how to decrypt the data can read its content
Example on how to create participant ID-Codes
- When managing multiple studies, define a prefix to easily identify the study (e.g. TGF)
- In multicentre studies, define a different code for each study site (e.g. S1, S2, S3, ....)
- Add participant inlusion number (e.g. 01, 02, 03, …)
- Separate ID-digits using a symbol (e.g. dash (-) or underscore (_))
- Example participant ID-Code: Study_Site_Participant = TGF-2-05
A coding system can also be predefined in the CDMS of the study.
Where can I get help?
Your local CTU↧ can support you with experienced staff regarding this topic
Basel, Departement Klinische Forschung, CTU, dkf.unibas.ch
Lugano, Clinical Trials Unit, CTU-EOC, www.ctueoc.ch
Bern, Clinical Trials Unit, CTU, www.ctu.unibe.ch
Geneva, Clinical Research Center, CRC, crc.hug.ch
Lausanne, Clinical Research Center, CRC, www.chuv.ch
St. Gallen, Clinical Trials Unit, CTU, www.kssg.ch
Zürich, Clinical Trials Center, CTC, www.usz.ch
External Links
Swissethics –see in particular
- Topics /Other Topics / Coding of trial subjects accepted by swissethics
FADP – Federal Act on Data Protection
GCDMP – see in particular
- Chapter “Data Privacy”
SCTO Regulatory Affairs – see in particular
- RAW Issue 1, April 2019, Essential information on data protection
References
ICH GCP E6(R2) – see in particular guidelines
- 2.11 Confidentiality of records
- 4.9 Records and reports
- 5.5 Trial management, data handling, and record-keeping
Swiss Law
HRA – see in particular articles
- Art. 3 Definition of coded and anonymised health related data and biological material
- Art. 56 Transparency and data protection