Pseudonymisation: Difference between revisions

From Endeavour Knowledge Base
(Created page with "Discovery uses the NHS standard Open Pseudonymiser (https://www.openpseudonymiser.org/) to create pseudonymised data by taking one or more inputs (NHS number, DoB) and a salt...")
 
No edit summary
 
(6 intermediate revisions by one other user not shown)
Line 1: Line 1:
Discovery uses the NHS standard Open Pseudonymiser (https://www.openpseudonymiser.org/) to create pseudonymised data by taking one or more inputs (NHS number, DoB) and a salt file to generates a psuedo ID that looks like "A541CAF13D376B9AD1072C3096AE141CFF1E67B027CEB632D194D3C6577AB8BF". By using different salt files you can generate different pseudo IDs from the same inputs, the idea being that different research projects would each use a different salt, so would have their own psueudo IDs generated for patient records they use.
Discovery uses the NHS standard [https://www.openpseudonymiser.org/ Open Pseudonymiser] to create pseudonymised data by taking one or more inputs (NHS number, date of birth) and a salt file to generate a pseudo ID that looks like "A541CAF13D376B9AD1072C3096AE141CFF1E67B027CEB632D194D3C6577AB8BF".  


Pseudo IDs are generated in DDS subscriber databases based on subscriber-specific configuration. Configuration allows generating pseudo IDs from NHS number only or NHS number and date of birth.  
Details on the process can be viewed at https://www.openpseudonymiser.org/OpenPseudonymiser_Docs.aspx


Salt Files
By using different [[Pseudonymisation#salt|salt files]] you can generate different pseudo IDs from the same data input; each research project uses a different salt to create their own pseudo IDs generated for patient records they use.


Each customer/project can supply their own salt files (a salt file is used to generate a pseudo ID from a given input, e.g. NHS number). Each subscriber database supports an unlimited number of pseudo IDs generated for each patient (e.g. CEG database has 30 pseudo IDs for each patient, each generated from the NHS number and a separate salt file).
Pseudo IDs are generated in DDS subscriber databases based on subscriber-specific configuration; from NHS number only or NHS number and date of birth.
=== Salt files ===
A salt file is used to generate a pseudo ID from a given input such as a NHS number.


DDS doesn't currently provide a "Key Server" (an API for securely sharing salt files). A Key Server is essentially a website/API that would allow a customer to store their salt files and allow us and other parties (e.g. researchers) to access them. Currently, Kambiz Boomla acts as the London DDS "Key Server" in that he's got all the salt files and can email them out to whoever needs them.
Each customer/project can supply their own salt files, and each subscriber database supports an unlimited number of pseudo IDs generated for each patient. For example, the CEG database has 30 pseudo IDs for each patient, each generated from the NHS number and a separate salt file.  


None of the above necessarily will make sense, but should be clearer if you have a quick read of the docs on the above website.
For more information see https://www.openpseudonymiser.org/FAQ.aspx
 
=== Key server ===
 
A key server is a website or API that securely stores and shares salt files.
 
Discovery doesn't currently provide a key server and instead uses a manual approach to managing salt files. A true automated key server is planned for the future.

Latest revision as of 14:41, 15 May 2023

Discovery uses the NHS standard Open Pseudonymiser to create pseudonymised data by taking one or more inputs (NHS number, date of birth) and a salt file to generate a pseudo ID that looks like "A541CAF13D376B9AD1072C3096AE141CFF1E67B027CEB632D194D3C6577AB8BF".

Details on the process can be viewed at https://www.openpseudonymiser.org/OpenPseudonymiser_Docs.aspx

By using different salt files you can generate different pseudo IDs from the same data input; each research project uses a different salt to create their own pseudo IDs generated for patient records they use.

Pseudo IDs are generated in DDS subscriber databases based on subscriber-specific configuration; from NHS number only or NHS number and date of birth.

Salt files

A salt file is used to generate a pseudo ID from a given input such as a NHS number.

Each customer/project can supply their own salt files, and each subscriber database supports an unlimited number of pseudo IDs generated for each patient. For example, the CEG database has 30 pseudo IDs for each patient, each generated from the NHS number and a separate salt file.

For more information see https://www.openpseudonymiser.org/FAQ.aspx

Key server

A key server is a website or API that securely stores and shares salt files.

Discovery doesn't currently provide a key server and instead uses a manual approach to managing salt files. A true automated key server is planned for the future.