DICOM

Introduction to the anonymization of medical images in DICOM format

  • Marina Chane
  • 7/20/20
  • Updated on 6/6/22

Today, digital medical imaging in the healthcare field has become a fundamental tool for medical diagnosis. Health data is a major issue, and it is essential to treat it with the highest levels of precaution. To do this, anonymizing data is the easiest but also the most secure approach ensuring personal data protection

Today, digital medical imaging in the healthcare field has become a fundamental tool for medical diagnosis. Health data is a major issue, and it is essential to treat it with the highest levels of precaution. To do this, anonymizing data is the easiest but also the most secure approach ensuring personal data protection, but there are some nuances to be considered when anonymizing DICOM files.

Images in the DICOM standard

“Digital Imaging and Communications in Medicine (DICOM) was developed to standardise medical image data and to easily share medical image data between computer systems. It is currently the global standard for handling, storing, printing and transmitting information in medical imaging. A DICOM image consists of a DICOM header and the viewable image. The DICOM header saves identifying information of patients and images which may include patient information, study information, institution information, etc.”

Anonymization of DICOMs for sharing and transmission of medical images

“The DICOM format is now used by most of the medical imaging community, not only for clinical practice but also for clinical research raising the possibility of data sharing or exchange. However, sharing sensitive medical image data to a third party demands protection of the data itself to ensure data safety and patient privacy”.1 

Anonymization is a processing operation which consists of using a set of techniques in such a way as to make it impossible, in practice, to identify the person by any means whatsoever and in an irreversible manner. There are, however, exceptions to this irreversibility: for example, in the context of clinical research. 

The complexity lies in anonymizing while preserving the value of the DICOM dataset. 

DICOM defines de-identification as the protection of confidential attributes by  deletion or encryption. DICOM defines re-identification as the reverse process of "protective removal". It should be noted that this definition is not perfect: re-identification is not always possible if the original data has been cleaned or deleted.

The objective is therefore to strike a balance between anonymization, pseudonymization, de- and re-identification of DICOM.

Of course, this does not mean simply removing confidential information such as patient identification and study dates, etc., but it also means that we have to be careful about the information that we collect. Certain attributes are required and therefore, their removal would result in DICOM being unusable.

DICOM - The types of data elements

Type 1It is required in the SOP (Service Object Pair) instance and must have a valid value
Type 2It is required in the SOP instance but can contain the value "unknown" or a value of length 0
Type 3Optional. May or may not be included and a length value of 0
Type 1CConditional. If a condition is met, then it is of type 1 (mandatory, cannot be null). If the condition is not met, then the tag is not sent.
Type 2CConditional. If a condition is met, then it is a type 2 (required, length zero OK). If the condition is not met, then the tag is not sent.

This table shows the types of data elements and how they are processed. 
SOP: Service Object Pair. This is the combination of an "Information Object" (e.g. an image) with a "Service" (e.g. the printing of this image). Therefore, it is the SOP classes that define the types of services to be performed and the information that the DICOM files should contain. Example of tag: 0010,0010. If the condition is not met, then the tag is not sent in the DICOM metadata file.

How to deal with DICOM Type 1 "required attributes"?

We have previously seen it: for DICOM to be valid, a “required attribute” must be present but must be turned into something meaningless and unrelated to the original identification value (this anonymization with pseudo-values is called pseudonymization). 

Part PS3.15 of the DICOM standard2, published a few years ago, defined existing data encryption techniques for use with DICOM data. 
These techniques serve three important purposes: data encryption, verification of data origin, and verification of data integrity.

The Attribute Confidentiality Profile (DICOM PS 3.15: Appendix E) provides a standard for image de-identification, reducing the complexity of secure de-identification of DICOM image data while retaining the flexibility to preserve certain information. The privacy profiles include a basic profile as well as a number of optional profiles. They provide instructions on how to securely clean DICOM elements that may contain personal data.

The Basic Application Level Confidentiality Profile is an extremely conservative approach that hides all confidential data related to the following attributes:

  • the identity and demographics of the patient (e.g. name of patient-0010,0010, age of patient-0010,1010, date of birth-0010,0030)
  • the identity of responsible parties or family members (e.g. name of observer organization-121009),
  • the identity of the staff involved in the procedure (e.g. name of human actor-0040,4037),
  • the identity of the organizations involved in the application or execution of the procedure (e.g. requesting service-0032,1033),
  • anything that could be used to match the examples if they were given access to the originals, such as UIDs, dates and times, and;
  • private attributes.

To apply this profile, you must at least take all the attributes listed in the table below, encrypt their original value with a standard (encryption algorithms such as RSA, AES, and Triple-DES5 are accepted by the DICOM standard for converting original DICOM data into a protected format) and store the result of the encryption in the modified attribute sequence while replacing the values in their original location with insignificant false values. For example, we could replace DROP^JACK with GOLP^CE28, it would be stored in the patient’s name attribute (0010,0010) and the original value of DROP^JACK would be encrypted and then placed in an attribute sequence.

Attribute NameTagAttribute nameTag
Instance Creator UID(0008,0014)Other Patient Ids(0010,1000)
SOP Instance UID(0008,0018)Other Patient Names(0010,1001)
Accession Number(0008,0050)Patient's Age(0010,1010)
Institution Name(0008,0080)Patient's Size(0010,1020)
Institution Address(0008,0081)Patient's Weight(0010,1030)
Referring Physician's Name(0008,0090)Medical Record Locator(0010,1090)
Referring Physician's Address(0008,0092)Ethnic Group(0010,2160)
Referring Physician's Telephone numbers(0008,0094)Occupation(0010,2180)
Station Name(0008,1010)Additional Patient's History(0010,21B0)
Study Description(0008,1030)Patient Comments(0010,4000)
Series Description(0008,103E)Device Serial Number(0018,1000)
Institutional Department name(0008,1040)Protocol Name(0018,1030)
Physician(s) of Record(0008,1048)Study Instance UID(0020,000D)
Performing Physicians' Name(0008,1050)Series Instance UID(0020,000E)
Name of Physician(s) Reading study(0008,1060)Study ID(0020,0010)
Operator's Name(0008,1070)Frame of Reference UID(0020,0052)
Admitting Diagnoses Description(0008,1080)Synchronization Frame of Reference UID(0020,0200)
Referenced SOP Instance UID(0008,1155)Image Comments(0020,4000)
Derivation Description(0008,2111)Request Attributes Sequence(0040,0275)
Patient's Name(0010,0010)UID(0040,A124)
Patient ID(0010,0020)Content Sequence(0040,A730)
Patient's Birth Date (0010,0030)Storage Media File-set UID(0088,0140)
Patient's Birth Time(0010,0032)Referenced Frame of Reference UID(3006,0024)
Patient's Sex(0010,0040)Related Frame of Reference UID(3006,00C2)

In addition to the Basic Application Level Confidentiality Profile, the DICOM Standards Committee has published Basic Application Level Confidentiality Options:
The different options are defined to be applied to the Basic Application Level Confidentiality Profile. Some of these options require the deletion of additional information, and others require the retention of information that would otherwise be deleted.

The following is a list of options that require additional information to be removed:

  • Clean Pixel Data Option
  • Clean Recognizable Visual Features Option
  • Clean Graphics Option
  • Clean Structured Content Option
  • Clean Descriptors Option

The following is a list of options that require retention of information that would otherwise be removed but that is needed for specific uses:

  • Retain Longitudinal Temporal Information with Full Dates Option
  • Retain Longitudinal Temporal Information with Modified Dates Option
  • Retain Patient Characteristics Option
  • Retain Device Identity Option
  • Retain Institution Identity Option
  • Retain UIDs
  • Retain Safe Private Option

To ensure compliance with the anonymization / pseudonymization of your DICOM worldwide, IMAIOS recommends that you follow the recommendations of the DICOM Standards Committee.

In a forthcoming article, we will discuss in detail how to proceed with anonymizing your DICOM using IMAIOS procedures as an example.

1 Kadek Y. E. Aryanto, André Broekema, Matthijs Oudkerk, Peter M. A. van Ooijen (2012) Implementation of an anonymisation tool for clinical trials using a clinical trial processor integrated with an existing trial patient data information system

2 http://dicom.nema.org/medical/dicom/current/output/html/part15.html

3 RSA stands for the public key algorithm named after its inventors (Rivest-Shamir-Adleman); AES for Advanced Encryption Standard, and DES for Data Encryption Standard. Triple-DES applies DES three times for stronger encryption - Oleg S. Pianykh (2011) DICOM Security

Other Articles that might interest you

Free software for annotating DICOM in deep learning

Deep learning algorithms for the detection of objects (abnormality, anatomical structure) in medical imaging require datasets annotated with precision to achieve acceptable results for medical use. This annotation is very time-consuming and requires the expertise of radiologists.

Sarah Madeleine

Anonymization processes in IMAIOS applications

When using our solutions, you may be asked to send us DICOM files, which are then anonymized. This feature may occur, for example, when you submit comments in the IDV applications for mobile devices. DICOM anonymization is a crucial step in data protection when sharing medical images.