DICOM File De-identifier

For The University of Alabama at Birmingham

Project Started in 2019

Project Description

Clinical imaging data, often stored as DICOM files, is the fuel needed to train new machine learning models. But before clinical images can be used, any identifying information must be removed. This process is called DICOM de-identification.

To provide their researchers easier access to clinical images, UAB’s Department of Radiology wanted to streamline its DICOM de-identification process. Their initial process involved manual steps that could be automated—speeding up the process while also avoiding human error.

UAB hired Innolitics to provide a DICOM de-identification solution.

Project Requirements

We worked with the Department of Radiology’s vice chair of clinical research to understand UAB’s requirements. They needed a solution that would:

Customized Solution

We examined the free DICOM de-identification tools available, and in particular RSNA’s Clinical Trial Processor. No existing tool met all of UAB’s needs, so built a tool using Python and the pydicom library.

The tool is data-driven:

  1. The researcher configures the project with an Excel sheet.
  2. Images are requested from the source PACS.
  3. A DICOM receiver accepts and saves the images.
  4. The software de-identifies the files.
  5. The files are sent to the destination—a research PACS or a file share.

The system provides other useful outputs on a per-project basis:

Deployment Support

After developing and testing the tool, we worked directly with UAB’s IT department to install it. We continue to provide support and occasionally implement new feature requests.

The tool has been used successfully on several research projects and has not affected the clinical PACS.

Figure 1: DICOM De-identification Data Flow
Figure 1: DICOM De-identification Data Flow