---
AI Summary: "\\- Developed a custom DICOM de-identification pipeline for UAB\\'s Department\
  \ of Radiology.  \n- Utilized Python and the pydicom library.  \n- Automated the\
  \ process, enabling researchers to specify DICOM UID mappings and export de-identified\
  \ files efficiently.  \n- Supported multiple simultaneous projects and provided\
  \ deployment assistance."
Anonymous: false
Assignee:
- Yujan Shrestha
Last Edited Time: '2025-02-01T16:39:00+00:00'
client_logo:
- 18dbd5b7-a754-8069-a18b-e446fe71efc1
client_name: The University of Alabama at Birmingham
date: '2020-05-14'
featured: false
medical_panel: Tooling
name: DICOM De-Identification Pipeline for CRO
services:
- 9dc4f55a-b5cb-419f-8bae-5e11d6b11fee
summary: Custom software development with PACS performance requirements
tags:
- DICOM
testimonials: []
---

## The Problem

Clinical imaging data, often stored as DICOM files, is the fuel needed to train new machine learning models. But before clinical images can be used, any identifying information must be removed. This process is called *DICOM de-identification*. It is a challenging problem with many custom needs per institution.

## The Outcome

We designed, developed, and have since maintained a custom DICOM de-identification pipeline for the institution.

## The Solution

To provide their researchers easier access to clinical images, UAB's Department of Radiology wanted to streamline its DICOM de-identification process. Their initial process involved manual steps that could be automated---speeding up the process while also avoiding human error.

UAB hired Innolitics to provide a DICOM de-identification solution.

#### Project Requirements

We worked with the Department of Radiology's vice chair of clinical research to understand UAB's requirements. They needed a solution that would:

- Allow researchers to specify DICOM UID mappings with simple excel files.
- Provide a database that allows DICOM files to be re-identified.
- Export de-identified files to a research PACS or the filesystem.
- Communicate with the Philips iSite PACS.
- Throttle requests to the clinical PACS.
- Support multiple simultaneous research projects.
- Support scheduling de-identification tasks during off-hours.
- Not require outside network access.
- Be straightforward for IT to install (we used Docker Images).

#### Customized Solution

We examined the [free DICOM de-identification tools available](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4636522/), and in particular RSNA's [Clinical Trial Processor](http://mircwiki.rsna.org/index.php?title=MIRC_CTP). No existing tool met all of UAB's needs, so built a tool using Python and the [pydicom library](https://github.com/pydicom/pydicom).

<figure>
  <img src="/img/portfolio/DICOM_De-Identification_Pipeline_for_CRO-8c966d9388b04e4194102a66139a6c6d.png">
  <figcaption>
    Figure 1: DICOM De-Identification Data Flow Diagram
  </figcaption>
</figure>

The tool is data-driven:

1.  The researcher configures the project with an Excel sheet.
2.  Images are requested from the source PACS.
3.  A DICOM receiver accepts and saves the images.
4.  The software de-identifies the files.
5.  The files are sent to the destination---a research PACS or a file share.

The system provides other useful outputs on a per-project basis:

- A CSV file maps source image, series, study, and patient UIDs from the source images to the de-identified destination images.
- Email notifications inform the user when a job is complete and where to find the files.
- Log files document any errors in the process.

#### Deployment Support

After developing and testing the tool, we worked directly with UAB's IT department to install it. We continue to provide support and occasionally implement new feature requests.

The tool has been used successfully on several research projects and has not affected the clinical PACS.
