DICOM in Python: Importing medical image data into NumPy with PyDICOM and VTK

I’ll be showing how to use the pydicom package and/or VTK to read a series of DICOM images into a NumPy array. This will involve reading metadata from the DICOM files and the pixel-data itself.

Introduction: The DICOM standard

Anyone in the medical image processing or diagnostic imaging field, will have undoubtedly dealt with the infamous Digital Imaging and Communications in Medicine (DICOM) standard the de-facto solution to storing and exchanging medical image-data.

Applications such as RadiAnt or MicroDicom for Windows and OsiriX for Mac, do a great job of dealing with DICOM files. However, there are as many flavors of DICOM as there are of ice-cream. Thus, when it comes to programmatically reading and processing DICOM files things get a little hairier depending on whether the files store the pixel data in a compressed form or not.

In this post I will show how to read in uncompressed DICOM file series through either the pydicom package or VTK. Take a look at the Resources section for tips on how to tackle compressed DICOM files.

DICOM Datasets

There’s a wealth of freely available DICOM datasets online but here’s a few that should get you started:

Despite the fact that its easy to get DICOM datasets, most databases forbid their re-distribution by third parties. Therefore, for the purposes of these posts I decided to use a dataset from an MR examination of my own fat head. You can find a .zip file with with .dcm files on the sagittal plane here. In order to follow this post extract the contents of this file alongside the IPython Notebooks, the contents of which I’ll be presenting in the pydicom Usage section and the VTK Usage section.

The pydicom package

In this, first of two posts I will show how to use the pydicom package, which consists of pure-python code, is hosted on pypi, and can be easily installed through pip as such:

pip install pydicom

As is often the case with many Python packages, while this package is called pydicom it simply goes by dicom within Python and needs to be imported with import dicom.


In this example I’m gonna use the MR dataset of my own head, discussed in the DICOM Datasets section, and the pydicom package, to load the entire series of DICOM data into a 3D NumPy array and visualize two slices through matplotlib. You can find the entire IPython Notebook here.

Obviously we’ll start with importing the packages we’ll need:

import dicom
import os
import numpy
from matplotlib import pyplot, cm

The only point of interest here is, as I mentioned in the The pydicom package section, that the pydicom package is imported as dicom so be careful with that. Next we use os.path.walk to traverse the MyHead directory, and collect all .dcm files into a list named lstFilesDCM:

PathDicom = "./MyHead/"
lstFilesDCM = []  # create an empty list
for dirName, subdirList, fileList in os.walk(PathDicom):
    for filename in fileList:
        if ".dcm" in filename.lower():  # check whether the file's DICOM

If you check the MyHead folder you’ll see that the .dcm files are named MR000000.dcm, MR000001.dcm, etc. Therefore, the walk function will return them in order since they’re sorted lexicographically by the OS. However, in many cases DICOM files don’t have all those leading zeros, with names like MR1, MR2, etc which would result in lstFilesDCM having the filenames ordered in a fashion such as MR1, MR10, MR100, etc. Since, the typical sorting functions in python, such as sorted and the sort method of list objects (docs here), are lexicographical as well (unless dealing with pure numbers), I strongly suggest using the very useful natsort package which can be found on PyPI here (and can be installed with a simple pip install natsort).

Now, lets get into the pydicom part of the code. A notable aspect of this package is that upon reading a DICOM file, it creates a dicom.dataset.FileDataset object where the different metadata are assigned to object attributes with the same name. We’ll see this below:

# Get ref file
RefDs = dicom.read_file(lstFilesDCM[0])

# Load dimensions based on the number of rows, columns, and slices (along the Z axis)
ConstPixelDims = (int(RefDs.Rows), int(RefDs.Columns), len(lstFilesDCM))

# Load spacing values (in mm)
ConstPixelSpacing = (float(RefDs.PixelSpacing[0]), float(RefDs.PixelSpacing[1]), float(RefDs.SliceThickness))

In the first line we load the 1st DICOM file, which we’re gonna use as a reference named RefDs, to extract metadata and whose filename is first in the lstFilesDCM list. We then calculate the total dimensions of the 3D NumPy array which are equal to (Number of pixel rows in a slice) x (Number of pixel columns in a slice) x (Number of slices) along the x, y, and z cartesian axes. Lastly, we use the PixelSpacing and SliceThickness attributes to calculate the spacing between pixels in the three axes. We store the array dimensions in ConstPixelDims and the spacing in ConstPixelSpacing.

If you were to open one of the DICOM files with an application such as the ones mentioned in the Intro section and checked the metadata you’d see that Rows, Columns, PixelSpacing, and SliceThickness are all metadata entries. pydicom simply creates attributes with the same names and assigns appropriate values to those, making them easily accessible.

The next chunk of code is:

x = numpy.arange(0.0, (ConstPixelDims[0]+1)*ConstPixelSpacing[0], ConstPixelSpacing[0])
y = numpy.arange(0.0, (ConstPixelDims[1]+1)*ConstPixelSpacing[1], ConstPixelSpacing[1])
z = numpy.arange(0.0, (ConstPixelDims[2]+1)*ConstPixelSpacing[2], ConstPixelSpacing[2])

where we simply use numpy.arange, ConstPixelDims, and ConstPixelSpacing to calculate axes for this array. Next, comes the last pydicom part:

# The array is sized based on 'ConstPixelDims'
ArrayDicom = numpy.zeros(ConstPixelDims, dtype=RefDs.pixel_array.dtype)

# loop through all the DICOM files
for filenameDCM in lstFilesDCM:
    # read the file
    ds = dicom.read_file(filenameDCM)
    # store the raw image data
    ArrayDicom[:, :, lstFilesDCM.index(filenameDCM)] = ds.pixel_array  

As you can see, what we do here is first create a NumPy array named ArrayDicom with the dimensions specified in ConstPixelDims calculated before. The dtype of this array is the same as the dtype of the pixel_array of the reference-dataset RefDs which we originally used to extract metadata. The point of interest here is that the pixel_array object is a pure NumPy array containing the pixel-data for the particular DICOM slice/image. Therefore, what we do next is loop through the collected DICOM filenames and use the dicom.read_file function to read each file into a dicom.dataset.FileDataset object. We then use the pixel_array attribute of that object, and toss it into ArrayDicom, stacking them along the z axis.

And that’s it! Using the pyplot module in matplotlib we can create a nice lil’ plot as such:

pyplot.axes().set_aspect('equal', 'datalim')
pyplot.pcolormesh(x, y, numpy.flipud(ArrayDicom[:, :, 80]))

which results in the following image:

A plot of my heads DICOM data

Reading DICOM through VTK

Now while skimming the previous section you might have thought ‘pfff that’s way too easy, why did we bother reading your rants?’. Well, in the interest of keeping you interested, I decided – against my better judgement – to provide the VTK approach to the above process.


You can find a separate notebook here, while I’ll be using the same dataset. Make sure to check that notebook cause I’ll only be detailing the VTK parts of the code here. You will find that the VTK solution is quite a bit more succinct. Now let’s start with reading in the series of .dcm files:

PathDicom = "./MyHead/"
reader = vtk.vtkDICOMImageReader()

As you can see, unlike the approach in the previous section, here we saved ourselves the two loops, namely populating the filename list, and reading in the data slice-by-slice. We first create a vtkDICOMImageReader object (docs here), and pass the path to the directory where all the .dcm files are through the SetDirectoryName method. After that, its just a matter of calling the Update method which does all the reading. If you were dealing with huge datasets, you’d be surprised how much faster than pydicom VTK does that.

Don’t let the above approach fool you entirely. It worked only cause the OS sorted the files correctly by itself. As I mentioned in the previous section, if the files weren’t properly named, lexicographical sorting would have given you a messed up array. In that case you would either need to loop and pass each file to a separate reader through the SetFileName method, or you’d have to create a vtkStringArray, push the sorted filenames, and use the vtkDICOMImageReader.SetFileNames method. Keep your eyes open! VTK is not forgiving 🙂

Next we need access to the metadata in order to calculate those ConstPixelDims and ConstPixelSpacing variables:

# Load dimensions using `GetDataExtent`
_extent = reader.GetDataExtent()
ConstPixelDims = [_extent[1]-_extent[0]+1, _extent[3]-_extent[2]+1, _extent[5]-_extent[4]+1]

# Load spacing values
ConstPixelSpacing = reader.GetPixelSpacing()

As you can see, the vtkDICOMImageReader class comes with a few useful methods that provide the metadata straight-up. However, only a few of those values are available in a straightforward manner. Thankfully, by using the GetDataExtent method of the reader we get a 6-value tuple with the starting and stopping indices of the resulting array on all three axes. Obviously, that’s all we need to calculate the size of the array. Getting the pixel-spacing is even easier than with pydicom and simply accomplished with the GetPixelSpacing method.

Now, onto the fun part :). You might have read my previous post on how to convert arrays between NumPy and VTK. You might have thought that we can use that functionality and get a nice NumPy array with a one-liner. Well, I hate to disappoint you but it’s not that straightforward (remember, VTK).

If you dig a little into the vtkDICOMImageReader docs, you will see that it inherits vtkImageReader2, which in turn inherits vtkImageAlgorithm. The latter, sports a GetOutput method returning a vtkImageData object pointer. However, the numpy_support.vtk_to_numpy function only works on vtkArray objects so we need to dig into the vtkImageData object till we get that type of array. Here’s how we do that:

# Get the 'vtkImageData' object from the reader
imageData = reader.GetOutput()
# Get the 'vtkPointData' object from the 'vtkImageData' object
pointData = imageData.GetPointData()
# Ensure that only one array exists within the 'vtkPointData' object
assert (pointData.GetNumberOfArrays()==1)
# Get the `vtkArray` (or whatever derived type) which is needed for the `numpy_support.vtk_to_numpy` function
arrayData = pointData.GetArray(0)

# Convert the `vtkArray` to a NumPy array
ArrayDicom = numpy_support.vtk_to_numpy(arrayData)
# Reshape the NumPy array to 3D using 'ConstPixelDims' as a 'shape'
ArrayDicom = ArrayDicom.reshape(ConstPixelDims, order='F')

As you can see, we initially use reader.GetOutput() to get a vtkImageData object pointer into imageData. We then use the GetPointData method of that object to ‘extract’ a vtkPointData object pointer into pointData. Now, these vtkPointData may hold multiple arrays but we should only have one in there (being the entirety of the DICOM data). Since these ‘internal’ arrays are numerically indexed, we get this vtkArray through index 0 so the array we were looking for can be retrieved through arrayData = pointData.GetArray(0). We can now finally ‘convert’ that pesky array to NumPy through vtk_to_numpy and store it in ArrayDicom. As a final step, we reshape that array using ConstPixelDims et voila!

From that point, we use our lovely NumPy array and get the same plots we got with the previous approach.

Note that we reshape ArrayDicom with a ‘Fortran’ order. Don’t ask me why, but when I tried to reshape in C order I got misaligned rubbish so there. Trial-n-error.


Should you want to learn more about pydicom do check the project’s official website, its wiki, and its user guide.

‘Whoa’ you might say though, what about those JPEG-based compressed DICOM files you mentioned in the intro?. Well unfortunately, neither pydicom, nor the admittedly convenient vtkDICOMImageReader class can handle those. At least, the pydicom package will warn you and return a NotImplementedError upon reading such a DICOM file, while VTK will just return an array full of 0s and leave you wondering. At this point I can only think of two viable solutions to this:

  • Go the hardcore route, install the GDCM library with Python bindings on your system, and use the mudicom package to handle it in Python. GDCM is a breeze to install on Windows and Linux systems as it provides pre-compiled binaries for those. You should also be able to install it on Mac using Homebrew and this recipe but I haven’t tested it yet.
  • Go the cheater’s route, or as I like to call it the lazy engineer’s route. Simply open that data in one of the applications mentioned in the Intro, such as OsiriX, and save it as uncompressed DICOM (yes you can do that).

Anyway, enough for today, hope you’ve learned enough to start butchering medical image data on your own, as there are a million awesome things you can do with those. We’ll see a few nifty things in subsequent posts. Next up, I’m going to regale you with bone-chilling tales of marching-cubes and surface extraction :). Thanks for reading!

118 thoughts on “DICOM in Python: Importing medical image data into NumPy with PyDICOM and VTK

  1. Pingback: Surface Extraction: Creating a mesh from pixel-data using Python and VTK | PyScience

  2. Thanks for this document.

    FYI: in an attempt to work with compressed DICOM, I found that the GDCM recipe for Homebrew was lacking an option for the python bindings. I ended up contributing it myself, and am waiting for Homebrew/homebrew-science to accept the pull-request.

    In the meantime, this should get it installed:

    brew install https://raw.githubusercontent.com/bollig/homebrew-science/0e20fd89055dc23be36ed745ce918b2d5cfb3a00/gdcm.rb –with-python3

    Test with:

    python3 -c ‘import gdcm’



    • Hey Evan,

      thanks a lot for the link, GDCM is definitely the answer to all the funky DICOM files out there.

      In my upcoming post I’m going to talk about SimpleITK which comes with GDCM out-of-the-box and allows for easy reading of compressed DICOM (tested it on Osirix datasets that wouldn’t play well with PyDICOM or VTK). Post is coming out in a few days 🙂

      Thanks for reading!


      • Hello Adam,

        I would really appreciate your post about reading and also displaying compressed DICOM images using SimpleITK in Python. I am new in Python and I am on the crossroad whether to dive in Python and start to use it as a main programming langue instead of Matlab or not. Main issue for me is an easy way to work with DICOM images. So far I came to problem with compressed DICOM images, so I am waiting for your post:-)




      • Hey Petr! I’m not going to start the age-old Python vs Matlab flame war cause I’m terribly biased towards the former (Python rules!). The next post is coming out in a couple days (Monday morning, Melbourne time) cause I still need to revise the text a bit :). Just follow the blog via email and you’ll get a notification when it’s out :). Thanks for reading! I hope I’ll manage to win you over to our side 😀


  3. Pingback: Image Segmentation with Python and SimpleITK | PyScience

  4. Pingback: Multi-Modal Image Segmentation with Python & SimpleITK | PyScience

  5. Nice post, again 🙂

    Only, to display the slice with pyplot, a pyplot.show() is missing below.

    Thanks for your crystal-clear explanations!


    • Actually by using `@matplotlib inline` you should get the plot directly as the cell output. However, that depends on the version/behavior of the IPython Notebook version. I can’t quite vouch for the behavior of the 3.0.x version that came out recently

      Thanks for reading 🙂


    • Haha I’m really glad you’ve found it useful! The reason I wrote posts like this was cause I wished someone had done so back in the day when I was struggling with stupid DICOM :). Thanks for reading!


  6. Hi Adam,

    Thanks for such informative. It’s really good starter for freshers like me.

    Well my scenario something different.
    I’m going to create a search for this dicom images (like when you type specific ‘word’ it will retentive all the dicom(using metadata) images which has this word).

    Here i need to create a (.xml or .json) file for each .dcm file and store all (.xml or .json) files in one directory. Later i will index entire directory using solr to create.

    Can you please help me how i can convert each (.dcm file metadata into .xml or .json) and store them.

    Appreciate your help.

    Karthik Vadla


    • Sounds like an interesting startup idea 🙂

      Well you can easily use PyDicom to read the metadata and then simply toss them into a Python dictionary and subsequently JSON file. As you can see under https://code.google.com/p/pydicom/wiki/PydicomUserGuide all DICOM attributes are read as key-value pairs.

      Also you shouldn’t have to read all DICOM files in a series as they typically share metadata between slices. If not I would simply concatenate the results and have lists of dictionaries of nested dictionaries.

      However, I wonder how much information you think you can draw from DICOM metadata. Most doctors are very lazy and don’t input much information in the imaging systems. Thus, I haven’t seen many DICOM sets where you could search for “hypothalamic carcinoma” and get back any matches. Good luck nonetheless though 🙂


  7. Great tutorial.
    I’m have been trying to generate a new set of DICOM files after manipulating the original dataset with pydicom. I noticed a compatibility problem with Python3. I wonder if you have faced the same problem.

    pydicom was initially written for Python2.6-7. In Python2.6-7 has a string datatype str which represents both 8-bit text and binary data. In Python3 str only represents the 8-bit text.
    The module filewriter in pydicom has a lot of writing statements for the metadata and the preamble which mixes the two types and produces lots of error statements.
    I have not found a work around for this problem yet. If you have run into this problem before, please let me know how did you get it to work.



    • Hi there, unfortunately I haven’t used pydicom in Python3. Actually, I do 99% of my work in Python 2 predominantly due to compatibility issues like the one you’ve described. The errors you’re describing sound entirely reasonable given that Python3 entirely revamped the way it handles strings.

      however, it would seem that the newest version of pydicom supports Python3 as stated under https://github.com/darcymason/pydicom/wiki/Porting-to-pydicom-1.0. If you’re using the PyPI version (current version being v0.9.9) it still says that a lot of Python3 issues have been resolved. Are you on the newest version? If you are and still getting said issues you may want to use an up-to-date dev version straight off the Git repo (https://github.com/darcymason/pydicom)

      Hope you work it out 🙂


      • Hi,

        Thanks for the quick response. I’m actually using the latest dev version of pydicom. I think I should report it to the development team.

        Also, I have tried converting Numpy array to binary before passing the array to pydicom-write routine but it failed to write the array to the output file. I will post here if I find a solution to this problem.


        Liked by 1 person

  8. Hi Somada,

    I finally got it to work and I thought I should put this here. I have been using conda for python package handling. The current conda/pip does not supply pydicom1.0. I worked around it and after getting conda to install the latest dev version of pydicom1.0 it works fine.



    • Oh awesome that’s good to know. I’m a little surprised it took fiddling with conda to get it to install, I mean conda doesn’t have compiled versions of all latest packages but it still supports pip, but at least you fixed it. Kudos!


  9. Hi, many thanks for your blog post.

    Couldn’t you just have done this to display an image (requires import pylab)?

    RefDs = dicom.read_file(lstFilesDCM[0])
    pylab.imshow(RefDs.pixel_array, cmap=pylab.cm.bone)


    • Yeah probably to an extent but you’re missing the axes so if you have different spacing along different directions the image would appear squished/stretched. Nonetheless I’m sure there’s a dozen ways to display it 🙂


      • Sorry, in hindsight, my comment seems depreciatory. I am not playing on your level at all but I would consider myself a novice. I just wanted to know the rationale behind it since I am playing with the current Kaggle competition which is based on DICOM images.


      • Haha no worries, no offense taken whatsoever. I’m not trying to come across like a guru anyway so I don’t mind constructive criticism :). Good luck on the Kaggle competition btw, I’ll check out the topic but I’m too lazy to commit to competing :p


  10. It is really helpful. Thank you.

    I have one question, I’ve wrote a code that convert dicom images to jpg and then encrypt the jpg file. The decryption part will save the output as jpg. is there any methods to save it back as dcm?


  11. Hi, what is actually 80 in “pyplot.pcolormesh(x, y, numpy.flipud(ArrayDicom[:, :, 80]))”. When I use it I get an error: IndexError: index 80 is out of bounds for axis 2 with size 1. Is it the size of photos you have? Should it be len(lstFilesDCM)-1?



    • I meant number of photos, not size. By the way, thank you for the post. I don’t fully understand how to manipulate the dcm format in python, but this helped me in getting the png format for some further visualizations.



    • hey irma, the `80` is merely the 81st slice (along the Z-axis) of the DICOM dataset I used and loaded into `ArrayDicom`. Obviously is you used your dataset and didn’t account for the different dimensions, i.e., the number of pixels along the x-y axes and number of slices along the z-axis you’ll get the error you did get. Check your dimensions with `ArrayDicom.shape` and that should get you on the right path 🙂


  12. Good morning, nice article. I’ve been working only with PNG/JPEG images for Head CT, but at this time I need to work with DICOM files and it has been a pain. I’ve been studying the pydicom module for some weeks and I had a really mess with this all. Until now, I used the DicomLibrary platform for opening DICOM files.

    After a file is open, you can switch between two windowing: default and auto. Automatic windowing is the result I get from the algorithm you wrote. But I would like to get the result by applying the Default windowing. I’ve gone to the matplot lib colormap reference, but that couldn’t help.

    Do you know how what that works? Better, how can I get the results I can see searching “Head CT” images in the web?

    Another thing, I would like to save the result I get from the pyplot (I use pyplot.savefig(PathDicom + “foo.png”) after your last code), but I always get the axes, a few values in the axes, a big white border and at the middle the result I want to save. Do you know how can you get only the result?


    • hey there,

      re the figure saving what you’re doing there isn’t saving the figure you plotted but creating and saving an empty figure. The `savefig` method needs to be applied to the figure you created and plotted upon, e.g., http://matplotlib.org/faq/howto_faq.html#save-transparent-figures

      Now for the windowing I honestly have no idea what you’re trying to achieve, are we talking about a different colormap or an actual windowing of the voxel intensity values?


    • I use this method to save the image based on the code above (without axes). Mind that I here also rescale them to a new_size, img2 is the orginal image but cropped to the new size. So you can just use your original image, and new_size equal to the size of your image. Sorry I don’t have much time to modify, but some .axis(‘off’) might be important for your problem of having axes.

      def save_image(output_image_path,im_dcm,img2,new_size,label):
      ConstPixelDims = (int(im_dcm.Rows), int(im_dcm.Columns), 1)
      ConstPixelSpacing = (float(im_dcm.PixelSpacing[0]), float(im_dcm.PixelSpacing[1]), float(im_dcm.SliceThickness))
      x = numpy.arange(0.0, (new_size+1)*spacing_x, spacing_x)
      y = numpy.arange(0.0, (new_size+1)*spacing_y, spacing_y)

      fig = pyplot.figure(figsize=(new_size*0.01,new_size*0.01))

      pyplot.subplots_adjust(top = 1, bottom = 0, right = 1, left = 0, hspace = 0, wspace = 0)

      res=pyplot.pcolormesh(x, y, numpy.flipud(array_drawn[:,:]))
      print “Saving: “,output_image_path



    • Good morning, I’ve already solved my problem. Basically is what user irma wrote: the code somada141 wrote before the sagittal plane of the head, I put:
      pyplot.figure(dpi=2000) #2250
      pyplot.axes().set_aspect(‘equal’, ‘datalim’)

      for dicomIndex in range(0, len(lstFilesDCM)):
      pyplot.pcolormesh(x, y, numpy.flipud(ArrayDicom[:, :, dicomIndex]))
      pyplot.pcolormesh(x, y, vec)

      # Get the created image
      file1 = PathDicom + “tmp.png”
      file2 = PathDicom + “foo.png”
      pyplot.savefig(file1, bbox_inches=’tight’, pad_inches=0)
      im = Image.open(file1)
      # remove pyplot borders
      bg = Image.new(im.mode, im.size, im.getpixel((0,0)))
      diff = ImageChops.difference(im, bg)
      diff = ImageChops.add(diff, diff, 2.0, -100)
      bbox = diff.getbbox()
      if bbox:
      # Create and initialize the auto windowing
      img2 = im.crop(bbox)

      final = numpy.array(img2)
      But then I wasn’t getting the results I wanted. I used RadiAnt and DicomLibrary for opening DICOM files and the results I get from both are a lot different. The solution I found for this problem was here: https://github.com/pieper/pydicom/blob/master/source/dicom/contrib/pydicom_Tkinter.py

      With the code wrote there, I can get the image I need in the default window. Hope this helps another users with the same problem as I.


  13. Hi,

    I have to make a GUI for Image registration, where there will be one window for reference image, another is moving image window. The image files are DICOM or NIFTI. I will be putting one slider to view different slices and another window to show the results after registration. The registration code, I have written in Python. Please suggest what to use for this GUI. I have tries pyQt, but there is no api for reading images and display it to window.



    • Ok so I’m no means a subject-matter-expert when it comes to GUIs, (1) cause I hate them and (2) cause I never learned how to make them properly which leads me to hate them. That being said I know that there’s a way to integrate Qt and VTK so that’d be the way I’d go about it. There’s already several projects out there integrating the two for pretty much what you’re doing now. Googs’ your friend 😀


  14. When I print RefDs for running the same code above,
    The terminal outputs that “(0020, 1040) Position Reference Indicator LO: ”
    (0028, 0002) Samples per Pixel US: 1
    (0028, 0004) Photometric Interpretation CS: ‘MONOCHROME2’
    (0028, 0010) Rows US: 512
    (0028, 0011) Columns US: 512
    (0028, 0030) Pixel Spacing DS: [‘0.859375’, ‘0.859375’]
    (0028, 0100) Bits Allocated US: 16
    (0028, 0101) Bits Stored US: 16
    (0028, 0102) High Bit US: 15
    (0028, 0103) Pixel Representation US: 1
    (0028, 1052) Rescale Intercept DS: ‘0’
    (0028, 1053) Rescale Slope DS: ‘1’
    (7fe0, 0010) Pixel Data OB or OW: Array of 524288 bytes”
    It seems that RefDs does not has the attribute “SliceThickness”??



  15. Pingback: Error Function Python Numpy

  16. Pingback: Medical Image Analysis with Deep Learning  | Tutor01

  17. My problem when reading DICOM using your code is that the ds.pixel_array changes from slide to slide. Therefore I caught an error like:

    Traceback (most recent call last):
    File “D:\OneDrive\Temp\Dicom_read\Yasaka – read.py”, line 48, in
    ArrayDicom[:, :, lstFilesDCM.index(filenameDCM)] = ds.pixel_array
    ValueError: could not broadcast input array from shape (485,685) into shape (685,685)

    At the beginning I thought the pixel array shape would be the same with each slide. Any solution for this ? Thank you!


    • I’d suggest you open the series in a dicom viewer and verify. Then if you really have images of different dimensions you could always interpolate the outliers to a given size


  18. Pingback: Computer Vision Feature Extraction 101 on Medical Images — Part 1: Edge Detection / Sharpening / Blurring / Emboss / Super Pixel – WELCOME TO THE TF WEB

  19. Hi, I wonder how to display axial image instead of Sagittal view?
    I try to change the slicing, such as (:, :, 80) to (: , 20, 🙂 but that doesn’t work


  20. Hello, thanks for posting this helpful tuto 🙂
    Actually I have an error in “pyplot.pcolormesh(x, y, numpy.flipud(ArrayDicom[:, :, 80]))”
    my data base contained 28 images so normally I put “pyplot.pcolormesh(x, y, numpy.flipud(ArrayDicom[:, :, 27]))” : isn’t it? but I get an error in this case : TypeError: Dimensions of C (1439, 1442) are incompatible with X (1440) and/or Y (1443); see help(pcolormesh)


  21. Hi! Thank you for this!

    Was wondering though, if you know of any way to open the DICOM images with structured contour sets? It is stored as a ‘RS’ file with .dcm extension.


    Liked by 1 person

  22. sir
    i did some segmentation work on simpleitk and saves the volume as .mhd. pl tell me how to convert each image to jpeg as i want to show some images in doc file. pl tell me the way as i am stuck


    • Just open the MHD file in 3DSlicer or Paraview, render the volume, and export an image out of there. Alternatively the volume-rendering post shows you how to do exactly what you’re talking about but it’d be harder to get the right angles and rendering parameters.


  23. Hello Sir..
    After implementation I got this error. How can I solve this or is there something wrong with my data? I have dicom data which i want to read and view.

    ValueError Traceback (most recent call last)
    in ()
    32 for filenameDCM in lstFilesDCM:
    33 ds = dicom.read_file(filenameDCM)
    —> 34 ArrayDicom[:, :, lstFilesDCM.index(filenameDCM)] = ds.pixel_array

    ValueError: could not broadcast input array from shape (128,128) into shape (512,512)


    • Hey there, its tricky telling without looking at your full code but I’d say that `ArrayDicom` has X and Y dimensions of 512 but `ds.pixel_array` is 128×128. Are you initializing your arrays correctly?


  24. Hi,
    Its a very good article. Could you please tell be is there any way to output the data (arraydicom) in vtkrenderwindow instead of using matplotlib


  25. Hi,
    That’s a very good tutorial. I have been following this and trying to process dicom images. However, the files which I have are ‘CR’ modality and do not have SliceThickness attribute. How can I convert the image to a numpy array in this case?


      • So, should I do it this way?
        ConstPixelSpacing = (float(RefDs.PixelSpacing[0]), float(RefDs.PixelSpacing[1]))

        Is it correct?
        x = numpy.arange(0.0, (ConstPixelDims[0]+1)*ConstPixelSpacing[0], ConstPixelSpacing[0])
        y = numpy.arange(0.0, (ConstPixelDims[1]+1)*ConstPixelSpacing[1], ConstPixelSpacing[1])
        If that is the case, what should I use for the z axis?


  26. Bonjour Somada,
    j’ai projet sur la création d’un logiciel d’analyse de texture des images médicales dans une zone d’intérêt limité . je me suis calé sur comment faire le traitement sur la dite zone en Dicom.
    est ce que vous pouvez m’aider s’il vous plais ? merci d’avance


    • hey there, dunno why you assumed I speak French :). There isn’t a single comment or post in French on the whole blog. All I got from Google Translate was:

      I have project on the creation of a texture analysis software for medical images in an area of ​​limited interest. I stalled on how to do the treatment on the said zone in Dicom.
      can you help me please? thank you in advance

      Which honestly doesn’t give me much to go on with :). Can you elaborate as to what specific issues you’re having?


  27. Really nice!!! Although I can not understand all for this moment.
    I learned python for a few months, I know numpy, pyplot, os, But I do not have imaging background. Would you kindly recommend some sources for the imaging basic and pydicom (except its official website). Thank you again!


    • Honestly, I haven’t come across much online material on medical imaging processing and I think apart from the official documentation on PyDICOM my post is your best bet :D. The lack of material is why I started this blog.


  28. Hi How to Select Attributes for loading spacing. By going thru various dicom directories what I could find was that the attributes in RefIDs will differ. Few directories doesn’t have an attribute named slice thickness and few others directories don’t have pixel spacing etc. So my question is how to choose the attributes from refid tags to load the spacing values?


    • Hi Anju, its normal for the attributes to change between directories, those are typically different imaging sequences/series so they shouldn’t be loaded in the same data-object. Normally all images in a single directory/series should have the same spacing/origin but if they’re not present there’s not much you can do. However, they may be called something else so I’d open the images in a DICOM viewer application like Radiant or OsiriX to see what attributes it reads in. Then once you identify the right attributes you can use those in your code.


    • Hi Ramon, I don’t believe you can crop a DICOM file directly. You could potentially load the contents of a DICOM file into an array, crop it, and write it into another DICOM file but I’ve never tried it. Depending on the processing you wanna do I’d suggest you load the dicom and crop it in the array format you would like to process as shown in this post.


  29. Hello Sir, I got the same error like other over the broadcast array.
    27 ds = dicom.read_file(filenameDCM)
    28 # store the raw image data
    —> 29 ArrayDicom[:, :, lstFilesDCM.index(filenameDCM)] = ds.pixel_array

    ValueError: could not broadcast input array from shape (47,49) into shape (47,62)

    Above you suggested “………. if you really have images of different dimensions you could always interpolate the outliers to a given size.”

    Can you help me with how t interpolate.

    I’m using python 3.5 and my DCM images have differents sizes I wish to asign a default size of 40×40.


  30. Hello.

    I really appreciate your comments.

    I extracted the vti file with reference to the above code. And I saw it again in the paraview. I felt slightly different vti configuration.

    In my previous vti file, the value of [cell / point Array Status] was ‘DICOMImage’. So I implemented volume rendering in vtk.js.

    However, for vti files converted to Python code, the value of [cell / point Array Status] is ‘ArrayDicom’. Therefore, I am experiencing errors and not implementing volume rendering.

    Is there a way to change the value of [cell / point Array Status] from ArrayDicom to DICOMImage?

    I desperately want to let you know.
    Here is the code :

    import vtk
    from vtk.util import numpy_support
    import os
    import numpy

    def vtkImageToNumPy(image, pixelDims):
    imageData = reader.GetOutput()
    pointData = image.GetPointData()
    arrayData = pointData.GetArray(0)
    ArrayDicom = numpy_support.vtk_to_numpy(arrayData)
    ArrayDicom = ArrayDicom.reshape(pixelDims, order=’F’)

    return ArrayDicom

    dicomFolder = “./lung_imgs”
    reader = vtk.vtkDICOMImageReader()

    # Load dimensions using `GetDataExtent`
    my_extent = reader.GetDataExtent()
    ConstPixelDims = [my_extent[1]-my_extent[0]+1, my_extent[3]-my_extent[2]+1, my_extent[5]-my_extent[4]+1]

    # Load spacing values
    #ConstPixelSpacing = reader.GetPixelSpacing()

    dicom_array = vtkImageToNumPy(reader.GetOutput(), ConstPixelDims)

    from pyevtk.hl import imageToVTK

    imageToVTK(“./lungResult3”, pointData={“ArrayDicom” : dicom_array})


  31. Hi,

    From dicom_numpy i can get two-tuple containing the 3D-ndarray (voxel) and the affine matrix.
    With the function dicom_numpy.combine_slices

    I would like to see a volume-rendering using VTK but it accept only 2d array or at least save my NumPy array in VTK file that could be read by Slicer 3D for exemple.

    how can i do ?

    ps:sorry for my english and i love your articles


  32. Thanks for the great article Adam!

    I’m new to python and wanting to move a whole bunch of complex ImageJ (*cringe*) dicom processing scripts over to python.

    I saw you mentioned both the vtk and pydicom modules, and in the comments someone mentioned SimpleITK (recipes? …I don’t even know what python people’s correct term is lol).

    Would you be able to comment on what you think is the best one to use nowadays? (I’m using python3, and then doing numpy processing, and wanting to open all different modalities, not just CT/MR cross sectional)


    • …I should mention apart from all modality types, I’ll also possibly need to support compressed dicom, as well as un-compressed – that’s why I mentioned SimpleITK.


      • Oh man ImageJ, blast from the past :D. You actually hit a sore-spot there Richard, Python2 is nearly dead (reaching end-of-life by 2020) so Python3 is the way to go. That’s largely why I haven’t written any new posts in years cause I’d have to essentially go over the entire thing and port to Py3, new Jupyter notebooks, new versions of VTK etc. I just don’t have the time for it anymore. Been hoping someone would sweep in and do it for me but its 5 years later and it hasn’t happened :D.

        Depending on what your scripts do you might get away with Py3, NumPy, SciPy, and a touch of scikit-image or OpenCV. Now if you’re doing medical-grade segmentation stuff you can’t go wrong with SimpleITK. OpenCV/scikit are top-notch but were never specialised in medical-image processing.

        Now as far as compressed DICOM goes I’m pretty confident that PyDICOM supports it (https://pydicom.github.io/pydicom/stable/image_data_handlers.html) ideally through GDCM (http://gdcm.sourceforge.net/wiki/index.php/Main_Page). It seems a little trickier with ITK/SimpleITK though I’ll come clean and say I haven’t tried it there.

        All in all I’d say use the simplest packages that fit the bill for you. As you might have seen ITK/VTK can be rather complex so if you can do your stuff without them I’d suggest avoiding them. With great processing power come great hassles in installation :D.


  33. Hey Adam, thanks for your thoughts again.

    …yeah – using ImageJ was getting just too cumbersome and painful, so decided to move to python and something you can actually program in.

    I checked out those ideas of yours for libraries and they look good.

    I’d just like to pick your brain on one more thing I’m wondering about the best way to handle: in ImageJ right now I use a lot of image registration for DICOM images. This is only translation/rotation generally …nothing needing warping type registration. I’ve found a pretty good plugin for IJ – turboreg – that does what I need reasonably well.

    So what’s your thoughts on the best (ie simplest to use, fastest, most accurate) option for python registration – given I only need translation/rotation?

    ..love to hear your (or anyone else) ideas on that



    • By ‘frame’ do you mean a slice in a 3D set or are you referring to 4D MRI? AFAIK each frame (be it in space or time) in DICOM is a single file in a series. The instructions in this post pertain to a series but the same methods can be used to read an individual file.


      • multiframe means one dicom file have multiple scans. For example, all scans of a particular series number will be represented by one dicom file.

        How to extract each frame in that dicom file?


  34. Hi Adam,

    Thank you for writing this piece. It really clarified a few areas in DICOM files that I was unsure about.

    I am about to begin a research project which involves extracting and editing data within DICOM files (particularly DICOM-RT files).

    Just wondering if you have come across any useful papers/articles/books which explain the basics behind DICOM files? I am aware that a similar question has already been posted but hoping new information is available now.



  35. Having read this I thought it was rather informative. I appreciate you taking the time and energy to put this information together. I once again find myself personally spending a lot of time both reading and commenting. But so what, it was still worth it!|

    Liked by 1 person

  36. Hey Somada, Thanks for this blog, and all informations.

    I am wondering if you can help me about a little pb that I have to face.

    I am trying to have differents volumes (CT DICOM, RT dicom, independant matrix) in the same scale to make some analyses and it looks not simple at all.
    I am able to read all of them using pydicom or dicompyler but I have no idea how to fix the origin pb, the pixels resolution and the rescale of different matrix.

    I was thinking the RT dicom must have the same spaital resolution, but it s not.

    Thank for your help ?



    • Hi Oli, what you’re describing really isn’t a simple problem. Depending on the data you really have two options: (1) interpolation (2) registration. If the imaged anatomy/object was somehow identically placed during the different imaging modalities (translation, rotation, scale) then interpolation would suffice as you would process all volumes and ‘map’ them unto the same grid and be able to analyze them concurrently.

      However, the above is typically not the case. When it comes to patients there’s both translation and rotation (patient moves) so you’d have to ‘pattern match’ the different images and apply transformations to all of them to match them. Have a look at my post under https://pyscience.wordpress.com/2014/10/19/image-segmentation-with-python-and-simpleitk/ and have a look at the comments where I’ve mentioned resources on how to do this.

      Good luck!


      • Thank Somada,

        Yes, there is not so simple :/
        Thank you for the link.

        I started to do that, but I was thinking it could be other ways. especially with RT dicom and CT, which normally are supposed to be quite close (I expected superposition beetwen RT dicom and CT must be simple from nowaday).
        I will post my code and more explanations tomorow.

        See you, and thk again.


        Liked by 1 person

  37. hello I used the same code above but I get this error
    IndexError Traceback (most recent call last)
    14 # Get ref file
    —> 15 RefDs = dicom.read_file(lstFilesDCM[0])
    17 # Load dimensions based on the number of rows, columns, and slices (along the Z axis)

    IndexError: list index out of range


    • Hey Hamdalla, based on the error it seems that `lstFilesDCM` doesn’t have anything in it, i.e., its an empty list. I’d double check that the path to the files is correct and print out `lstFilesDCM` to make sure it’s found the files.


  38. Hi Somada,
    I am working on DICOM database which consist of CT images in .dcm format and corresponding labels in .PNG format. I would like to train CNN .Kindly help me I am new to Dicom as well as python.
    In above tutorial I come to know about how to read dicom data into numpy but how to convert corresponding labels into numpy so that cnn can be train.
    I am using CHAOS 2019 database for my project .
    I hope you will help me out in this regard.
    Your help really make me move ahead.
    Thanks in advance.


      • Thanks Somada ,
        I have gone through scipy and converted into numpy array. now I got dicom images and corresponding labels into numpy array.
        Again there is one problem …In my dataset there are few dicom images which has extra contour with almost same intensity of useful intensities …I want to remove that extra contour from images and preserve the useful information with same image size.
        Please help me how to do it which image operation will help me.
        Hope you will reply soon.
        Thanks in advance


      • Hi Devidas, if by ‘contour’ you’re referring to an isocontour, i.e., a single value, then you just need to ‘filter out’ that one contour. In NumPy it’d be as simple as `yourarray[yourarray == isocontour_value] = 0` so that you 0 out the unwanted values.


  39. Pingback: Denosing Lung CT Scans using Neural Networks with Interactive Code — Part 1, Vanilla Auto Encoder Model – Data Science Repository

  40. Good Morning Community, i’m a Brazilian researcher my project consist basicaly in reading Dicom image and using its pixels to get the average CT numbers

    I need to make a interface using PySimpleGUI and link it to Pydicom image data to read and save Dicom images, i would love to see how can i do this in Python, i need to get the Pixel data of the image and then show it to the user though the PySimpleGUI interface.

    Thanks for reading, and if u know some tips please step ahead and write a reply!


    • Hey there! I’ve never heard of PySimpleGUI before but from what I can see it supports different backends like Qt in which case I believe it is possible to integrate VTK with QT and visualise the DICOM values much like I’ve done in this blog. Alternatively you could opt to visualise the DICOM data through matplotlib which should work perfectly fine but may not perform as quickly.


  41. Hi there!
    I just wounder, how to deal with the gantry tilt in CT-DICOMs. This tilt makes the data nearly unusable for any processing, since it varies from patient to patient. If you reconstruct a volume from the different slices, in the sagittal view, a human head looks like an alien. Has anybody seen a solution for this (in Pydicom)?

    Kind regards


    • Well this is way over my head but I’d think you can rotate the entire volume prior to slicing it. The DICOM is what it is so I don’t know if PyDICOM can help but both VTK and NumPy have rotation functionality and since this doesn’t require warping but only rotation it shouldn’t be too hard. Any chance rotation correction meta exist in the DICOM?


      • Thanks for your replay. Rotation is not the problem, that could easily be fixed. Here we are talking about real shearing, which distorts the proportions. And that is crucial, because diagnosis is partially! based on ratio of different portions of the brain. I can only imagine a shear matrix as solution, but I hoped there is an already made function. I am just wondering, why this problem is not heavily discussed in different forums, because it is absolutely essential for every CNN to work properly.

        Liked by 1 person

      • Aaah ok sorry I had no idea, never encountered that issue. Well as you suggested a transformation ought to help with that but of course you’d have to know by how much to transform. Does the DICOM dataset give metadata on how much shearing has been applied to the image (I assume not)? The vtkTransform can apply such affine transformations (https://vtk.org/doc/nightly/html/classvtkTransform.html#details) but you may be better served by the advanced transforms available under SimpleITK (https://simpleitk.readthedocs.io/en/master/registrationOverview.html). Of course unless you have some sort of reference so that you’d know how much shearing to correct i have no clue as to how that can be fixed.


      • Yes, if dicom is your DICOM file, dicom.GantryDetectorTilt, this will give you the tilt. Thanks for your help, actually I do not know simpleitk, so another library to learn.


      • Ah fantastic was hoping that’d be the case. I’m sure there’ll be some trial-n-error involved into translating the tilt provided in the DICOM to one that makes sense in SimpleITK but it should be entirely feasible. Also, if you’re dealing with medical image-data I reckon learning SimpleITK is a worthwhile investment.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s