A Gentle Introduction to Channels First and Channels Last Image Formats for Deep Learning

Color images have height, width, and color channel dimensions.

When represented as three-dimensional arrays, the channel dimension for the image data is last by default, but may be moved to be the first dimension, often for performance-tuning reasons.

The use of these two “channel ordering formats” and preparing data to meet a specific preferred channel ordering can be confusing to beginners.

In this tutorial, you will discover channel ordering formats, how to prepare and manipulate image data to meet formats, and how to configure the Keras deep learning library for different channel orderings.

After completing this tutorial, you will know:Let’s get started.

This tutorial is divided into three parts; they are:An image can be stored as a three-dimensional array in memory.

Typically, the image format has one dimension for rows (height), one for columns (width) and one for channels.

If the image is black and white (grayscale), the channels dimension may not be explicitly present, e.

g.

there is one unsigned integer pixel value for each (row, column) coordinate in the image.

Colored images typically have three channels, for the pixel value at the (row, column) coordinate for the red, green, and blue components.

Deep learning neural networks require that image data be provided as three-dimensional arrays.

This applies even if your image is grayscale.

In this case, the additional dimension for the single color channel must be added.

There are two ways to represent the image data as a three dimensional array.

The first involves having the channels as the last or third dimension in the array.

This is called “channels last“.

The second involves having the channels as the first dimension in the array, called “channels first“.

Some image processing and deep learning libraries prefer channels first ordering, and some prefer channels last.

As such, it is important to be familiar with the two approaches to representing images.

Take my free 7-day email crash course now (with sample code).

Click to sign-up and also get a free PDF Ebook version of the course.

Download Your FREE Mini-CourseYou may need to change or manipulate the image channels or channel ordering.

This can be achieved easily using the NumPy python library.

Let’s look at some examples.

In this tutorial, we will use a photograph taken by Larry Koester, some rights reserved, of the Phillip Island Penguin Parade.

Phillip Island Penguin ParadePhoto by Larry Koester, some rights reserved.

Download the image and place it in your current working directory with the filename “penguin_parade.

jpg“.

The code examples in this tutorials assume that the Pillow library is installed.

Grayscale images are loaded as a two-dimensional array.

Before they can be used for modeling, you may have to add an explicit channel dimension to the image.

This does not add new data; instead, it changes the array data structure to have an additional third axis with one dimension to hold the grayscale pixel values.

For example, a grayscale image with the dimensions [rows][cols] can be changed to [rows][cols][channels] or [channels][rows][cols] where the new [channels] axis has one dimension.

This can be achieved using the expand_dims() NumPy function.

The “axis” argument allows you to specify where the new dimension will be added to the first, e.

g.

first for channels first or last for channels last.

The example below loads the Penguin Parade photograph using the Pillow library as a grayscale image and demonstrates how to add a channel dimension.

Running the example first loads the photograph using the Pillow library, then converts it to a grayscale image.

The image object is converted to a NumPy array and we confirm the shape of the array is two dimensional, specifically (424, 640).

The expand_dims() function is then used to add a channel via axis=0 to the front of the array and the change is confirmed with the shape (1, 424, 640).

The same function is then used to add a channel to the end or third dimension of the array with axis=2 and the change is confirmed with the shape (424, 640, 1).

Another popular alternative to expanding the dimensions of an array is to use the reshape() NumPy function and specify a tuple with the new shape; for example:After a color image is loaded as a three-dimensional array, the channel ordering can be changed.

This can be achieved using the moveaxis() NumPy function.

It allows you to specify the index of the source axis and the destination axis.

This function can be used to change an array in channel last format such, as [rows][cols][channels] to channels first format, such as [channels][rows][cols], or the reverse.

The example below loads the Penguin Parade photograph in channel last format and uses the moveaxis() function change it to channels first format.

Running the example first loads the photograph using the Pillow library and converts it to a NumPy array confirming that the image was loaded in channels last format with the shape (424, 640, 3).

The moveaxis() function is then used to move the channels axis from position 2 to position 0 and the result is confirmed showing channels first format (3, 424, 640).

This is then reversed, moving the channels in position 0 to position 2 again.

The Keras deep learning library is agnostic to how you wish to represent images in either channel first or last format, but the preference must be specified and adhered to when using the library.

Keras wraps a number of mathematical libraries, and each has a preferred channel ordering.

The three main libraries that Keras may wrap and their preferred channel ordering are listed below:By default, Keras is configured to use TensorFlow and the channel ordering is also by default channels last.

You can use either channel ordering with any library and the Keras library.

Some libraries claim that the preferred channel ordering can result in a large difference in performance.

For example, use of the MXNet mathematical library as the backend for Keras recommends using the channels first ordering for better performance.

We strongly recommend changing the image_data_format to channels_first.

MXNet is significantly faster on channels_first data.

— Performance Tuning Keras with MXNet Backend, Apache MXNetThe library and preferred channel ordering are listed in the Keras configuration file, stored in your home directory under ~/.

keras/keras.

json.

The preferred channel ordering is stored in the “image_data_format” configuration setting and can be set as either “channels_last” or “channels_first“.

For example, below is the contents of a keras.

json configuration file.

In it, you can see that the system is configured to use tensorflow and channels_last order.

Based on your preferred channel ordering, you will have to prepare your image data to match the preferred ordering.

Specifically, this will include tasks such as:In addition, those neural network layers that are designed to work with images, such as Conv2D, also provide an argument called “data_format” that allows you to specify the channel ordering.

For example:By default, this will use the preferred ordering specified in the “image_data_format” value of the Keras configuration file.

Nevertheless, you can change the channel order for a given model, and in turn, the datasets and input shape would also have to be changed to use the new channel ordering for the model.

This can be useful when loading a model used for transfer learning that has a channel ordering different to your preferred channel ordering.

You can confirm your current preferred channel ordering by printing the result of the image_data_format() function.

The example below demonstrates.

Running the example prints your preferred channel ordering as configured in your Keras configuration file.

In this case, the channels last format is used.

Accessing this property can be helpful if you want to automatically construct models or prepare data differently depending on the systems preferred channel ordering; for example:Finally, the channel ordering can be forced for a specific program.

This can be achieved by calling the set_image_dim_ordering() function on the Keras backend to either ‘th‘ (theano) for channel-first ordering, or ‘tf‘ (tensorflow) for channel-last ordering.

This can be useful if you want a program or model to operate consistently regardless of Keras default channel ordering configuration.

Running the example first forces channels-first ordering, then channels-last ordering, confirming each configuration by printing the channel ordering mode after the change.

This section provides more resources on the topic if you are looking to go deeper.

In this tutorial, you discovered channel ordering formats, how to prepare and manipulate image data to meet formats, and how to configure the Keras deep learning library for different channel orderings.

Specifically, you learned:Do you have any questions?.Ask your questions in the comments below and I will do my best to answer.

…with just a few lines of python codeDiscover how in my new Ebook: Deep Learning for Computer VisionIt provides self-study tutorials on topics like: classification, object detection (yolo and rcnn), face recognition (vggface and facenet), data preparation and much more…Skip the Academics.

Just Results.

Click to learn more.

.

. More details

Leave a Reply