Detecting Retina Damage from OCT-Retinal Images

Detecting Retina Damage from OCT-Retinal ImagesBadreesh ShettyBlockedUnblockFollowFollowingApr 6Optical coherence tomography (OCT) is an imaging technique that uses coherent light to capture high resolution images of biological tissues.

OCT is heavily used by ophthalmologists to obtain high resolution images of the eye retina.

Retina of the eye functions much more like a film in a camera.

OCT images can be used to diagnose many retina related eyes diseases.

OCT is an emerging biomedical imaging technology that offers non-invasive real-time, high-resolution imaging of highly scattering tissues.

It is widely used by ophthalmologist to perform diagnostic imaging on the structure of the anterior eye and the retina.

In this article, we’ll see how to classify the optical coherence tomography to classify the retinal disease.

We will discuss classification method to classify retinal OCT images automatically based on convolutional neural networks (CNN’s) on 2 pretrained models using Transfer Learning and 1 custom CNN model.

About the DatasetObtained from: Retinal-OCT Images1.

Choroidal neovascularization (CNV):Choroidal neovascularization is the creation of new blood vessels in the choroid layer of the eye.

Choroidal neovascularization is a common cause of neovascular degenerative maculopathy (i.


‘wet’ macular degeneration)[1] commonly exacerbated by extreme myopia, malignant myopic degeneration, or age-related developments.

Choroidal neovascularization (CNV)2.

Diabetic Macular Edema (DME):DME is a complication of diabetes caused by fluid accumulation in the macula that can affect the fovea.

The macula is the central portion in the retina which is in the back of the eye and where vision is the sharpest.

Vision loss from DME can progress over a period of months and make it impossible to focus clearly.

Diabetic Macular Edema (DME)3.

Drusen:Drusen are yellow deposits under the retina.

Drusen are made up of lipids, a fatty protein.

Drusen likely do not cause age-related macular degeneration (AMD).

But having drusen increases a person’s risk of developing AMD.

Drusen are made up of protein and calcium salts and generally appear in both eyes.


NormalNormal vision occurs when light is focused directly on the retina rather than in front or behind it.

A person with normal vision can see objects clearly near and faraway.

NormalTransfer LearningFine-Tuning our ModelFine-tuning implementation consists of truncating the last layer (the softmax layer) of the pre-trained network and replacing it with a new softmax layer that is relevant to our problem.

1)MobileNetMobileNet is an architecture which is more suitable for mobile and embedded based vision applications where there is lack of compute power.

This architecture was proposed by Google.

Loading MobileNet with pretrained imagenet weights and excluding the top layers i.

e the last three layers replacing fully connected layers with new trainable layers and exposing other layers in the model using include_top=FalseFreezing a layer or set of layers means preventing their weights from being updated during training.

If you don’t do this, then the representations that were previously learned by the convolutional base will be modified during training.

All the convolutional levels are pre-trained, so we freeze the layers during the training of the full so their weights don’t change as the new fully connected layers change trying to learn model so, we set trainable layers as False.

GlobalAveragePooling2D is used to convert the input to the correct shape for the Dense layer to handle.

It flattens the 3D output of the previous layer into a 1D layer, suitable for our fully connected layer by averaging.

num_classes=4, activation=”softmax” gives the single class to a sample by inference — this is done by computing a probability for each possible class.

We can then select the class with the highest probability to give the final classification.

Adding the mobilenet layers and the fully connected layer for training with the convolutional base.

Model Summary13 Conv Layers(Mobile Net)Trainable params(Weights) are all parameters that get updated during the backward pass of backpropagation.

These weights contain the information learned by the network from to the training data.

Image AugmentationThe idea behind image augmentation is that we follow a set process of taking in existing images from our training dataset and applying some image transformation operations to them, such as rotation, shearing, translation, zooming, and so on, to produce new, altered versions of existing images.

The augmentation generates 34464 training images and 3223 validation images in 4 classesTraining ModelThe callback mechanism will be called during each training iteration of the model, that is, at the end of each Epoch.

Checkpointing is a process that saves a snapshot of the application’s state at regular intervals, so the application can be restarted from the last saved state in case of failure.

This is useful during training of deep learning models, which can often be a time-consuming task.

Early Stopping reduces overfitting in neural networks is to use early stopping.

Early stopping prevents overtraining of your model by terminating the training process if it’s not really learning anything.

The val_loss did not improve from 0.

70965 to a lower value for the last 3 epoch which is the patience in the earlystopping callback.

The Test Accuracy is 71.

2%Confusion MatrixNormal Images can be identified easily.

We Can Improve more by working on other models.

Github Code: Retinal-MobileNet.

ipynb2)VGG16Researchers from the Oxford Visual Geometry Group, or the VGG for short, developed the VGG network, which is characterized by its simplicity, using only 3 x 3 convolutional layers stacked on top of each other in increasing depth.

Reducing the volume size is handled by max pooling.

At the end, two fully connected layers, each with 4,096 nodes, are then followed by a softmax layer.

Pooling is carried out by max pooling layers, which follow some of the convolution layers.

Not all the convolution layers are followed by max pooling.

Max pooling is performed over a 2 x 2 pixel window, with a stride of 2.

ReLU activation is used in each of the hidden layers.

The number of filters increases with depth in most VGG variants.

The 16-layered architecture VGG-16 is shown in the following diagram.

Similarly to the mobilenet steps are carried with few changesThe Flatten layer deals with the dimensions.

Because we have a three-dimensional pixel input image, we use Flatten to turn this into a long, single-dimensional string of numbers.

13 Conv Layers(Vgg16)More Trainable Params compared to MobileNet.

It signifies more features all learnt the model.

Confusion MatrixNORMAL,DRUSEN are identified correctly compared to the other retinal diseasesGithub Code: RetinalVGG16.

ipynb3) 5-Hidden Layer CNN ModelWe apply 5 CNN(convolutional layer) with 3*3 kernel and “relu” activations.

ModelTrainable ParamsThe Trainable seem less compared to both the pretrained model.

But still it performs better compared to the pretrained models.

Test AccuracyConfusion MatrixThe Confusion Matrix performs better at predicting Normal,Drusen and DME.

Github Code: Retinal5CNN.

ipynb4)Custom VGG ModelTrainable ParamsMore Trainable Params compared to both the models.

Non-Trainable params are also significantly less compared to the comparitive models.

Confusion MatrixThe model has predicted majority OCT-retinal diseases correctly.

The above inference can be seen below in the predictions.

PredictionsNormal PredictionDrusen PredictionCNV PredictionDME PredictionAs we can see from above all the Retinal Diseases have been classified correctly.

Github Code: OCT-Retinal-VGG.

ipynbConclusionCNN model with 16 hidden Layers i.

e similar to VGG Model gives best result of 96% test Accuracy.

References :https://www.








com/cell/fulltext/S0092-8674(18)30154-5.. More details

Leave a Reply