This report explores object localization using the bounding box regression technique in Keras and interactively visualizes the model’s prediction in Weights & Biases

Photo by Nick Hillier on Unsplash

Interactive Report | Colab Notebook


Object localization is the task of locating an instance of a particular object category in an image, typically by specifying a tightly cropped bounding box centered on the instance. Object detection, on the contrary, is the task of locating all the possible instances of all the target objects.

Object localization is also called “classification with localization”. This is because the architecture which performs image classification can be slightly modified to predict the bounding box coordinates. Check out Andrew Ng’s lecture on object localization or check out Object detection: Bounding box regression with Keras, TensorFlow, and Deep Learning by Adrian Rosebrock.

This report explores semantic segmentation with a UNET like architecture in Keras and interactively visualizes the model’s prediction in Weights & Biases.

View interactive report here. Colab notebook is available here.


Are you interested to know where an object is in the image? What is the shape of the object? Which pixels belong to the object? To accomplish this, we need to segment the image, i.e., classify each pixel of the image to the object it belongs to or give each pixel of the image a label contrary to giving one label to an image.

Thus, image segmentation is the task of learning a pixel-wise mask for each object in the image.

Machine Learning in Production

This report will review how Grad-CAM counters the common criticism that neural networks are not interpretable.

View interactive report here. All the code is available here.

Training a classification model is interesting, but have you ever wondered how your model is making its predictions? Is your model actually looking at the dog in the image before classifying it as a dog with 98% accuracy? Interesting, isn’t it. In today’s report, we will explore why deep learning models need to be interpretable, and some interesting methods to peek under the hood of a deep learning model. Deep learning interpretability is a very exciting area of research and much progress is being made in this direction already.

So why should you care about interpretability? After all, the success…

Machine Learning in Production

Tips and tricks to debug your neural network

Photo by Efe Kurnaz on Unsplash

View interactive report here. All the code is available here.

In this post, we’ll see what makes a neural network under perform and ways we can debug this by visualizing the gradients and other parameters associated with model training. We’ll also discuss the problem of vanishing and exploding gradients and methods to overcome them.

Finally, we’ll see why proper weight initialization is useful, how to do it correctly, and dive into how regularization methods like dropout and batch normalization affect model performance.

Where do neural network bugs come from?

As shown in this piece, neural network bugs are really hard to catch because:

1. The code never crashes, raises an exception, or even slows down.
2. The network…

Photo by NeONBRAND on Unsplash


Technology is remarkable when it inspires other technologies. The field of computer vision has done this time and again. The simple idea of convolving a 2D data like image with another 2D matrix called kernel with some stride and padding changed computer vision forever and opened so many new possibilities.

One such possibility is for the computers to understand what’s there in an image or scene or frame. CNN revolutionized image classification task. But their power lies in the fact that they can sustain spatial information, unlike a dense neural network. …

Photo by Andrei Lazarev on Unsplash

Sign languages are languages that use the visual-manual modality to convey meaning. Sign languages are expressed through manual articulations in combination with non-manual elements. American Sign Language or ASL is one such sign language that serves as the predominant sign language of deaf communities in the United States and most of Anglophone Canada. Sign languages are full-fledged natural languages with their own grammar and lexicon. But since sign languages are not universal they are used in a limited population of trained professionals and users of the same.

Defining The Problem

Sign language can be understood by masses if an accurate translation of the…

Ayush Thakur

Deep Learning :D Twitter: ayushthakur0 Website:

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store