Aditi Shanmugam | Computer Vision Engineeer | IISc | Fellowship.ai | BMSIT&M

Machine Learning Enginner (Computer Vision), Ex-Intern

April '22 - Present

As a Machine Learning Engineer at Inferigence Quotient, I led the development and deployment of cutting-edge computer vision solutions, focusing on real-time object detection and recognition systems. My work involved spearheading a cross-functional team in the creation of an Automatic Number Plate Recognition (ANPR) system, which significantly improved detection accuracy and reduced manual tracking errors across multiple deployment sites. I also engineered a high-performance object recognition pipeline for UAVs, which enhanced operational efficiency in surveillance missions by 30% and reduced processing time by 50%. Additionally, I developed a real-time georeferencing system for UAV-captured images, achieving 95% accuracy in aligning images with satellite imagery, thereby improving geolocation accuracy by 40%. My contributions not only advanced the technical capabilities of our systems but also directly impacted client operations by providing reliable, scalable solutions.

Primary Tasks and Responsibilities:

1. Automatic Number Plate Recognition - ANPR

Directed a cross-functional team of 8 data engineers and ML engineers in the deployment of a scalable number plate tracking solution, incorporating custom OCR correction logic specifically tailored for Indian number plates.
Orchestrated the development and integration of advanced detection algorithms that elevated detection accuracy to over 95%, significantly enhancing data integrity; this initiative led to a reduction in error rates to under 2% and greatly improved service reliability for law enforcement and civilian applications.
Successfully deployed the software solutions at 10 toll plazas, resulting in a 40% reduction in manual tracking errors and empowering law enforcement agencies to respond to traffic violations in real-time, significantly improving operational efficiency.

2. System for Tracking And Recognition of Targets - iSTART

Engineered a robust object recognition and tracking pipeline for deployment within Unmanned Aerial Vehicles (UAVs), implementing state-of-the-art tracking algorithms such as DeepSort alongside custom-trained YOLOv7 models, optimized for real-time performance.
Utilized frameworks and libraries including OpenCV, TensorRT, and onnxruntime in C++ to design and develop the system, ensuring it was fully optimized for deployment on Jetson devices, thus achieving high efficiency and low latency in aerial surveillance applications.
Led the development of an in-house image annotation tool and oversaw the curation of custom datasets tailored for deep learning projects, managing and mentoring a team of 3 members to ensure the successful delivery of high-quality data for model training.

3. Real-time Georeferencing of Aerial Infrared (IR) Video - GeoAIR

Leveraged Python and OpenCV to develop a precise frame registration pipeline, aligning UAV-captured infrared images with satellite imagery to achieve accurate geo-location for real-time aerial surveillance applications.
Incorporated advanced template matching algorithms combined with sparse and dense optic flow techniques, achieving close to 90% frame registration accuracy and delivering performance on HD videos with latency under 500ms and throughput of 25fps on a moderate capacity GPU, meeting stringent real-time operational requirements.

Visual Computing Lab, Indian Institute of Science (IISc)

Research Intern (Deep Learning)

May '21 - April '22

Former Research Intern under Professor Anirban Chakraborty at the Visual Computing Laboratory, operated by the Department of Computational Data Sciences, at the Indian Institute of Sciences (IISc). As a Research Intern at the Visual Computing Lab, IISc, I contributed to the development of advanced deep learning models for domain adaptation and image inpainting. I collaborated on integrating the Divide-Mix algorithm into the Source-Free Multi-Label Domain Adaptation (SF-MLDA) framework, which led to a 7% increase in model accuracy by mitigating data noise. Additionally, I worked on developing neural networks that utilized Generative Adversarial Networks (GANs) and Autoencoders, resulting in a 30% improvement in anomaly detection efficiency. My research focused on advancing the state-of-the-art in these areas, with outcomes that were not only academically significant but also had practical implications for improving the reliability and robustness of machine learning models.

Primary Tasks and Responsibilities:

1. Source Free Multi-Label Domain Adaptation - SF-MLDA

Played a pivotal role in the development of an advanced framework for performing Source-Free Multi-Label Domain Adaptation (SF-MLDA), leveraging cutting-edge techniques to address domain shift challenges in machine learning.
Spearheaded the integration of a co-teaching algorithm called Divide-Mix to effectively mitigate noise in training data within the SF-MLDA framework, resulting in a measurable 7.0% improvement in model accuracy and robustness.

2. Superpixel Masking and Image Inpainting - SMAI

Led the development and optimization of two neural networks inspired by Generative Adversarial Networks (GANs) and Autoencoders for advanced anomaly detection, localization, and correction, significantly improving the system’s effectiveness.
Conducted in-depth experiments with structural and reconstruction loss functions to establish a strong correlation between image inpainting quality and reconstruction accuracy, contributing to the refinement of the overall pipeline.
Enhanced the anomaly detection pipeline by incorporating multi-exposure fusion techniques for synthetic image regeneration, achieving a remarkable 80.0% overall accuracy rate, setting a new standard for image inpainting applications.

Fellowship.ai

Data Science Fellow

January '21 - April '21

I was part of the four month Machine Learning Fellowship program offered by Fellowship.ai, a subsidary of launchpad.ai. During my tenure as a Data Science Fellow at Fellowship.ai, I developed a zero-shot object detection web application tailored for culinary environments. By refining the Language-Image Pre-training (CLIP) model, I achieved a Top-1 accuracy of 97.22% and a perfect Top-3 accuracy of 100%, using a dataset of just 16 images across over 100 classes. This web application became integral to daily operations, enabling real-time ingredient recognition and significantly streamlining kitchen processes. My work involved not only model refinement and deployment but also ensuring that the application could perform effectively in a production environment, delivering practical, real-world benefits to the users.

Primary Tasks and Responsibilities:

1. Novel Food Type Detection

Developed an end-to-end functional web application using Streamlit to perform zero-shot object detection for food items in an in-oven setting, enabling real-time identification and monitoring within culinary environments.
Established baseline results by leveraging ResNet50 and ResNet101 networks through transfer learning and training from scratch on custom datasets, effectively setting the foundation for advanced model improvements.
Utilized OpenAI's state-of-the-art Contrastive Language-Image Pre-training model, CLIP, to achieve remarkable performance, attaining a Top-1 accuracy of 97.22% and a perfect Top-3 accuracy of 100.0% on a custom dataset containing approximately 16 images per class.
Developed and implemented web scrapers using Scrapy and Selenium to generate custom datasets by extracting images from food blogs and Instagram. Enhanced the dataset size and diversity through data augmentation techniques, significantly improving model robustness.

Projects

Multi-Modal Machine Learning for Object Detection.

This work was conducted as my Undergraduate degree thesis under the guidance of Prof.Pratibha N, assistant professor at the Department of Electronics and Telecommunications Engineering at BMS Institute of Technology and Management. The main aim of this project is to demonstrate the significant improvements in Computer Vision upon integrating contextual understanding used in Natural Language Processing tasks. The Contrastive Language Image Pretraining task uses multimodal data as Image-Text pairs, in Zero-shot or Few-shot settings. This work has been accepted to the 6th International Joint Conference on Advances in Computational Intelligence (IJCACI 2022).

Paper | Code

Source Free - Multi Label Domain Adaptation

This work introduces a novel concept of Source-Free Multi-Label Domain Adaptation (SF-MLDA) using graph convolution networks (GCN) and a Co-teaching based method to tackle the problem of noisy labels. In summary, this research work aims to improve the task of adapting a deep learning model to a new domain (distribution of data) where instances have multiple labels, in the absence of a labeled source domain to aid in the adaptation process. My work was carried out in coalition with Vikash Kumar, a Masters student, and his advisor Prof.Anirban Chakraborty at the Indian Institute of Science, during my Internship at the Visual Computing Lab.

Superpixel Masking and Image Inpainting with Multi Exposure Fusion

The aim of this research was to improve Anomaly Detection and Correction using Superpixel Masking and Inpainting. To enhance existing methods, the network was designed by employing a mask-based curriculum learning approach and incorporating multi-image exposure. Two variants of the network were developed, taking inspiration from Generative Adversarial Networks (GAN) and Autoencoder-based architectures. This work aims to improve applications such as image restoration, where damaged or missing parts of an image need to be reconstructed while maintaining the overall image quality. I collaborated with Aditya Kumar Pal, a former Masters student, and his advisor Prof.Anirban Chakraborty at the Indian Institute of Science, during my Internship at the Visual Computing Lab.

Dry Waste Classification Using Machine Learning and IoT

The primary objective of this project was to develop a simple yet efficient machine learning-powered device to perform waste classification on organic and inorganic waste. Additionally, a paper about the work was submitted to the 2nd International Conference On Intelligent Engineering And Management, 2021. The project also acquired small-scale funding of INR 20,000 to develop a fully functional prototype to scale. This project was carried out under Dr. Mallikarjuna Gowda C.P, Associate professor at the Department of Electronics and Telecommunications Engineering at BMS Institute of Technology and Management.

Paper | Code

Work Experience

Projects