Menu

Research Profile

Background

I am currently working as a Professor at School of Electrical Engineering and Computer Science (SEECS), National University of Sciences and Technology (NUST), Islamabad, Pakistan. Besides, I am an Adjunct Senior Lecturer at the School of Computer Science and Software Engineering at The University of Western Australia in Perth, Australia. Previously, I was a Senior Researcher at the German Research Center for Artificial Intelligence (DFKI) as well as an Adjunct Lecturer at Kaiserslautern University of Technology (TUKL), Germany. I received my PhD with the highest distinction in computer engineering from TUKL in 2008. My research interests include machine learning and pattern recognition with a special emphasis on applications in document image analysis. I have co-authored over 100 publications in international peer-reviewed conferences and journals in this area. My Google Scholar Profile shows over 5,500 citations of my papers.

Professional Activities



Recent Publications

Logical Layout Analysis using Deep Learning

Logical layout analysis plays an important part in document understanding. It can become a challenging task due to varying formats and layouts. Researchers have proposed different ways to solve this problem, mostly using visual information in some way and a complex pipeline. In this paper, we present a simple technique for labelling the logical structures in document images. We use visual and textual features from the document images to label zones. We utilize Recurrent Neural Networks, specifically 2 layers of LSTM, which input the text from the zone that we want to classify as sequences of words and the normalized position of each word with respect to the page width and height. Comparisons are made by comparing the image under test with the known layouts and labels are assigned to zones accordingly. The labels are abstract, title, author names, and affiliation; however, the text also contains very important information for the task at hand. The presented approach achieved an overall accuracy of 96.21% on publicly available MARG dataset.

Feature Engineering meets Deep Learning: A Case Study on Table Detection in Documents

Traditional computer vision approaches heavily relied on hand-crafted features for tasks such as visual object detection and recognition. The recent success of deep learning in automatically extracting representative and powerful features from images has brought a paradigm shift in this area. As a side effect, decades of research into hand-crafted features is considered outdated. In this paper, we present an approach for table detection in which we leverage a deep learning based table detection model with hand-crafted features from a classical table detection method. We demonstrate that by using a suitable encoding of hand-crafted features, the deep learning model is able to perform better at the detection task. Experiments on publicly available UNLV dataset show that the presented method achieves an accuracy comparable with the state-of-the-art deep learning methods without the need of extensive hyper-parameter tuning.

Hyperspectral Image Analysis For Writer Identification Using Deep Learning

Handwriting is a behavioral characteristic of human beings that is one of the common idiosyncrasies utilized for litigation purposes. Writer identification is commonly used for forensic examination of questioned and specimen documents. Recent advancements in imaging and machine learning technologies have empowered the development of automated, intelligent and robust writer identification methods. Most of the existing methods based on human defined features and color imaging have limited performance in terms of accuracy and robustness. However, rich spectral information content obtained from hyperspectral imaging (HSI) and suitable spatio-spectral features extracted using deep learning can significantly enhance the performance of writer identification in terms of accuracy and robustness. In this paper, we propose a novel writer identification method in which spectral responses of text pixels in a hyperspectral document image are extracted and are fed to a Convolutional Neural Network (CNN) for writer classification. Different CNN architectures, hyperparameters, spatio-spectral formats, train-test ratios and inks are used to evaluate the performance of the proposed system on the UWA Writing Inks Hyperspectral Images (WIHSI) database and to select the most suitable set of parameters for writer identification. The findings of this work have opened a new arena in forensic document analysis for writer identification using HSI and deep learning.

Runway Detection And Localization In Aerial Images Using Deep Learning

Landing is the most difficult phase of the flight for any airborne platform. Due to lack of efficient systems, there have been numerous landing accidents resulting in the damage of onboard hardware. Vision based systems provides low cost solution to detect landing sites by providing rich textual information. To this end, this research focuses on accurate detection and localization of runways in aerial images with untidy terrains which would consequently help aerial platforms especially Unmanned Aerial Vehicles (commonly referred to as Drones) to detect landing targets (i.e., runways) to aid automatic landing. Most of the prior work regarding runway detection is based on simple image processing algorithms with lot of assumptions and constraints about precise position of runway in a particular image. First part of this research is to develop runway detection algorithm based on state-of-the-art deep learning architectures while the second part is runway localization using both deep learning and non-deep learning based methods. The proposed runway detection approach is two-stage modular where in the first stage the aerial image classification is achieved to find the existence of runway in that particular image. Later, in the second stage, the identified runways are localized using both conventional line detection algorithms and more recent deep learning models. The runway classification has been achieved with an accuracy of around 97% whereas the runways have been localized with mean Intersection-over-Union (IoU) score of 0.8.

FFD: Figure and Formula Detection from Document Images

In this work, we present a novel and generic approach, Figure and Formula Detector (FFD) to detect the formulas and figures from document images. Our proposed method employs traditional computer vision approaches in addition to deep models which aids them to improve their performance in comparison to their conventional counterparts. We transform input images by applying connected component analysis (CC), distance transform, and colour transform, which are stacked together to generate an input image for the network. The best results produced by FFD for figure and formula detection are with F1-score of 0.906 and 0.905, respectively. We also propose a new dataset for figures and formulas detection to aid future research in this direction.

Two-stage framework for optic disc localization and glaucoma classification in retinal fundus images using deep learning

Background: With the advancement of powerful image processing and machine learning techniques, Computer Aided Diagnosis has become ever more prevalent in all fields of medicine including ophthalmology. These methods continue to provide reliable and standardized large scale screening of various image modalities to assist clinicians in identifying diseases. Since optic disc is the most important part of retinal fundus image for glaucoma detection, this paper proposes a two-stage framework that first detects and localizes optic disc and then classifies it into healthy or glaucomatous. Methods: The first stage is based on Regions with Convolutional Neural Network (RCNN) and is responsible for localizing and extracting optic disc from a retinal fundus image while the second stage uses Deep Convolutional Neural Network to classify the extracted disc into healthy or glaucomatous. Unfortunately, none of the publicly available retinal fundus image datasets provides any bounding box ground truth required for disc localization. Therefore, in addition to the proposed solution, we also developed a rule-based semi-automatic ground truth generation method that provides necessary annotations for training RCNN based model for automated disc localization. Results: The proposed method is evaluated on seven publicly available datasets for disc localization and on ORIGA dataset, which is the largest publicly available dataset with healthy and glaucoma labels, for glaucoma classification. The results of automatic localization mark new state-of-the-art on six datasets with accuracy reaching 100% on four of them. For glaucoma classification we achieved Area Under the Receiver Operating Characteristic Curve equal to 0.874 which is 2.7% relative improvement over the state-of-the-art results previously obtained for classification on ORIGA dataset. Conclusion: Once trained on carefully annotated data, Deep Learning based methods for optic disc detection and localization are not only robust, accurate and fully automated but also eliminates the need for dataset-dependent heuristic algorithms. Our empirical evaluation of glaucoma classification on ORIGA reveals that reporting only Area Under the Curve, for datasets with class imbalance and without pre-defined train and test splits, does not portray true picture of the classifier’s performance and calls for additional performance metrics to substantiate the results. Keywords: Computer aided diagnosis, Deep learning, Glaucoma detection, Machine learning, Medical image analysis, Optic disc localization.

Converting a Common Low-Cost Document Scanner into a Multispectral Scanner

Forged documents and counterfeit currency can be better detected with multispectral imaging in multiple color channels instead of the usual red, green and blue. However, multispectral cameras/scanners are expensive. We propose the construction of a low cost scanner designed to capture multispectral images of documents. A standard sheet-feed scanner was modified by disconnecting its internal light source and connecting an external multispectral light source comprising of narrow band light emitting diodes (LED). A document was scanned by illuminating the scanner light guide successively with different LEDs and capturing a scan of the document. The system costs less than a hundred dollars and is portable. It can potentially be used for applications in verification of questioned documents, checks, receipts and bank notes.

Automatic fish detection in underwater videos by a deep neural network-based hybrid motion learning system

It is interesting to develop effective fish sampling techniques using underwater videos and image processing to automatically estimate and consequently monitor the fish biomass and assemblage in water bodies. Such approaches should be robust against substantial variations in scenes due to poor luminosity, orientation of fish, seabed structures, movement of aquatic plants in the background and image diversity in the shape and texture among fish of different species. Keeping this challenge in mind, we propose a unified approach to detect freely moving fish in unconstrained underwater environments using a Region-Based Convolutional Neural Network, a state-of-the-art machine learning technique used to solve generic object detection and localization problems. To train the neural network, we employ a novel approach to utilize motion information of fish in videos via background subtraction and optical flow, and subsequently combine the outcomes with the raw image to generate fish-dependent candidate regions. We use two benchmark datasets extracted from a large Fish4Knowledge underwater video repository, Complex Scenes dataset and the LifeCLEF 2015 fish dataset to validate the effectiveness of our hybrid approach. We achieve a detection accuracy (F-Score) of 87.44% and 80.02% respectively on these datasets, which advocate the utilization of our approach for fish detection task.

Journal Papers

[–] 
2010 – Today
[–] 
2000 – 2009

Conference Publications

.
[–] 
2010 – Today
[–] 
2000 – 2009

Getting in touch is easy!

Your questions and comments are important to us. Email us with any inquiries or call on the number given. We would be happy to answer your questions and set up a meeting with you. Our experts will likely help you in any sort of questions you may have.