Seminar on Connecting Language and Vision: Hierarchical Image Classification

Multi-level hierarchical classification addresses the problem of classifying items into a multi-level hierarchy structure of classes. For example, an image of a ‘cat’’ can be classified into `biological organism’’, `animal’, and `cat’’, depending on the taxonomy used. While there have been several methods proposed to solve this problem, they still suffer from several drawbacks. In particular, most existing methods: (i) do not embed the taxonomy structure used, (ii) use a complex backbone neural network with ‘n$ disjoint output layers that do not constraint each other, and (iii) consequently, may output predictions that are often inconsistent with the taxonomy in place. This lecture addresses these deficiencies by introducing a novel mask-based output layer for multi-level hierarchical classification. Specifically, this lecture will cover a model-agnostic output layer that embeds the taxonomy and that can be combined with any model.

Event Info:

23rd December, 2022
9:30 – 10:30 PKT
SEECS, NUST, H-12
Organizer(s): Dr. Faisal Shafait and Dr. Imran Malik

Speaker

Dr. Imran Razzak

Imran Razzak is a Senior Lecturer in Human-Centered Machine Learning and a Postgraduate Research Coordinator at the School of Computer Science and Engineering at the University of New South Wales, Sydney, Australia. Previously, he was a Senior Lecturer in Computer Science at Deakin University, Geelong, Australia. He is an Associate Editor of IEEE TNNLS, IEEE TCSS, and IEEE JBHI. He has attracted research grants of more than 1.5 million AUD. His area of research focuses on connecting language and vision for better interpretation. It spans three broad areas: Machine Learning, Computer Vision and Natural Language Processing with particular emphasis on healthcare.