Cross-modal Retrieval: Retrieval Across Different Content Modalities

Nikhil Rasiwasia (Yahoo Labs! )

NICTA SML SEMINAR

DATE: 2013-12-04
TIME: 12:00:00 - 13:00:00
LOCATION: NICTA - 7 London Circuit
CONTACT: JavaScript must be enabled to display this email address.

ABSTRACT:
Multimedia data such as images, web pages, videos, music, etc. are now available in abundance. The increasing availability demands the development of novel representations to tackle the unique challenges posed by the multimedia content. The primary challenge being heterogeneous nature --- data with multiple information modalities --- of the content e.g. web pages which contain both images and text, videos which contain both images and audio, songs with associated lyrics, etc. In almost all these situations, different representations are adopted for different modalities, thereby making it nearly impossible to operate across them using traditional retrieval approaches. In this talk, the problem of cross-modal retrieval from multimedia repositories is considered. This problem addresses the design of retrieval systems that support queries across content modalities, e.g., using an image to search for texts. Two hypotheses are then investigated. The irst is that low-level cross-modal correlations should be accounted for. The second is that the joint space should enable semantic abstraction. Three new solutions to the cross-modal retrieval problem are then derived from these hypotheses: correlation matching (CM), an unsupervised method which models cross-modal correlations, semantic matching (SM), a supervised technique that relies on semantic representation, and semantic correlation matching (SCM), which combines both. It is concluded that both hypotheses hold, in a complementary form, although the evidence in favor of the abstraction hypothesis is stronger than that for correlation.
BIO:
Nikhil Rasiwasia received the B.Tech degree in electrical engineering from Indian Institute of Technology Kanpur (India) in 2005. He received the MS and PhD degrees from the University of California, San Diego in 2007 and 2011 respectively, where he was a graduate student researcher at the Statistical Visual Computing Laboratory, in the ECE department. Currently, he is working as scientist for Yahoo Labs! Bangalore, India. In 2008, he was recognized as an `Emerging Leader in Multimedia' by IBM T. J. Watson Research. He also received the best student paper award at ACM Multimedia conference in 2010. His research interests are in the areas of computer vision and machine learning, in particular applying machine learning solutions to computer vision problems.

Updated:  3 December 2013 / Responsible Officer:  JavaScript must be enabled to display this email address. / Page Contact:  JavaScript must be enabled to display this email address. / Powered by: Snorkel 1.4