PART 1: Efficient Semantic Segmentation and Application for Scene Understanding PART 2: Learning a Deformable Part-based Model in the Virtual World and Adapting it to the Real World
PART 1: Sebastian Ramos PART 2: Jiaolong Xu
NICTA SML SEMINARDATE: 2013-11-28
TIME: 16:00:00 - 17:00:00
LOCATION: NICTA - 7 London Circuit
CONTACT: JavaScript must be enabled to display this email address.
ABSTRACT:
PART 1: State-of-the-art semantic segmentation pipelines often contain conditional random fields (CRF), where the inference process is done by maximum a posteriori probability (MAP) algorithms that optimize an energy function knowing all the potentials. In the first part of this talk, we will focus on CRFs where the computational cost of instantiating the potentials is orders of magnitude higher than MAP inference, as it is often the case in semantic image segmentation, where most potentials are instantiated by slow classifiers fed with costly features. We will describe a novel technique for facing this problem through Active MAP inference, achieving similar levels of accuracy but with major efficiency gains. In the second part of the talk, we will show an application of semantic segmentation for scene understanding in the context of urban scenarios and autonomous vehicles. We will show results of this approach capable of producing dense and semantic 3D maps that are usable in real-time.
PART 2: Detecting pedestrians with on-board vision systems is of paramount interest for assisting drivers to prevent vehicle-to-pedestrian accidents. During the last few years, the so-called (deformable) part-based classifiers including multi-view modeling are usually top ranked in accuracy. Training such classifiers is not trivial since a proper view clustering and spatial part alignment of the pedestrian training samples are crucial for obtaining an accurate classifier. In this talk, first we introduce our work on learning deformable part-based model (DPM) with virtual world data. We showcase how the automatic view clustering and part alignment can be performed by using virtual-world pedestrians, i.e., human annotations are not required. Based on this, we discuss how the part-level supervision can be used to improve the accuracy of the DPM and furthermore, a mixture of parts model can be learned to achieve even better accuracy. As a very common problem in machine learning and computer vision, the accuracy of a classifiers can significantly drop when the training data (source domain) and the application scenario (target domain) have inherent differences. This can also happen to the virtual-world trained classifier. Therefore, adapting the virtual-world trained classifiers to the specific real-world scenario in which the detectors must operate is of paramount importance. In the second part of this talk, we will present several domain adaptation (DA) methods for adapting a virtual-world trained classifier to the real world, including model-transform-based method, feature-transform-based method and ensemble methods, and most of them are focused on adapting the state-of-the-art deformable part-based model.
BIO:
PART 1: Sebastian Ramos is a first year PhD student under the supervision of Prof. Antonio M. LApez, at Computer Vision Center (CVC) in Barcelona and an active member of the Advanced Driver Assistance Systems group at the CVC.
PART 2: Jiaolong Xu is a second year PhD student under the supervision of Prof. Antonio M. LApez, at Computer Vision Center (CVC) in Barcelona and an active member of the Advanced Driver Assistance Systems group at the CVC.





