Séminaire équipe IPI – Orateurs : Nina S. T. Hirata (University of Sao Paulo) et Zaohui Che
2 juillet 2019 @ 14 h 00 min - 16 h 00 min
Le prochain séminaire IPI aura lieu mardi 2 juillet 2019 à 14h, en salle D005 sur le site de Polytech.
L’équipe recevra 2 orateurs :
- Nina S. T. Hirata : maître de conférences au département d’informatique à l’Institute of Mathematics and Statistics de l’Université de São Paulo.
- Zaohui Che : doctorant à l’Institute of Image Communication and Network Engineering à l’Université de Shanghai Jiao Ton, et actuellement accueilli au sein de l’équipe IPI.
Présentation 1 :
Title: Spatial and Hierarchical relations in image understanding
Abstract: Understanding image content is a challenging task that requires identifying image components and their spatial and hierarchical relationships. We start by presenting some image decomposition examples and reviewing some existing approaches used to model the spatial and/or hierarchical relationships. Our ultimate aim is to develop an end-to-end approach that transforms an input image into its structural decomposition, containing information on its constituent components, as well as spatial and hierarchical relationships among them. We describe some ongoing efforts, current status and which are the expected results.
Short bio: Nina S. T. Hirata holds a PhD degree in Computer Science from the University of São Paulo, Brazil. Currently, she is an associate professor of the Computer Science Department, Institute of Mathematics and Statistics, at the same university. Her main research interest is in the development and application of machine learning based techniques on image analysis and understanding.
Présentation 2 :
Title: The robustness of deep saliency models
Abstract: Currently, a plethora of saliency models based on deep neural networks have led great breakthroughs in many complex high-level vision tasks. The robustness of these models, however, has not yet been studied. In this talk, we explore the robustness of deep saliency models when suffering from two types of perturbations, i.e. random transformations and deliberate perturbations.
On the one hand, most of current studies on human attention and saliency modeling have used high-quality stereotype stimuli. In real world, however, captured images undergo various types of transformations. Can we use these transformations to augment existing saliency datasets? Here, we first create a novel saliency dataset including fixations of 10 observers over 1900 images degraded by 19 types of transformations. Second, by analyzing eye movements, we find that observers look at different locations over transformed versus original images. Third, we utilize the new data over transformed images, called data augmentation transformation (DAT), to train deep saliency models. We find that label-preserving DATs with negligible impact on human gaze boost saliency prediction, whereas some other DATs that severely impact human gaze degrade the performance. These label-preserving valid augmentation transformations provide a solution to enlarge existing saliency datasets.
On the other hand, we propose a sparse feature-space adversarial attack method against deep saliency models for the first time. The proposed attack only requires a part of the model information, and is able to generate a sparser and more insidious adversarial perturbation, compared to traditional image-space attacks. Theseadversarial perturbations are so subtle that a human observer cannot notice their presences, but the model output[1]s will be revolutionized. This phenomenon raises security threats to deep saliency models in practical applications. We also explore some intriguing properties of the feature[1]space attack, e.g. 1) the hidden layers with bigger receptive fields generate sparser perturbations, 2) the deeper hidden layers achieve higher attack success rates, and 3) different loss functions and different attacked layers will result in diverse perturbations.
We also propose a novel saliency model based on generative adversarial network (dubbed GazeGAN), which achieves the best performance in terms of popular saliency evaluation metrics, and is more robust to various perturbations.
Short bio: Zhaohui Che received the B.E. degree from School of Electronic Engineering, Xidian University, Xi’an, China, in 2015. He is currently working toward the Ph.D. degree at the Institute of Image Communication and Network Engineering, Shanghai Jiao Tong University, Shanghai, China. His research interests include image saliency detection, image quality assessment, deep learning, and deep model robustness research.