Periodic Reporting for period 2 - RED (Robust, Explainable Deep Networks in Computer Vision)
Periodo di rendicontazione: 2022-03-01 al 2023-08-31
In this project, we aim to significantly advance deep neural networks in computer vision toward improved robustness and explainability. To that end, we will investigate structured network architectures, probabilistic methods, explainable AI techniques, and hybrid generative/discriminative models, all with the goal of increasing robustness and gaining explainability. This is accompanied by research on how to assess robustness and aspects of explainability via appropriate datasets and metrics. While we aim to develop a toolbox that is as independent of specific image and video analysis tasks as possible, the work program is grounded in concrete vision problems, e.g. scene understanding and motion estimation, to monitor progress. We expect the project to have significant impact in applications of computer vision where robustness is key, data is limited, and user trust is paramount.
Second, we have advanced the explainability of DNNs in computer vision in various ways. On the one hand, the novel structured neural architectures discussed above improve the inherent explainability of these models. Beyond this, we have developed practical tools to estimate uncertainties in DNNs in highly efficient ways, allowing to quantify the uncertainty of the prediction as well as to better understand the inner workings of existing neural architectures. Additionally, we have developed highly practical approaches for obtaining post-hoc explanations from deep neural networks, offering not only the possibility to better understand existing deep neural networks but also to train new neural networks that suppress undesirable behavior such as predictions that are not well aligned with human values or human comprehension.
Third, we have worked on comprehensive benchmarking methodologies and created novel datasets, which allow to assess the quality of explainable AI algorithms as well as certain inherently explainable DNNs in computer vision in a quantitative fashion.
In the direction of technical foundations, we have for example developed algorithms that allow to estimate the uncertainty of both the predictions as well as of the “inner workings” of a deep neural network in highly practical ways. We have also devised highly efficient algorithms for computing so-called feature attributions, which can be understood as visual maps that highlight to a human user which parts of the input image were primarily responsible for a certain prediction of a deep neural network. This makes estimating uncertainties and feature attributions much more practical than before, allowing to shed significantly more light into otherwise rather opaque deep neural networks for a variety of image and video analysis tasks.
In the remainder of the project, we will specifically aim to address more complex scene analysis scenarios, such as a joint reconstruction and semantic analysis of 3D scenes. Moreover, we will leverage our previous work on estimating uncertainties in deep neural networks to reduce the dependence of deep neural networks on large amounts of training data, such as by improving the effectiveness of transfer learning. Another direction will be to significantly expand the scope of our efforts for an in-depth understanding of explainable AI techniques in computer vision. In a similar vein, we will also investigate how explainable AI techniques can be applied to much larger families of computer vision models with the expectation that we help to significantly increase user trust in critical applications of computer vision.