## Periodic Reporting for period 1 - HOMOVIS (High-level Prior Models for Computer Vision)

**Reporting period:**2015-06-01

**to**2016-11-30

## Summary of the context and overall objectives of the project

"In recent years, computer vision as a technology has found its way into smart phones, autonomous cars, medical image processing, industrial inspection and surveillance tasks. Due to the every increasing amount of available visual data, it is expected that computer vision will play an even more important role for the society.

The space of "natural" images is extremely large. For example considering the space of images of only $64\times 64$ pixels and $256$ different gray values, one can generate already $10^{10000}$ different images, a number which is by far larger than the (estimated) number of atoms in the universe. In this space, natural (or plausible) images are lying only on a certain thin manifold with a much smaller dimension. However, the structure of the manifold turns out to be extremely complicated, because it has to reflect many different transformations of the images such as translations, rotations, brightness changes etc. A major aim of computer vision and image processing is to develop mathematical models to describe these manifolds, for example for image reconstruction, image classification, stereo etc. While most existing mathematical models are limited to local pixel interactions and hence only have a sense of edges, the aim of the research project is to develop high-level prior models that should eventually be able to describe the manifolds of objects "faces" of "dogs" or "cars".

Inspired by the findings of the structure of the visual cortex by the Nobel prize winners Hubel and Wiesel, we propose in this project to represent images in the so-called roto-translation space, which decomposes the local image gradient into its magnitude and its orientation. The roto-translation space serves as a (simplified) mathematical model to the pattern of organization of the cells in the visual cortex. A major advantage of the representation in the roto-translation space is that one can easily get a sense of curvature and hence continuity of object boundaries which is known to be a very strongly prior of the human visual system.

The ultimate goal of the project is to develop high-level prior models in the roto-translation space that can go beyond the notion of continuity of object boundaries and hence provide a better understanding of the structure of natural images."

The space of "natural" images is extremely large. For example considering the space of images of only $64\times 64$ pixels and $256$ different gray values, one can generate already $10^{10000}$ different images, a number which is by far larger than the (estimated) number of atoms in the universe. In this space, natural (or plausible) images are lying only on a certain thin manifold with a much smaller dimension. However, the structure of the manifold turns out to be extremely complicated, because it has to reflect many different transformations of the images such as translations, rotations, brightness changes etc. A major aim of computer vision and image processing is to develop mathematical models to describe these manifolds, for example for image reconstruction, image classification, stereo etc. While most existing mathematical models are limited to local pixel interactions and hence only have a sense of edges, the aim of the research project is to develop high-level prior models that should eventually be able to describe the manifolds of objects "faces" of "dogs" or "cars".

Inspired by the findings of the structure of the visual cortex by the Nobel prize winners Hubel and Wiesel, we propose in this project to represent images in the so-called roto-translation space, which decomposes the local image gradient into its magnitude and its orientation. The roto-translation space serves as a (simplified) mathematical model to the pattern of organization of the cells in the visual cortex. A major advantage of the representation in the roto-translation space is that one can easily get a sense of curvature and hence continuity of object boundaries which is known to be a very strongly prior of the human visual system.

The ultimate goal of the project is to develop high-level prior models in the roto-translation space that can go beyond the notion of continuity of object boundaries and hence provide a better understanding of the structure of natural images."

## Work performed from the beginning of the project to the end of the period covered by the report and main results achieved so far

"In this section, we give a short chronological list of works performed since the start of the project:

* Together, with A. Chambolle, we have been invited to write a review paper in the Acta Numerica Journal, which was a great honor for us. The paper contains a broad overview of continuous optimization approaches for image processing and computer vision. The paper was published in the beginning of 2016.

* In the paper "Total Variation on a Tree", we have worked on efficient solvers of total variation minimization. The idea is to combine fast non-iterative solvers based on dynamic programming on tree-like graphs (e.g. chains) with continuous first-order primal-dual algorithms. The resulting algorithms appear to be extremely efficient and can be applied to a variety of convex and non-convex total-variation based imaging problems.

* In the paper "Inertial Proximal Alternating Linearized Minimization (iPALM) for Nonconvex and Nonsmooth Problems", we have developed an inertial algorithm for non-smooth and non-convex optimization. The algorithm is particularly suited for dictionary learning problems, hence we expect that the algorithm will also become important when learning high-level prior models in the roto-translation space.

* In the paper "Acceleration of PDHG on partially strongly convex functions", we have developed a principle to accelerated first-order primal-dual algorithms for partially strongly convex functions. The work provides a theoretical understanding of the fact that existing primal-dual algorithms on partially strongly convex functions often perform much better compared to their theoretical worst-case complexity."

* Together, with A. Chambolle, we have been invited to write a review paper in the Acta Numerica Journal, which was a great honor for us. The paper contains a broad overview of continuous optimization approaches for image processing and computer vision. The paper was published in the beginning of 2016.

* In the paper "Total Variation on a Tree", we have worked on efficient solvers of total variation minimization. The idea is to combine fast non-iterative solvers based on dynamic programming on tree-like graphs (e.g. chains) with continuous first-order primal-dual algorithms. The resulting algorithms appear to be extremely efficient and can be applied to a variety of convex and non-convex total-variation based imaging problems.

* In the paper "Inertial Proximal Alternating Linearized Minimization (iPALM) for Nonconvex and Nonsmooth Problems", we have developed an inertial algorithm for non-smooth and non-convex optimization. The algorithm is particularly suited for dictionary learning problems, hence we expect that the algorithm will also become important when learning high-level prior models in the roto-translation space.

* In the paper "Acceleration of PDHG on partially strongly convex functions", we have developed a principle to accelerated first-order primal-dual algorithms for partially strongly convex functions. The work provides a theoretical understanding of the fact that existing primal-dual algorithms on partially strongly convex functions often perform much better compared to their theoretical worst-case complexity."

## Progress beyond the state of the art and expected potential impact (including the socio-economic impact and the wider societal implications of the project so far)

In continuous optimization, we have proposed different algorithms that significantly go beyond the state-of-the-art in continuous optimization.

In case of total variation minimization, our proposed hybrid algorithms based on dynamic programming and continuous optimization appear as the currently fastest algorithms for such problems. In particular, if only approximate solutions are necessary, the proposed algorithms are of high interest because they deliver good solutions already after a very small number of iterations.

For non-smooth and non-convex optimization, we have proposed an inertial variant of the proximal alternating linearization method. We have proven convergence of the sequence of iterates in case of semi-algebraic functions. The inertial PALM algorithm appear to be significantly faster compared to the original PALM algorithm and it shows a certain ability to overcome spurious stationary points.

Is has been known for some time that first-order primal-dual algorithms perform significantly better than their theoretical worst-case performance on problems which are only partially strongly convex in either the primal or dual variable. Examples include total generalized variation minimization, image reconstruction involving a linear operator, e.g. image deconvolution or MRI reconstruction. We have shown that we can partially accelerate first-order primal-dual algorithms and hence can get fast convergence at least at the block of variables corresponding to the strongly convex part of the problem.

In case of total variation minimization, our proposed hybrid algorithms based on dynamic programming and continuous optimization appear as the currently fastest algorithms for such problems. In particular, if only approximate solutions are necessary, the proposed algorithms are of high interest because they deliver good solutions already after a very small number of iterations.

For non-smooth and non-convex optimization, we have proposed an inertial variant of the proximal alternating linearization method. We have proven convergence of the sequence of iterates in case of semi-algebraic functions. The inertial PALM algorithm appear to be significantly faster compared to the original PALM algorithm and it shows a certain ability to overcome spurious stationary points.

Is has been known for some time that first-order primal-dual algorithms perform significantly better than their theoretical worst-case performance on problems which are only partially strongly convex in either the primal or dual variable. Examples include total generalized variation minimization, image reconstruction involving a linear operator, e.g. image deconvolution or MRI reconstruction. We have shown that we can partially accelerate first-order primal-dual algorithms and hence can get fast convergence at least at the block of variables corresponding to the strongly convex part of the problem.