Skip to main content

Content-Aware Wireless Networks: Fundamental Limits, Algorithms, and Architectures

Periodic Reporting for period 2 - CARENET (Content-Aware Wireless Networks: Fundamental Limits, Algorithms, and Architectures)

Reporting period: 2020-04-01 to 2021-09-30

he ERC AdV Project CARENET focuses on massive on-demand content data distribution based on coded caching.
The project has a strong information theoretic and coding theoretic focus, and considers the fundamental limits of on-demand data delivery through a communication network, where the data is stored in one or more servers, and must be delivered on demand to several users. Since the data is already available (e.g. a library of multimedia content files), and the users have locally accessible cache memory, then the data can be pre-stored strategically in the network during off-peak time in order to facilitate the delivery at peak time.
Unlike conventional content distribution networks, coded caching exploit the power of network coding and has the potential of achieving much better scaling
of the network load versus the number of demanding users.

To give a concrete and simple example: consider a system with one wireless server and two users. Users 1 demands file A, but has cached file B. User 2 demands file B, but has file A in its cache. Then the served can broadcast the XOR A+B of the two files, satisfying both demands with a single (coded) file transmission. On the other hand, conventional caching without coding would need to transmit both files A and B, thus doubling the load on the broadcast downlink.

Of course, this concept can be applied in full generality to a variety of network topologies, and the investigation of the theoretical limits, the practical coding algorithms, and the demonstration and implementation on an actual wireless network, are the subject of the research carried out in CARENET.

The societal impact of this research consists of the improvement of communication networks and in particular wireless networks, which has demonstrated a fully critical and very important infrastructure especially in the last year and a half of Covid pandemic. If the economy and society in EU and in the world did not reach a complete halt and collapse it is (mainly) thanks to the fact that fortunately citizens have widespread access to broadband internet and massive segments of society have been transferred from the real world (office, shops, factories) to the virtual online world.
The pandemic should be a reminder of the fact that an efficient and reliable access to internet is a critical strategic factor to guarantee society and economy resilience. Therefore, improving this critical infrastructure has a clear societal impact in improving the readiness and resilience level of our society.
he main results achieved so far are summarized as follows:

1) We have fully characterized the fundamental limits of Device to Device (D2D) coded caching, establishing conditions for the exact optimality of an improved scheme with respect to our previously proposed scheme (which was order-optimal but not exactly optimal) under uncoded prefetching (i.e. under the condition that the users store directly segments of the library files, and not functions thereof).

2) We have formulated a new problem of coded caching with privacy, where the privacy of the users demands must be preserved with respect to the other users in the system. In fact, in the original schemes of coded caching, every user could be aware and easily determine which file other users have demanded. This would represent a significant privacy breach in any practical system. In contrast, we have proposed a novel scheme to achieve information theoretic privacy of the demands and characterized its order-optimality, both for the single server topology and for the D2D topology. IN the latter case, privacy is achieved by using a ``trusted server'' that collects the user demands and sends back encrypted messages to the users, instructing what they should send to their peers.

3) We have significantly expanded the network topologies for which coded caching was previously studied, including novel ``Fog'' topologies with partially shared caches.

4) We have considered the implementation of coded caching over practical ``routing networks'', that serves as a guideline for implementing coded caching at the application layer, i.e. ``above IP''. In our work, we have addressed practically all open problems that were listed in a previous publication as practical stumbling blocks for the practical adoption of coded caching. In particular, these were: the inability of intermediate nodes such as wireless routers and base stations to store cached content; the fact that coded caching should be run at the server-client level, by some third party content distribution entity, while the core network and the wireless access are fixed by some protocol (e.g. 5G) and run by another entity (a wireless network provider); the problem of subpacketization, which must be limited for finite length files; the problem of decentralized operations, for which each user should be able to cache independently from other users; the problem of asynchronous streaming sessions, where each user starts and stop at arbitrary times.

5) We have started to build our testbed demonstration at TU Berlin, in collaboration with a startup company based in Munich, called CADAMI. We plan to keep collaborating with CADAMI also after the duration of the ERC project.
This collaboration will bring the theoretical ideas developed in the project to practice, and represents a potential for technology transfer.

6) We have extended the coded caching scenarios from jus file delivery to computation, i.e. the delivery of functions of such data files. For example, we have determined the fundamental information theoretic limits of coded caching for linear function computation, where each user demands an arbitrary linear combination of the data files.
We have also considered the problem of data shuffling, where users must exchange data such that each user, at the end of the exchange, has a different data block in an assigned data block permutation. This scheme can be useful in distributed implementation of deep learning, where the training of a large network is performed in a distributed fashion, and
workers (computation nodes) exchange data blocks in the form of ``mini-batches'' for stochastic gradient computation.
All the achievements stated above are beyond the state of the art, where by ``state of the art'' it is intended the status of the theoretical research at the starting date of the CARENET project, i.e. 01/10/2018.

To list the main progress in our personal view, they are:

1) the formulation and solution of the private coded caching problem, which has already generated many tens of papers from other groups and raised an enormous interest in the network coding and information theory community.

2) the practical solution of all stumbling blocks mentioned in the paper:

Paschos G, Bastug E, Land I, Caire G, Debbah M. Wireless caching: Technical misconceptions and business barriers. IEEE Communications Magazine. 2016 Aug 12;54(8):16-22.

which we have solved in the paper

Mozhgan Bayat;Kai Wan;Giuseppe Caire, Coded Caching over Multicast Routing Networks
IEEE Transactions on Communications, early access (available on IEEE Xplore), Year: 2021

3) the testbed that we are building at TU Berlin in collaboration with CADAMI, which will demonstrated unprecendented efficiency of on-demand data delivery over a WLAN with multiple access points.
concept of coded caching over a wireless routing network (above IP, and transparent to the network)
Connectivity and interference graph representing the network for optimization of the delivery phase