Final Activity Report Summary - FILD (Providing Real-time Feedback on Internet Packet Loss and Delay)
During the second period of the project, we continued along three directions:
(1) We improved AudIt: As originally designed, AudIt had one disadvantage: it required that each ISP maintain per-flow state. Under normal circumstances, an ISP should have the resources to maintain and process such state. However, under extraordinary circumstances (e.g. denial-of-service attacks that generate an unusual rate of new flows), maintaining per-flow state may be too expensive. Hence, we designed an improved version of AudIt, which allows each participating ISP to arbitrarily reduce the amount of state it maintains at the cost of also reducing the quality of the feedback it provides. Our main result was to show that (like the original AudIt) our improved version prevents ISPs from lying about their performance, yet (unlike the original AudIt) it allows each ISP to choose its own trade-off between the resources it uses and the quality of the feedback it provides.
(2) We investigated deployment options: The main criticism we received from the academic community for AudIt is that it requires adding (modest) functionality to ISP network equipment, in particular, border routers. Unfortunately, the data-plane of modern high-speed routers is typically built in hardware, making it very hard to add any functionality to it, however modest. This motivated us to develop Routebricks, a router that achieves comparable performance with hardware routers, yet is built entirely in software, hence, is fully programmable. If ISPs used such software routers, then deploying AudIt (or any new protocol) would be significantly easier. Routebricks is essentially a cluster of commodity servers running Linux and Click. We implemented a prototype that achieves aggregate throughput 35Gbps. Routebricks was developed in collaboration with Intel Research, Berkeley. Intel has showed a lot of interest in our work, as that would enable it to enter the market of high-speed routers.
(3) We started to explore an alternative approach: According to AudIt, ISPs voluntarily report their performance to end-systems. The alternative approach is to enable end-systems to collaboratively infer ISP performance by performing end-to-end measurements. This approach is called 'network tomography' and has already been thoroughly studied by other researchers, but at a theoretical level (to our knowledge, there exists no real tomography-based system that measures Internet performance). Our contribution was to study tomography from a practical point of view. First, we proved the limits of what tomography can do in real networks, i.e. up to how many failures it can detect. Second, we started to develop Netscope, an actual tomography-based tool that runs on PlanetLab nodes and infers the loss rates of Internet links. Netscope is developed in collaboration with Swisscom, the biggest ISP in Switzerland, which has expressed interest in using it to monitor the status of its neighbouring networks, in order to be able to choose good routes for its customers.