Skip to main content

Protecting and Preserving Human Knowledge for Posterity

Final Report Summary - PPP (Protecting and Preserving Human Knowledge for Posterity)

The amount and variety of content being published online is growing at an
explosive rate. Online publishing enables content to reach a much larger
audience than paper publishing but offers no guarantee of long-term access
to the content. The PPP Project investigates techniques for building a
large, reliable peer-to-peer system for the preservation of online
published material. The system consists of a large number of low-cost,
persistent web caches (peers) that cooperate to detect and repair damage
by voting in "opinion polls" on the content of their cached documents.
The peers are autonomous and mutually suspicious. Project activities
include 1) investigating defenses against adversaries whose goal is to
attack the preservation process; 2) performing a foundational study of the
interconnections between identity, trust, and reputation models in
peer-to-peer systems; 3) investigating the use of estimates of peer
diversity to increase the fault and attack tolerance of peer-to-peer
systems; and 4) developing, analyzing, implementing, and testing new
protocols that address a large spectrum of data preservation challenges
including: the high frequency of updates of online government documents,
the large volumes of scientific data, and the privacy concerns of sensitive
user data.

This work is being evaluated using a real testbed of over 200 libraries
around the world with the support of publishers representing thousands of
titles. The broader impact of the work is that all electronic material
preserved through the system including academic journals, government
documents and web articles, and scientific data will remain
accessible to generations of citizens for both research and education