Skip to main content

Living Web Archives

Project description

Digital libraries and technology-enhanced learning LiWA developed web archiving tools able to capture content from a wide variety of sources, to improve archive fidelity and authenticity, and to ensure long term interpretability of web content.

The interest in Web content preservation is strongly growing, not only in traditional library and archival organisations, but also in sectors such as industry and services. But the typical characteristics of Web content - variety of formats, high dynamics, volatility, interactivity and context-dependency - make adequate Web archiving a particular challenge. With the LiWA project, Web archiving has been established as a new topic for scientific research and development within the digital preservation domain.

At the centre of the project was the concept of 'Living Web Archives', as opposed to the current practice of producing periodic snapshots of pages. 'Living' here refers to:

  • long term interpretability as the archive evolves and adapts over time,
  • improved archive fidelity and authenticity by filtering out irrelevant information,
  • captured content from a wide variety of sources.

To enhance archive fidelity and authenticity, LiWA has developed and tested new methods based on content interpretation and intelligent pattern detection of traps and Web spam. This allows reducing the amount of fake content and helping prioritise crawls by automatically detecting content of value.

To improve the integrity and temporal, structural and semantic coherence of Web archives, some work was dedicated to temporal Web archive construction. This serves the objective to significantly improve content positioning in time and (topic) space and will lay the foundations for fast and effective access to evolving Web content.

To facilitate archive interpretability, LiWA applied methods for semantic and terminology extraction, able to detect and handle evolving semantics, interpretations of domain concepts and terminology. This is a contribution to the task of preserving the usefulness, quality, and accessibility of Web archives over time.

For validating the LiWA approach, two demonstrator applications have been built on top of the LiWA services. The applications focus on the social Web and on the special challenge of archiving audio-visual content.

The potential benefit of this research is twofold: Archiving institutions will be able to automatically archive higher volumes of dynamic and volatile digital content, resulting in a significant increase of preserved digital content. Archive users will benefit from the higher quality of archive content and improved search services.

Field of science

  • /social sciences/economics and business/business and management/commerce
  • /natural sciences/computer and information sciences/internet
  • /social sciences/sociology/governance/public services
  • /humanities/history and archaeology/history
  • /social sciences/media and communications/library science/archives

Call for proposal

FP7-ICT-2007-1
See other projects for this call

Funding Scheme

CP - Collaborative project (generic)

Coordinator

GOTTFRIED WILHELM LEIBNIZ UNIVERSITAET HANNOVER
Address
Welfengarten 1
30167 Hannover
Germany
Activity type
Higher or Secondary Education Establishments
EU contribution
€ 655 623
Administrative Contact
Thomas Risse (Dr.)

Participants (7)

NARODNI KNIHOVNA CESKE REPUBLIKY
Czechia
EU contribution
€ 53 738
Address
Klementinum 190
110 01 Praha 1
Activity type
Public bodies (excluding Research Organisations and Secondary or Higher Education Establishments)
Administrative Contact
Libor Coufal (Mr.)
MORAVSKA ZEMSKA KNIHOVNA V BRNE
Czechia
EU contribution
€ 52 120
Address
Kounicova 65A
601 87 Brno
Activity type
Other
Administrative Contact
Petr Žabička (Ing.)
MAX-PLANCK-GESELLSCHAFT ZUR FORDERUNG DER WISSENSCHAFTEN EV
Germany
EU contribution
€ 415 350
Address
Hofgartenstrasse 8
80539 Munich
Activity type
Other
Administrative Contact
Gerhard Weikum (Prof.)
STICHTING INTERNET MEMORY FOUNDATION
Netherlands
EU contribution
€ 629 000
Address
Keizersgracht 62-64
1015 CS Amsterdam
Activity type
Research Organisations
Administrative Contact
JULIEN MASANES (Mr.)
SZAMITASTECHNIKAI ES AUTOMATIZALASI KUTATOINTEZET
Hungary
EU contribution
€ 298 400
Address
Kende Utca 13-17
1111 Budapest
Activity type
Research Organisations
Administrative Contact
Takacs Gabriella (Mrs.)
STICHTING NEDERLANDS INSTITUUT VOORBEELD EN GELUID
Netherlands
EU contribution
€ 152 860
Address
Media Parkboulevard 1
1200 BB Hilversum
Activity type
Other
Administrative Contact
Harm Post (-)
HANZO ARCHIVES LIMITED
United Kingdom
EU contribution
€ 425 280
Address
Clifton Street 64
EC2A 4HB London
Activity type
Private for-profit entities (excluding Higher or Secondary Education Establishments)
Administrative Contact
Mark Middleton (Mr.)