Objective
The objective of FTMPS is to develop techniques and system software capable of accommodating component failures in massively parallel computers in order to permit extremely long executions of application code, where a real-time response is not required.
A transputer-based system, featuring redundant processor nodes, a fault-tolerant communications network architecture and an independent network of control processors provides the environment for the work.
The project examines:
concurrent failure detection on a node and system basis
checkpointing and restart of applications
post-failure recovery behaviour
quantitative failure modelling.
Fields of science
Topic(s)
Data not availableCall for proposal
Data not availableFunding Scheme
Data not availableCoordinator
52072 Aachen
Germany
See on map
Participants (5)
PR1 1RE Preston
See on map
91058 Erlangen
See on map
3000 Leuven
See on map
3000 Coimbra
See on map
33098 Paderborn
See on map