In this paper we consider the rollback propagation and the performance of a fault-tolerant multiprocessor with a rollback recovery mechanism (FTMR2M) , which was designed to be tolerant of hardware failure with minimum time overhead. Rollback propagation between cooperating processes is usually required to ensure correct recovery from failure. To minimize the waste of processor time and storage overhead required for handling sophisticated rollback propagations, the FTMR2M always keeps one recoverable state. Approaches for evaluating the recovery overhead and analyzing the performance of FTMR2M are presented. Two methods for detecting rollback propagations and multi-step rollbacks between cooperating processes are also proposed.
|Number of pages
|Proceedings - International Symposium on Computer Architecture
|Published - Apr 26 1982
|9th Annual Symposium on Computer Architecture, ISCA 1982 - Austin, United States
Duration: Apr 26 1982 → Apr 29 1982
ASJC Scopus subject areas
- Hardware and Architecture