Authors
Vimal K Reddy, Eric Rotenberg, Sailashri Parthasarathy
Publication date
2006/10/20
Journal
ACM SIGARCH Computer Architecture News
Volume
34
Issue
5
Pages
83-94
Publisher
ACM
Description
Redundant threading architectures duplicate all instructions to detect and possibly recover from transient faults. Several lighter weight Partial Redundant Threading (PRT) architectures have been proposed recently. (i) Opportunistic Fault Tolerance duplicates instructions only during periods of poor single-thread performance. (ii) ReStore does not explicitly duplicate instructions and instead exploits mispredictions among highly confident branch predictions as symptoms of faults. (iii) Slipstream creates a reduced alternate thread by replacing many instructions with highly confident predictions. We explore PRT as a possible direction for achieving the fault tolerance of full duplication with the performance of single-thread execution. Opportunistic and ReStore yield partial coverage since they are restricted to using only partial duplication or only confident predictions, respectively. Previous analysis of Slipstream fault …
Total citations
2007200820092010201120122013201420152016201720182019202020212022202369194251362241131
Scholar articles