Thursday, 28 August 2014

The History Of Fault Tolerance System

- Over the past half century, binary computing machines have seen many changes and have exponentially grown in complexity and speed. Early computers functioned effectively without the aid of an incorporated fault tolerance system and relied solely on programmers to detect the erroneous compilation of code. The first computers simply failed after executing flawed code and would only then be inspected and repaired. Early attempts to address faults in machines included the "N modular redundancy" and "M of N majority voting". In the "N modular redundancy" technique the system automatically detects and fixes the faulty module or circuit card and then notifies the operator of the error. The "M of N majority voting" technique uses three or more of any replicable hardware component (essentially three identical machines) and executes commands simultaneously checking for disagreements in the system. In the example of three parallel machines, the system will function as follows:

1) If all three machines agree, the system resumes execution.

2) If two of the machines agree but one is divergent, the execution resumes using the two agreeing machines and the one in disagreement is reported as faulty.

3) If all three machines disagree, the system halts and ends execution.

Both techniques are still in wide-use today but are slowly being replaced by new methods that seek to provide reliability for systems with growing complexity.

No comments:

Post a Comment