Home
Patent Search
IMT Blog
REGISTER
|
SIGN IN
United States Patent
4914657
Walter , ; et al.
April 3, 1990
Title
Operations controller for a fault tolerant multiple node processing system
Abstract
An operations controller for a multiple node fault tolerant processing system having a transmitter for transmitting inter-node messages, a plurality of receivers, each receiving inter-node messages from only one of the nodes and a message checker for checking each received message for physical and logical errors. A fault tolerator assembles all of the errors detected and decides which nodes are faulty based on the number and severity of the detected errors. A voter generates a voted value for each value which is received from the other nodes which is stored in a data memory by a task communicator. A scheduler selects the tasks to be executed by an applications processor which is passed to the task communicator. The task communicator passes the selected task and the data required for the execution of that task to the applications processor and transmits the data resulting from that task to all of the nodes in the system. A synchronizer synchronizes the operation of its own node with all of the other nodes in the system.
Inventors:
Walter; Chris J.
(Columbia,
MD
)
, Kieckhafer; Roger M.
(Ellicott City,
MD
)
, Finn; Alan M.
(Amston,
CT
)
Assignee:
Allied-Signal Inc.
(Morris Township,
Morris County
)
Appl. No.:
038813
Filed:
April 15, 1987
Current U.S. Class:
714/4
714/797
709/248
714/12
Field of Search:
371/9,11,36 364/200
U.S. Patent Documents
4392199
July 1983
Schmitter et al.
4438494
March 1984
Budde et al.
4503534
March 1985
Budde et al.
4503535
March 1985
Budde et al.
Primary Examiner:
Atkinson; Charles E.
Attorney, Agent or Firm:
Massung; Howard G.
Claims
What is claimed is:
1. In a fault tolerant multiple node processing system wherein each node has an applications processor for executing a predetermined set of tasks, wherein each task in said predetermined set of tasks is included in the predetermined set of tasks of at least one other node in the processing system and an operations controller for establishing and maintaining its own node in synchronization with every other node in the system, for controlling the operation of its own node, and for selecting from an active task list the tasks to be executed by its own application processor in coordination with all of the other nodes in the system through the exchange of inter-node messages with all of the other nodes in the system, said active task list containing a selected subset of said predetermined set of tasks, the operations controller comprising:
a transmitter for transmitting all of the inter-node messages generated by its own operations controller to all the nodes in the system including its own node over a private communication link, said transmitter having an arbitrator for deciding the order in which said inter-node messages are to be transmitted when two or more messages are ready for transmission at the same time;
a plurality of receivers, each receiver associated with a respective one of said multiple nodes and only receiving messages from that node;
a message checker connected to said plurality of receivers for checking each received message for physical and logical errors to generate an internal error report containing an error status byte identifying each detected error, said message checker polling each of said receivers to unloadd the received messages in a repetitive sequence;
a voter subsystem having a voter for voting on the content of all error free messages having a value produced by the execution of the same task in said at least one other node to generate a voted value and a deviance checker for generating an internal error report containing a deviance vector identifying each node which sent a message used in the generaion of said voted value whose value differed from the voted value by more than a predetermined deviance value;
a fault tolerator connected to said message checker, said voter subsystem and said transmitter for passing all error free messages received from said message checker to said voter subsystem, for generating an inter-node error message containing all of said error reports accumulated by all the subsystems which is sent to all of the nodes in the system by said transmitter, for generating a base penalty count message containing a base penalty count for each node in the system based on the number of errors detected and the severity of the detected errors identified in said internal error reports which is sent to all of the nodes in the system by said transmitter, for globally verifying the base penalty count for each node through the exchange of inter-node base penalty count messages, and for generating a system state vector identifying each node whose base penalty count exceeds a predetermined exclusion threshold;
a task scheduler connected to said fault tolerator for selecting the next task to be executed by the node's own applications processor from said active task list, for replicating the scheduling of other nodes in the system, for maintaining a global data base in the scheduling and execution of tasks by each node through the exchange of task completed/started messages received from the fault tolerator, and for generating an error report identifying each node whose scheduling process differs from the scheduling process replicated for that node, said task scheduler further having meand to reconfigure said active task list in response to said system state vector received from the fault tolerator indicating a change in the number of non-excluded nodes;
a data memory;
a task communicator connected to said voter subsystem, said data memory, said task scheduler, the transmitter and the applications processor for storing said voted values received from said voter subsystem in said data memory, for passing the identity of the task selected by the task scheduler to the applications processor, for extracting from said data memory the voted values required for the execution of the selected task and passing them to the applications processor, for generating said task completed/started messages identifying the task just completed and the new task started by the applications processor which is transmitted to all the nodes by said transmitter, and for generating inter-node data value messages containing the data values generated by the applicationsprocessor in the execution of the selected tasks which are also transmitted to all the nodes by said transmitter; and
a synchronizer connected to said message checker, said task scheduler and said transmitter for synchronizing the operation of its own node with all of the other non-faulty nodes in the system through the exchange of inter-node time-dependent messages, said synchronizer generating a time-dependent message which is transmitted by said transmitter to all the nodes in the system, storing a time stamp signifying the local time which each time-dependent message received from said message checker is and correcting the synchronization of said task scheduler of its own node based on the difference between the time stamp of its own time-dependent message and a voted time stamp derived from the time stamps for all the time-dependent messages received within a predetermined time window.
2. The operations controller of claim 1 wherein said transmitter comprises:
a first interface for receiving the inter-node messages generated by said fault tolerator;
a second interface for receiving the inter-node messages generated by said task communicator;
a synchronizer interface for receiving said inter-node time-dependent messages generated by said synchronizer;
an arbitrator connected to said first, second, and synchronizer interfaces responsive to said first, second, and synchronizer interfaces having received messages to be transmitted for arbitrating the priorities of these messages to generate a transmit signal identifying which message is to be transmitted, said arbitrator delaying the transmission of the inter-node messages received by said first and second interfaces if their transmission will interfere with the transmission of said time dependent messages;
a parallel-to-serial converter for converting said inter-node messages to a serial format for transmission over said private communication link to all of the noes in said processing system; and
a first multiplexer connected to said first, said second, and said synchronizer interfaces, said arbitrator and said parallel-to-serial converter for passing the inter-node message stored in one of said first, second, and synchronizer interfaces to said parallel-to-serial converter in response to said transmit signal generated by said arbitrator.
3. The operations controller of claim 2 wherein each of said inter-node messages has a message type code identifying the type of information contained in the message and a data identification code which uniquely identifies the particular data value contained in the message, said message checker comprising:
sequencer means connected to said plurality of receivers for context switching among said plurality of receivers in a predetermined sequence to unload the received inter-node messages from said plurality of receivers;
a context storage connected to said plurality of receivers and said sequencer means for storing the relevant information pertaining to the message being processed, said context storage having one entry for each node, each of said entries storing at least the message type code, the data identification code, a byte count which identifies the specific byte being processed and an error status byte;
error check logic means connected to said plurality of receivers, said sequencer means and said context storage for checking the node identification code contained in the message with reference to the expected node identification code associated with the receiver which received the message, for checking the message type code, for checking the data identification code against a maximum data identification code, and for checking the number of bytes contained in the message, said error check logic means recording all detected errors in said error status byte stored in said context storage;
between limits checker means connected to said plurality of receivers and said error check logic means for checking the data value contained in the mesages against predetermined maximum and minimum limit values and for reporting an exceeds limit error to said error check logic means whenever the data value contained in a message is not within said maximum and minimum limit values, said error check logic means recording said exceeds limits error in said error statusbyte; and
multiplexer means connected to said sequencer, context storage and said fault tolerator for passing on each received message to said fault tolerator for further processing, said multiplexer means appending to each message, as it is passed on, an error report containing said error status byte currently stored in said context storage.
4. The operations controller of claim 3 wherein said fault tolerator comprises:
a data memory connected to said message checker for storing the content of all error free inter-node messages received from said message checker;
an error file for storing the content of all of the received error reports;
error handler means connected to said message checker, said synchronizer, said task scheduler and said voter subsystem for storing the error reports received from said message checker, said synchronizer, said task scheduler and said voter subsystem in said error file and for generating a base penalty count for each node from the content of said error file, said base penalty count being indicative of the operational status of that particular node, said error handler means further having means for determining which nodes are faulty and for excluding such faulty nodes, in coordination with all of the other nodes in the system, from participating in the operation of said multiple node processing system through the exchange of said inter-node messages, said inter-node messages including error messages, each containing the content of said error file for a particular node and a base penalty count message containing the base penalty count of each node; and
interface means for storing in said data memory all of the error free messages passed by said message checker, for passing the identities of the faulty nodes to said task scheduler and said synchronizer, and for passing all error reports to said error handler.
5. The operations controller of claim 4 wherein said voter subsystem comprises:
an upper medial value sorter for sorting the data values received from said fault tolerator to generate an upper medial value;
a lower medial value sorter for sorting the data values received from said fault tolerator to generate a lower medial value;
averaging means connected to said upper and lower medial value sorters for averaging said upper and lower medial values to generate said voted value;
deviance checker means connected to said upper and lower media value sorters for comparing in parallel the content of each received message with said upper and lower medial values to generate said deviance vector; and
loader means connected to said fault tolerator for loading the content of the messages received from said fault tolerator into said upper and lower medial value sorters and said deviance checker means.
6. The operations controller of claim 5 wherein said task scheduler comprises:
a task activity list containing an entry for each active task in said multiple node processing system, each entry containing an execution periodicity and a node allocation for that task;
a priority scan list containing a selected portion of the tasks in said taskactivity list which are available for execution, said selected portion of said tasks being stored in their preferred order of execution;
a completion status list storing the same selected portion of said tasks stored in said priority scan list;
a selection queue storing for each node the tasks ready for selectionin their preferred order of execution;
a period counter for counting fundamental timing periods to generate a period count corresponding to the number of fundamental periods which have expired since the beginning of a new master period;
wake-up sequencer means connected to said task activity list, said priorityscan list and said completion status list for interrogating said task activity list to transfer to said priority scan list and said completion status list all of the tasks contained in said task activity list whose periodicity is greater than said period count;
prioritiy scan means connected to said priority scan list and said selection queue for transferring to said selection queue for each node entry three tasks stored in said priority scan list which are ready for execution by that node in their preferred order of execution;
task selector means connected to said selection queue for selection in said preferred order of execution a task currently stored in said selection queue for its own node as the next task scheduled for execution by its own applications processor; and
a task interactive consistency handler connected to said fault tolerator for updating the status of each task in said task activity list, said priority scan list, said completion status list and said selection queue which are identified in inter-node messages reporting the completion of a task.
7. The operations controller of claim 6 wherein said inter-node messages have a data identification code identifying the type of data contained in that inter-node message and a message type code identifying the type of the inter-node message, said task communicator comprising:
a data memory for storing said voted values, said data memory having at least two partitions identified by a context bit, each partition having a plurality of entries for storing said voted values;
a context bit memory connected to said data memory for storing a context bit for each data identification code, said context bit identifying said voted values stored in said data memory which are ready for use in the execution of tasks by the applications processor;
a task terminated recorder connected to said task scheduler for complementing the context bit in said context bit memory in response to said task terminating signal generated by said task scheduler;
a store data control connected to said voter subsystem for storing said voted values in said data memory using said message type code, said data identification code and the complement of said context bit associated with the voted value as an address for the appropriate entry in said data memory;
a next task register connected to said task scheduler for storing the task identification code of the task selected by the task scheduler for execution by the applications processor;
an input FIFO register accessible by the applications processor for storing the identity of the next task to be executed by the applications processor and the voted values required for the execution of said next task;
an input handler connected to said next task register, said data memory and said transmitter, responsive to the applications processor completing the preceding task to generate said task completed/started message sent to said transmitter to transmit to sall of the nodes in the processing system, to transfer the task identification code of said next task stored in said next task register to said input FIFO register, and to access said data memory for the voted values required for the execution of said next task, said input handler using said context bits to identify which voted values in said data memory are to be used in the execution of said next task;
an output FIFO register for receiving from said applications processor the data vaues resulting from the execution of each task; and
an output handler connected to said output FIFO register for generating data value messages sent to said transmitter for transmission to all the nodes in said multiple processing system, said data value messages containing the data values stored in the said output FIFO register and the identification code for the data values.
8. The operations controller of claim 7 wherein said time-dependent messages include alternating sync and pre-sync time-dependent messages, said synchronizer comprising:
a message interface connected to said message checker for receiving said sync and pre-sync time-dependent messages;
counter means for generating a local time;
a time stamp memory having one entry for each node in the system, each entry storing a time stamp for the most recent time-dependent message received from the associated node;
a time stamper connected to said message interface, and said counter means, said time stamper responsive to receiving a time-dependent message from said message interface for each node for generating a time stamp corresponding to the local time at which said time-dependent message is received and for storing said time stamp in said time stamp memory in the entry associated with the node from which the time-dependent message was received;
a time stamp voter connected to said time stamp memory for generating a voted time stamp corresponding to a medial value of the time stamps stored in said time stamp memory for said pre-sync time-dependent messages;
a sync correction generator connected to said time stamp memory and said time stamp voter for generating sync delta having a value corresponding to the difference between said voted time stamp and the time stamp of its own pre-sync time-dependent message;
means connected to said sync correction generator for adding said sync delta to a nominal transmission timing interval to generate an actual transmission timing interval, said nonimal transmission timing interval corresponding to a nominal timing interval between the end of the transmission of the sync time-dependent message and the passing of the pre-sync time-dependent message to said transmitter; and
message generator means connected to said means for adding and said transmitter for generating said sync and pre-sync time-dependent messages, said message generator means passing said pre-sync time-dependent messages to said transmitter at the end of said nominal transmission timing interval and passing said sync time-dependent messages to the transmitter at the end of said actual transmission timing interval.
Description
CROSS REFERENCE
This invention is related to commonly assigned, copending patent application Ser. Nos. 038,818 and 039,190 filed concurrently on Apr. 15, 1987.
BACKGROUND OF THE INVENTION
1. Field of the Invention
The invention is related to the field of multiple node processing systems and in particular to an operations controller one for each node in the multiple node processor system, for each operations controller controlling the operation of its own node in a fault tolerant manner.
2. Description of the Prior Art
The earliest attempts to produce fault tolerant computer systems provided redundant computers in which each computer simultaneously executed every task required for the control operation. Voting circuits monitoring the outputs of the multiple computers determined a majority output which was assumed to be the correct output for the system. In this type of system, a faulty computer may or may not be detected and the faulty computer may or may not be turned off.
The redundant computer concept, although highly successful, is expensive because it requires multiple computers of equivalent capabilities. These systems require powerful computers because each computer has to perform every task required for the operation of the system. As an alternative, the master-slave concept was introduced in which the operation of several computers was controlled and coordinated by a master control. The master control designated which tasks were to be executed by the individual computers. This reduced the execution time of the control operation because all the computers were no longer required to execute every task, and many of the tasks could be executed in parallel. In this type of system when a computer is detected as faulty, the master could remove it from active participation in the system by assigning the task that would normally have been assigned to the faulty computer to the other computers. The problem encountered in the master-slave concept is that the system is totally dependent upon the health of the master and if the master fails then the system fails. This defect may be rectified by using redundant master controls, however, the increased cost of redundant masters limits the applicability of these systems to situations where the user is willing to pay for the added reliability. Typical of such situations are the controls of nuclear power plants, space exploration and other situations where failure of the control system would endanger lives.
Recent improvements to the master-slave and redundant execution fault tolerant computer systems discussed above are exemplified in the October 1978 proceedings of the IEEE, Volume 66, No. 10, which is dedicated to fault tolerant computer systems. Of particular interest are the papers entitled "Pluribus: An Operational Fault Tolerant Microprocessor" by D. Katuski et al., Pages 1146-1159 and "SIFT: The Design and Analysis of a Fault Tolerant Computer for Aircraft Control" by J. H. Wensley et al., Pages 1240-1255. The SIFT system uses redundant execution of each system task and of the master control functions. The Pluribus system has a master copy of the most current information which can be lost if certain types of faults occur.
More recently a new fault tolerant multiple computer architecture has been disclosed by Whiteside et al, in U.S. Pat. No. 4,356,546, in which each of the individual task execution nodes has an applications processor and an operations controller which functions as a master for its own node.
The present invention is an operations controller for a fault tolerant multiple node processing system based on the system taught by Whiteside et al in U.S. Pat. No. 4,323,966 which has improved fault tolerance and control capabilities. A predecessor of this operations controller has been described by C. J. Walter et al in their paper "MAFT: A Multicomputer Architecture for Fault-Tolerance in Real-Time Control Systems" published in the proceedings of the Real-Time System Symposium, San Diego, Dec. 3-6, 1985.
SUMMARY OF THE INVENTION
The invention is an operations controller for each node in a fault tolerance multiple node processing system. Each node has an applications processor for executing a predetermined set of tasks and an operations controller for establishing and maintaining its own node in synchronization with every other node in the system, for controlling the operation of its own node, and for selecting the task to be executed by its own applications processor in coordination with all of the other nodes in the system through the exchange of inter-node messages.
The operations controller has a transmitter for transmitting all of the inter-node messages generated by its own operations controller to all the other nodes in the system. The transmitter has an arbitrator for deciding the order in which the inter-node messages are to be transmitted when two or more messages are ready for transmission. The operations controller further has a plurality of receivers, each receiver associated with a respective one node and only receiving messages from that node and a message checker for checking each received message for physical and logical errors to generate an inter-node error report containing an error status byte identifying each detected error. The message checker polls each of the receivers to unload the received messages in a repetitive sequence. A voter subsystem has a voter for voting on the content of all error free messages containing the same information to generate a voted value and has a deviance checker for generating an inter-node error report identifying each node which sent a message used in the generation of the voted value whose content differed from the voted value by more than a predetermined amount.
The operations controller also has a fault tolerator for passing all error free messages received from the message checker to the voter subsystem, for generating an inter-node error message containing all of the error reports accumulated by all of the subsystems of its own operations controller, for generating a base penalty count for each node in the system based on the number of detected errors and the severity of the detected errors identified in such inter-node error reports, for globally verifying the base penalty count for each node through the exchange of inter-node base penalty count messages, and for generating a system state vector identifying each node whose base penalty count exceeds a predetermined exclusion threshold. The operations controller further includes task scheduler for selecting the next task to be executed by its own applications processor from an active task list, for maintaining a global data base on the scheduling and execution of each node through the exchange of task completed/started messages and for generating an error report whose scheduling process differs from the scheduling process replicated for that node.
The operations controller also has a data memory and a task communicator for storing the voted values in the data memory. The task communicator further has means for passing the identity of the task selected by the scheduler to the applications processor, means for extracting the voted values required for the execution of the selected task and passing them to the applications processor, means for generating the task completed/started messages identifying the task just completed and the new task started by the applications processor and for generating internode data value messages containing the data values generated by the applications processor in the execution of the selected tasks.
The operations controller further includes a synchronizer for synchronizing the operation of its own node with all of the other non-faulty nodes in the system through the exchange of inter-node time-dependent messages.
The object of the invention is an architecture for a multiple node fault tolerant processing system based on the functional and physical partitioning of the application task and the overhead functions.
Another object of the invention is a distributed multiple node processing system in which no one node is required to execute every task of the applications task and in which failure of one or more nodes need not prevent execution of any applications task.
Another object of the invention is a multiple node computer architecture in which task selection and fault detection are globally verified.
Another object of the invention is a fault tolerant computer architecture in which the exclusion or readmittance of a node into the active set of nodes is made on a global basis.
These and other objects of the invention will become more apparent from a reading of the specification in conjunction with the drawings.
BRIEF DESCRIPTION OF THE DRAWINGS
FIG. 1 is a block diagram of the multi-computer architecture;
FIG. 2 is a block diagram of the Operations Controller;
FIG. 3 is the master/atomic period timing diagram;
FIG. 4 is the atomic/subatomic period timing diagram;
FIG. 5 is a block diagram of the Transmitter;
FIG. 6 is a circuit diagram of one of the interfaces;
FIG. 7 is a block diagram of the Arbitrator;
FIG. 8 shows waveforms for the Self-Test Arbitration Logic;
FIG. 9 is a block diagram of the Longitudinal Redundancy Code Generator;
FIG. 10 is a block diagram of a Receiver;
FIG. 11 is a block diagram of the Message Checker;
FIG. 12 is a block diagram of the decision logic for the Between Limits Checker;
FIG. 13 is the format for the error status byte generated by the Message Checker;
FIG. 14 is a block diagram of the Fault Tolerator;
FIG. 15 shows the partitioning of the Fault Tolerator RAM;
FIG. 16 shows the format of the Message partition of the Fault Tolerator RAM;
FIG. 17 shows the format of the Error Code Files partition of the Fault Tolerator RAM;
FIG. 18 shows the format of the Group Mapping partition of the Fault Tolerator RAM;
FIG. 19 shows the format of the Error Code Files partition of the Fault Tolerator RAM;
FIG. 20 shows the format of the Penalty Weight partition of the Fault Tolerator RAM;
FIG. 21 is a block diagram of the Fault Tolerator's Message Checker Interface;
FIG. 22 is a block diagram of the Fault Tolerator's Error Handler;
FIG. 23 is a block diagram of the Error Handler's Error Consistency Checker;
FIG. 24 is a block diagram of the Error Handler's Validity Checker;
FIG. 25 illustrates the format of the error byte in an error message;
FIG. 26 is a timing diagram of the reconfiguration sequence;
FIG. 27 is a block diagram of the Voter Subsystem;
FIG. 28 is a flow diagram for the Upper and Lower Medial Value Sorters;
FIG. 29 is a circuit diagram of the Lower Medial Value Sorter;
FIG. 30 is a flow diagram for the Averaging Circuit;
FIG. 31 is a circuit diagram of the Averaging Circuit;
FIG. 32 is a flow diagram of the Deviance Checker;
FIG. 33 is a circuit diagram of a Deviance Checker;
FIG. 34 is a block diagram of the Scheduler;
FIG. 35 shows the data format of the Scheduler RAM;
FIG. 36 shows the data format of the Scheduler ROM;
FIG. 37 is a block diagram of the Scheduler's Task Selector Module;
FIG. 38 is a flow diagram of the Wake-Up Sequencer's operation;
FIG. 39 is a flow diagram of the Execution Timer's operation;
FIG. 40 is a flow diagram of the TIC Handler's operation;
FIG. 41 is a flow diagram of the TIC Handler's Selection Queue Update sub-process;
FIG. 42 is a flow diagram of the TIC Handler's Completion/Termination sub-process;
FIG. 43 is a flow diagram of the TIC Handler's Execution Timer Reset sub-process;
FIG. 44 is a flow diagram of the TIC Handler's Priority Scan List Update sub-process;
FIG. 45 is a flow diagram of the Priority Scanner's operation;
FIG. 46 is a flow diagram of the Next Task Selector's operation;
FIG. 47 is a block diagram of the Reconfigure Module;
FIG. 48 is a flow diagram for the Task Swapper's operation in response to a Node being excluded from the operating set;
FIG. 49 is a flow diagram of the Task Swapper's operation in response to a Node being readmitted to the operating set;
FIG. 50 is a flow diagram of the Task Reallocator's operation in response to a Node being excluded from the operating set;
FIG. 51 is a flow diagram of the Task Status Matcher's operation;
FIG. 52 is a block diagram of the Task Communicator;
FIG. 53 is a partial block diagram of the Task Communicator showing the elements associated with the operation of the Store Data Control;
FIG. 54 is a flow diagram of the Store Data Control's operation;
FIG. 55 is a partial block diagram of the Task Communicator showing the elements associated with the operation of the DID Request Handler;
FIG. 56 is a flow diagram of the DID Request Handler's operation;
FIG. 57 is a partial block diagram of the Task Communicator showing the elements associated with the operation of the Task Terminated Recorder;
FIG. 58 is a flow diagram of the Task Terminated Recorder's operation;
FIG. 59 is a partial block diagram of the Task Communicator showing the elements associated with the operation of the Task Started Recorder;
FIG. 60 is a flow diagram of the Task Started Recorder's operation;
FIG. 61 is a partial block diagram of the Task Communicator showing the elements associated with the operation of the AP Input Handler;
FIG. 62 is a flow diagram of the AP Input Handler's operation;
FIG. 63 is a partial block diagram of the Task Communicator showing the elements associated with the operation of the AP Output Handler:
FIG. 64 is a flow diagram showing the AP Output Handler's operation:
FIG. 65 shows the format of the DID information as stored in the DID List;
FIG. 66 shows the format of the DID information with the NUDAT bit appended;
FIG. 67 is a partial block diagram of the Task Communicator showing the subsystems involved in "reconfiguration";
FIG. 68 is a flow diagram showing the operation of the Reconfigure Control during reconfiguration;
FIG. 69 is a partial block diagram of the Task Communicator showing the subsystems involved in "reset";
FIG. 70 is a flow diagram of the Reset Control during reset;
FIG. 71 is a block diagram of the Synchronizer;
FIG. 72 shows the format of the Synchronizer Memory;
FIG. 73 shows the format of the Message Memory;
FIG. 74 shows the format of the Time Stamp Memory;
FIG. 75 shows the format of the Scratch Pad Memory;
FIG. 76 shows the waveforms of the signals generated by the Timing Signal Generator;
FIG. 77 is a block diagram of the Synchronizer Control;
FIG. 78 is a flow diagram showing the operation of the Data Handler and Expected Message Checker;
FIG. 79 is a flow diagram showing the operation of the Within Hard Error Window and Soft Error Window Checker and the Time Stamper;
FIG. 80 is a flow diagram for the operation of the "HEW to warning count";
FIG. 81 is a partial block diagram of the Synchronizer showing the elements associated with the operation of the Message Generator;
FIG. 82 is a flow diagram of the operation of the Message Generator and the Transmitter Interface;
FIG. 83 shows the waveforms of the timing signals for generating a TIC message;
FIG. 84 shows the waveforms of the timing signals for generating a sync System State message;
FIG. 85 shows the format of the "cold start" pre-sync message;
FIG. 86 is a flow diagram showing the operation of the Synchronizer during a "cold start";
FIGS. 87 and 87a are flow diagrams showing the generation of the HEW to warning signal during "cold start";
FIG. 88 is a flow diagram showing the storing of data during a "cold start";
FIG. 89 is a flow diagram showing the operation of the Operating Condition Detector during a "cold start";
FIG. 90 is a timing diagram used in the description of the "cold start";
FIG. 91 is a flow diagram of the operation of the Synchronizer during a "warm start";
FIG. 92 is a timing diagram used in the description of a "warm start";
FIG. 93 is a flow diagram of the operation of the Byzantine Voter to generate Byzantine voted task completed vector and Byzantine voted branch condition bits for the Scheduler;
FIG. 94 is a perspective of the Byzantine Voter's three-dimensional memory;
FIG. 95 shows the two-dimensional format of ISW vectors resulting from the first Byzantine vote on the three-dimensional ISW matrices; and
FIG. 96 is a functional circuit diagram of the Byzantine Voter.
DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENT
The multi-computer architecture for fault tolerance is a distributed multi-computer system based on the functional and physical partitioning of the application tasks and the overhead functions, such as fault tolerance and systems operations. As shown in FIG. 1, the multi-computer architecture consists of a plurality of Nodes 10a through 10n, each having an Operations Controller 12 for performing the overhead functions and an Applications Processor 14 for executing the application tasks.
For each application, the multi-computer architecture is required to execute a predetermined set of tasks, collectively called application tasks. Each Node is allocated an active task set which is a subset of the application tasks. Each Node in coordination with all of the other Nodes is capable of selecting tasks from its active task set and executing them in a proper sequence. The active task set for each Node may be different from the active task set allocated to the other Nodes and each task in the application tasks may be included in the active task set of two or more Nodes depending upon how many Nodes are in the system and the importance of the task to the particular application. In this way, the multi-compute architecture defines a distributed multi-computer system in which no one Node 10 is required to execute every one of the application tasks, yet the failure of one or more Nodes need not prevent the execution of any application task. As shall be more fully explained later on, the active task set in each Node is static for any given system configuration or system state and will change as the system state changes with an increase or decrease in the number of active Nodes. This change in the active task set called "reconfiguration" takes place automatically and assures that every one of the important or critical application tasks will be included in the active task set of at least one of the remaining active Nodes in the system.
Each Node 10a through 10n is connected to every other Node in the multi-computer architecture through its Operation Controller 12 by means of a private communication link 16. For example, the Operations Controller "A" is the only Operations Controller capable of transmitting on communication link 16a. All of the other Nodes are connected to the communication link 16a and will receive every message transmitted by the Operations Controller "A" over communication link 16a. In a like manner, the Operations Controller "B" of Node 10b is the only Operations Controller capable of transmitting messages on communication link 16b, and Operations Controller N of the Node 10n is the only Operations Controller capable of transmitting messages on communication link 16n.
External information from sensors and manually operated devices collectively identified as Input Devices 20 are transmitted directly to the Applications Processors 14 of each Node through an input line 18. It is not necessary that every Applications Processor receive information from every sensor and/or Input Device, however, each Applications Processor 14 will receive the information from every sensor and/or Input Device which it needs in the execution of the applications task.
In a like manner, the Applications Processor 14 in each Node will transmit data and control signals, resulting from the execution of the applications task to one or more actuators and/or display devices collectively identified as Output Devices
22. The data and/or control signals generated by the Applications Processor 14 in the individual Nodes 10a through 10n may be combined by a Combiner/Voter Network 24 before it is transmitted to the Output Devices 22. Further, when multiple values of the same data and/or control signals are generated by two or more of the Nodes, the Combiner/Voter Network 24 may also be used to generate a single voted value which is transmitted to the Output Devices 22. The use or omission of a Combiner/Voter Network 24 is optional. It is not necessary that every actuator or display receive the output generated by every Node in the system. The specific actuator or display only needs to be connected to the Node or Nodes whose Applications Processor 14 is capable of generating the data or command signals it requires.
The network of Operations Controllers 12 is the heart of the system and is responsible for the inter-node communications, system synchronization, data voting, error detection, error handling, task scheduling, and reconfiguration. The Applications Processors 14 are responsible for the execution of the application tasks and for communications with the Input Devices 20 and Output Devices 22. In the multi-computer architecture, the overhead functions performed by the Operations Controllers 12 are transparent to the operations of the Applications Processor 14. Therefore, the structure of the Applications Processor 14 may be based solely upon the application requirements. Because of this, dissimilar Applications Processors 14
may be used in different Nodes without destroying the symmetry of the multi-computer architecture.
The structural details of the Operations Controller 12 in each Node 10a through 10n are shown in FIG. 2. Each Operations Controller 12 has a transmitter 30 for serially transmitting messages on the Node's private communication link 16. For discussion purposes, it will be assumed that the Operations Controller illustrated in FIG. 2 is the Operations Controller A as shown in FIG. 1. In this case, the Transmitter 30 will transmit messages on the private communication link 16a. Each Operations Controller also has a plurality of Receivers 32a through 32n, each of which is connected to a different private communication link. In the preferred embodiment, the number of Receivers 32a through 32n is equal to the number of Nodes in the multi-computer architecture. In this way, each Operations Controller 12 will receive all of the messages transmitted by every Node in the system including its own. Each Receiver 32a through 32n will convert each message received over the private communication link to which it is connected from a serial format to a parallel format then forward it to a Message Checker 34. Each Receiver 32a through 32n will also check the vertical parity and the longitudinal redundancy codes appended to each of the received messages and will generate an error signal identifying any errors detected.
The Message Checker 34 monitors the Receivers 32a through 32n and subjects each received message to a variety of physical and logical checks. After completion of these physical and logical checks, the messages are sent to a Fault Tolerator 36. Upon the detection of any errors in any message, the Message Checker 34 will generate an error status byte which is also transmitted to the Fault Tolerator 36.
The Fault Tolerator 36 performs five basic functions. First, the Fault Tolerator performs further logical checks on the messages received from the Message Checker 34 to detect certain other errors that were not capable of being detected by the Message Checker 34. Second, the Fault Tolerator passes error free messages to a Voter 38 which votes on the content of all messages containing the same information to generate a voted value. Third, it passes selected fields from the error free messages to other subsystems as required. Fourth, the Fault Tolerator aggregates the internal error reports from the various error detection mechanisms in the Operations Controller and generates Error messages which are transmitted to all of the other Nodes in the system by the Transmitter 30. Finally, the Fault Tolerator 36 monitors the health status of each Node in the system and will initiate a local reconfiguration when a Node is added or excluded from the current number of operating Nodes. The Fault Tolerator 36 maintains a base penalty count table which stores the current base penalty counts accumulated for each Node in the system. Each time a Node transmits a message containing an error, every Node in the system, including the one that generated the message, should detect this error and generate an Error message identifying the Node that sent the message containing the error, the type of error detected, and a penalty count for the detected error or errors. Each Fault Tolerator 36 will receive these Error messages from every other Node and will increment the base penalty count for that Node which is currently being stored in the base penalty count table, if the detection of the error is supported by Error messages received from a majority of the Nodes. The magnitude of the penalty count increment is predetermined and is proportional to the severity of the error. If the incremented base penalty count exceeds an exclusion threshold, as shall be discussed later, the Fault Tolerator initiates a Node exclusion and a reconfiguration process in which the Faulty Node is excluded from active participation in the system and the active task sets for the remaining Nodes are changed to accommodate for the reduction in the number of active Nodes.
The Fault Tolerator 36 will also periodically decrement the base penalty count for each Node in the system so that a Node which was previously excluded may be readmitted into the active system. When a previously excluded Node continues to operate in an error free manner for a sufficient period of time, its base penalty count will be decremented below a readmittance threshold which will initiate a Node readmittance and reconfiguration process in which the previously excluded Node is readmitted into the active system. When the previously excluded Node is readmitted into the system the active task set for each Node is readjusted to accommodate for the increase in the number of active Nodes in the system.
The Voter 38 performs an "on-the-fly" vote using all of the current copies of the data values received from the Fault Tolerator 36. The voted data value and all copies of the received data are passed to a Task Communicator 44 which stores them in a Data Memory 42. The Voter will select a voted data value using an appropriate algorithm as shall be discussed relative to the Voter 38 itself. Each time a new copy of a data value is received, a new voted data value is generated which is written over the prior voted data value stored in the Data Memory 42. In this manner, the Data Memory 42 always stores the most current voted data value assuring that a voted data value is always available for subsequent processing independent of one or more copies of the data value failing to be generated or "hang" causing a late arrival. The Voter 38 will also perform a deviance check between the voted data value and each copy of the received data value, and will generate an error vector to the Fault Tolerator identifying each Node which generated a data value which differed from the voted data value by more than a predetermined amount. This arrangement will support both exact and approximate agreement between the copies of the data values. The Voter 38 supports several data types, included pack boolean values, fixed point formats, and the IEEE standard 32-bit floating point format.
A Scheduler 40 has two modes of operation, normal and reconfiguration. In the normal mode of operation the Scheduler 40 is an event driven, priority based, globally verified scheduling system which selects from its active task set the next task to be executed by its associated Applications Processor 14. For a given system configuration (set of active Nodes) the active task set assigned to each Node is static. Each time the associated Applications Processor begins a task, the Scheduler 40
selects the next task to be executed. The Applications Processor will immediately begin the execution of the selected task and the Task Communicator 44 will immediately initiate the generation of a message informing all of the other Nodes of the identity of the selected task, the identity of the preceding task finished by the Applications Processor 14, and the branch conditions of the preceding task. Conditional branching is controlled by the Applications Processor 14 and is determined by conditions in the application environment. The precedence relationship between a task and its successor task may include conditional branches, concurrent forks, and join operations implemented at task boundaries.
Conditional branching provides an efficient means of switching operational modes and avoids the necessity of scheduling tasks not required by the current conditions. As interactive consistency voting process guarantees agreement on the branch conditions generated by the other Nodes which executed the same task.
The Scheduler 40 is each Node replicates the scheduling process for every other Node in the system and maintains a global data base on the scheduling and execution of tasks by each Node. Upon the receipt of a message from another Node identifying the task completed and the task started, the Scheduler 40 will compare the task completed with the task previously reported as started and generate a scheduling error signal if they are not the same. The Scheduler 40 will also compare the task reported as started with a task it has scheduled to be started by that Node. If they are different, the Scheduler will also generate a scheduling error signal. The Scheduler 40 will pass all scheduling error signals to the Fault Tolerator 36. All of the Scheduler's error detection mechanisms are globally verified and have been designed to ensure that failure of one or more copies of a task does not upset scheduling.
In the reconfiguration mode of operation, a reversible path independent reconfiguration algorithm provides graceful degradation of the workload as faulty Nodes are excluded from the operating system. Because the algorithm is reversible it also supports graceful restoration of the workload as previously excluded Nodes are readmitted following an extended period of error free operation.
In reconfiguration, the active task set allocated to each Node is altered to compensate for the change in the number of active Nodes. During reconfiguration after the exclusion of a faulty Node, the active task set, or at least the critical task of a faulty Node's active task set, may be reallocated and included in the active task set of the other Nodes. In other instances, individual tasks may be globally disabled and replaced with simpler tasks, and some noncritical tasks may be disabled with no replacement. The reconfiguration process readjusts the active task set for the active Nodes to accommodate the system capabilities the algorithm supports true distributed processing, rather than just a replication of uniprocessor task loads on redundant Nodes.
The Task Communicator 44 functions as an input/output (I/O) interface between the Operations Controller 12 and the Applications Processor 14. The Applications Processor 14 signals the Task Communicator 44 when it is ready for the next task. A simple handshaking protocol is employed to synchronize communications between the Applications Processor 14 and the Task Communicator 44. Upon receipt of this signal the Task Communicator 44 reads the selected task from the Scheduler 40 and transfers it to the Applications Processor 14. Concurrently, the Task Communicator 44 will initiate the transmission of the task completed/task started message identifying the task completed by the Applications Processor 14, the task being started by the Applications Processor and the branch conditions of the completed task. The Task Communicator 44 will then fetch the data required for the execution of the started task from the Data Memory 42 and temporarily store it in a buffer in the order in which it is required for the execution of the started task. The Task Communicator will pass these data values to the Applications Processor as they are requested. Effectively, the Task Communicator 44 looks like an input file to the Applications Processor
14.
The Task Communicator 44 also receives the data values generated by the Application Processor 14 in the execution of the selected task and generates Data Value messages which are broadcast by the Transmitter 30 to all of the other Nodes in the system. The Task Communicator will also appende to the Data Value message a data identification (DID) code and a message type (MT) code which uniquely identifies the message as a Data Value message.
A Synchronizer 46 provides two independent functions in the operation of the multi-computer architecture. The first function pertains to the synchronization of the operation of the Nodes 10a through 10n during steady state operation, the second function pertains to the synchronization of the Nodes on start up. During steady state operation, the Synchronizer 46 effects a loose frame base synchronization of the Nodes by the exchange of messages which implicitly denote local clock times. The Synchronizer 46 in each Node counts at its own clock rate, up to a "nominal sync count," then issues a presynchronization System State message which is immediately broadcast by the Transmitter 30 to all of the other Nodes in the system. As the presynchronization System State messages from all the Nodes in the system, including its own, are received at each Node, they are time stamped in the Synchronizer as to their time of arrival from the Message Checker 34. The time stamps are voted on to determine a voted value for the arrival time of the presynchronization System State messages from all the Nodes. The difference between the voted time stamp value and the time stamp of the Node's own presynchronization System State message is an error estimate which is used to compute a corrected sync count. The error estimate includes any accumulated skew from previous synchronization rounds and the effects of clock drift. The Synchronizer 46 will then count up to the corrected sync count and issue a synchronization System State message which is immediately transmitted by the Transmitter 30 to all of the other Nodes in the system. The synchronization System State messages will also be time stamped as to their arrival in the Synchronizers in each Node in the system.
The time stamps of all presynchronization and synchronization System State messages are all compared with the voted time stamp value to determine which Nodes are in synchronization with its own Node and which are not. When the difference in the time stamps exceeds a first magnitude a soft error signal is generated signifying a potential synchronization error. However, if the time stamp difference exceeds a second magnitude, larger than the first magnitude, a hard error signal is generated signifying a synchronization error has definitely occurred. The soft and hard error signals are transmitted to the Fault Tolerator 36 and are handled in the same manner as any other detected error. Start up is defined as a process for creating a functional configuration of Nodes called an "operating set." If an "operating set" is in existence, and the functional configuration is changed by the admittance or readmittance of one or more Nodes, the process is called a "warm start." If no "operating set" is in existence, it is called a "cold start." In a warm start, the Sychronizer 46 will recognize the existence of an operating set and will attempt to acheive synchronization with the operating set. A cold start is initiated by a power on reset (POREST) signal generated in response to the initial application of electrical power to the system. Each Synchronizer 46 will attempt to achieve point-to-point synchronization with all the Nodes until an operating set is formed. Once an operating set is formed, those Nodes not included in the operating set will switch to the warm start process and will attempt to achieve synchronization with the operating set.
INTER-NODE MESSAGES
The operation of the multi-computer architecture depends upon the exchange of data and operational information by the exchange of inter-node messages. These inter-node messages are data-flow instructions which indicate to each individual Operations Controller how it should be processed.
The various inter-node messages and their information content are listed on Table 1.
TABLE I ______________________________________ Inter-Node Message Formats Message Type Description/ Byte Number Abbreviation Number Context ______________________________________ MT0 One Byte 1 NID/Message Type Data Value 2 Data ID 3 Data Value 4 Block Check MT1 Two Byte 1 NID/Message Type Data Value 2 Data I.D. 3-4 Data Value 5 Block Check MT1 Task Interactive 1 NID/Message Type Consistency (TIC) 2 Data I.D. = 0 3 Task Completed Vector 4 Task Branch Condition Bits 5 Block Check MT2 Four Byte 1 NID/Message Type Data Value 2 Data I.D. (D4B) 3-6 Data Value 7 Block Check MT3 Four Byte 1 NID/Message Type Data Value 2 Data I.D. (D4B2) 3-6 Data Value 7 Block Check MT4 Base Penalty 1 NID/Message Type Count 2 Base Count
0 (BPC) 3 Base Count 1 4 Base Count 2 5 Base Count 3 6 Base Count 4 7 Base Count 5 8 Base Count 6 9 Base Count 7 10 Block Check MT5 System State 1 NID/Message Type (SS) 2 Function Bits 3 Task Completed Vector 4 Task Branch Condition Bits 5
Current System State 6 New System State 7 Period Counter (High) 8 Period Counter (Low) 9 ISW Byte 10 Reserved 11 Block Check MT6 Task Completed/ 1 NID/Message Type Started 2 Completed Task ID (TC/S) 3 Started Task ID 4 Branch Condition/ ECC 5 Block Check MT7 Error 1 NID/Message Type (ERR) 2 Faulty Node ID 3 Error Byte 1 4 Error Byte 2 5 Error Byte 3 6 Error Byte 4 7 Penalty Base Count 8 Penalty Increment Count 9 Block Check ______________________________________
The inter-node messages all have the same basic format so as to simplify their handling in the receiving mode. The first byte of each inter-node message contains the Node identification (NID) code of the Node from which the message originated and a message type (MT) code identifying the message type. The last byte in each inter-node message is always a block check byte which is checked by the Receivers 32a through 32n to detect transmission errors.
There are four different Data Value messages which range from a one byte Data Value message to a four byte Data Value message. These Data Value messages are identified as message types MT0 through MT3. The second byte of a Data Value message is a data identification (DID) code which when combined with the message type code uniquely identifies that particular data value from other data values used in the system. The data identification (DID) code is used by the Message Checker 34 to define the types of checks that are to be performed. The MT/DID codes are used to identify which limits will be used by the Message Checker 34 and the deviance to be used by the Voter 38 to define the permissible deviance of each actual data value from the voted values and by the Task Communicator 44 to identify the data value to be supplied to the Applications Processor 14 in the execution of the current task. The bytes following the data identification byte are the data values themselves with the last butye being the block check byte as previously indicated.
A Task Interactive Consistency (TIC) message is a special case of the two byte Data Value message which is identified by the DID being set to zero (0). The Task Interactive Consistency message, message type MT1, is a rebroadcast of the task completed vector and branch condition data contained in Task Completed/Started (TC/S) messages received from the other Nodes and are transmitted at the end of each Subatomic period (SAP), as shall be explained in the discussion of the timing sequence. The information content of the Task Interactive Consistency messages are voted on by each Node and the voted values are used by the Scheduler 40 in the task selection and scheduling process.
A Base Penalty Count (BPC) message, message type MT4, contains the base penalty count that the individual Node is storing for each Node in the system including itself. Each Node will use this information to generate a voted base penalty count for each Node in the system. Thereafter, each Node will store the voted base penalty count as the current base penalty count for each Node. This assures that at the beginning of each Master period each Node is storing the same number of base penalty counts for every other Node in the system. The Base Penalty Count message is transmitted by each Node at the beginning of each Master period timing interval.
A System State (SS) message, message type MT5, is sent at the end of each Atomic period timing interval and is used for the point-to-point synchronization of the Nodes and to globally affirm reconfiguration when a majority of the Nodes conclude that reconfiguration is required. The transmission of the System State message is timed so that the end of its transmission coincides with the end of the preceding Atomic period and the beginning of the next Atomic period. The first byte of the System State message contains the node identification (NID) code of the originating Node and the message type (MT) code. The second byte contains three function bits, the first two bits are the synchronization and presynchronization bits which are used in the Synchronization process described above. The third bit identifies whether or not the Node is operating or excluded. The third and fourth bytes of the System State message are the task completed vector and the branch condition vector, respectively. Byte five contains the current system state vector and byte six contains the the new system state vector. When the sending Node has concluded reconfiguration is necessary, the new system state vector will be different from the current state vector. Byte seven and eight contain the higher and lower order of bits of the Node's own period counter. Byte nine is an "in sync with" (ISW) vector which defines which Nodes that particular Node determines it is synchronized with, and byte ten is reserved for future use. Byte eleven is the conventional block check byte at the end of the message. The Synchronizer uses the time stamp of the pre-synchronization System State messages, identified by the pre-synchronization bit in the second byte being set to generate an error estimate used to compute a correction to the time duration of the last Subatomic period. This correction synchronizes the beginning of the next Atomic period in that Node with the Atomic period being generated by the other Nodes. The period counter bytes are used to align the Master periods of all the Nodes in the system. The period counter counts the number of Atomic periods from the beginning of each period and is reset when it counts up to the fixed number of Atomic periods in each Master period. Byte nine is used only during an automatic cold start as shall also be explained in more detail in the discussion of the Synchronizer 46.
The Task Completed/Started (TC/S) message, message type MT6, is generated by the Task Communicator 44 each time the Applications Processor 14 starts a new task. The second and third bytes of the Task Completed/Started message contain the task identification (TID) codes of the task completed and new task started by the Node's Applications Processor 14. The fourth byte of this message contains the branch condition of the completed task, and an error correction code (ECC).
The last inter-node message is the Error message, message type MT7, which is sent whenever the Transmitter 30 is free during an Atomic period. Only one error message reporting the errors attributed to a particular Node can be sent in an Atomic period. The second byte of the Error message is the Node identification (NID) code of the Node accused of being faulty. The following four bytes contain error flags identifying each error detected. The seventh and eighth bytes of the error message contain the base penatly count of the identified Node and the increment penatly count which is to be added to the base penalty count if the errors are supported by Error messages received from other Nodes. The increment penalty count is based on the number of errors detected and the severity of these errors. This information is used by the other Nodes to generate a new voted base penalty count for the Node identified in the Error message. A separate Error message is sent for each Node which generates a message having a detected error.
TIMING PERIODS
The overall control system of the multi-computer architecture contains a number of concurrently operating control loops with different time cycles. The system imposes the constraint that each cycle time be an integer power of two times a fundamental time interval called an Atomic period. This greatly simplifies the implementation of the Operations Controller 12 and facilitates the verification of correct task scheduling. The length of the Atomic period is selected within broad limits by the system designer for each particular application. The System State messages which are used for synchronization are sent at the end of each Atomic period.
The longest control loop employed by the system is the Master period. Each Master period contains a fixed number of Atomic periods, as shown in FIG. 3. All task scheduling parameters are reinitialized at the beginning of each Master period to prevent the propagation of any scheduling errors. The Nodes will also exchange Base Penalty Count messages immediately following the beginning of each Master period.
The shortest time period used in the system is the Subatomic (SAP) period, as shown in FIG. 4, which defines the shortest execution time recognized by the Operations Controller 12 for any one task. For example, if the execution time of a task is less than a Subatomic period, the Operations Controller 12 will not forward the next scheduled task to the Applications Processor 14 until the beginning of the next Subatomic period. However, when the execution time of a task is longer than a Subatomic period, the Operations Controller 12 will forward the next scheduled task to the Applications Processor as soon as it is ready for it. There are an integer number of Subatomic periods in each Atomic period which are selectable by the systems designer to customize the multi-computer architecture to the particular application. As shown in FIG. 4, each Subatomic period is delineated by a Task Interactive Consistency message as previously described.
TRANSMITTER
FIG. 5 is a block diagram of the Transmitter 30 embodied in each of the Operations Controllers 12. The Transmitter 30 has three interfaces, a Synchronizer Interface 50 receiving Task Interactive Consistency messages and System State messages generated by the Synchronizer 46, a Fault Tolerator Interface 52 receiving the Error and Base Penalty Count messages generated by the Fault Tolerator 36, and a Task Communicator Interface 54 receiving Data Value and Completed/Started messages generated by the Task Communicator 44. The three interfaces are connected to a Message Arbitrator 56 and a Longitudinal Redundancy Code Generator 58. The Message Arbitrator 56 determines the order in which the messages ready for transmission are to be sent. The Longitudinal Redundancy Code Generator 58 generates a longitudinal redundancy code byte which is appended as the last byte to each transmitted message. The message bytes are individually transferred to a Parallel-to-Serial Converter 60 where they are framed between a start bit and two stop bits, then transmitted in a serial format on communication link 16.
The Transmitter 30 also includes a Self-Test Interface 62 which upon command retrieves a predetermined self-test message from an external ROM (not shown) which is input into the Longitudinal Redundancy Code Generator 58 and transmitted to the communication link by the Parallel-to-Serial Converter 60. The Transmitter 30 also has an Initial Parameter Load Module 64 which will load into the Transmitter various predetermined parameters, such as the length of the minimum synchronization period between messages, the length of a warning period for Interactive Consistency and System State messages and the starting address in the ROM where the self-test messages are stored.
As shown in FIG. 6, each of the three interfaces has an eight bit input register 66 which receives the messages to be transmitted from its associated message source through a multiplexer 68. The multiplexer 68 also receives the three bit Node identification (NID) code which identifies the Node which is generating the message.
Whenever the associated message source has a message to be transmitted, it will hold the message until a buffer available signal is present signifying the input register 66 is empty. The message source will then transmit the first byte of the message to the input register 66. A bit counter 70 will count the strobe pulses clocking the message into the Input Register 66 and will in coordination with a flip flop 72 and an AND gate 74 actuate the multiplexer 68 to the clock the three bit Node identification code into the Input Register 66 as the last three most significant bits of the first byte. The flip flop 72 is responsive to the signal "transmit quiet period" (TQP) generated at the end of its preceding message to generate a first byte signal at its Q output which enables AND gates 74 and 76. The AND gate 74 will transmit the three most significant bits generated by the bit counter 70 in response to the strobe signals loading the first byte into the input register 66 and will actuate the multiplexer 68 to load the three bit Node identification code into the three most significant bit places of the input register 66.
The AND gate 76 will respond to the loading of the eighth bit into input register 66 and will generate an output which will actuate the flip flop 78 to a set state. In the set state, the flip flop 78 will generate a message available signal at its Q output and will terminate the buffer available signal at its Q output. The message avaialble (MA) signal will reset the flip flop 72 terminating the first byte signal which in turn disables the AND gates 74 and 76. The message available (MA) signal is also transmitted to the Message Arbitrator 56 signifying a message is ready for transmission.
Termination of the buffer available (BA) signal when the flip flop 78 is put in the set state inhibits the message source from transmitting the remaining bytes of the message to the Transmitter 30. The first three least significant of bits of the first bytes, which are the message type code, are communicated directly to the Message Arbitrator 56 and are used in the arbitration process to determine which message is to be sent if more than one message is available for transmission or if the sending of that message will not interfere with the transmission of a time critical message generated by the Synchronizer 46.
The Message Arbitrator 56 will generate a transmit (Txxx) signal identifying the next message to be sent when there is more than one message ready for transmission. This message will actuate the Longitudinal Redundancy Code Generator 58 to pass the selected message to the Parallel-to-Serial Converter for transmission. The transmit signal will also reset the flip flop 78 in the appropriate interface which reasserts the buffer available (BA) signal, actuating the associated message source to transmit the remaining bytes of the message to the interface. These are then transmitted directly to the Longitudinal Redundancy Code Generator 58 as they are received. When all of the bytes of the message are transmitted, the Message Arbitrator 56
will generate a transmit quiet period (TQP) signal which actuates the Parallel-to-Serial Converter to transmit a null (synchronization signal for a predetermined period of time following the transmission of each message. In the preferred embodiment, the quiet period is a time required for the transmission of 24 bits or two (2) null bytes. The transmit quiet period (TQP) signal will also set the flip flop 72 indicating that the preceding message has been sent and that the next byte received from the associated message source will be the first byte of the next message.
The details of the Message Arbitrator 56 are shown on FIG. 7. Under normal operation when no critical time messages, such as Task Interactive Consistency (TIC) and System State (SS) messages, are to be sent, A Fault Tolerator (FLT) Task Communicator (TSC) Arbitration Logic 82 will generate, in an alternating manner, PFLT and a PTSC polling signals which are received at the inputs of AND gates 84 and 86, resspectively. The AND gate 84 will also receive the Fault Tolerator Message Available (FLTMA) signal generated by the Fault Tolerator Interface 52 while AND gate 86 will receive a Task Communicator message available (TSCMA) signal generated by the Task Communicator Interface 54 after the Task Communicator 44 has completed the loading of the first byte of the message ready for transmission. The outputs of the AND gates 84 and 86 are transmit Fault Tolerator (TFLT) and transmit Task Communicator (TTSC) signals which are applied to AND Gates 88 and 90, respectively. The alternate inputs to AND gates 88 and 90 are received from a Time Remaining-Message Length Comparator 92 which produces an enabling signal whenever the transmission of the selected message will not intefere with the transmission of a time dependent message as shall be explained hereinafter. If the AND gate 88 is enabled it will pass the transmit Fault Tolerator (TFLT) signal to the Fault Tolerator Interface 52 to reassert the buffer available signal, enabling it to receive the remaining bytes of the message from the Fault Tolerator 36 and to the Longitudinal Redundancy Code Generator 58 enabling it to pass the message, byte-by-byte from the Fault Tolerator Interface 52 to the Parallel-to-Serial Converter 60 for transmission on the communication link 16. In a like manner, when the AND gate 90 is enabled, and the polling of the Task Communicator Interface 54 indicates that the Task Communicator 44 has a message ready for transmission, then the AND gate 86 will generate a transmit Task Communicator (TTSC) signal which, if passed by the AND gate 90, will result in the transmission of the Task Communicator's message. The TFLT and the TTSC signals, when generated, are fed back to lock the FLT-TSC Arbitration Logic 82 in its current state until after the message is sent.
The message arbitration between the Fault Tolerator's and Task Communicator's messages is primarily dependent upon the type of the message currently being transmitted. The logic performed by the FLT-TSC Arbitration Logic 82 is summarized on Table II.
TABLE II ______________________________________ FLT-TSC Abitration Logic Table Poll Next Then Poll Next Then Current Message Alternate Wait For Message ______________________________________ Fault Tolerator Task Communicator Task Communicator Fault Tolerator System State Fault Tolerator (Master Period) System State Task Communicator (Atomic Period) Interactive Task Communicator Consistency Self Test Task Communicator ______________________________________
Normally the FLT-TSC Arbitration Logic 82 will poll the Fault Tolerator Interface 52 and the Task Communicator Interface 54 in an alternating sequence. However, at the beginning of each Atomic period, the FLT-TSC Arbitration Logic 82 will first poll the Task Communicator Interface 54 for a Task Completed/Started message which will identify the task being started by that Node. If the Task Completed/Started message is not available it will then poll the Fault Tolerator Interface 52.
At the beginning of each Master period, all of the Nodes should transmit a Base Penalty Count message which is used for global verification of the health of each Node in the system. Therefore, after each System State message which is coincident with the beginning of a Master period, the FLT-TSC Arbitration Logic will first poll the Fault Tolerator Interface 52 and wait until it receives the Base Penalty Count message from the Fault Tolerator 36. After the transmission of the Base Penalty Count message, it will then poll the Task Communicator Interface 54 and transmit a Task Completed/Started message identifying the task scheduled to be started by the Applications Processor. If the Fault Tolerator 36 does not generate a Base Penalty Count message within a predetermined period of time, the FLT-TSC Arbitration Logic 82 will resume polling of the Fault Tolerator Interface 52 and the Task Communicator Interface 54 in an alternating sequence. In a like manner, after a self-test message, the FLT-TSC Arbitration Logic 82 will poll the Task Communicator Interface 54 and wait for a Task Completed/Started message.
The Synchronizer 46 will load the first byte of either a Task Interactive Consistency or System State message in the Synchronizer Interface 50 a predetermined period of time before the beginning of the next Subatomic or Atomic periods. A Warning Period Generator 94 will load a warning period counter with a number corresponding to the number of bits that are capable of being transmitted before the Task Interactive Consistency or System State messages are to be transmitted. As described previously, the transmission of the final bit of either of these messages marks the end of the previous Subatomic or Atomic periods respectively, therefore, their transmission will begin a predetermined time (bit counts) before the end of the period. Since the Task Interactive Consistency and System State messages are of different bit lengths, the number loaded into the warning period counter will be different. The Warning Period Generator 94 will decode the message type code contained in the first byte of the message stored in the Synchronizer Interface 50 and will load the warning period counter with a number indicative of the length of the warning period for that particular type of time critical message. The warning period counter will be counted down at the bit transmission rate of the Parallel-to-Serial Converter 60 to generate a number indicative of the time remaining for the transmission of a time critical message. The number of counts remaining in the warning period counter are communicated to a Synchronizer Transmission Control 96 and the Time Remaining-Message Length Comparator 92. When the warning period counter is counted down to zero the Synchronizer Transmission Control 96 will generate a transmit synchronizer (TSYN) signal which will actuate the Synchronizer Interface 50 to reassert the buffer available signal and will actuate the Longitudinal Redundancy Code Generator 58 to pass the message from the Synchronizer Interface 50 to the Parallel-to-Serial Converter 60
for transmission on the Node's own communication link 16.
The Time Remaining-Message Length Comparator 92 will decode the message type of a message selected for transmission by the FLT-TSC Arbitration Logic and determine the number of bits that have to be transmitted for that message. To this number the Time Remaining-Message Length Comparator 92 will add a number equal to the number of bits corresponding to the quiet period between the messages and compare the sum of the message and the quiet period with the count remaining in the warning period counter to determine if the transmission of the selected message will or will not interfere with the transmission of the time critical message from the Synchronizer Interface 50. If the transmission of the selected message will not interfere with the sending of the time critical message from the Synchronizer 46, the Time Remaining-Message Length Comparator 92 will generate a signal enabling AND gates 88 and 90 to pass the TFLT or TTSC signals, otherwise the Time Remaining-Message Length Comparator 92
will generate a signal disabling AND gates 88 and 90, inhibiting the transmission of the selected message from either the Fault Tolerator Interface 52 or the Task Communicator Interface 54. This signal will also toggle the FLT-TSC Arbitration Logic 82
to poll the nonselected interface to determine if it has a message to transmit. If the nonselected interface has a message ready for transmission, the Time Remaining-Message Length Comparator 92 will determine if there is sufficient time to transmit the message from the nonselected interface before the transmission of the time critical message from the Synchronizer Interface 50. If there is sufficient time, the message from the nonselected interface will be transmitted, otherwise the AND gates 88 and
90 will remain disabled.
The Message Arbitrator 56 also has a Byte Counter 100 which counts the number of bytes transmitted by the Parallel-to-Serial Converter 60. The output of the Byte Counter 100 is received by a Message Byte Logic 102. The Message Byte Logic 102
decodes the message type code of the message being transmitted and determines the number of bytes in that message. After the last byte of the message is transmitted, the Message Byte Logic 102 will first generate a transmit longitudinal redundancy code (TLRC) signal which enables the Longitudinal Redundancy Code Generator 58 to transmit the generated longitudinal redundancy code as the final byte of the message. The Message Byte Logic 102 will then generate a transmit quiet period (TQP) signal enabling the Parallel-to-Serial Converter 60 to transmit the null signal for a predetermined number of bytes which is used for message synchronization. The transmit quiet period (TQP) signal is also transmitted to the Synchronizer Transmission Control
96 where it is used to terminate the transmit synchronizer (TSYN) signal. At the end of the quiet period, the Message Byte Logic 102 will generate an end of quiet period (EQP) signal which will reset the Byte Counter 100 and unlatch the FLT-TSC Arbitration Logic 82 for selection of the next message for transmission.
A Self-Test Arbitration Logic 104 recognizes a request for a self-test in response to a transmitted Task Completed/Started message in which the task identification (TID) code is the same as the Node identification (NID) code. After the transmission of a self-test request message, the Self-Test Arbitration Logic 104 will inhibit a Task Communicator Enable (TSCE) signal and a Fault Tolerator Enable (FLTE) signal as shown in FIG. 8 which, when applied to AND gates 84 and 86, respectively, inhibits all transmissions from the Fault Tolerator Interface 52 or the Task Communicator Interface 54. Immediately following the next Task Interactive Consistency or System State message, the Self-Test Arbitration Logic 104 will generate a transmit self-test (TSLT) signal which will actuate the Self-Test Interface 62 to read the self-test message from an associated off board (read only memory) ROM. The (TSLT) signal will also enable the Longitudinal Redundancy Code Generator 58 to pass the self-test message from the Self-Test Interface 62 to the Parallel-to-Serial Converter 60 for transmission. After transmission of the self-test message, the Self-Test Arbitration Logic 104 will restore the Task Communicator Enable (TSCE) signal to permit the transmission of a Task Completed/Started message signifying the completion of the self-test. As indicated in Table II, the FLT-TSC Arbitration Logic 82 will automatically select the message from the Task Communicator Interface 54 as the next message to be transmitted following the transmission of the self-test message. After the transmission of the Task Completed/Started message the Self-Test Arbitration Logic 104 will terminate the Task Communicator Enable (TSCE) signal until after the next Task Interactive Consistency or System State message is transmitted as indicated in FIG. 8.
The Self-Test Interface 62 serves to transfer the self-test message from the off board ROM (not shown) to the Longitudinal Redundancy Code Generator 58. The off board ROM will store a plurality of Self-test messages which are transmitted one at a time in response each time a Self-test is requested. The first byte of each Self-test message is a number indicative of the number of bytes in the Self-test message which is passed back to the Message Byte Logic 102 to identify the completion of the self-test. The last byte in each self-test message stored in the off board ROM is the starting address for the next Self-test message. The starting address is not transmitted, but rather is stored in the Self-Test Interface 62 to locate the next Self-test message in the off board ROM to be transmitted. The last byte of the last Self-test message stored in the off board ROM contains the starting address of the first Self-test message, so that the Self-test message sequence is repeated. The starting address for the first Self-test message is loaded into the Self-Test Interface 62 by the Initial Parameter Load Module 64 in response to an initial load command generated by the Synchronizer 46 in response to the electrical power being turned on.
As illustrated in FIG. 9, the Longitudinal Redundancy Code Generator 58 has an 4:1 Input Multiplexer 110 which receives the message bytes from the Synchronizer Interface 50, Fault Tolerator Interface 52, Task Communicator Interface 54, and Self-Test Interface 62. The Input Multiplexer 110 controls which message will be transmitted to the Parallel-to-Serial Converter 60 in response to the transmit (TFLT, TTSC, TSYN, and TSLT) signals generated by the Message Arbitrator 56, as previously described. Each byte of a message selected for transmission by the Message Arbitrator 56 is transmitted to an Output Multiplexer 112 by means of nine parallel lines, one for each bit in the received byte plus the parity bit generated by the associated interface. A Longitudinal Redundance (LR) Bit Generator 114 is connected to each of the nine parallel bit lines and collectively generates a nine bit longitudinal redundancy code. Each bit in the longitudinal redundance code is a function of the bit values in the same bit locations in the preceding bytes. The outputs of all the LR bit generators 114 are also received by the Output Multiplexer 112. The Output Multiplexer 112 is responsive to the transmit longitudinal redundancy code (TLRC) signal generated by the Message Arbitrator 56 to output the last bit generated by each of the LR bit generators 114 as the last byte of the message being transmitted. The output of the Output Multiplexer 112 is connected directly to the Parallel-to-Serial Converter 60 which frames each received byte between predetermined start and stop bits before it is transmitted on the Node's communication link.
RECEIVERS
The structures of the Receivers 32a through 32n are identical, therefore, only the structure of the Receiver 32a will be discussed in detail. Referring to FIG. 10, the messages from Node A transmitted on communication link 16a are received by a Noise Filter and Sync Detector 116. The synchronization portion of the Noise Filter and Sync Detector 116 requires that a proper synchronization interval exists prior to the reception of a message. As described relative to the Transmitter 30, the synchronization interval preferably is the time required for the Transmitter 30 to transmit two complete null bytes after each transmitted message.
The low pass portion of the Noise Filter and Sync Detector 116 prevents false sensing of the "start" and "stop" bits by the Receiver 32a due to noise which may be present on the communication link 16a. The low pass filter portion requires that the signal on the communication link 16a be present for four (4) consecutive system clock cycles before it is interpreted as a start or a stop bit. The Noise Filter and Sync Detector 116 will generate a new message signal in response to receiving a start bit after a proper synchronization interval.
After passing through the Noise Filter and Sync Detector 116 the message, byte-by-byte, is converted from a serial to a parallel format in a Serial-to-Parallel Converter 118. The Serial-to-Parallel Converter 118 also determines when a complete
12-bit byte has been received. If the 12-bit byte is not properly framed by a "start" and two "stop" bits, a new bit is added, the bit first received is discarded and the framing is rechecked. Framing errors are not flagged by the Receive 32a since this fault will manifest itself during a vertical parity check. After conversion to a parallel format, the start and stop bits are stripped from each byte and the remaining 9-bit byte is transferred to a Longitudinal Redundancy Code and Vertical Parity Code (LRC and VPC) Checker 122 to check for parity errors. The error checking logic outputs the current combinational value of the vertical parity and the longitudinal redundancy codes. The vertical parity check portion checks the parity vertically across the received message while the longitudinal redundancy code checker portion performs a longitudinal redundancy code check on each byte received from the Serial-to-Parallel Converter 118. The Message Checker 34 decodes the message type information contained in the first byte of the message and determines which byte is the last byte in the message and, therefore, for which byte the longitudinal redundancy code check is valid. The Message Checker 34 will ignore all other LRC error signals generated by the LRC and VPC Code Checker 122.
In parallel with the vertical parity and longitudinal redundancy checks, the 8-bit message byte is transferred to a Buffer 120 which interfaces with the Message Checker 34. The Buffer 120 temporarily stores each 8-bit message byte until the Message Checker is ready to check it. Upon receipt of a message byte, the Buffer will set a byte ready flag signifying to the Message Checker 34 that it has a message byte ready for transfer. The Message Checker 34 will unload the message bytes from the Buffer 120 independent of the loading of new message bytes by the Serial-to-Parallel Converter 118. The 8-bit message bytes are transferred to the Message Checker 34 via a common bus 124 which is shared with all of the Receivers 32a through 32n in the Operations Controller 12. The transfer of the message between the Receivers 32 and the Message Checker 34 is on a byte-by-byte basis in response to a polling signal generated by the Message Checker. The Message Checker 34 will systematically poll each Receiver one at a time in a repetitious sequence.
MESSAGE CHECKER
The details of the Message Checker 34 are shown in FIG. 11. The Message Checker 34 processes the messages received by the Receivers 32a through 32n and verifies their logical content, records any errors detected, and forwards the messages to the Fault Tolerator 36. The operation of the Message Checker 34 is controlled by a Sequencer 126 which context switches among the multiple Receives 32a through 32n in order to prevent overrun of the Buffers 120 in each Receiver. Each Receiver 32a through
32n is polled in a token fashion to determine if it has a message byte ready for processing. If the message byte is ready for processing when it is polled by the Sequencer 126 the byte will be processed immediately by the Message Checker 34. Otherwise the Sequencer 126 will advance and poll the next Receiver in the polling sequence. The Sequencer 126 stores the Node identification (NID) code of the Node 10 associated with each Receiver. The Sequencer 126 also has a Byte Counter associated with each Receiver 32a through 32n which is indexed each time the Sequencer 126 unloads a byte from that particular Receiver. The byte count uniquely identifies the particular byte being processed by the Message Checker 34.
The Sequencer 126 will transfer the Node identification code and the byte count to a Data Multiplexer 128 to tag the message byte as it is transferred to the Fault Tolerator 36. The Node identification code and the byte count are also transmitted to an Error Check Logic 130 and a Context Storage 132. The Error Check Logic 130 will check the Node identification code expected by the Sequencer 126 with the Node identification code contained in the first byte of the message being checked to determine if they are the same. When they are the different the Error Checker Logic 130 will generate an error signal which is recorded in an error status byte being generated in the Context Storage 132. The Node identification code is also used as an address into the Context Storage 132 where the relevant information pertaining to the message being processed is stored. The Context Storage 132 has a separate storage location for each Node 10 in the system which is addressed by the Node identification code contained in the message.
The Context Storage 132 stores the message type (MT) code, the data identification (DID) code, the byte count, an error status byte, a data value mask, and an intermediate error signal for each message as it is being processed. As each byte is unloaded from the Receivers, the information in the Context Storage 132 will be used by an Address Generator 134 with the message type (MT) code, the data identification (DID) code, and the byte count which identifies the specific byte to be processed. In response to this information, the Address Generator 134 will output an address where the required processing information is stored in a Message Checker ROM 136. The Message Checker ROM 136 stores the maximum and minimum values for the data contained in the message, the valid data identification numbers for each message type, and a data mask which identifies how many data values are contained in the message being processed and the number of bytes in each data value.
The maximum and minimum data values are transmitted to a Between Limits Checker 138 which will check the data contained in each data byte against these maximum and minimum values. The Between Limits Checker 138 will generate four different error signals as a result of the between limits checks. The first two are the maximum value (MXER) and minimum value (MNER) error signals, signifying the data value exceeded the maximum value or was less than the minimum value. The other two errors signals are the equal to maximum value (MXEQ) and equal to minimum value (MNEQ) signals. These latter error signals are transmitted to the Error Check Logic 130 which will store them in the Context Storage 132 as intermediate error signals.
The Error Check Logic 130 will OR the vertical parity code and the longitudinal redundancy code error signals generated by the Receiver and generate a parity error signal which is recorded in the error status byte being generated in the Context Storage 132. As previously described, the Error Check Logic 130 will check the expected Node identification (NID) code against the Node identification code contained in the first byte of the message and will check the message type (MT) code by checking to see if bits in bit position 1, 3, and 4 of the first byte are identical. As previously described in the detailed description of the Transmitter 30 the middle bit of the 3-bit message type code is repeated in bit positions 3 and 4 for message type error detections. The Error Check Logic 130 will also check the validity of the data identification (DID) code contained in the second byte of the message against the maximum value for a (DID) code received from the Message Checker ROM 136 and will generate an error signal if the data identification code has a value greater than the maximum value. The Error Check Logic 130 will further check the two's complement range of the appropriate data byte and generate a range error (RNGER) signal when a two's complement error range is detected. It will also record in the Context Storage 132 the maximum (MXER) and the minimum (MNER) error signals generated by the Between Limits Checker 138.
With regard to the Between Limits Checker 138, often it can be determined from the first byte of a multi-byte data value if the data value within or outside the maximum or minimum values received from the Message Checker ROM 136 and checking of the remaining bytes is no longer necessary. However, when the Between Limits Checker 138 generates a MXEQ or MNEQ signal signifying that the data value of the byte beingchecked is equal to either the maximum of minimum limit value, it will be necessary to check the next byte against a maximum or a minimum value to make a factual determination of whether or not the received data value is within or outside the predetermined limits. The Error Check Logic 130 in response to an MXEQ or an MNEQ signal from the Between Limits Checker 138 will store in the Context Storage an intermediate value signal which signifies to the Context Storage 132 that the between limits check is to be continued on the next byte containing that data value. This process will be repeated with the next subsequent byte if necessary to make a final determination. During the checking of the next byte of the particular data value, the Context Storage 132 will supply to the Error Check Logic 130 stored intermediate value which identifies to which limit, maximum or minimum, the data value of the preceding data byte was equal. From this information, the existence or non-existence of a between the limits error can readily be determined by relatively simple logic as shown on FIG.
12. A Decoder 140 responsive to the intermediate value stored in the Context Storage 132 will enable AND gates 142 and 144 if the preceding between limits check generated a signal signifying the data value contained in the preceding byte was equal to the maximum value. Alternatively, the intermediate value will enable AND gates 146 and 148 signifying that the data value contained in the preceding byte was equal to the minimum value. If on the second byte the Between Limits Checker 138 detects a maximum limit error (MXER) and AND gate 142 is enabled, the maximum limit error MXER will be recorded in the error status byte being generated in the Context Storage 132. In a like manner, if a minimum limit error (MNER) is detected on the second byte and the AND gate 146 is enabled, the minimum limit error (MNER) will be stored in the error status byte. If the second byte applies an equal to maximum (MXEQ) or equal to minimum (MNEQ) signal to the inputs of the AND gates 144 and 148, respectively, an intermediate value will again be stored in the Context Storage 132 and the final decision delayed to the next byte. The data value mask received by the Context Storage 132 from the Message Checker ROM 136 identifies the number of individual data values that are in the Data Value message being processed in which data bytes belong to each data value. This mask is used by the Error Check Logic 130 to identify the last byte in each data value. On the last byte of any data value, only maximum or minimum limit errors will be recorded in the Context Storage error status byte. The MXEQ and MNEQ signals will be ignored.
The Error Check Logic 130 will also detect if the message contained the correct number of bytes. The Context Storage 132 stores the message type (MT) code for each message being processed. In response to a message signal received with a message byte from a particular Receiver 32, the Error Check Logic 130 will decode the message type code stored in the Context Storage 132 and generate a number corresponding to the number of bytes that type of message should have. It will then compare this number with the byte count generated by the Sequencer 126 prior to receiving a new message signal from the Receiver 32 and will generate a message length error (LENER) signal when they are not the same. Because the length error (LENER) signal may not be generated until after the error status byte has been sent to the Fault Tolerator 36, the message length error signal will be passed to the Fault Tolerator 36 in the error status byte for the next message received from that Node.
The format of the error status byte formed in the Context Storage 132 is shown in FIG. 13. In an ascending order of bit positions, starting with the least significant or zero bit position the error status byte contains a flag for the parity error (PARER) a flag for the length error (LENER) for the preceding message, a flag bit for the Node identification (NID) error, a flag bit for the data identification (DID) error, a flag bit for the message type (MT) error, a flag bit for the two's complement range error (RNGER) and flag bits for the maximum and minimum limit (MXER and MNER) errors.
Returning to FIG. 11 the Data Multiplexer 128 transmits each message byte directly to the Fault Tolerator 36 as it is processed by the Message Checker 34. The Data Multiplexer will append to each message byte a descriptor byte which contains the Node identification code (NID) and the byte count (BYTC) received from the Sequencer 126 for that particular byte of the message. At the end of the message, independent of its length, the Data Multiplexer 128 will transmit the error status byte stored in the Context Storage 132 as the last byte. The last byte is identified by a byte count "15" so that it can readily be identified by the Fault Tolerator 36 for fault analysis.
FAULT TOLERATOR
The details of the Fault Tolerator 36 are shown on FIG. 14. The Fault Tolerator 36 has a Message Checker Interface 150 which receives the messages byte-by-byte after being checked by the Message Checker 34. Upon receipt of an error free Task Completed/Started message, the Message Checker Interface 150 will forward the identity (NID) of the Node which sent the message condition contained in the message to a Synchronizer Interface 152, the identity (TID) of the new task started, and the branch condition contained in the message to the Scheduler Interface 154. The Message Checker Interface 150 will also send the Node identification (NID) code and the message type (MT) code to a Voter Interface 158 and the data along with a partition bit to a Fault Tolerator RAM Interface 160. The Message Checker Interface 150 will also forward the error status byte (byte=15) generated by the Message Checker 34 to an Error Handler 164 for processing.
The Synchronizer 46 will report to the Error Handler 164 through the Synchronizer Interface 152 any errors it has detected in the Task Interactive Consistency (TIC) and System State (SS) messages. The Scheduler Interface 154 will forward to the Scheduler 40 the task identification (TID) code of the task started and the Node identity (NID) of each received Task Completed/Started message. In return, the Scheduler 40 will transmit to the Error Handler 164 through the Scheduler Interface 154 any errors it has detected.
The Transmitter Interface 156 will forward to the Transmitter 30 the Base Penalty Count and Error messages generated by the Error Handler 164. As previously described, the Transmitter Interface 156 will load the first byte of the message to be transferred into the Transmitter's Input Register to signify it has a message ready for transmission. It will then await the reassertion of the buffer available (BAB) signal by the Transmitter 30 before forwarding the remainder of the message to the Transmitter 30 for transmission.
A Reset Generator 157 is responsive to a reset signal generated by the Error Handler 164 when it determines its own Node is faulty and to a power on reset (POR) signal generated when electrical power is first applied to the Node to generate an Operations Controller reset (OCRES) signal and an initial parameter load (IPL) signal which are transmitted to the other subsystems effecting a reset of the Operations Controller 12.
The Fault Tolerator RAM Interface 160 will store in a Fault Tolerator RAM 162 the data contained in the message bytes as they are received from the Message Checker Interface 150. The Fault Tolerator RAM 162 is a random access memory partitioned as shown in FIG. 15. A message partition section 166, as shown on FIG. 15, stores in predetermined locations the messages received from each Node. In the message partition section 166 the messages are reassembled to their original format using the identifier byte appended to the message bytes by the Message Checker 34. A double buffering or double partitioning scheme is used to prevent overwriting of the data that is still being used by the Voter 38. A context bit generated by the Message Checker Interface 150 determines into which of the two partitions the new data is to be written. Separate context bits are kept for each Node and are toggled only when the error status byte indicates the current message is error free. As previously discussed relative to the Message Checker 34, the message length (LENER) byte of the error status byte signifies that the preceding message had a message length error and, therefore, is ignored in the determination of an error free condition for the current message.
The format for a single message in the message partition section 166 is illustrated in FIG. 16. As shown, the message is reconstructed in its original format in the Fault Tolerator RAM 162 using the Node identification (NID) code and the byte count appended to each message byte in the Message Checker as a portion of the address. The context bit generated by the Message Checker Interface 150, along with the message partition code (bits 8 through 11) generated by the Fault Tolerator RAM Interface 160 completes the address and identifies which of the two locations in the message partition 166 the messages from each Node is to be stored.
The Fault Tolerator RAM 162 has three sections used by the Error Handler 164 for generating the Base Penalty Count and Error messages.
An error code file section 170 stores the error codes used to generate the Error messages transmitted immediately after the beginning of each Atomic period and to generate the increment penalty count which is included in the Error message.
Since there are thirty-five different error detection mechanisms in each Operations Controller 12, there is a possibility of two to the thirty-fifth power of error combinations that may result from each message transmitted in the system. In order to reduce the number of combination of errors to a reasonable number, compatible with the state of the art storage capabilities of the Fault Tolerator RAM 162, the error reports from the various subsystems are formated into special error codes as they are received. The formated error codes, as shown on FIG. 17, include an identification of the subsystem which reported the error plus a flag indication of the errors detected. For example, the error status byte received from the Message Checker 34
is formated into two separate error codes. The first error code contains the subsystem code 0000 which reported the errors and the error flags from the four least significant bits of the error status byte. The second error code contains the subsystem code 0001 and the error flags from the four most significant bits of the error status byte. These error codes are stored in the error code file section 170 at an address defined by the faulty Nodes identification (NID) code and report number as shown in FIG. 19. The error code file section 170 is double partitioned the same as the message partition section 166 so that two error files are stored for each Node. The context bit generated by the Message Checker Interface 150 identifies in which of the two error files for that Node the error code will be reported.
Each error code is used to address a group mapping section 168 of the Fault Tolerator RAM 162. The error code addresses a penalty weight pointer, as shown in FIG. 18, which addresses a penalty weight section 172 of the Fault Tolerator RAM. As shown in FIG. 20, the penalty weight pointer addresses a specific penalty weight which is assigned to the specific combination of reported errors contained in the formated error code. The penalty weights resulting from each error code stored in the error file for that Node are summed in the Error Handler 164 and appended to the Error message as an increment penalty count (byte-8) for that Node. As previously indicated, the Error Handler 164 will generate only one Error message in each Atomic period for each Node which transmitted a message which contained an error.
The Fault Tolerator RAM 162 will also store the deviance limits for the one byte (MT0) two byte (MT1), and four byte (MT2 and MT3) Data Value messages in four separate sections, 174, 176, 178 and 180, which are used by the Voter 38, as shall be explained with reference to the Voter hereinafter.
The details of the Messager Checker Interface 150 are illustrated in FIG. 21. A Store Message Module 182 receives the message bytes directly from the Message Checker 34 and stores them in the message partition section 166 of the Fault Tolerator RAM 162. The Store Message Module 182 will add the context bits stored in a Message Checker Interface Context Store 190 to the descriptor (NID plus byte count) appended to the message byte by the Message Checker 34 to generate a partition address (PID). The partition address identifies the location in the message partition section 166 where the particular message byte is to be stored. As previously discussed, at the beginning of each Master period, each Node will first transmit a Base Penalty Count message followed by a Task Completed/Started message. The Store Message Module 182 stores for each Node a first flag signifying the receipt of the Base Penalty Count message and a second flag signifying the receipt of the subsequent Task Completed/Started message. These flags are set to false at the beginning of each Master period and are reset to true when the Base Penalty Count and the Task Completed/Started messages are received for that Node. Unless both of these flags are set to true the Store Message Module 182 will disable the writing of the address of any subsequently received messages from that Node in a Voter Interface Buffer 184. As a result, the subsequently received data from that Node will not be processed by the Voter
38 and will be ignored during any subsequent processing. The Voter Interface Buffer is a 8.times.7 first in-first out buffer in which the four most significant bits are the four most significant bits of the partition address (context bits plus NID) for the received message in the message partition section 166 of the Fault Tolerator RAM 162. The remaining three bits are the message type code contained in the first byte of the message.
An Error Status Byte Detector 186 listens to the messages being transmitted from the Message Checker 34 to the Fault Tolerator 36 and will detect the receipt of each error status byte (byte 15) generated by the Message Checker 34. If the content of the error status byte, with the exception of the length error (LENER) bit, are all zeros, the Error Status Byte Detector 186 will enable the Message Checker Interface Context Storage 190 to load the Voter Interface Buffer 184 through the Store Message Module 182, or to load a Task Completed Register 202 or to load a Branch Condition Register 200 as required. Otherwise the Error Status Byte Detector 186 will load each non-zero error status byte in an Error Status Buffer 188 for subsequent processing by the Error Handler 164. The Error Status Byte Detector 186 will also detect if a message is a self-test message (TID=NID) set a self-test flag in the Error Status Buffer 188. The Error Status Buffer 188 is an 8.times.12 first in-first out buffer in which the most significant bit is a self-test flag, the next three bits are the Nodes identification (NID) code and the remaining 8-bits are the received error status byte.
The Message Checker Interface Context Storage 190 temporarily stores for each Node the information contained in Table III. This information is temporarily stored since it is not known if the message is error free until the error status byte is received.
TABLE 3 ______________________________________ Message Checker Interface Context Storage Bit Description When Written ______________________________________ 13 TIC Flag MT1, Byte Count = 2 (DID = 0) 12 Partition Context Byte Count = 15 Bit
11-9 Message Type Code Byte Count = 1 8 Branch Condition MT6, Byte Count = 4 Bit 7-0 Started TID MT6, Byte Count = 3 ______________________________________
The most significant bit, bit 13, signifies that the received message is a Task Interactive Consistency (TIC) message which is processed by the Synchronizer 46. This flag is set by a Task Interactive Consistency Message Detector 192 in response to a message type MT1 having a data identification code which are all zero's, (DID=0) and will inhibit the loading of the address of this message in the Voter Interface Buffer 184 since it is only used by the Synchronizer and no other subsystem of the Operations Controller. The twelfth bit is the partition context bit which identifies in which partition of the message partition section 166 the message will be stored. The context bit is toggled when the Error Status Byte Detector 186 indicates the prior message was error free. If the message is not error free, the context bit is not toggled and the next message received from that Node is written over the prior message in the Fault Tolerator RAM 162.
The message type code bits are received directly from the first byte of the message. The branch condition bit, bit-8, is received from a Branch Condition Detector 194 which detects the branch condition contained in the fourth byte of the Task Completed/Started (MT6) message. The identification of the started task (TID) is obtained from a Task Started Detector 196 which loads the TID of the started task into the seven least significant bit locations of the Message Checker Interface Context Storage 190.
Upon the receipt of an error status byte which signifies that the received message was error free and if the message is not a Task Interactive Consistency message, the Message Checker Interface Context Storage 190 will transfer the context bit and the message type to the Store Message Module 182. In the Store Message Module 182, the context bit is added to the Node identifi