Home
Patent Search
IMT Blog
REGISTER
|
SIGN IN
United States Patent
6000020
Chin , ; et al.
December 7, 1999
Title
Hierarchical storage management from a mirrored file system on a storage network segmented by a bridge
Abstract
A system for hierarchical data storage management and transparent data backup in a high speed, high volume Fibre Channel Arbitrated Loop environment comprising first and second Fibre Channel Arbitrated Loops, each coupling a Transaction Server and backup HSM server to high speed disk drives and mirrored high speed disk drives respectively. The two loops are coupled by a Bridge compatible with the Fibre Channel Arbitrated Loop protocol which forwards write transactions directed to the mirrored disk drives from the first loop the second but keeps read transaction from the Transaction Server to the high speed disk drives on the first loop isolated from backup and HSM transactions occurring on the second loop between the backup HSM server, the mirrored disk drives and backup storage devices coupled to the backup HSM server.
Inventors:
Chin; Howey Q.
(San Jose,
CA
)
, Chan; Kurt
(Roseville,
CA
)
Assignee:
Gadzoox Networks, Inc.
(,
)
Appl. No.:
825683
Filed:
April 1, 1997
Current U.S. Class:
711/162
370/401
370/403
370/404
370/405
709/214
709/216
709/249
711/111
711/112
711/114
Current International Class:
G06F 11/14 (20060101)
Field of Search:
395/200.44,200.45,200.46,200.79,182.02,182.03,182.04 370/401-409 711/111-114,162 709/214,216,249
U.S. Patent Documents
5179555
January 1993
Videlock et al.
5212784
May 1993
Sparks
5432907
July 1995
Picazo, Jr. et al.
5488731
January 1996
Mendelsohn
5673382
September 1997
Cannon et al.
5694615
December 1997
Thapar et al.
5757642
May 1998
Jones
5771349
June 1998
Picazo, Jr. et al.
5831985
November 1998
Sandorfi
5848251
December 1998
Lomelino et al.
Foreign Patent Documents
A2 0 359 471
Sep., 1989
EP
WO 94/25919
Nov., 1994
WO
WO 96/42019
Dec., 1996
WO
Primary Examiner:
Thai; Tuan V.
Attorney, Agent or Firm:
Fish; Ronald C. Falk & Fish
Claims
What is claimed is:
1. An apparatus comprising:
a primary memory which is a Fibre Channel Arbitrated Loop node;
a mirrored memory which is a Fibre Channel Arbitrated Loop node;
a backup/archival storage device;
means for carrying out read and write transactions with said primary memory over a first Fibre Channel Arbitrated Loop local area network and, for every write transaction to said primary memory, carrying out a write transaction of the same data to said mirrored memory via a second Fibre Channel Arbitrated Loop (hereafter FCAL) local area network and an FCAL bridge which is capable of understanding the Fibre Channel Arbitrated Loop protocols on said first and second (FCAL) local area networks and establishing loop tenancies between said first and second (FCAL) local area networks when a source node is on said first FCAL local area network and a destination node is on said second FCAL local area network, or vice versa, but when said source and destination nodes are both on said first FCAL local area network, keeping the loop tenancy confined to said first FCAL local area network without tying up said second FCAL local area network, and when said source and destination nodes are both on said second FCAL local area network, keeping the loop tenancy confined to said second FCAL local area getwork without tying up said first FCAL local area network, and further comprising means including said bridge for carrying out read and write transactions between said mirrored memory and said backup/archival storage device via said second FCAL local area network without tying up said first FCAL local area network.
2. An apparatus comprising:
a primary memory which is a Fibre Channel Arbitrated Loop (FCAL) node;
a mirrored memory which is a Fibre Channel Arbitrated Loop node;
a backup/archival storage device;
a primary FCAL local area network which is a Fibre Channel Arbitrated Loop coupled to said primary memory;
a secondary FCAL local area network which is a Fibre Channel Arbitrated Loop coupled to said mirrored memory and said backup/archival storage device;
a bridge means which is a Fibre Channel Arbitrated Loop node capable of bridging FCAL loop tenancies between said primary and secondary FCAL local area networks, and specifically for selectively coupling said primary FCAL local area network to said secondary FCAL local area network when open (OPN) primitives arrive from one Fibre Channel Arbitrated Loop which are addressed to a node on the other Fibre Channel Arbitrated Loop, and wherein said bridge means includes means to resolve conflicting OPN situations;
a transaction processor coupled by said primary FCAL local area network to said bridge means and said primary memory and programmed to carry out read and write transactions with said primary memory over said primary FCAL local area network and, for every write transaction to said primary memory, programmed to carry out another write transaction of the same data to said mirrored memory via said bridge means and said secondary FCAL local area network if said secondary FCAL local area network is available, and, if not available, to try the write transaction one or more times again later until the write transaction to said mirrored memory is completed;
a backup and hierarchical storage management processor coupled to said backup/archival storage device and said mirrored memory and said bridge means via said secondary FCAL local area network, and programmed to carry out read and write transactions between said mirrored memory and said backup/archival storage device via said secondary FCAL local area network without involving said primary FCAL local area network because of the presence of said bridge means.
3. An apparatus comprising:
a primary memory which is a Fibre Channel Arbitrated Loop (FCAL) Node;
a mirrored memory which is a Fibre Channel Arbitrated Loop Node;
a backup/archival storage device which is a node for a SCSI bus;
a primary local area network (LAN) which is a Fibre Channel Arbitrated Loop and is coupled to said primary memory:
a secondary local area network which is a Fibre Channel Arbitrated Loop and is coupled to said mirrored memory;
said SCSI bus coupled to said backup/archival storage device;
an FCAL bridge selectively coupling said primary and secondary FCAL local area networks as one larger FCAL LAN when an npan (OPN) primitive arrives from one FCAL LAN which is addressed to a node on said other FCAL LAN or vice versa, and which includes one or more state machines structured, or microprocessors programmed, to implement necessary switching rules to perform bridging and to resolve conflicting OPN situations;
a backup and hierarchical storage management processor coupled to said FCAL bridge via said secondary local area network, and coupled to said backup/archival storage device via said SCSI bus and coupled to said mirrored memory via said secondary local area network, and including a computer or microprocessor programmed to carry out backup and/or hierarchical storage management read and write transactions between said mirrored memory and said backup/archival storage device via said secondary FCAL local area network and said SCSI bus without involving said primary FCAL local area network such that said primary memory is free to carry out read and write transactions during said backup and/or hierarchical storage management operations by virtue of said primary FCAL local area network being isolated by said FCAL bridge from loop tenancies on said secondary FCAL local area network necessary to carry out said backup and/or hierarchical storage management functions;
a transaction processor which is an FCAL node coupled by said primary FCAL local area network to said primary memory and coupled to said backup and hierarchical storage management processor and said mirrored memory via said FCAL bridge and said secondary FCAL local area network, and programmed to carry out read and write transactions with said primary memory over said primary FCAL local area network and, for every write transaction to said primary memory, said transaction processor is programmed to carry out a mirrored write transaction of the same data to said mirrored memory by sending the same data written on said primary memory to said backup and hierarchical storage management processor via said FCAL bridge and said secondary FCAL local area network if said secondary FCAL local area network is available, and, if not available, to try the write transaction one or more times again later until the write transaction to said mirrored memory is completed; and
wherein said backup and heirarchical storage management processor is programmed to receive data transmissions of said mirrored write transactions and write said data to said mirrored memory via said secondary FCAL local area network.
4. An apparatus comprising:
a Transaction Server computer, said Transaction Server computer being a Fibre Channel Arbitrated Loop node;
a first array of one or more disk drives or other storage media, said first array being a Fibre Channel Arbitrated Loop node;
a first Fibre Channel Arbitrated Loop coupling said Transaction Server computer to said one or more disk drives;
a backup server computer, said backup server computer being a Fibre Channel Arbitrated Loop node;
a second array of one or more disk drives or other storage media which mirror the storage capacity of said first array of one or more disk drives or other storage media, said second array being a Fibre Channel Arbitrated Loop node;
a second Fibre Channel Arbitrated Loop coupling said backup server computer to said second array of one or more disk drives or other storage media;
a Bridge compatible with the Fibre Channel Arbitrated Loop protocol which couples said first Fibre Channel Arbitrated Loop to said second Fibre Channel Arbitrated Loop, said Bridge including means for receiving open (OPN) primitives originating from a node on said first Fibre Channel Arbitrated Loop and forwarding said OPN primitive onto said second Fibre Channel Arbitrated Loop if the destination address of said OPN primitive is a node on said second Fibre Channel Arbitrated Loop but not otherwise, and for receiving OPN primitives originating from said node on said second Fibre Channel Arbitrated Loop and forwarding said OPN primitive onto said first Fibre Channel Arbitrated Loop if the destination address of said OPN primitive is said node on said first Fibre Channel Arbitrated Loop but not otherwise.
5. The apparatus of claim 4 wherein said Transaction Server computer is programmed to initiate a mirrored write transaction to said second array of one or more disk drives or other storage media automatically each time said Transaction Server computer initiates a write transaction to said first array of one or more disk drives or other storage media, said mirrored write transaction being the same data that was written to said first array are or more disk drives said mirrored write transaction being initiated by transmitting said OPN primitive on said first Fibre Channel Arbitrated Loop having a destination address which is the node address of said second array of one or more disk drives or other storage media, said mirrored write transaction being carried out if said secondary FCAL local area network is available, and, if not available, said Transaction Server is being programmed to try the write transaction one or more times again later until the write transaction to said mirrored memory is completed.
6. The apparatus of claim 4 further comprising one or more workstation computers and wherein said Transaction Server computer is coupled to said one or more workstation computers by a local area network.
7. The apparatus of claim 4 further comprising one or more backup data storage devices and wherein said backup server computer is coupled to said one or more backup data storage devices by a local area network or a SCSI bus or by said second Fibre Channel Arbitrated Loop.
8. The apparatus of claim 7 wherein said backup server computer is programmed to perform data backup operations by initiating read transactions addressed to said second array of one or more disk drives or other storage media and taking the data received as a result of said read transactions and writing said data to said backup data storage devices.
9. The apparatus of claim 7 wherein said backup server computer is programmed to perform hierarchical storage management operations by scanning the file structures of data stored on said one or more disk drives or other storage media of said second array and initiating selective read transactions for predetermined data stored on said one or more disk drives or other storage media of said second array, said selective read transactions addressed to said second array of one or more disk drives or other storage media and taking the data received as a result of said read transactions and writing said data to a predetermined one of said backup data storage devices.
10. An apparatus comprising:
first and second Fibre Channel Arbitrated Loops;
a first array of disk drives coupled as a node on said first Fibre Channel Arbitrated Loop;
a second array of disk drives coupled as a node on said second Fibre Channel Arbitrated Loop;
a first transaction processing means for performing read transactions with said first array of disk drives via said first Fibre Channel Arbitrated Loop, and for performing write transactions with said first and second arrays of disk drives via said first and second Fibre Channel Arbitrated Loops;
bridge means coupled to said first and second Fibre Channel Arbitrated Loops for selectively coupling said first and second Fibre Channel Arbitrated Loops together as one bigger Fibre Channel Arbitrated Loop when said first transaction processing means is performing a write transaction with said second array of disk drives but for isolating said first and second Fibre Channel Arbitrated Loops at other times, said bridge means including means for resolving conflicting open (OPN) situations; and
further comprising storage management means coupled to said second Fibre Channel Arbitrated Loop for performing hierarchical storage management functions and data backup transactions with said second array of disk drives using only said second Fibre Channel Arbitrated Loop during intervals when said first transaction processing means is not performing write transactions with said second array of disk drives.
11. A method of reading and writing data, comprising:
writing data to and reading data from a first memory using a first Fibre Channel Arbitrated Loop (FCAL) local area network (LAN), a transaction processor coupled to said first Fibre Channel Arbitrated Loop local area network and one or more workstations coupled to said transaction processor;
whenever data is written to said first memory using said first Fibre Channel Arbitrated Loop local area network, writing the same data to a second memory located on a second Fibre Channel Arbitrated Loop local area network using a bridge which has multiple ports coupled to said first and second Fibre Channel Arbitrated Loop local area networks and which is capable of learning node addresses from watching flow of Fibre Channel Arbitrated Loop primitives arriving at said ports of said bridge coupled to said first and second Fibre Channel primitives arriving at said ports coupled to said first and second Fibre Channel Arbitrated Loop local area networks and which is capable of performing switching to selectively couple said first Fibre Channel Arbitrated Loop local area network to said second Fibre Channel Arbitrated Loop local area network when an open (OPN) primitive arrives from a source node on either of said first or second Fibre Channel Arbitrated Loop local area networks with a destination address which indicates destination node is on the other Fibre Channel Arbitrated Loop local area network, and said bridge being capable of resolving conflicting OPN situations and preemptively closing losing source node; and
performing backup write transactions between said second memory and a third memory using said second Fibre Channel Arbitrated Loop local area network while isolating said second Fibre Channel Arbitrated Loop local area network from said first Fibre Channel Arbitrated Loop local area network using said bridge.
12. An apparatus comprising:
a primary memory which is a node on a Fibre Channel Arbitrated Loop;
a mirrored memory which is a node on a Fibre Channel Arbitrated Loop;
a backup/archival storage device;
a primary local area network which is a Fibre Channel Arbitrated Loop and which is coupled to said primary memory;
a secondary local area network which is a Fibre Channel Arbitrated Loop and which is coupled to said mirrored memory;
a bus coupled to said backup/archival storage device;
third local area network which is a Fibre Channel Arbitrated Loop;
backup and heirarchical storage management processor which is a Fibre Channel Arbitrated Loop node and which is coupled to said third local area network, and is also coupled to said backup/archival storage device via said bus and coupled to said mirrored memory via said secondary local area network, and programmed to carry out read and write transactions between said mirrored memory and said backup/archival storage device via said secondary local area network and said bus without involving said primary local area network;
a transaction processor which is a Fibre Channel Arbitrated Loop node which is coupled to said primary memory via said primary local area network and coupled to said backup and heirarchical storage management processor via said third local area network, and programmed to carry out read and write transactions with said primary memory over said primary local area network and, for every write transaction to said primary memory, said transaction processor is programmed to carry out a mirrored write transaction of the same data to said mirrored memory by sending the same data written on said primary memory to said backup and hierarchical storage management processor via said third local area memory via means for carrying Fibre Channel Arbitrated bridging functions to learn node addresses by watching FCAL primitive traffic on said primary and secondary local area networks and to transmit OPNs and RRDYs and other primitives between said primary and secondary local area networks when the source and destination nodes are on different local area networks, and to resolve conflicting OPN situations; and
wherein said backup and heirarchical storage management processor is programmed to receive data transmissions of said mirrored write transactions and write said data to said mirrored memory via said secondary local area network.
13. An apparatus comprising:
a primary memory which is a Fibre Channel Arbitrated Loop node (FCAL);
mirrored memory which is a Fibre Channel Arbitrated Loop node;
backup/archival storage device;
primary FCAL local area network coupled to said primary memory which is a Fibre Channel Arbitrated Loop;
a secondary FCAL local area network coupled to said mirrored memory which is a Fibre Channel Arbitrated Loop;
bus coupled to said backup/archival storage device;
transaction processor which has first and second FCAL interface circuits which are Fibre Channel Arbitrated Loop nodes and which are coupled, respectively, by said primary FCAL local area network to said primary memory, and via said secondary FCAL local area network to said mirrored memory, and having a first bus interface circuit capable of understanding protocol used on said bus coupled to said backup/archival storage device, said transaction processor programmed to carry out read and write transactions with said primary memory over said primary FCAL local area network and, for every write transaction to said primary memory, said transaction processor being programmed to carry out a mirrored write transaction of the same data to said mirrored memory by sending the same data written on said primary memory to said mirrored memory via said secondary FCAL local area network via a bridging function implemented by the program of said transaction processor, said bridging function controlling said transaction processor to recognize OPN primitives arriving from said primary FCAL local area network directed to a node on said secondary FCAL local area network and connect said primary and secondary FCAL local area networks together as one bigger FCAL local area network for a duration of any loop tenancy started by said OPN primitives, said bridging function also programmed to control said transaction processor to resolve conflicting OPN situations, said transaction processor further programmed to carry out read and write transactions of backup and hierarchical storage management functions between said mirrored memory and said backup/archival storage device using said secondary local area network and said bus without involving said primary local area network.
Description
BACKGROUND OF THE INVENTION
The invention pertains to the field of backup systems in high transaction performance, high data availability networks such as Fibre Channel local area networks in large data storage and processing facilities such as credit card transaction processing centers etc.
In typical high transaction performance, high data availability networks comprising one or more Transaction Servers coupled by a Fibre Channel Arbitrated Loop (hereafter FC-AL or the main loop) to huge banks of on-line storage disk drives called JBODs. Transactions are processed and data is written to the disk drives and read from the disk drives. Typically such read and write transactions keep the main loop busy all the time. However, because of the frequency of failures of disk drives and the uneven frequency of need for read transactions on some data as compared to other data, there is a need in such systems for hierarchical storage management functions such as data aging storage and automatic, nonintrusive backup.
One mirrored system of which the applicants are aware is a mirrored storage system marketed by Vinca. In this system, one Transaction Server is coupled by a SCSI bus to a JBOD, RAID or other hard disk arrangement. A JBOD, like a RAID is a disk drive arrangement with multiple disk drives all coupled to the same bus. A JBOD stands for "just a bunch of disks" believe it or not. A JBOD is an enclosure containing a bunch of electromechanical disk drives, power supplies, a backplane which a 4-wire Fibre Channel extension with one differential pair for transmit and one differential pair for receive in the case of a JBOD FCAL node. The primary difference between a JBOD and a RAID is that a RAID has a disk controller sitting between the FCAL and the drives whereas a JBOD does not. RAID stands for Redundant Array of Inexpensive Disks and is also known as a "disk array". RAIDs provide fault tolerance over conventions JBODs and give greater performance in some application by accessing multiple disk drives in parallel. RAIDs provide fault tolerance by adding parity in the sense that data is "striped" across the first N drives and the N+1st drive contains the parity. If any one drive fails, knowing the parity scheme allows the data to be recreated by using the parity drive. A controller sits on a RAID between the drives and a bus so RAIDs are hot-pluggable so that any one drive can be removed and the controller can recreate its data on the fly. FCALs make JBODs more hot pluggable than SCSI bus connections to JBODs since FCAL was designed to be a network. A SCSI bus is a bus with a particular bus protocol referred to as the SCSI protocol. This server is connected by a proprietary high speed link to another server. The second server is connected by another SCSI bus to a mirrored JBOD, RAID or other hard disk arrangement. In this arrangement, each time the Transaction Server writes data to its hard drive array, the same data needs to be converted from SCSI format, packetized for transmission over the proprietary high speed link and transmitted to the second server using whatever protocol is in use on the proprietary high speed link. When the second server receives the packets, it must depacketize them and convert the payload data to SCSI format data and then initiate a SCSI write transaction to write the data to the hard disk array coupled to the second server. Thus, the overhead of two SCSI write transactions, packetizing and depacketizing to from the packet format on the proprietary high speed link and a transmission over the proprietary high speed link must be made to get the data mirrored. The increased overhead increases the server processing loads and thereby increases the cost in CPU time of each transaction. The protocol conversion between SCSI and the protocol on the proprietary high speed link between the servers forces the servers to double as pseudorouters.
Other backup products that are commercially available are software products like the Replica backup software from Stac Electronics in Carlsbad, Calif. This product can do nothing to solve the problem solved by the invention in speeding up read transactions on a FC-AL since it cannot isolate backup devices from fast main loop devices and there is no mirrored storage. This means that transactions to the main storage devices cannot be started until the backup transaction from the main storage devices to the backup storage devices are completed.
The requirement for hierarchical storage management creates a conflict in system performance and cost considerations. Hierarchical storage management functions are typically implemented with low performance, low cost devices whereas online transaction processing systems are implemented with high performance, high cost devices. The mixing of hierarchical storage management devices on the same FC-AL with high speed, high performance on-line transaction processing devices results in a significant reduction in overall on-line transaction processing performance. This is because the high performance devices must wait for the low performance hierarchical storage management devices to complete their tasks before the high speed transaction processing devices can continue their work. This is because only one pair of devices can have control of the Fibre Channel Arbitrated Loop at any particular time, so when a hierarchical storage management server has control of the FC-AL to carry out a write transaction to a backup disk drive, no high speed transaction processor can simultaneously be using the FC-AL to do either read or write transactions to the JBOD drives. This becomes an intractable problem as on-line transaction processing systems expand over terabytes of storage and will be becoming worse over time as huge data structures like image, voice and video files are added to these already large data structures.
Therefore, a need has arisen for a way to implement hierarchical storage management functions in such high performance, online transaction processing systems without severely negatively impacting the performance of such systems.
SUMMARY OF THE INVENTION
A solution to this problem is provided according to the teachings of the invention by mirroring the data over a Bridge. The Bridge segments the FC-AL into two FC-ALs or two LANs of any other type. One of these LANs carries high speed, high data availability online transaction processing traffic between a Transaction Server and a primary memory, typically comprised of a high performance JBOD or other high speed storage devices. This first LAN or local area network will hereafter be called the Primary Loop or primary FC-AL. The second LAN or FC-AL will be called the Secondary Loop or secondary FC-AL. It carries hierarchical storage management (hereafter HSM) traffic between an HSM backup server and a mirrored memory, typically also a high performance JBOD or other high speed storage device. The Secondary Loop also carries data between the mirrored memory and backup/archival storage devices such as tape drives or WORM drives. The Primary Loop and Secondary Loop referred to herein are preferably FC-AL, but they may be any other type of local area network such as Ethernet, fast Ethernet, ATM etc.
The way this system works is as follows. For any write transaction from a high speed transaction processor to a high performance JBOD over the Primary Loop, an additional write operation is carried out over the Bridge and the Secondary Loop to the mirrored-storage HSM disk drives. The write transaction to the mirrored disk drive device or devices on the Secondary Loop so as to mirror the data stored on the disk drives in the Primary Loop does not slow down processing on the Primary Loop because of the presence of the Bridge. If the Bridge were not present, a write transaction to a backup storage device such as a mirrored backup disk drive stalls all further transactions with the fast storage devices in the Primary Loop until the transaction with the backup storage device is completed. However, the mirrored disk drives are fast devices so write transactions to them are very fast even though the data must traverse the Bridge. Thus processing of further transactions on the Primary Loop is not appreciably slowed down by a write transaction to a mirrored disk drive on the Secondary Loop. Later, data on the mirrored disk drives in the Secondary Loop is moved by an HSM server on the Secondary Loop to slower backup devices like streaming tape backup drives, WORM drives etc. Since all nodes involved on this secondary backup operation are on the Secondary Loop which is insulated from the Primary Loop by the Bridge, the secondary backup HSM transactions do not tie up the Primary Loop at all.
In contrast, in a prior art FC-AL network without a Bridge, whenever an HSM server started moving data from the mirrored storage to the backup devices such as streaming tape or WORM drives, transactions between the transaction processor and the main disk drives could not occur because the single FC-AL loop to which all nodes were coupled was tied up with HSM transactions to backup devices. With the architecture according to the teachings of the invention, the presence of the Bridge isolates the secondary HSM transactions between the backup devices on the Secondary Loop so that the Primary Loop is not tied up in a loop tenancy and is free to carry out a concurrent transaction between the Transaction Server and the Primary Loop disk drives.
Read operations on the high performance FC-AL segment are unaffected since the Bridge keeps the read operation confined locally to the Primary Loop so no loop tenancy on the Secondary Loop results from a read transaction on the Primary Loop. This provides a major advantage since read operations can outnumber write transactions on the Primary Loop by factors of 9 to 1 in many installations. Thus, these read transactions between the transaction processor and the disk drives on the Primary Loop can proceed at a very fast pace on the Primary Loop and need not be slowed down by slower HSM transactions occurring on the Secondary Loop. That is, HSM transactions on the HSM FC-AL segment are performed between the backup storage devices, the HSM server and the mirrored disk storage all of which are coupled to the Secondary Loop, and are kept out of the high speed primary FC-AL segment by the Bridge. In this way, 16 to 24 hour or overnight backups no longer have any impact on system operation on the Primary Loop. The only time penalty that is traded off for the acceleration of read transactions is that there is a small latency in the Bridge for each write transaction between the Transaction Server and a destination node on the Primary Loop while the Bridge looks up the destination address in the forwarding table or learns the location of the destination node and decides to keep the traffic on the Primary Loop. This small time penalty is more than offset by the great increase in the read transaction processing rate and the greater security of being able to do data backup and other HSM functions more frequently without slowing down processing on the Primary Loop.
Further, if the Bridge were not present, every primitive and data frame of every transaction on the Primary Loop would have to pass through all the nodes located on the Secondary Loop of the architecture according to the teachings of the invention because they would all be located on the Primary Loop if the Bridge were not present. Thus, each primitive and data frame in FC-AL protocol transmissions would suffer the 6 word delay in each of these nodes even though none of the nodes of the backup machinery is directly involved in the primary transaction.
The architecture and protocol according to the teachings of the invention provide an overall performance boost over nonsegmented, high speed online transaction processing FC-AL networks with integrated HSM components since, in the typical online transaction processing center, read operations dominate write transactions by a large factor according to the information available to the inventors. FC-AL networks in general provide a performance boost as FC-AL networks can be faster than SCSI bus networks (often as much as 5 times faster). SCSI bus networks have a performance curve expressed in I/O transactions per second that rises linearly with increasing numbers of disk drives coupled to the Transaction Server by SCSI buses until a certain level is reached. At that level, performance levels off with increasing numbers of devices and ultimately performance falls off with still larger numbers of devices coupled to the Transaction Server by SCSI buses. The reason for this falloff is that SCSI bus transactions involve a fair amount of overhead processing to set up every I/O transaction. As the number of devices rises, the amount of overhead processing also rises with the number of drives and the number of buses. Ultimately, the overhead of setting up the transaction with the various drives begins to catch up with and overtake the performance increase with increased numbers of drives. It is this phenomenon which causes performance to fall off.
Because Fibre Channel Arbitrated Loops are substantially faster than SCSI buses, there is a movement in the data storage industry to move away from SCSI buses and toward FC-AL networks. In FC-AL networks, more drives can be added than is the case in SCSI bus networks before the performance begins to fall off. This is because of the higher speed of FC-AL networks. Thus, it is easier to attain one terabyte of storage in an FC-AL network without sacrificing performance than in SCSI bus networks.
Further, most Transaction Servers today use PCI buses (PCI buses are internal buses of a computer or workstation with a particular bus protocol called PCI). internally. A maximum limit of 3 SCSI buses can be coupled to a PCI bus. This is because more than three SCSI buses per PCI bus couples too much parasitic capacitance load to the PCI bus thereby slowing it down. Thus, attaining a one terabyte storage system with SCSI buses also requires the use of more Transaction Servers than is the case in a FC-AL network because of the limit on the number of SCSI buses than can be coupled to a PCI bus. The large number of SCSI buses to the JBOD and RAID arrays to reach one terabyte of storage therefore dictates that more Transaction Servers must be used thereby increasing the overall cost and complexity of the system.
Further, since there is no protocol conversion in a network constructed according to the teachings of the invention, the overhead cost in CPU time to get all write data copied to mirrored storage disk drives on the Secondary Loop is much less than for the Vinca prior art where a protocol conversion from the SCSI protocol to the protocol of the proprietary high speed data link between the primary and HSM servers is necessary. Further, the Transaction Server and HSM server in a system according to the teachings of the invention need not have software that can perform a pseudorouter function or the type of protocol conversion needed in the Vinca prior art.
Additional hardware enhancements to support simultaneous writes to the Bridge and the online storage JBODs and enhancements to the Fibre Channel Arbitrated Loop protocol supporting multicast to a Bridge and one or more other nodes can further enhance performance without loss of HSM functionality.
BRIEF DESCRIPTION OF THE DRAWINGS
FIG. 1 is a block diagram of a system employing the teachings of the invention.
FIG. 2 is a diagram showing the structure of one embodiment of a half Bridge according to teachings of the invention.
FIG. 3 is a diagram showing how two half Bridges according to the first embodiment of the teachings of the invention can be cross connected internally to make a full Bridge.
FIG. 4 is a diagram showing how two half Bridges constructed according to any embodiment of the teachings of the invention may be connected together to link two Fibre Channel Arbitrated Loops and achieve acceleration over a single FC-AL having the same number of nodes as the two smaller FC-ALs linked by the two half Bridges.
FIG. 5 is a diagram showing the internal construction of a half Bridge according to the preferred embodiment of the invention.
FIG. 6 is a diagram showing the internal construction of a TX port of a half Bridge according to the preferred embodiment of the invention.
FIG. 7 is a diagram showing the internal construction of an RX port of a half Bridge according to the preferred embodiment of the invention.
FIG. 8, comprised of FIGS. 8A through 8G is a flow chart showing processing by the TX and RX ports of a half Bridge according to the preferred embodiment of the invention to carry out bridging, learning and conflicting OPN preemption decisions.
FIG. 9 is a block diagram of a full Bridge using the preferred TX and RX port structures of FIGS. 6 and 7.
FIG. 10 is a flowchart of the portion of the software that controls processing by the Transaction Server to perform mirrored write transactions.
FIG. 11 is a block diagram of an alternative embodiment of the invention which does not use separate Bridge.
FIG. 12 is a block diagram of an alternative embodiment of the invention that uses a single Transaction Processor.
DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENT
Referring to FIG. 1, there is shown a block diagram of a system employing the teachings of the invention. A high speed primary transaction processing Fibre Channel Arbitrated Loop (FC-AL) 10 couples a Primary Storage Bank of Hard Disk Drives comprised of a first bank of hard disk drives 12 and a second bank of hard disk drives 14 to a high speed Transaction Server 16. A secondary FC-AL 26 is coupled to the Primary Loop through Bridge 28 and is also coupled to mirrored disk drives 32 and 34
and a backup & HSM server 30. The hard disk drives 12 and 14 on the Primary Loop 10 as well as their mirrored counterparts 32 and 34 on the Secondary Loop 26 and both servers need to be compatible with the Fibre Channel Arbitrated Loop protocol if Fibre Channel Arbitrated Loops are used for the Primary Loop 10 and the Secondary Loop 26. The Seagate Barracuda is one example of an FC-AL compatible disk drive.
The Transaction Server 16 is coupled via a local area network link such as a 10Base-T twisted pair 18 to a network hub 20 such as an Ethernet.RTM. hub. The hub 20 is coupled to a plurality of workstations such as workstations 22 and 24 by individual network drop lines 21 and 23. These workstations carry out processing of transactions which sometimes require that data be read from the banks of hard disk drives 12 and 14 and sometimes require that data be written thereto. These I/O transactions are carried out by the Transaction Server 16 by carrying out read and write transactions in accordance with the Fibre Channel Arbitrated Loop protocol. In some embodiments, the high speed FC-AL 10 may be implemented using an accelarated protocol utilizing the methods and apparatus disclosed in a co-pending U.S. patent application entitled, "ACCELERATED FIBRE CHANNEL HUB AND PROTOCOL", U.S. Ser. No. 08/695,290, filed Aug. 8, 1996, now U.S. Pat. No. 5,751,715, which is hereby incorporated by reference. In such an embodiment, the FC-AL 10 would be implemented through an accelerated hub of the type described in "Accelerated Fibre Channel Hub And Protocol" patent application referenced above.
The high speed transaction processing FC-AL 10 (hereafter referred to as the high speed loop) is coupled to a secondary heirarchical storage management Fibre Channel Arbitrated Loop 26 by a learning Bridge 28. The learning Bridge 28 may have the design of the Bridge disclosed in co-pending U.S. patent application FIBRE CHANNEL LEARNING BRIDGE, LEARNING HALF BRIDGE, AND PROTOCOL", U.S. Ser. No. 08/786,891 Jan. 23, 1999, filed Jan. 23, 1997, now pending, which is hereby incorporated by reference and the essence of which is described further below. As mentioned above, the Secondary Loop 26 couples the Bridge to a backup and heirarchical storage management server 30 (hereafter HSM server) and two banks 32 and 34 of mirrored storage hard disk drives. The HSM server 30 is also coupled via a SCSI bus 36 to two online low speed data storage devices 38 and 40 used for storing archival backups and data having infrequent usage or which has not been used recently. In the embodiment shown in FIG. 1, low speed data storage device 38 is a CD-based write-once, read-many (WORM) drive and device 40 is a tape backup system.
The way the system of FIG. 1 works is by mirroring write transactions in the high speed loop across the Bridge. That is, every time Transaction Server 16 does a write transaction, it arbitrates for the high speed loop, and when it wins control thereof (more than one transaction processing server may be on high speed loop 10), it transmits an open (hereafter OPN) primitive to the particular disk drive in bank 12 or bank 14 to which it wishes to store data. The acronyms OPN, CLS, ARB, RRDY and LIP will be used to designate certain primitives used for signalling other nodes in the Fibre Channel Arbitrated Loop protocol. The OPN primitive is a unique set of bits defined in the Fibre Channel Arbitrated Loop specifications (publicly available at http://www.t11.org--the standard which covers ARB/OPNICLS protocol is "Fibre Channel Arbitrated Loop is publicly available at ftp://ftp.t11.org/t11/pub/fc/al-2/98-170v5.pdf under document number X3.272-199x revision 6.6 under T11/Project 1133D/Rev 6.6) which is transmitted followed by two node addresses which all other nodes that it passes through that there is a desire to begin a loop tenancy between a source node and a destination node whose destination address is one of the addresses that follow the OPN primitive. The CLS primitive is a unique set of bits defined in the Fibre Channel Arbitrated Loop specifications which is transmitted by either the source node or a destination node in a loop tenancy to inform the other node in the loop tenancy that it is to be terminate. The ARB primitive is a unique set of bits defined in the Fibre Channel Arbitrated Loop specifications which is transmitted followed by a priority code by a node which wants to assume control of the loop and establish a loop tenancy. The RRDY primitive is a unique set of bits defined in the Fibre Channel Arbitrated Loop specifications which is transmitted by a node which receives an OPN primitive to indicate that the node which transmitted the RRDY has space available in its buffer to receive one frame of data. The LIP primitive is a unique set of bits defined in the Fibre Channel Arbitrated Loop specifications which is transmitted by a node to tell all other nodes to reinitialize their software and restart their FC-AL protocols. It is in effect a reset or reboot command. The disk drive on the high speed loop so addressed replies with an RRDY primitive and then the Transaction Server writes a data frame to that disk drive. This process continues with RRDY and data frames being exchanged until either all the data to be stored has been transferred, or one or the other of the Transaction Server 16 or the disk drive transmits a close (CLS) primitive indicating that no more data can be transmitted or accepted during this transaction. This can happen when a buffer in the disk drive becomes full or the Transaction Server for some reason cannot currently complete sending all the data. If the transaction is cut short before all data is transferred, the Transaction Server reopens the connection later and completes the data transfer.
Assuming that the data storage transaction has been completed, the Transaction Server next begins the process of doing a mirrored write transaction of the same data to the mirrored storage hard disk drives 32 and 34. To do this, the Transaction Server 16 generates an OPN primitive addressed to either the backup and HSM server 30 or directly to one of the mirrored storage hard disk drives in banks 32 and 34. In the preferred embodiment, there is mirrored hard disk drive on the Secondary Loop for every hard disk drive on the Primary Loop and the counterpart drives on the primary and Secondary Loops have identical capacities. Therefore, the hard disk drives on the Secondary Loop have identical content to the hard disk drives on the Primary Loop at substantially all times, and recovery after a disk crash on the Primary Loop can be made by copying the data from the mirrored drives 32 and 34 onto replacement drives on the Primary Loop. When the mirrored drives 32 and 34 have the same capacity as the primary drives, the Transaction Server 16 can address OPN primitives directly to the mirrored drives. In some species within the genus of the invention, the mirrored drives 32 and 34 can have smaller capacity than the primary drives 12
and 14 to save on costs for the system. In these species, the OPN of the mirrored write transaction will be addressed by the Transaction Server 16 to the HSM server 30 which then finds open space on the "mirrored", i.e., backup drives 32 and 34 and forwards the data to the appropriate disk drive on the Secondary Loop. In these embodiments, the HSM server writes the data to an available portion of the backup disk drives 32 and 34 and later archives the data off these backup drives and onto the secondary backup devices 38 and 40 so as to free up space on the backup drives for more writes from the Transaction Server. In all embodiments, the Transaction Server 16, includes some mechanism to try a write to either the HSM server 30 or the backup disk drives 32 and 34 again later if, when a write to either destination is attempted but fails because the Secondary Loop is busy with another local transaction such as copying data from the backup drives to the secondary backup devices 38 or 40. The details of this "try again later" mechanism in the Transaction Server are not critical to the invention, and any way of detecting that the Secondary Loop is busy and scheduling another attempt to write the data to the backup drives later will suffice for practicing the invention.
Hereafter, the discussion assumes a species where the OPN of the mirrored write transaction is directed to the mirrored disk drives directly. When the OPN primitive reaches Bridge 28, its destination address is determined by the Bridge to be on the Secondary Loop, so the OPN primitive is forwarded to the Secondary Loop, unless the Secondary Loop is busy with a local transaction. If there are conflicting OPNs on the primary and Secondary Loops 10 and 26, the Bridge 28 resolves the conflict in the manner described below which may involve sending a preemptive CLS back to the Transaction Server 16. If the transaction receives a CLS before it completes its write to the Secondary Loop, the "try again later" software routine in the Transaction Server is triggered to reschedule another try later.
Assuming no conflicting OPN situation has arisen, and the OPN from the Transaction Server 16 is forwarded onto the Secondary Loop, the following things happen. On the Secondary Loop, the OPN primitive propagates to the destination disk drive. The destination disk drive then replies with an RRDY or CLS primitive which propagates around the Secondary Loop 26 until it reaches the Bridge 28. When the Bridge 28 sees the RRDY or CLS primitive returned from the Secondary Loop, it concludes that the source is on the Primary Loop and the destination is on the Secondary Loop. Therefore, the Bridge connects the primary and Secondary Loops together as one big loop. The RRDY or CLS primitive thus propagates back to the Transaction Server 16 and the data write transaction is completed in the same fashion as the original write transaction to the disk drive on the Primary Loop was completed. Thus, the same data that was written to the disk drive on the Primary Loop is also written to the mirrored disk drive on the Secondary Loop. After this transaction is completed, the loop tenancy is relinquished, and the Bridge 28 separates the Primary Loop and Secondary Loop again so as to keep all purely local traffic on each loop so two purely local concurrent loop tenancies on the primary and Secondary Loops can coexist.
The backup and HSM server 30 is free to use the Secondary Loop 26 whenever the Secondary Loop is not tied up in a loop tenancy with the Transaction Server 16. The HSM server 30 uses the Secondary Loop to carry out its heirarchical storage management duties without slowing down the Primary Loop by virtue of the presence of Bridge 28. Arbitration for the Secondary Loop will always be won by the Transaction Server 16 as between it and the HSM server 30 since the Transaction Server 16 will have a higher priority. Thus, any time both the HSM server and the Transaction Server are arbitrating for control of the Secondary Loop, the Transaction Server 16 will win and be able to complete its transaction before the HSM server 30 can continue any storage management function involving use of the Secondary Loop 26. Typical HSM transactions will be to move data that has not been used for a predetermined period or which is older than a predetermined age to archival storage. This is typically done by reading the data to be archived from the mirrored storage drives and then doing write transactions to the CD WORM drive 38 or the tape drive 40.
The Bridge 28 works generally as follows. The OPN primitive issued by the Transaction Server has a destination address which designates the address of the particular hard disk drive to be opened. This destination address is part of the data structure of the OPN primitive. The Fibre Channel Arbitrated Loop learning Bridge 28 monitors loop traffic for the destination addresses of any OPN primitives and monitors other primitives that result from the OPN primitive so as to draw conclusions as to the locations of each node in terms of whether they are on the Primary Loop 10 or the Secondary Loop 26. This process is explained more fully below. The locations of the nodes are stored in a forwarding table stored in memory in the Bridge 28. Each destination address in an OPN primitive issued from either the Transaction Server 16 or the HSM server 30 (or any other node on either loop) is checked against any entry for that destination address in the forwarding table. When the destination address in an OPN received from the Secondary Loop is found in the table and the table indicates that the destination address is located on the Primary Loop 10, the Bridge forwards the OPN primitive to the Primary Loop 10. Likewise, when an OPN primitive arrives from the Primary Loop, and the forwarding table indicates that the node is located on the Secondary Loop, the OPN primitive is forwarded to the Secondary Loop side of the Bridge. After forwarding of an OPN primitive, the destination device sees its destination address in the OPN primitive and replies either with an RRDY or a CLS primitive which causes the Bridge to conclude that the source and destination nodes are on opposite sides of the Bridge and to switch so as to make connections to couple the Primary Loop and Secondary Loop together as one loop. The data transfer process is then completed in accordance with ordinary Fibre Channel protocol rules. If the Bridge 28 determines that the destination address of an OPN is located on the same loop from which it was received, the OPN is not forwarded and the transaction is completed on a purely local basis such that the nodes on the loop having neither the destination node or the source node are bypassed. There follows a more detailed discussion of the construction and operation of the full Bridge 28 and the half Bridges of which it is constructed.
Half Bridge Operation
Referring to FIG. 2, there is shown the structure of a half Bridge according to one alternative embodiment of the invention wherein the TX port sets data in the memory during the learning process. FIG. 2 shows how a half Bridge can be used to divide a Fibre Channel Arbitrated Loop into two loop segments--a local loop segment 52 and a remote loop segment 54, although half Bridges are never used alone to do this and are always coupled to another half Bridge. The remote loop segment is comprised of two segments: a TX or transmit segment coupling the RX port of the left half Bridge to the TX port of the right half Bridge; and an RX or receive segment coupling the TX port of the left half Bridge to the RX port of the right half Bridge. For purposes of illustrating generally how a half Bridge according to the teachings of the invention works, FIG. 2 does not show the other half Bridge, but represents it by remote loop segment 54. The half Bridge, when used in conjunction with another half Bridge to form a full Bridge, prevents any local loop traffic from being transmitted into the remote loop segment across the full Bridge as the half Bridges learn the addresses of the nodes on the local loop segment.
The half Bridge has a TX port 56 having internal terminals 1, 2 and 3 which are coupled to a state machine (not shown) which implements part of the switching rules of the half Bridge. Likewise, the half Bridge has a RX port 58 which also has terminals 1, 2 and 3 which are also coupled to a state machine (not shown) which implements the remaining switching rules of the half Bridge. The details of the internal construction of the TX and RX ports is not critical to the invention. For example, the structure described below for the preferred embodiment can be used with suitable modification of the control signals that flow between the TX and RX ports. Altematively, other constructions for the TX and RX ports may also be used such as control by a programmable machine of off-the-shelf FC-AL chipsets to implement the learning and switching rules described herein. The TX and RX ports of FIG. 2 work generally the same as the TX and RX ports 100 and 102 of FIGS. 6 and 7 and in terms of switching rules, learning, handling ARB fill words and RRDY primitives, and resolving concurrent OPN situations. The main difference between the embodiment of FIG. 2 and the preferred embodiment of FIG. 5 is in how the TX and RX ports control and use memory 78. Accordingly, much of the discussion of the specifics of these common operations is deferred until the discussion of FIGS. 6 and 7. The description of the general operation of the half Bridge of FIG. 2 will be useful to understand the basic principles of FC-AL bridging before the detailed discussion of implementation of the TX and RX ports of FIGS. 6 and 7 swamp the reader in detail.
Terminal 3 of the TX port 56 is coupled to the outbound segment of local loop segment 52 of the loop and terminal 1 is coupled to the inbound segment of the remote loop segment 54 of the FC-AL. Terminal 1 of the TX port is coupled to a local return segment line 60 which is also coupled to terminal 3 of the RX port 58. The local return segment line 60 is used to route primitives and data which originated from a node on the local segment 52 which are directed to another node on the local segment from the RX port directly back to the TX port so that they need not propagate around the remote segment 54. Also, the RX port 58 contains a latch (not shown but shown in FIG. 5 as latch 200) which is coupled so as to be able to store the source and destination addresses of OPN primitives received at a Local RX Port, i.e., pin 1 of RX port 58 coupled to the inbound portion of the local loop segment 52. Likewise, the TX port 56 includes a latch 202 (not shown but shown in FIG. 6 as latch 202) which is coupled so as to be able to store OPN primitives received at the Remote RX Port, i.e., pin 2 of the TX port 56 coupled to the inbound portion of the remote loop segment 54. The operation of these latches to support resolution of various concurrent OPN scenarios will be explained further below.
The half Bridge 50 is a learning Bridge in that it monitors traffic on the loop segments to which it is coupled and uses the destination addresses of OPN primitives in conjunction with the ensuing RRDY and CLS primitives and the terminal on which they arrive to draw conclusions regarding whether a node having a particular address is on the local segment or the remote segment. How this is done will be explained in more detail below, but for completeness here, the half Bridge keeps a forwarding table in memory 78 which is consulted by the half Bridge when each new OPN primitive arrives. The half Bridge uses the destination address of the OPN primitive as an index into a forwarding table stored in a memory 78 to obtain data indicating whether the node having the destination address in the OPN primitive is on the local segment 52 or the remote segment 54.
The half Bridge can only short circuit data frames and primitives through the local segment return 60 for the local segment 52 and not for the remote segment 54. Data frames and primitives propagating on the remote segment 54 arrive at TX port
56 and propagate around the local segment and through the RX port 58 as if the half Bridge was not really there. However, primitives and commands that are propagating on the remote segment 54 cannot be short circuited by the half Bridge and must propagate around the local segment 52 even if the destination node is on the remote segment.
The reason for existence of the half Bridge is that it makes local loop segmentation and acceleration possible and is particularly useful in situations where there are two separate concentrations of nodes, each concentration separated from the other by some appreciable distance. In such a situation, symbolized by FIG. 4, the first concentration of nodes are coupled by a segment 1 FC-AL designated 62 which are coupled to a first half Bridge 66. The second concentration of nodes are coupled by a segment 2 FC-AL 64 which is also coupled to a second half Bridge 68. This situation requires the use of a long fiber optic link represented by fibers 70 and 72. Fiber 70 couples the RX port of half Bridge 66 to the TX port of half Bridge 68. Fiber
72 couples the RX port of half Bridge 68 to the TX port of half Bridge 66. The two half Bridges in this configuration cooperate to implement a full Bridge which learns the locations of each node on the first and second segment FC-ALs by watching the traffic on the loop. As the half Bridge 66 becomes smarter, more traffic which is between nodes on the first segment 62 gets short circuited in the half Bridge 66 through that Bridge's local return segment. This prevents purely local traffic on loop segment 62 from having to incur the propagation delays inherent in propagating across the fiber optic links 70 and 72 and the latency of each node on the second segment 64 in order to get back to the destination node on the first segment. Likewise, as the half Bridge 68 becomes smarter, more traffic which is between nodes on the second segment 64 gets short circuited in the half Bridge 68 through that Bridge's local return segment thereby accelerating purely local operations on the second segment 64
like half Bridge 66 accelerates purely local operations on the first segment 62.
The way the half Bridge of FIG. 2 functions to carry out its learning process and perform switching to accelerate transactions where possible is as follows. Memory 78 stores a forwarding table. The forwarding table is a 1024.times.1 memory which has 1 memory location for each of the 1024 different possible destination addresses that could be generated using a 10 bit AL.sub.-- PD destination address in an OPN primitive. The reader will note an apparent discrepancy in that the Fibre Channel Arbitrated Loop protocol only permits 126 different destination addresses with destination address numbers 0, F0 (hex), F7 (hex) and F8 (hex) reserved whereas 1024 different destination addresses can be generated using 10 bits. The 126 different possible destination addresses which are actually useable represent 128 addresses possible with a 7 bit address after reserved addresses F0 and 0 are removed. The reserved addresses F7 and F8 are used as a preemption tie breaking method in the event of duplicate half-duplex OPNs. Fibre Channel Arbitrated Loops use 8b/10b encoding and decoding.
In operation, there are several different possibilities for the states entered by the TX and RX Port State Machines and the switching and other processing that will be carried out thereby. The switching and other processing which is actually carried out by the state machines in any situation depends upon the relative locations of the source node and destination node on the local and remote loop segments.
The state transitions, switching rules and other processing carried out by the transmit and RX ports 56 and 58 in FIG. 2 for situations other than handling of conflicting OPNs are as follows. Switching rules for handling conflicting OPNs are as given below in the discussion of FIGS. 8A through 8G. The details of the circuitry and/or programming to implement these rules to handle conflicting OPNs and do the switching described herein are not critical to the invention.
The TX port and RX port each initialize such that the TX port makes a 2-3 connection between terminals 2 and 3 therein and the RX port makes a 1-2 connection such that the remote loop segment 54 and the local loop segment 52 are coupled together as one big loop in coordination with the initialization condition that every destination node is assumed to be on the remote loop segment. Memory 78 is cleared to all 0s using the Reset All signal line 500 (1=local, 0=remote). The 2-3 connection terminology will be used herein as a shorthand expression meaning making a connection between terminals 2 and 3 and likewise for any other terminal pair. Suppose now the TX port 56 receives an OPN primitive on terminal 2. The TX port of the half Bridge latches the AL.sub.-- PD (destination address) contained in the OPN primitive in an internal latch (not shown but like latch 202 in FIG. 6), places this destination address on address bus 108 and forwards the OPN primitive out onto the local loop segment
52. The TX port 56 then arms the RX port 58 by activating the Arm signal on line 65. This sets the RX port to a certain state that will cause certain processing to occur if a CLS or RRDY primitive arrives at terminal 1 of the RX port from the local loop segment 52. Arming of the RX port is symbolized in FIG. 2 by activating a signal on line 65 in FIG. 2, although, in the preferred embodiment, the state machines within the TX port 56 and the RX port 58 are the same state machine so all the control signals between the transmit and RX ports shown in FIG. 2 would not be necessary as the data conveyed by these control signals would be known by the state machine by virtue of the state it is in and the transitions between states. In alternative embodiments, the TX Port State Machine can be a separate state machine from the RX Port State Machine. Such embodiments typically use the control signals shown in either FIG. 2 or FIG. 5 flowing between the transmit and RX ports to keep the two different state machines apprised of the conditions encountered by the transmit and RX ports.
If the TX port 56 receives an OPN primitive on terminal 1 from the local segment return 60, it automatically makes a 1-3 connection to forward the OPN primitive out on the local loop segment 52 in case the destination node is there.
If the RX port 58 receives an OPN primitive on terminal 1 from the local loop segment 52, it decides what to do by first deciding whether it has been previously armed. If the RX port 58 has been previously armed, it knows the OPN came from the remote loop segment and its destination node is not on the local loop segment 52 because if it had been local, the destination node would have converted the OPN to and RRDY or CLS for transmission to terminal 1 of the RX port. Thus, the RX port knows that the OPN must be forwarded back to the remote loop segment, so the RX port makes a 1-2 connection to forward the OPN primitive out on the remote loop segment 54. The RX port then writes a logic 0 into the forwarding table address currently latched on the address bus 108 indicating that this particular destination is not on its local segment.
If the RX port 58 in FIG. 2 was not previously armed when the OPN primitive arrives on terminal 1, it knows the OPN was generated by one of its local nodes, but it does not know whether the location of the destination node has been previously stored in the forwarding table. In this situation, the RX port latches the destination address of the OPN primitive and places the destination address on the address bus 108. The RX port then reads the memory location mapped to that destination address by activating the Read signal on line 67. Whatever data is stored at that memory location is then output by memory 78 on data bus 112 and causes the RX Port State Machine to enter one of two possible states. If the data returned by the memory is a logic 1, it means that the destination node is local. In this event, the RX port makes a 1-3 connection to transmit the OPN primitve out on the local segment return 60. The RX port also then activates a Local signal on line 71 thereby informing the TX Port State Machine to make a 1-3 connection to keep the traffic local and bypass the delays imposed by the nodes on the remote loop segment. If the data returned from the forwarding table was a logic 0, the destination node is not local, so the RX port
58 makes a 1-2 connection to forward the OPN primitive out on the remote loop segment 54.
The RX port then deactivates the Local signal on line 71 thereby causing the TX port 56 to resume the 2-3 connection of its initialization state.
If the RX port 58 does not receive an OPN primitive on terminal 1 but observes either a CLS or RRDY primitive arriving from the local segment on terminal 1, it knows that the destination node is on the local loop segment 52. Accordingly, the RX port activates the Write One signal on line 73 thereby causing the TX port to activate the Set signal on line 75 so as to set to a logic 1 the memory location mapped to the destination address of the OPN primitive previously received by the TX port from the remote loop segment. Next, the RX port determines whether it has been previously armed. If armed, the RX port knows that an OPN primitive previously received by the TX port from the remote loop segment has been forwarded thereby out onto the local loop segment 52. Therefore, the RX port also knows that the OPN initiator is not local, so it makes a 1-2 connection to forward the RRDY or CLS primitives received at terminal 1 to the source node on the remote loop segment. The RX port 58 then also deactivates the Local signal on line 71 to cause the TX port 56 to make or maintain a 2-3 connection. This connection by the TX port 56 permits any data frames arriving on terminal 2 from the source node to propagate out to the local destination node on the local loop segment 52 so as to allow the data transfer to be completed. In embodiments where the TX port and RX Port State Machines are combined into one machine, the single state machine will only have one state which is assumed in this source remote, destination local situation and will automatically assume the 1-2 connection for the RX port and the 2-3 connection for the TX port.
If the RX port 58 had not been previously armed when it received either an RRDY or CLS primitive at terminal 1, it means both the source node and the destination node are on the local loop segment but the destination node is closer to the RX port than the source node such that the OPN primitive issued by the source node arrived at the destination node before it arrived at the RX port 58. The destination node then replied with either an RRDY or CLS primitive which the RX port sees in an unarmed state. In this situation, RX port 58 makes a 1-3 connection and activates the Local signal on line 71 thereby causing the TX port to make a 1-3 connection. In embodiments where the TX port and RX Port State Machines are combined into one machine, the single state machine will only have one state which is assumed in the source local, destination local situation and will automatically assume the 1-3 connection for the RX port and the 1-3 connection for the TX port. Special rules for handling the fatal embrace case of simultaneous or near simultaneous OPNs arriving at a half Bridge from both the remote and local loop segments will be described in more detail below in the section heading SIMULTANEOUS OPNs.
PREFERRED HALF BRIDGE, TX PORT AND RX PORT STRUCTURES
Referring to FIG. 5, there is shown a block diagram of the preferred embodiment for a half Bridge. In the embodiment of FIG. 5, the RX port 100 does all the setting and resetting of the memory. This differs from the embodiment of FIG. 2 where the TX port sets the forwarding table memory locations to logic 1 and the RX port resets them to logic 0 on system initialization or the reception of a LIP initialization primitive. As is the case for the embodiment of FIG. 2, the LIP primitive will also cause the RX port 100 and TX port 102 to assume their default connections, 1-2 and 2-3, respectively. In the embodiment of FIG. 5, elements having like reference numerals to elements in FIG. 2 are the same and serve the same purpose in the combination.
As in the case of FIG. 2, the preferred half Bridge is comprised of a TX port 102, and an RX port 100 coupled by a local segment return 60 and coupled to a memory 78 which stores location data for each FC-AL node address as the location for the node is learned to be local or remote. The half Bridge of FIG. 5 includes a comparator 402 which functions to compare source and destination addresses of OPNs to resolve concurrent OPN situations. All of these circuits are coupled together by a plurality of different control signals, the function and activation conditions of each which will become clear during the course of the discussion of the flow charts which describe the switching and learning operations carried out by the TX port and RX port.
FIG. 6 is a block diagram of the preferred internal structure of the half Bridge TX port 102. The TX port comprises: a state machine 602 (which can be a programmed microprocessor) which functions to provide the logic which implements the switching and concurrent OPN resolution rules, count RRDYs and controls the switch 608 and insertion/substitution/deletion FIFO circuit 610 to do the operations detailed in the flow charts; an AL.sub.-- PA latch 202 which functions to store the source and destination addresses of OPNs and to automatically substitute hex F7 or hex F8 into the source address field of any half-duplex OPNs; a remote decode circuit 604 which functions to recognize the primitives that arrive at pin 2 from the remote half Bridge and advise the state machine 602; a local decode circuit 606 which functions to recognize primitives that arrive at pin 1 via the local bypass and advise the state machine 602; a switch 608 which controls routing of primitives and data arriving at either pin 2 from the remote half Bridge or pin 1 via the local bypass onto the local loop segment 52 connected to pin 3; and insertion/substitution/deletion FIFO circuit 610 which functions to block OPNs and trailing RRDYS until concurrent OPN situations can be resolved, regenerate and forward winning OPNs, regenerate and forward RRDYs that trailed OPNs, generate and transmit preemptive CLSs, generate fill words such as ARB(0) to arbitrate for control of the local loop and substitute ARB(0) for any incoming fill words during the arbitration process or forward them unchanged when the TX port is not arbitrating, and enter a transparent mode where incoming data frames and primitives are passed therethrough unaltered.
FIG. 7 is a block diagram of the preferred structure for an RX port 100. The RX port is comprised of a state machine 612 (which can be a programmed microprocessor) which functions to control memory 78 and a switch 614 and an insertion/substitution FIFO circuit 616 as well as receive and send various control signals to implement the switching, learning and concurrent OPN resolution rules detailed below; an AL.sub.1 PA latch 200 which latches the source and destination addresses of incoming local OPNs from local loop segment 52 and supplies the AL.sub.-- PA address via bus 201 to the data input of an Insertion/Substitution/Deletion FIFO circuit when an OPN is generated; a decoder circuit 620 which functions to recognize various primitives arriving on the local loop segment 52 and advise the state machine as well as recognizing when a half-duplex OPN arrives thereby causing the state machine to cause conversion thereof in latch 200 to a pseudo-full-duplex OPN; an insertion/substitution FIFO circuit 616 which serves to block OPNs and trailing RRDYs until possible preemption situations can be resolved, generate and transmit OPNs, preemptive CLSs and RRDYs out to the remote half Bridge via pin 2 under control of the state machine 612 and enter a transparent mode where incoming data frames and primitives are passed therethrough unaltered; and switch 614 which serves to route incoming primitives and data at pin 1 and primitives generated by the insertion and substitution FIFO circuit 616 out on either pin 2 to the remote half Bridge or pin 3 to the local bypass.
Before discussing the detailed operation of the preferred embodiment for the half Bridge, a short discussion of full Bridge operation is in order so as to set the context for the discussion of the half Bridge process flow charts.
Full Bridge Operation
Referring to FIG. 3, there is shown a full Bridge formed using two half Bridges according to the embodiment of FIG. 2. FIG. 9 shows a full Bridge formed using two half Bridges according to the embodiment of FIG. 5. This full Bridge is comprised of two half Bridges which have the structure of the half Bridge of FIG. 5 and function the way described in the flow charts of FIGS. 8A through 8G. The full Bridge of FIG. 3 is two half Bridges connected together to filter traffic from two loop segments
52A and 52B and allow two concurrent loop tenancies, i.e., two local transactions may be occurring simultaneously. The two half Bridges 149 and 151 shown in the embodiment of FIG. 3 have identical structures and work identically to the above description of the half Bridge embodiment of FIG. 2.
In the Bridges of either FIG. 3 or FIG. 9, traffic arriving on terminal 2 of the TX port of either half Bridge is minimized. In other words, after the learning process, only OPNs destined to nodes not on a local segment connected to one of the half Bridges get forwarded to the other half Bridge. For example, in the embodiment of FIG. 9, if the source node is node 153 and the destination node is node 155, all data frames traveling between these two nodes are shunted across local return segment
60B and never reach left half Bridge 161. Likewise, if the source node is node 104 and the destination node is node 106, all data frames traveling between these nodes are shunted across local return segment 60A and never reach right half Bridge 163. An advantage of this structure is that concurrent conversations can simultaneously occur between nodes 153 and 155 as one conversation and nodes 106 and 104 as another concurrent conversation. However, if the source node is 153 and the destination node is
106, the OPN primitive from node 153 will be forwarded on line 54B from terminal 1 of RX port 100B to terminal 2 of TX port 102A in accordance with the switching rules defined above. After TX port 102A wins arbitration with ARB(0), the OPN propagates to destination node 106 which responds with an RRDY or CLS primitive. The response primitive arrives at terminal 1 of RX port 100A and is forwarded via a 1-2 connection and line 54A from terminal 2 of RX port 100A to terminal 2 of TX port 102B where it propagates to the source node 153.
Similarly, as shown in FIG. 4, two half-Bridges can be used to connect two groups of nodes that are physically separated by long distances. The two half Bridges 66 and 68 can be of the design of FIG. 2 or the design of FIG. 5 or any other design within the teachings of the invention. The two half Bridges are connected together by long fiber-optic segments 70 and 72. Since the speed of light through a fiber optic cable is roughly 5 ns/meter, a 200 meter full-duplex link line fibers 70 and 72
between two loop segments results in 5.sup.-9 .times.(2.times.200)=2 microseconds of additional latency over a short link, which is the equivalent of 9 nodes on the link (40 bits per FC word, 6 FC words per node, 941 picoseconds per bit at 1.0625
Gb/sec). On a conventional loop, all transfers would incur this additional 9 node equivalent delay. With additional 200 m fiber optic links, additional delay is incurred. For example, with two 200 m links the additional delay would be equivalent to 18
nodes, with three links 27 nodes, etc. On a Bridged segment such as is shown in FIG. 4, only the traffic flowing across the Bridge would incur these delays.
The operation of the half Bridge embodiment shown in FIG. 5 and the TX and RX ports shown in FIGS. 6 and 7 will be described by reference to the flow charts of FIG. 8 comprised of FIGS. 8A through 8G.
Concurrent half-duplex OPNs present a difficult concurrent OPN resolution problem since the normal preemption rules described below do not work since a half-duplex OPN only has two destination addresses and no source address so an ambiguity is created as to whether they are the same OPN (same source and destination address) or not. This is resolved as follows. First, a rule is adopted that no half duplex OPN can be forwarded from one half Bridge to another. Instead, any half duplex OPN received at the RX port is recognized and automatically converted to a pseudo-full-duplex OPN when it is stored in latch 200 as follows. In a half-duplex OPN where no source address is provided, the half Bridge RX port decode circuit 620 will recognize the arrival of a half duplex OPN at pin 1 and activate the Half Duplex control signal on line 658 of FIG. 7 to alert state machine 612 of the event. The state machine 612 then asserts RX Convert on line 660 to cause latch circuit 200 to insert either hex F7 or hex F8 for the source address in the AL.sub.-- PA latch 200 in place of the destination address that was originally there. The choice between F7 and F8 depends upon how the Bridge was configured upon installation. This permits easy preemption resolution using the full duplex rules described herein in the event of concurrent OPNs with the same AL.sub.-- PD. The OPNs are then compared by the preemption process described below and one or the other of them is preempted. One of the full or pseudo-full-duplex OPNs will survive and will be forwarded by the RX port of the half Bridge which received it to the other half Bridge or over the local bypass path. When a winning OPN is forwarded by the RX port to the remote half Bridge, if it was originally a half-duplex OPN, it is not converted back to its original half duplex form. When a winning OPN is forwarded by the RX port over the local bypass, it is converted back to its original half-duplex form by assertion of the RX Convert Back signal on line 778 of FIG. 7 by state machine 612. This causes the Insertion/Substitution circuit 616 to substitute the AL.sub.-- PD of the OPN for its F7 or F8 in the source address field.
SWITCHING RULES AND STATES FOR THE TRANSMIT AND RX PORTS OF THE
PREFERRED EMBODIMENT OF THE HALF BRIDGE OF FIG. 5
There is given below the port state machine state transition and switching rules implemented by the state machines in the RX and TX ports, 100 and 102, respectively, in the embodiment of FIG. 5. The rules are given in the form of flow charts. The section of this specification entitled RULES FOR HANDLING OF ARB PRIMITIVES BY BOTH HALF BRIDGE AND FULL BRIDGE contains arbitration primitive handling rules for both the RX ports and the TX ports of the embodiments of FIGS. 2 and 5, and is incorporated by reference into the following port state transition and switching rules.
RULES FOR HANDLING OF ARB PRIMITIVES BY BOTH HALF BRIDGE AND FULL BRIDGE
ARB primitives are output by any node or TX port or RX port desiring to obtain control of a loop segment prior to transmission. The ARB primitive includes a priority designator therein which indicates the priority of the node or TX port or RX port that issued it. Each node or TX port or RX port which receives an ARB primitive fill word examines the priority thereof. If that node or TX port or RX port desires to transmit, the ARB priority designator of the incoming ARB is changed to the priority designator of the node or TX port or RX port which desired to transmit if the priority thereof is higher than the priority of the incoming ARB primitive and the ARB is forwarded. If the priority of the incoming ARB is higher than the priority of the node, TX port or RX port which desires to transmit, the incoming ARB is forwarded without change. If a TX port, RX port or node sees its own ARB come back to it, it knows it has won the arbitration and has permission to transmit. After a node or a TX port or an RX port sees its own ARB come back to it, it thereafter "swallows" all incoming ARBs, regardless of their priority, by substituting the highest priority address, F0, for the priority designator of all incoming ARBs until the node, TX port or RX port has relinquished the loop. At that time, the ARB handling rules defined above for the time before the node saw its own ARB come back are resumed.
In the case where the TX port 102 receives a remote OPN on pin 2, it begins to arbitrate for control of the local loop segment. It can happen that a node on the local loop segment decides that it needs to transmit. That local node will also begin to arbitrate for the local loop segment. If the local node's ARB comes back to it before the remote OPN arrives, the TX port's ARBs will all be swallowed, and the TX port will block the remote OPN from transmission onto the local loop and latch the source and destination addresses thereof. Meanwhile, the local node will generate an OPN which arrives at the Local RX Port, pin 1 of the RX port 100. This will trigger the preemption rules described below to decide which OPN to preempt. If the local node begins to arbitrate for the local loop segment after the TX port has arbitrated for and won control of the local loop, no preemption situation arises. This is because the local node's ARBs will be swallowed by the TX port so it will never gain control of the local loop segment and transmit a local OPN to the RX port that would cause a concurrent OPN situation to arise.
On a half Bridge, the ARBs from the local nodes arriving at the Local RX Port, all remain local, i.e., are transmitted out pin 3 of the RX port to pin 1 of the TX port. The TX port then just forwards the ARB primitive out on pin 3, the Local TX Port. On a full Bridge, the same thing happens. Specifically, the ARB primitives arriving at the Local RX Port of the RX port are not forwarded to the other half Bridge but are sent only to the TX port of the half Bridge which received the ARB primitive. Consequently, on a full Bridge, no ARBs flow between the local and remote loop seqments coupled to the two half Bridges. Instead, ARBs arrive at the Local RX Port and are forwarded to terminal 1 of the local TX port without ever being transmitted on terminal 2 of the local RX port (see FIG. 3) to the remote loop segment.
Half Bridges cannot stand alone. They must be connected to another half Bridge either locally to make a full Bridge or remotely as in the configuration of FIG. 4.
Referring to FIG. 8, which is comprised of multiple pages of flow charts labelled FIG. 8A, 8B etc., there is shown a flow chart graphically illustrating the half-to-pseudo-full-duplex conversion processing, learning, preemption and switching rules which are followed concurrently by the TX port 102 and the RX port 100 in each half Bridge of a full Bridge. The reader should read the flowcharts of FIG. 8A et seq. in conjunction with study of FIGS. 5, 6 and 7 for a full understanding of the functions and relationships between the circuit elements and control signal activations. Below there are reiterated 6 possible scenarios some of which require preemption and some of which do not. Processing of the various cases on the flowcharts of FIGS. 8A et seq. is indicated by labels at appropriate branches of the flowchart. The various conflicting OPN situations where preemption of one of the OPNs is necessary can be categorized into 6 different scenarios. Those six scenarios and the preemption rules that are followed in each case are described next as cases 1 through 6.
Case 1: a local OPN at the left half Bridge is forwarded to the right half Bridge followed by receipt at the left half Bridge of a different remote OPN. In this case, conflicting OPNs make preemption of one of them necessary. The preemption decision is based upon a comparison of the addresses of the conflicting OPNs. Half duplex OPNs are converted to pseudo-full duplex OPNs before the preemption address comparison is done. If the address comparison indicates that the local OPN of the left half Bridge is higher priority than the remote OPN received at the left half Bridge, the remote OPN is discarded by the left Bridge and the right half Bridge sends out a preemptive CLS to close the source node which generated the lower priority remote OPN. If the remote OPN at the left half Bridge is higher priority, the left half Bridge transmits a preemptive CLS on it local loop to close the node, which generated the lower priority local OPN. The left half Bridge then arbitrates for control of the left half Bridge local loop using ARB(0). Any ARB fill words coming in from the right half Bridge are converted to ARB(0) by the left half Bridge during this process. Arbitration is won by the left half Bridge when it sees ARB(0) return to it. When arbitration is won by the left half Bridge, the remote OPN is transmitted by the left half Bridge out onto the left half Bridge local loop. The right half Bridge, which independently does its own comparison, concludes that the lower priority OPN forwarded to it by the left half Bridge must be discarded.
Case 2: a local OPN for the left half Bridge has been forwarded across the local bypass to the left half Bridge TX port when, later, a different remote OPN arrives at the TX port of the left half Bridge. This situation is resolved by an automatic preemption of the remote OPN by the left half Bridge sending out a preemptive CLS to the right half Bridge since the local loop is busy.
Case 3: a remote OPN is received at the left half Bridge simultaneously with receipt at the left half Bridge of a local OPN. This situation is resolved by examining the addresses of the two OPNs and sending a preemptive CLS to close the source node of the lower priority OPN, the preemptive CLS being transmitted by the half Bridge coupled to the local loop coupled to the source node of the lower priority OPN. Following the preemption, if the local OPN is higher priority, the memory is accessed to determine whether the destination address of the winning local OPN is local or remote, and the local OPN is forwarded via the appropriate path to its destination.
Case 4: a remote OPN is received by the left half Bridge followed by receipt of a different local OPN. When the remote OPN is received, the TX port starts to arbitrate for the local loop. If a local OPN is received before arbitration is won by the TX port, the remote OPN is too late because the local loop is considered to be busy as soon as arbitration is won by the source node which generated the local OPN. Therefore, the remote OPN is preempted by transmission of a preemptive CLS from the left half Bridge to the right half Bridge, and the left half Bridge then discards the remote OPN.
Case 5 (no preemption necessary): a local OPN is received at the left half Bridge and is forwarded to the right half Bridge, whereupon it is returned to the left half Bridge. When a local OPN is returned from the right half Bridge, the left half Bridge must identify the fact that the remote OPN just received is the same OPN as the local OPN previously forwarded to the right half Bridge. This is done by comparing addresses. When case 5 arises, the returning remote OPN is forwarded transparently onto the local loop without arbitration and Jim all subsequent traffic is forwarded by the left half Bridge transparently onto the local loop until the loop tenancy is done, and a new local or remote OPN is detected.
Case 6 (no preemption necessary): a remote OPN is received, forwarded locally and returns. Upon detection of the identical OPN, the arbitration process started by the left half Bridge TX port is stopped, and the left half Bridge goes into a transparent mode where all fill words, data and primitives coming from the right half Bridge are passed transparently through the local loop and forwarded back to the right half Bridge where they either reach a destination node or are forwarded to the source node. If the destination node is present on the local loop of the right half Bridge, it responds with an RRDY or CLS which is forwarded to the source node on the same local loop. If an RRDY was sent, the source node responds with a frame of data which is then forwarded to the left half Bridge and transparently passed therethrough, through the left half Bridge local loop and back to the right half Bridge where it is transparently forwarded to the destination node. This process is continued until the loop tenancy is completed. The right half Bridge learns the location of the destination node, so the next time a right half Bridge trys to open the same destination node, the local bypass path will be used and the left half Bridge will be omitted from the loop tenancy.
Processing starts in FIG. 8A with block 101 wherein the system is initialized by carrying out the following operations to initialize. When the loop initializes (any Loop Initialization Primitive or LIP detected by the TX or RX ports of the half Bridge), the half Bridges connects both local loop segments together as one large loop. Accordingly, TX port 102 makes 2-3 connection, and RX port 100 makes 1-2 connection. This is done by TX Port State Machine 602 setting the state of a Switch Control signal on line 620 so as to set switch 608 to connect pins 2 and 3, and similarly for the Switch Control signal on line 622 generated by RX Port State Machine 612. Also, the memory 78 is cleared to all logic 0s indicating there are no local ports yet and all traffic is to flow through both halt Bridges. This is done by the RX Port State Machine 612 asserting the Clear All signal on line 120.
After initialization, test 103 is performed to determine if the TX port has received a remote OPN primitive at the Remote RX Port, terminal 2. If not, processing proceeds to step 818 to determine if the RX port has received a local OPN on pin 1. If not processing proceeds back to Start in block 99.
If the RX port has received a local OPN, processing proceeds to step 822 wherein the Decode circuit 620 in FIG. 7 activates an OPN signal on line 654 to inform the RX Port State Machine that an OPN has arrived. The Decode circuit 620 also activates a TX Arm signal on line 644 to tell the TX port that it has received an OPN. The Decode circuit 620 also activates a Latch [0:1] signal on line 656 causing the AL.sub.-- PA addresses of the local OPN to be stored in latch 200. In response to the activation of the OPN signal on line 654, the RX Port State Machine activates a Del OPN signal on line 727 which causes the Insertion/Substitution/Deletion FIFO 616 to delete the first 20 bits of the header and OPN from the FIFO pipeline thereby blocking the local OPN from being forwarded to the remote half Bridge or on the local segment return 60 until such time as the state machine allows it to be forwarded. If any following RRDYs were received, the Decode Circuit 620 in FIG. 7 also activates the RRDY signal on line 730 once for each received RRDY. This causes an RRDY counter in RX Port State Machine 612 to be incremented once for each received RRDY. It also cause the state machine 612 to activate a DEL RRDY signal on line 858 in FIG. 7
once for each received RRDY. This causes the FIFO circuit 616 to delete the RRDYs from the FIFO pipeline to block their transmission. Both the OPN and RRDYs can be reconstructed later for forwarding by the state machine by activation of the Insert OPN and Insert RRDY signals on lines 774 and 786, respectively. These processes of storing the AL.sub.13 PA and blocking the OPN and any following RRDYs from being forwarded caused by activation of the various signals in step 822 are symbolized by step 824. The RX Port State Machine also activates an Enable Compare signal on line 638 to compare the AL.sub.-- PA address of the local OPN with the the Default AL.sub.-- PA Address stored in latch 202 of the TX port (the default AL.sub.-- PA will be stored in both latches 200 and 202 whenever a CLR is received by either RX or TX port).
Step 822 also represents the process of detecting if the incoming local OPN was half-duplex and converting it to pseudo-full-duplex if it was. This is process is carried out by the RX port if the Decode circuit 620 activates the Half Duplex signal on line 658 when the OPN signal on line 654 is activated. This causes the RX Port State Machine to activate an RX Convert signal on line 660 in FIG. 7 to cause the AL.sub.-- PA latch circuitry 200 to replace the source address with either hex F7
or hex F8 depending upon the configuration data for the half Bridge.
If a remote OPN arrives later after the local OPN, the address comparison and preemption processing is described below in the description of the processing for preemption cases 1, 2 and 3. The purpose of steps 818 and 822 et seq. is to simply describe the processing which follows when a local OPN arrives alone at the RX port.
Finally, after step 824, processing proceeds along path 820 to step 816 in FIG. 8E.
Returning to the consideration of step 103 on FIG. 8A, the arrival of a remote OPN is detected by the Remote Decode circuit 604 which is coupled by line 630 to pin 2 in FIG. 6. As soon as the TX port has received a remote OPN, the Remote Decode circuit 604 activates the OPN control signal on line 632. This tells the state machine that a remote OPN has arrived. When this happens, step 111 is performed to arm the RX port to aid in the learning process, as will be explained below. The TX Port State Machine arms the RX port by activating the RX Arm signal on line 634. Step 111 also represents the process whereby the Remote Decode circuit 604 activates the OPN signal on line 632 which causes the TX Port State Machine to activate the Del OPN signal on line 691 in FIG. 6. This causes the blocking of further transmission of the OPN as symbolized by block 105 by causing the Insertion/Substitution/Deletion FIFO circuit 610 to remove the OPN from the FIFO pipeline. The Remote Decode circuit 604
also activates the Latch [0:1] signal on line 636 which causes the AL.sub.-- PA latch 202 to latch the AL.sub.-- PA address of the OPN just received as symbolized by block 105. If the OPN was followed by any RRDYs, the Remote Decode circuit 604
activates the RRDY signal on line 696 for each one. Each such activation causes the TX Port State Machine 602 to increment an RRDY count, and to activate a Del RRDY signal on line 856 for each activation of the RRDY signal on line 696. This removes the RRDYs from the FIFO pipeline to block their transmission. These RRDYs can be later regenerated and transmitted by activation of the Insert RRDY signal on line 686 if the OPN just received wins the AL.sub.-- PA comparison. This process of blocking the OPN any following RRDYs from transmission through the FIFO is performed by each TX and RX port in both the left half and right half Bridges whenever an OPN is received alone or in combination with following RRDYs and should be understood as the process performed whenever blocking of further transmission of OPNs and RRDYs is required by the flow charts of FIG. 8A et seq.
Also, if the received OPN was pseudo-full-duplex, the Pseudo Full Duplex signal on line 710 in FIG. 6 is activated by the Remote Decode circuit 604. This informs the state machine 602 to activate the Convert Back signal on line 712 if the OPN is forwarded onto the local loop. This causes the Insertion/Substitution/Deletion FIFO to substitute a copy of the destination address for the hex F7 or F8 source address to convert the pseudo-full-duplex OPN back to half duplex before forwarding out onto the local loop segment.
Activation of Enable Compare on line 638 by Remote Decode circuit 604 causes an address comparison of the AL.sub.-- PA of the remote OPN from latch 202 with whatever AL.sub.-- PA address is latched in latch 200 of the RX port. If there is no conflicting OPN, latch 200 will store a default AL.sub.-- PA which will always lose against the AL.sub.-- PA of the remote OPN. Note that in the steps of the process symbolized by FIG. 8A et seq., any step that requires activation of Enable Compare enables this signal only if it is not already enabled.
After arming the RX port, step 105 is performed to latch the source and destination addresses of the remote OPN. Step 105 is accomplished as a result of the activation by the Remote Decode circuit 604 of the Latch (0:1) control signal on line
636 successively when the destination and source addresses arrive at pin 2. This signal is activated once as the destination address arrives to latch the 10 bit AL.sub.-- PD destination address in latch 202 and then is activated again as the AL.sub.-- PS address arrives to latch the 10 bit source address in latch 202. Step 105 also blocks transmission of the remote OPN until its destination can be determined and it is clear there are no conflicting OPNs. This blocking of the remote OPN is accomplished automatically as the Remote Decode circuit 604 in FIG. 6 recognizes the OPN and activates an OPN signal on line 632. As noted above, this causes TX Port State Machine 602 to activate a Del OPN signal on line 691 which causes the Insertion/Substitution/Deletion FIFO circuit 610 to strip off the 20 bits that comprise the header and OPN primitive. The OPN can be regenerated later if it needs to be forwarded onto the local loop by asserting an Insert OPN signal on line 694.
Test 103 essentially can be a continuous sampling of the OPN signal on line 632, or it can simply be a change of state by TX Port State Machine 602 when the OPN signal on line 632 changes states.
If test 103 determines that a remote OPN has been received, it may mean that the source node is somewhere on the remote loop segment 54. However, it may also mean that the source node and destination node are both on the local loop segment 52
but the source node is closer to the RX port 100 than the destination node and the location of the destination node is as yet unlearned resulting in the OPN being forwarded by the RX port 100 out onto the remote loop segment 54 whereupon it eventually returns to the Remote RX Port, terminal 2 of the TX port via remote loop segment 167. This is the reason the RX port is armed in step 111. By arming the RX port, the half Bridge can learn the location of the destination node of the remote OPN by watching the traffic that results at the local RX port as a result of receipt of the remote OPN. For example, if the source node was node 104 in FIG. 5 and the destination node was node 106, source node 104 could transmit an OPN primitive to terminal 1
of the RX port 100 which would then forward the OPN primitive by making a 1-2 connection if it had not yet learned that node 106 was on its local loop segment 52. This OPN would propagate around the remote loop segment and return to terminal 2 of the TX port. The TX port 102 would react to the arrival of the OPN primitive by storing the OPN addresses in latch 202, and comparing the address fields of the remote OPN latched in latch 202 to the address fields of the latched local OPN stored in latch 200
in the RX Port. This comparison is done by activation of the Enable Compare signal on line 638 in FIG. 5 and transmission of the AL.sub.-- PA address fields of the OPN latched in RX port via a 20 bit data bus 108 and the AL.sub.-- PA address field of the OPN latch in the TX port 102 via a 20 bit bus 109 to a comparator 402. If the remote OPN was the same as the local OPN, the comparator 402 would activate the Equal signal on line 640 which would cause both the TX and RX ports to go into transparent mode wherein all primitives and data are passed therethrough without change. Since any local half-duplex OPNs (characterized by the third and fourth address characters being identical) are converted to pseudo full-duplex OPNs in the RX port latch prior to transmission to the other half Bridge, if the above situation arose where a local half-duplex OPN was forwarded to the remote half Bridge and came back therefrom, the comparison would occur on the pseudo full-duplex OPNs and the Equal signal would be activated causing transparent mode to be entered. The conversion of any local half-duplex OPNs to pseudo full-duplex OPNs is done by setting the AL.sub.-- PS (source address) field to F7 or F8, depending on whether the half Bridge was configured to be the high priority or low priority half Bridge at the time of initial configuration of the system. This conversion happens either before the address fields are latched in the internal OPN latches in the TX or RX ports or after the latching process and before any process of shipping the OPN on bus 108 from the RX port to the comparator 402 and before any comparison process. Note that if both the local OPN received by the RX port and the remote OPN received by the TX port of the same half Bridge were pseudo-full-duplex, the source address of each of the local OPN and the remote OPN would both have been changed to F7 or F8, depending upon which way the RX port of the half Bridge was configured. In the normal case however when a remote OPN arrives which is pseudo full-duplex, the F7 or F8 source address is stripped and the OPN is converted back to half duplex before transmission out on the local loop. If a remote pseudo full-duplex OPN is involved in a preemption comparison, if the winning OPN is the remote pseudo-full-duplex OPN, it is converted back to half-duplex before being forwarded to the local loop segment. If the winning OPN is either a local full duplex OPN or a local pseudo full-duplex OPN, no change is made to the full-duplex or pseudo full-duplex OPN prior to transmission to the remote half Bridge.
After latching the source and destination addresses in step 105, processing proceeds via path 642 to test 652 on FIG. 8B. That test checks the TX Arm signal on line 644 in FIG. 5 to determine if it has been recently activated. The TX Arm signal is set active by the RX port whenever decoder 620 in FIG. 7 detects a local OPN at pin 1. The TX Port State Machine notes this fact and sets an internal flag or enters an internal state that can be checked when step 662 is reached to determine which preemptive processing state is necessary.
When a local OPN has been received at pin 1 in FIG. 7, decoder 620 activates the OPN signal on line 654 which causes the RX Port State Machine to activate the TX Arm signal for the one clock cycle needed to access memory and then resets TX Arm on the next clock cycle. Activation of OPN on line 654 also causes the RX Port State Machine to activate the Latch [0:1] signal on line 656 which causes the addresses of the local OPN to be stored in latch 200. If the local OPN was half-duplex, Decoder
620 detects this and activates the Half Duplex signal on line 658. This causes the RX Port State Machine to activate the RX Convert signal on line 660 which causes the AL.sub.-- PA latch circuit 200 to convert the source address of the local OPN to hex F7 or F8 depending upon the RX port's configuration data.
Activation of TX Arm is done so that if a concurrent remote OPN arrives during the time the local OPN is being dealt with or just after it is forwarded, the conflict can be resolved. The TX Arm signal is activated as soon as the local OPN is detected, and remains active for the single clock cycle during which the memory 78 is checked to determine from the local OPN's destination address where to send the local OPN. During the clock cycle when the local OPN was received, its addresses are latched, its destination address is used to access memory, and, if it is half-duplex, its source address is converted to hex F7 or F8 in the address latch 200.
If test 652 in FIG. 8B determines that TX Arm is active when the remote OPN arrived, it means a possible conflicting OPN situation has arisen. Cases 1-4 in the preemption rule processing detailed above can result depending upon where, if anywhere, the local OPN has been sent. To determine which preemption case processing is necessary, step 662 is performed to determine where the local OPN was sent or if simultaneous local and remote OPNs have been detected. This is done by determining if TX Arm is still active or is false. If TX Arm is false, but has been recently activated, step 662 checks the status of the RX Switch Pos. signal on line 664 in FIG. 5. This signal is set by the RX Port State Machine to a logic state which corresponds with the position of switch 614 when the TX Arm signal is deactivated after the memory access has been completed and the switch moved to the position necessary to forward the local OPN to its destination.
Case 2 arises if step 662 determines that the local OPN has been previously forwarded on the local bypass to the TX port prior to the time the remote OPN arrives. The TX port determines where the local OPN was sent on the local bypass by determining if the TX Arm signal is still true, and, if not, by checking the status of an RX Switch Pos. signal on line 664 to see if switch 614 in FIG. 7 is set to the 1-3 position. When TX Arm is false, it means that the memory access has been completed and the RX Switch Pos. Signal will indicate whether the local OPN was sent on the local bypass or forwarded to the remote half Bridge.
When a case 2 preemption situation arises, step 666 is performed to carry out an automatic first-come, first-served preemption. Case 2 involves the local loop being busy at the time the remote OPN arrives. This will be the case when step 662
discovers that the local OPN has been previously forwarded on the local bypass when the remote OPN arrives. Step 666 carries out this automatic preemption by the TX Port State Machine activating the Auto Preemption signal on line 646 in FIG. 5. This causes the RX Port State Machine to set the Switch Control signal on line 622 in FIG. 7 to set the switch 614 to a 1-2 connection and to activate the Insert CLS signal on line 650 in FIG. 7. This causes the Insertion/Substitution FIFO circuit 616 to generate a CLS primitive and send it out to the remote half Bridge where it is forwarded to the source node which generated the remote OPN thereby closing it.
If step 662 determines that the local OPN just received has been previously forwarded to the remote half Bridge at the time the remote OPN arrived, a case 1 preemption situation has arisen. In this case, step 668 is performed wherein the TX Port State Machine activates Enable Compare signal on line 638 in FIG. 5. This causes the Comparator 402 to compare the AL.sub.-- PA addresses of the local and remote OPNs latched in latches 200 and 202 and activate either the Preempt Local Source signal on line 670 or the Prempt Remote Source on line 672.
Next, step 674 is performed to determine which of these two result signals from the comparator have been activated. The two half Bridges that together comprise the full Bridge each simultaneously perform the processes depicted in the flow chart of FIGS. 8A et seq. However, the flow charts have been written from the viewpoint that the process depicted is occurring in the left half Bridge as the local half Bridge such that the right half Bridge may be referred to as the remote half Bridge. That being said, path 676 out of step 674 represents the branch taken if the remote OPN at pin 2 of the left half Bridge TX port won the address priority comparison. Path 678 represents the branch taken if the left half Bridge local OPN won.
Referring to FIG. 8C, the processing along path 678 will be described. When the left half Bridge local OPN won and has already been forwarded to the right half Bridge (the reader is also referred to FIG. 3 for context), the right half Bridge must close the source node on its local loop segment before transmitting the local OPN received from the left half Bridge onto its local loop. This is done by step 688 in FIG. 8C. In this step, the right half Bridge first blocks the remote OPN received at pin 2 (the forwarded local OPN received from the left half Bridge). This blocking is done when the Remote Decode circuit 604 (see FIG. 6) in the right half Bridge (identical circuits to circuits in the left half Bridge are referred to by the same reference numbers) detects a remote OPN at pin 2 and activates the OPN signal on line 632. This causes the Latch [0:1] signal on line 636 to be activated which causes AL.sub.-- PA latch to latch the source and destination addresses. Activation of the OPN signal on line 632 also causes the TX Port State Machine to follow the same processing that the left half Bridge TX port does when it receives an OPN, said processing being described by the flow chart of FIGS. 8A et seq. In this case, since step 652
performed by the right half Bridge would find that TX Arm had been recently activated when the right half Bridge local OPN was received and forwarded to the left half Bridge, processing would proceed through step 662 to step 668 and following. This processing would result in activation of the Enable Compare signal on line 638 to start an address comparison and causes the state machine to simultaneously activate the Delete OPN signal on line 691 in FIG. 6. Activation of Delete OPN causes the Insertion/Substitution/Deletion FIFO circuit 610 to strip off the 20 bits of the OPN primitive.
The comparison in the right half Bridge comparator 402 occurs between the address of the right half Bridge local OPN previously transmitted to the left half Bridge (which was a remote OPN there and found to be of lower priority) on bus 108 and the address of the remote OPN received from the left half Bridge on bus 109. This comparison will yield the same result as it did in the left half Bridge, i.e., the right half Bridge remote OPN will be indicated as higher priority meaning the local source that generated the remote OPN received by the left half Bridge must be closed. The right half Bridge comparator 402 will activate the Preempt Local Source signal on line 670. This will cause the right half Bridge TX Port State Machine to activate the Insert CLS signal on line 684 to generate and send out a CLS on the local loop segment. This closes the local loop source node which generated the losing remote OPN at the left half Bridge.
Next, the right half Bridge must forward or unblock the remote OPN and any RRDYs, but before it can do that, it must arbitrate for and win control the local loop segment. Step 690 represents this process. First, the right half Bridge TX Port State Machine activates the Start ARB signal on line 700. This causes the FIFO circuit 610 to change any incoming CFW to ARB(0) and send them out onto the local loop segment. The local node which generated the losing OPN will have been closed so it will no longer be swallowing ARBs. When the ARB(0) arrives at each node, it is forwarded as the highest priority ARB by each node. Eventually the ARB(0) arrives at pin 1 of the right half Bridge RX port 100 and is recognized by Decode circuit 620 which then activates the ARB(0) control signal on line 702 of FIG. 7 which tells the RX Port State Machine 612 that the TX port has just won arbitration. The RX Port State Machine then activates the ARB Won signal on line 704. This fact is detected by the TX Port State Machine 602 which then activates the Stop ARB signal on line 706 in FIG. 6. This causes the Insertion/Substitution/Deletion FIFO 610 to stop substituting ARB(0) for incoming ARB CFWs. Then the right half Bridge TX Port State Machine activates the Insert OPN signal on line 694 which causes the FIFO circuitry 610 to generate and transmit an OPN primitive. This is followed by transmission by the Insertion/Substitution/Deletion FIFO of the addresses latched in AL.sub.-- PA latch 202. These addresses are always available to the FIFO circuit via bus 109 and are automatically transmitted out pin 3 in sequence following generation of an OPN.
If the remote OPN at pin 2 of the right half Bridge had been followed by any RRDYs indicating the source node on the local loop of the left half Bridge was issuing buffer credit, each RRDY would be detected by Remote Decode circuit 604 and would cause activation of the RRDY signal on line 696. The number of time RRDY