Home
Patent Search
IMT Blog
REGISTER
|
SIGN IN
United States Patent Application
20020103943
Kind Code
A1
Lo, Horatio ; et al.
August 1, 2002
Distributed storage management platform architecture
Abstract
A distributed storage management platform (DSMP) architecture is disclosed. Such a DSMP architecture includes a number of storage routers. Each one of the storage routers comprises a number of interface controllers. One of the interface controllers of each one of the storage routers is communicatively coupled to one of the interface controllers of at least one other of the storage routers. final
Inventors:
Lo; Horatio
(Milpitas, CA)
, Tam; Sam
(Belmont, CA
)
, Lee; David
(San Jose, CA
)
, Kurpanek; Dietmar M.
(Emerald Hills, CA
)
Correspondence Name and Address:
Skjerven Morrill MacPherson LLP Suite 700 25 Metro Drive
Samuel G. Campbell III
San Jose
CA
95110
US
Series Code:
904824
Filed:
July 12, 2001
U.S. Current Class:
710/2
U.S. Class at Publication:
710/2
Intern'l Class:
G06F 003/00
Claims
What is claimed is:
1. A distributed storage management platform architecture comprising: a plurality of storage routers, wherein each one of said storage routers comprises a plurality of interface controllers, and a one of said interface controllers of each one of said storage routers is communicatively coupled to a one of said interface controllers of at least one other of said storage routers.
Description
CROSS-REFERENCE TO RELATED APPLICATIONS
[0001] This application is a continuation-in-part of patent application Ser. No. 09/501881, entitled "A MULTI-PORT FIBRE CHANNEL CONTROLLER", referred to herein as MPFCC patent application, and having D. Kurpanek as the inventor, which is assigned to Vicom Systems, Inc., the assignee of the present invention, and which is hereby incorporated by reference herein, in its entirety and for all purposes.
[0002] This application also claims priority to Provisional Patent Application Serial No. 60/217,867, entitled "A DISTRIBUTED STORAGE MANAGEMENT PLATFORM ARCHITECTURE," and having H. Lo and S. Tam as inventors; and to Provisional Patent Application Serial No. 60/268,777, entitled "A DISTRIBUTED STORAGE MANAGEMENT PLATFORM ARCHITECTURE," also having H. Lo and S. Tam as inventors, which provisional patent applications are assigned to Vicom Systems, Inc., the assignee of the present application, and are hereby incorporated by reference herein, in their entirety and for all purposes.
BACKGROUND OF THE INVENTION
[0003] 1. Field of the Art
[0004] The present invention relates to computer subsystem communications, and, more particularly, to a method and apparatus for creating an extensible storage area network (SAN) architecture by interconnecting multiple storage router type devices--which may be viewed as the nodes in a framework which is the SAN--providing coupling of host computers to storage drive arrays or subsystems. final
[0005] 2. Description of the Related Art
[0006] Although most of the concepts of networked storage are rooted in technology that has existed for several decades, there are recent developments in computing which have lead to drastically increased demands for storage space. This is due in a large part to the advent of massive unstructured data flows, observed to stem from the mass acceptance of the internet and its related methods of communications and dissemination of information, as opposed to those associated with structured data flows, typical examples of which are Enterprise Resource Planning (ERP) systems, and Electronic Document Interchange (EDI) systems. There is a growing need to effectively control and manage data storage activities related to the former, the latter being more easily able to be controlled and managed. As such, new challenges are brought forth related to facilitating server and storage consolidation, non-disruptive back-up procedures, and minimizing the distance limitations of technologies preceding storage area networking.
[0007] This need has led to the development of what is commonly referred to as a `Storage Area Network` (SAN). Within a SAN, host computers provide access to arrays of storage devices that can be either local or remotely located, and can be either centralized in one location or distributed over many. This variability and the complexity of such storage subsystems mandates that the host computers be coupled to devices that can route requests to the storage devices and make their actual configuration transparent to the end-user (e.g., `storage routers`). This added network (the SAN) between the hosts and the storage devices, when properly set up and configured by an administrator, releases the end-user from the need to be concerned with the actual physical layout of the storage.
[0008] Traditional approaches to storage and storage management emphasize the need to control expense and restrict access. Prior technology has generally only enabled a given storage device to be accessible to a single server, so the latter goal is managed with relative ease though apparently working against achieving the former where there is such a one-to-one ratio of dependency. While one cost factor, that of the physical hardware performing the storage role, has exponentially decreased over the time since the advent of computing as a viable business tool, the cost associated with management now has continued to increase at an increasing rate, offsetting any benefits the former brings.
[0009] The important function of managing storage in a networked environment has proven to be generally difficult, and when comparing the approaches of management through a server-provided interface with storage-specific direct connect-based management, a definite trend for success has been correlated with the latter. Just as data has been seen to come to be valued as an independent strategic asset from the computers that access it, storage networking products and architectures, as platforms for data protection and storage management are just now being elevated to the same level of concern.
[0010] To ensure reliability, redundant access is often supported, employing multiple routing devices, some of which may be distributed across geographically distant locations. Although a comparatively new technology, common existing implementations of SANs have to date been observed as failing in a critical area, that of not readily supporting extensibility. An essential characteristic for a SAN is that it must be scalable if it is to support the increasing rate of growth of demand for storage space.
[0011] It is therefore desirable to introduce greater simplicity into the hardware used to communicate between a host system and storage array, while meeting the prerequisites of redundancy and reliability (collectively termed as high availability). Preferably, a suitable SAN architecture also provides improved performance and reduces the running cost of the SAN, ideally maintaining transparency to the user. Also most preferably, such an architecture is extensible, allowing easy insertion to and removal from the SAN of hosts, storage drive arrays or subsystems, and any appliances that are introduced into the SAN to form part of that architecture.
SUMMARY OF THE INVENTION
[0012] In one embodiment of the present invention, a distributed storage management platform (DSMP) architecture is disclosed. Such a DSMP architecture includes a number of storage routers. Each one of the storage routers comprises a number of interface controllers. One of the interface controllers of each one of the storage final routers is communicatively coupled to one of the interface controllers of at least one other of the storage routers.
[0013] The foregoing is a summary and thus contains, by necessity, simplifications, generalizations and omissions of detail; consequently, those skilled in the art will appreciate that the summary is illustrative only and is not intended to be in any way limiting. As will also be apparent to one of skill in the art, the operations disclosed herein may be implemented in a number of ways, and such changes and modifications may be made without departing from this invention and its broader aspects. Other aspects, inventive features, and advantages of the present invention, as defined solely by the claims, will become apparent in the non-limiting detailed description set forth below.
BRIEF DESCRIPTION OF THE DRAWINGS
[0014] The present invention may be better understood, and its numerous objects, features, and advantages made apparent to those skilled in the art by referencing the accompanying drawings.
[0015] FIG. 1 is a schematic diagram illustrating the basic philosophy underlying a SAN employing a DSMP Architecture. The SAN is shown by a `cloud` symbol (a common representation of the potentially complex coupling of links), which can for example incorporate storage networking devices that may exist in a storage area network. The implication is that the details of connectivity therein can be temporarily overlooked while examining the hosts and storage drives array/subsystems attached separately external to the SAN. Within the cloud is a collage of subsystem interface devices, each containing two different interface controllers, without specific manner of connectivity being shown. This indicates that a SAN employing a DSMP Architecture is dependent on such an architecture to couple the attached equipment, and the present invention will provide a specific means of achieving this, demonstrated by various embodiments disclosed subsequently.
[0016] FIG. 2 is a block diagram illustrating the essentials of a host (i.e., host computer) in the context of this DSMPA invention. final
[0017] FIG. 3 is a block diagram illustrating an example of connectivity between hosts and storage drives array/subsystems through subsystem interface devices, each containing at least one embodiment of an interface controller, a derivative of the multi-port controller invention referred to in the MPFCC patent application. In an embodiment, each such subsystem interface device connects one host to a storage array/subsystem, and these devices are shown coupled in a loop on the host side. The hosts are also connected into this loop, each through a host I/O interface controller. Details of connections between the subsystem interface devices and the storage drive array/subsystem are not raised at this intermediate point of developing the explanation, this being the simplest open construct implementation of the DSMP Architecture (i.e. subsystem interface device consisting of, amongst other non-specific components, at least a single interface controller).
[0018] FIG. 4 is a block diagram illustrating an example of connectivity which is an extension of that shown in FIG. 3. In addition to the elements described there, the subsystem interface devices are each also shown to be connected to a sub-network, also to which the hosts are connected. Each element is linked to the sub-network through a network interface card of a type specific to that element family. This is an implementation of the invention which capitalizes on the ability that DSMP Architecture provides for inter-device communication across either of separate independent channels. Thus, this embodiment of the invention being a next stage open construct implementation (i.e., subsystem interface device consisting of, amongst other non-specific components, a single interface controller plus one network interface card), enables an extra level of management flexibility and redundancy beyond the one shown in FIG. 3.
[0019] FIG. 5 is a block diagram illustrating an example of connectivity between hosts and storage drives array/subsystems through storage router type devices, each containing a two stage embodiment of the multi-port fibre channel controller invention (per MPFCC patent application). This embodiment is a developed construct implementation of the most basic open construct of DSMP Architecture shown in FIG. 3. Each such device, described in earlier figures as a subsystem interface device, now becomes a storage router type device (or storage router). Each final storage router connects one host to a storage array/subsystem, and these storage router devices are shown coupled in two separate loops, one each on the host side and the device side. The hosts are connected into the former loop, each through a fibre channel interface controller (mounted internally to each), whilst the storage array/subsystem elements are connected into the latter loop, each through a non-specific but nevertheless compatible interface means. From the perspective of the storage router device, each storage router is connected to each one of these two loops separately, through a host side multi-port fibre channel controller, and a device side multi-port fibre channel controller respectively.
[0020] FIG. 6 is a block diagram illustrating an example of connectivity which is an extension of that shown in FIG. 5. Additional to the elements described there, the items equivalent to those subsystem interface devices as per FIG. 4, are now described as SVE devices, and are each shown to be connected to a sub-network, also to which the hosts are connected. Each element is linked to the network through a network interface card of a type specific to that element family. This is yet a further implementation of the invention which capitalizes on the ability that DSMP Architecture provides for inter-device communication across any of separate independent channels. Thus, this embodiment of the invention, which is a complete construct implementation of DSMP Architecture ion applied to a SAN, enables an extra level of management flexibility and redundancy beyond the one shown in FIG. 5.
[0021] FIG. 7 is a block diagram illustrating what constitutes an SVE device, a daughtercard and a motherboard, the first embracing a collective of components that make up a dual port fibre channel controller, the second encompassing an identical group of components but also accompanied by various others fundamental to the processing operations that must take place to permit the DSMPA to function.
[0022] FIG. 8 is a schematic diagram which demonstrates how a DSMP Architecture may be employed in a SAN which is constructed using other components in addition to storage router devices and their links, such as switches and hubs.
[0023] FIG. 9 is a schematic diagram illustrating a comparison between three different types of architecture which can be employed in a SAN, in such a way that their final contrasting aspects are emphasized. The first two represent existing art, the third shows the DSMP Architecture.
[0024] The use of the same reference symbol in different drawings indicates similar or identical items. The use of the same label suffixes (i.e., digits beyond the first one that that coincides with FIG. #) in different drawings also indicates similar or identical items.
DETAILED DESCRIPTION OF THE INVENTION
[0025] The following is intended to provide a detailed description of an example of the invention and should not be taken to be limiting of the invention itself. Rather, any number of variations may fall within the scope of the invention which is defined in the claims following the description.
[0026] Introduction
[0027] The present invention provides for a Distributed Storage Management Platform (DSMP) Architecture, a variety of embodiments of which provide a framework for an extensible storage area network (SAN). Such a DSMP Architecture has been conceived to be employed in a SAN to capably address the variety of problem issues confronting the technology, and provides a wealth of benefits, amongst them the most straight-forward being the easy attachment and de-attachment of hosts and storage equipment to and from a given SAN.
[0028] Each such embodiment typically includes a number of common elements throughout, and a number of hosts and a set of a storage drive arrays or subsystems, coupled together through interface devices (also referred to herein as storage router type devices, or simply, storage routers),--which themselves embody interface controllers--each a key building block element of the DSMPA.
[0029] Such a DSMPA includes a number of interface controllers, which reside in storage network router type devices, at least two interface controllers per storage network router type device. In a storage area network, employing such a DSMPA, each one of the storage network router type devices is communicatively coupled, through at least one of the interface controllers contained within, to at least one other neighboring storage network router type device, through at least one of the interface controllers contained within that neighboring storage network router type device, via at least one of a number of links. The storage network router type devices incorporating interface controllers, and links between them, enable the sharing of management information, and thus form a distributed management platform, which is the basis of an intelligent SAN.
[0030] In such a storage area network, each one of the interface controllers is communicatively coupled to at least one other of the interface controllers, as well as to the hosts and to the storage arrays/subsystems. The interconnection of these storage router type devices is a pre-eminent factor in the operation of a SAN employing the DSMP Architecture. This in-band connectivity is a key focus aspect of the invention and permits sharing of databases held in the memory of separate storage router type devices housing the controllers--databases to which these interface controllers have direct access. An absence of this feature would otherwise be an obstacle to obtaining any of the several advantages over the analogous situation, where similar functionality could be obtained without this distribution of such in-band-coupled devices, particularly with regards to performance and scalability of the entire installation.
[0031] A SAN designed using the DSMP Architecture consists of an arrangement of such storage routers in between a set of hosts and the available storage, to provide a number of advantages over the traditional set-up in which the hosts are directly connected to the storage elements without the storage routers being coupled there between.
[0032] The most critical of these benefits is the enabling of so-called "any-to-any" connectivity between hosts and routers, where hosts from vendors that were previously incompatible with particular types of storage units can now be employed. The architecture provides coding mechanisms which present the storage elements to a host in an orchestrated manner. This ensures that the signals transmitted to and from the host fibre channel interface controller are in a sequence and format which leads to proper interpretation by the host of the data I/O being exchanged.
[0033] This "any-to-any" concept also includes the ability to implement such connectivity over a variety of different network types, any one of which may already be in place handling communications between hosts and storage units prior to the introduction of the elements of the DSMP Architecture.
[0034] In several embodiments of the present invention, describing a specific loop configuration, each one of the interface controllers is a multi-port controller. Specifically, they can be multi-port fibre channel controllers, and in one such embodiment, each of these multi-port controllers is a dual-port fibre channel controller. Although fibre channel is the implied accepted standard transport layer at the present stage of technological development in this field, various other transport layers are commonly found as the means of connectivity between hosts and storage elements, including SCSI, SSA, TCP/IP and others, and so are contemplated by this disclosure. Elaboration upon this is subsequently provided.
[0035] Features and Framework of the DSMP Architecture
[0036] A DSMP Architecture can be configured to consist of differing storage router types each having one of a number of combinations of interface controller sub-elements which subscribe directly to the different transport layer technologies. Correspondingly, references made throughout this document to the fibre channel transport layer in the context of the host interface controller, or even that of the interface controller integral to the storage elements, may in general be substituted for by any of these other transport layer technologies. The suitability of such substitution may not necessarily be recommended due to the performance decrement in comparison to fibre channel that such a change may bring (as defined by the current governing standards for these areas of transport layer technology), but the invention nevertheless accommodates for this.
[0037] The process associated with providing these and other capabilities is referred to herein by "storage virtualization". A key concept to storage virtualization is that the physical elements of the storage are subsumed and represented to the hosts after several stages of mapping that take place within the router in a simple logical view, the routers themselves remaining transparent. This is achieved through manipulation under the control of software, some of which is embedded in the hardware, and some of which is resident in one or more hosts, specifically configured to be able to be applied in a management role.
[0038] Other features and advantages, which shall become part of what is termed herein as the DSMP Architecture paradigm set, include the following:
[0039] 1) Ease of Reconfiguration
[0040] The basic drive units of the storage elements of the SAN can be configured in various ways involving any one or a combination of concatenation, mirroring and one of several instant-of-time copying processes, and this may be accomplished and retained as well as abandoned and re-configured with relative ease.
[0041] 2) Extensibility of the SAN
[0042] The need to expand the amount of available storage is a fundamental requirement faced by any commercial enterprise, which inevitably occurs in the course of business, and is a capability which is poorly addressed by host/storage infrastructures architected via the existing technology. With the present invention, storage elements can be added to an extent limited only by parameters of the software, from either the same or a different vendor, or of a kind subscribing to the same, or a different transport layer technology, to that of the initially resident storage.
[0043] 3) Ease of Replacing Failed Drives
[0044] The basic drive units of the storage elements of the SAN are prone to failure from time to time, and a DSMP Architecture according to the embodiments of the present invention provides mechanisms which insulate the host from the effects of failed storage components and allow for their replacement and for the regeneration of previously stored data, as well as the safe handling of I/O signals which are in transit at the time of drive failure.
[0045] 4) Ease of Replacing Failed Storage Routers
[0046] The storage router units which comprise a suitably configured DSMP Architecture are themselves occasionally vulnerable to failure, and so final mechanisms which isolate failed storage router units and allow for their replacement without disruption to the I/O exchange taking place between hosts and the storage elements. Again, of foremost concern is the safe handling of I/O signals which are in transit at the time of router failure, and a ready way of automatically reconfiguring the substitute router unit to present the storage elements to the hosts in the same virtualization configuration that the unit which failed was presenting those storage elements.
[0047] The present invention, namely a Distributed Storage Management Platform (DSMP) Architecture, can be employed as the framework of a Storage Area Network (SAN). An interface controller such as that described in the MPFCC patent application provides multi-port coupling capability to a storage router which plays becomes a key role as a node in a FC-backboned SAN, empowering the SAN with the DSMP Architecture paradigm set. Flexibility in the manner of connection allows the attached host and storage devices to be placed in a number of different configurations, supporting any of the various topologies which are permitted for the fibre channel (or other) transport layer, including most notably loop, but also point-to-point, and switched (also termed fabric) amongst others, as discussed in more detail subsequently herein.
[0048] When such storage routers are employed to interconnect hosts with storage subsystems and other such devices, such storage routers can consist of interface controllers configured, for example, in pairs (together also with other associated hardware). A specific embodiment is such that each member of an interface controller pair is dual-port fibre channel, and that there is only one such pair to be found in each router. One controller in each pair is assigned to be coupled to the host side, and the other is assigned to be coupled to the storage side. However, there is nothing to preclude controllers consisting each of a multiplicity of ports beyond two, and router units being built with several or many such pairs of controllers, with certain controllers assigned to be coupled to the host side, and others to be coupled to the device side. Note that a one-to-one correspondence of host side to device side allocation ratio need not necessarily be maintained.
[0049] A SAN architecture according to the present invention that employs dual-port fibre channel controller equipped storage area routers, such as those described previously, allows multiple hosts from an array of different vendor sources, running any one of a range of different operating systems (from the Open Systems--UNIX--genre, and several from the PC platform space) to be interconnected with multiple storage drive arrays/subsystems, for example, in fibre channel loop topologies, respectively, on each side.
[0050] DSMP Architecture in a Heterogeneous Environment
[0051] Such an architecture can provide coupling among multiple hosts and multiple storage media, employing multiple storage routers, where some of the hosts and storage media can be of varied constructs to that defined earlier (namely that of paired dual-port fibre channel controllers), in a heterogeneous environment. As opposed to a homogeneous coupling environment, consisting only of a single primarily chosen transport layer (or physical interconnect)--namely that fibre channel in this example, a heterogeneous coupling environment is comprised of different transport layers (other than that of primary choice) in portions which may be sub-loops (or branches) to the main loop (or other topology).
[0052] Routers of these variant kinds can include (amongst other supporting hardware) one dual-port fibre channel controller mating with another multi-port controller (or other type of interface controller) that subscribes directly to a different physical interconnect, such as SCSI or SSA. Examining these two commonly sought-after variant router constructs:
[0053] In the case of a multi-port SCSI controller, this may be assigned to either the host side or to the storage side, whilst the multi-port fibre channel controller is then assigned to whichever host or storage device side remains in this arrangement.
[0054] For the case of a multi-port SSA controller combined with a multi-port fibre channel controller, although in theory the SSA controller may be assigned to act within the particular topological coupling of either the host or storage device side, in practicality, a storage device side allocation is generally observed, so for this case, the MPFCC is usually relegated to the host side.
[0055] As described earlier, a SAN employing a DSMP in an arrangement of this kind inherits all of the advantages associated with the DSMP paradigm, whether the coupling on host or storage device sides is of a loop, or other topology. Within such a SAN, coupling and successful interaction is permitted amongst a wide heterogeneous variety of host and storage equipment elements, from an array of competing vendors running different host OS platforms, or in the case of the storage units, perhaps incorporating degrees of their own intelligent capability for I/O protection ranging from nothing to high-level RAID protection. The availability of such capabilities is unprecedented in the independent distributed routing space.
[0056] The benefits of a DSMP Architecture, according to embodiments of the present invention are many, and include:
[0057] the ability to perform the establish storage virtualization;
[0058] the ability to easily reconfigure the processes associated with virtualization including those related to maintaining redundancy levels either with regard to hosts or with regard to the storage media;
[0059] extensibility--of either hosts or the units of storage media--easily inserting/removing into/out of a SAN meet any needs for expansion;
[0060] maintainability--relating to any of the host computers, units of storage media, or even the router units themselves--easily allowing for removal/replacement of unserviceable elements while permitting the functioning of the remainder of the SAN to continue with minimal disruption, protecting stored data from loss, and providing a high degree of protection against loss of I/O signals in transit at the time of failure.
[0061] [Note that in the case of substituting for failed storage drive units, the prescribing of spare drives, and the way to invoke such spare drives, is one of the standard automated features supplementing the aforementioned storage virtualization functionality of the routers--provided that the storage drive array/subsystem supports such functionality--which is generally the case.]final
[0062] The one common denominator in all of this heterogeneity is that each one of the elements should subscribe to understanding (receiving and sending) data I/Os via an upper-level protocol such as SCSI, or whatever other means/modes of data communications between host servers and storage is considered acceptable (or is an industry standard).
[0063] Relationship of DSMPA With Switching
[0064] The multi-port fibre channel controller described in the MPFCC patent application allows topologies such as loop, point-to-point, and fabric--the latter otherwise referred to as switched. Although the details of any one particular topology are not significant to the present invention, the concept of switching should be discussed, and, in particular, how a switched portion of a storage network facilitates the building of a SAN framework.
[0065] SAN devices which perform switching (i.e., switches) are able to route data traffic across multiple ports according to a predefined fixed table of connectivity and can be implemented within the framework of a SAN wherein a DSMPA is employed and effectively complement the functionality of the routers. It will be noted that switches are generally discussed in the context of fibre channel interconnects, and may not apply in other of the transport layer technologies.
[0066] Even if storage routers were to have a multiplicity of ports comprising one or more of their internally mounted controllers, the volume could be insufficient to be able to deal with:
[0067] an excessive number of different host components seeking to be interconnected with the storage, or alternatively,
[0068] the multiplicity of separate storage media units available to be connected with.
[0069] In overcoming one of these problem situations if it occurs, it is important to note that the limiting factor in the degree of distributed routing independence and complexity is merely the port multiplicity in the storage routers, which define the extent of the DSMP Architecture. Although not strictly a router in the same sense as one acting in the context of data networks, (whereupon such a data router makes algorithm-governed decisions of determining the path rerouting of data packets, where the details of the source and destination are in a constant dynamic state), the storage router, in one embodiment, acts as an intelligent bridging conduit. Leaving the details of the processes which take place within such an architecture for subsequent discussion herein, it can be stated now that a storage router acts both as a SCSI initiator and as a SCSI target, as data is channeled through such a storage router between hosts and storage.
[0070] However, if a switch is implemented within the SAN framework, that switch merely acts as a director of data flow in both directions between nodes and devices, typically, without any capability of acting as an initiator or target. To give a simple analogy in terms of common household electrical supply for appliances and the like, the switch merely acts in much the same way that a plain power strip works in extending the number of AC power outlets available for distribution, though the power strip does nothing to the electricity in the way of metering, conditioning or amplifying.
[0071] The ratio of router units to host interface controller units is an over-riding parameter in determining the level of redundancy within the SAN. However, no such relationship exists between the number of router units and physical storage drive units, since the processes of storage virtualization are used to manipulate precisely that relationship. A switch may need to be implemented in a practical sense as a convenient and inexpensive means of providing sufficient paths for data channeling, though (theoretically), a DSMP Architecture, according to the embodiments of the present invention, embodied by a sufficient multiplicity of fibre channel ports on-board its integral controllers, obviates the role of the switch.
[0072] Example Interface Controller
[0073] A DSMP Architecture, according to embodiments of the present invention, preferably employs storage routers that provide the functionality and features necessary to support the full range of benefits that the architecture promotes. For example, while a SAN employing a DSMP Architecture according to embodiments of the present invention can use any one of several transport layer protocols suitable for SANs, an interface controller such as that described in the MPFCC patent application can be, and is most desirably, employed as the building block of such a SAN, thereby making fibre channel the lower level protocol of choice.
[0074] Each of the ports resident on a controller can provide access to its own bi-directional I/O data link. These I/O data links provide the mechanism by which host requests for access to the resources of the SAN can be received and passed on to the storage devices. Such communication is handled through the storage routers in a manner essentially transparent to elements both at source and destination, and with no significant detriment upon data throughput performance.
[0075] As mentioned, an important variable of such a multi-port controller is the very multiplicity of the ports available. Dual ports represents a minimum requirement for each controller to function effectively within the topology of the network on the particular side being considered--either the side closer to the hosts--or the side closer to the storage devices.
[0076] As explained earlier, the controllers may be paired in each router, one of each pair allocated to each such side. Consider one of the ports of one of these two controllers, say that controller which is allocated to coupling with the host side. One such port accepts the link between a router and the adjacent element (be it another router or a different device), located say counter-clockwise (for the case of a loop)--or left (for the case of point-to-point topology), while the other port continues the link to the subsequent next adjacent element, say clockwise, or, towards the right (respectively per whatever the topology).
[0077] Increasing the multiplicity of fibre channel ports mounted upon controllers is advantageous for a number of reasons, the most obvious one being an improved potential for scalability, and the related economy of such scalability reducing cost. The ability to circumvent a need for separate switch devices adds to this benefit.
[0078] There is, however, a cross-over point at which the value in multiplying the number of ports per controller begins to diminish, such as technical design limitations relating to:
[0079] provision of adequate electrical power supply components, both in terms of capacity, and providing for redundancy mechanisms to provide uninterrupted operation in the event of single line failure;
[0080] increasing complexity of circuit paths and design case contingencies, as well as heightened demands on individual component performance at very high frequencies, leading to increased chance for intermittent bugs and outright failure of electronic components, and then of the equipment itself in service;
[0081] as well as commercial disadvantages of:
[0082] excessive product entry-level price, and
[0083] acceptance difficulties in marketing a product whose physical packaging size exceeds the bounds of equipment which can be conveniently mounted in conventional unitary computer equipment racking space.
[0084] Although these concerns relate to successful practical implementation of the invention, they really fall outside the scope of describing the technology, so they will not be used to define any particular parameter limits.
[0085] The use of interface controllers (an example being multi-port controllers) simplifies the SAN architecture (e.g., a DSMP Architecture according to the present invention) and provides other economies. Moreover, a multiplicity of ports allows for a variety of topological options, allowing such a SAN to be architected in any number of ways, based on criteria such as throughput, reliability, fault-tolerance (both with regard to routers and other elements of the storage network) and similar criteria.
[0086] Basic SAN Employing DSMPA Architecture
[0087] FIG. 1 is a schematic diagram illustrating the basic philosophy underlying a SAN employing a DSMP Architecture. There exist a multiplicity of host systems which must have some external means of access, both by read and by write input/output operations to permanent data storage media. There is available a separate multiplicity of storage drive array/subsystems. According to embodiments of the present invention, the hosts may be coupled to the storage by attaching this equipment to a SAN, relying on that SAN to also lend itself as a distributed management platform and so additionally provide an array of desirable characteristics beyond that of mere connectivity.
[0088] The SAN is shown by a `cloud` symbol, which is a common representation of the potentially complex coupling of links, perhaps incorporating storage networking devices that may exist in a storage area network. Within the `cloud` are introduced a series of subsystem interface devices which are generic embodiments of storage networking devices containing one or more embodiments of an interface controller, one specific example being that described in the MPFCC patent application. The subsystem interface devices are simply shown in a `heap`--their specific manner of connectivity amongst themselves and between these devices and the externally attached equipment is not detailed at this point in the discussion.
[0089] As will be apparent to one of skill in the art, one or more of the controllers in a SAN employing DSMP Architecture can be coupled to one or more such host systems. A subsystem interface device 100 is configured to operate in a DSMP Architecture such as that depicted herein. Subsystem interface device 100[i] includes an interface controller 101[i], through which connection is made to each of the host systems 120([1]-[N]), which are each machines that may comprise of a variety of computer components, some of which may be in additional to the minimal set defining a suitable host, as described in the subsequent section, and also shown schematically in FIG. 2.
[0090] Subsystem interface device 100 is also depicted as being coupled (indirectly) to a set of storage drive array/subsystems 130([1]-[N]). Storage drive array/subsystem 130[i] can be, for example, a storage subsystem such as a hard disk drive array, tape storage unit, or such other storage media as may be necessary and appropriate to the given storage application. This equipment employs an upper level protocol, such as SCSI or some variant protocol (based on a similarly complex structured data format), preferably an accepted industry standard in host-to-storage communications. This enables participation in receiving and being able to send the I/O data signals (which may or may not be in a packetized format) being transmitted amongst the SAN elements coupled together by the network topology. Variation of protocol in the lower transport layer (from that of FC as the preferred means) is widely anticipated, and a DSMP Architecture according to embodiments of the present invention provide for this.
[0091] Having been presented an explanation in the preceding discussion, those skilled in the art will understand now that the next most immediate portions of this coupling are internal to a subsystem interface device containing, amongst various other necessary components, at least one other interface controller, and then further, this coupling may be along a path passing through other elements of the SAN. However, the details relating to these intermediate portions of the SAN element coupling between subsystem interface devices 100([1]-[N]) and storage drive array/subsystems 130([1]-[N]) are immaterial for the purposes of this initial discussion, and so are not shown in FIG. 1.
[0092] In the SAN architecture depicted in FIG. 1, port 102[i,1] of interface controller 101[i] couples subsystem interface device 100[i] to host system 120[i] via one of a series of host I/O interface controllers 125([1]-[N]) installed in host systems 120([1]-[N]). The other I/O ports (ports 102([i,2]-[i,N])) on interface controller 101[i] can then be coupled to other elements of the DSMP Architecture, including, but not necessarily limited to, other host systems and even other storage routers. Aligned with the earlier discussion the implication is that this controller is coupled to a host side loop (or other) topology,
[0093] On the other hand, each storage interface device 100[i] may also be seen to contain an interface controller 105[i] on the storage device side (simply referred to subsequently as device side), which may or may not resemble the interface controller 101[i] on the host side. Each has at least one of a port 106[i,1] which is employed to coupling subsystem interface device 100[i] to one of a set of storage drive arrays/subsystems 130[i], via one of a series of port interfaces that might be integral to such third party storage equipment. Any other I/O ports (ports 106([i,2]-[i,N])) on the storage side of on interface controller 101[i] can then be coupled to other elements of the DSMP Architecture, including, but not necessarily limited to, other storage drive arrays/subsystems and even other storage routers. Aligned with the earlier discussion, the implication is that this interface controller 105[i] is coupled to a storage side loop (or other topology).
[0094] It is more common for neighboring subsystem interface devices 100[i] (later referred to, as storage routers) to be communicatively coupled via their respective storage device-side interface controllers 105[i], rather than via those interface controllers 101[i] on their host sides. There are certain circumstances of SAN 110 configurations which will be considered to consist of more than a single SAN, if the latter means of host side controller inter-linking of subsystem interface devices 100[i] is applied.
[0095] In any case, this implication is that a SAN 110 employing DSMP Architecture will somehow be dependent on such an architecture to couple the attached equipment, and the invention will provide a specific means of achieving this, demonstrated by various embodiments which remain to be disclosed, as the description progresses throughout this document.
[0096] Moreover, regarding the signals transmission described herein, those skilled in the art will recognize that a signal may be directly transmitted from a first element to a second element, or a signal may be modified (e.g., amplified, attenuated, delayed, latched, buffered, inverted, filtered or otherwise modified) between the elements shown in the diagram by interstitial elements whose details are omitted for brevity. Although the signals of the above described embodiment are characterized as transmitted from one element to the next, other embodiments of the present invention may include modified signals in place of such directly transmitted signals as long as the informational and/or functional aspect of the signal is successfully transmitted between elements.
[0097] To some extent, a signal input at a second element may be conceptualized as a second signal derived from a first signal output from a first element due to physical limitations of the circuitry involved (e.g., there is inevitably be some attenuation and delay). Therefore, as used herein, a second signal derived from a first signal includes the first signal or any modifications to the first signal, whether due to circuit limitations, or due to passage through other circuit elements which do not change the informational and/or final functional aspect of the first signal.
[0098] With the foregoing described embodiment wherein the different components are contained within different other components (e.g., the various elements shown as components of host 220), it is to be understood that such depicted architectures are merely examples, and that in fact, there may be other architectures that can be implemented which achieve the same functionality. This statement will also apply to subsequent descriptions.
[0099] In an abstract, but still definite sense, any arrangement of components to achieve the same functionality is effectively "associated" such that the desired functionality is achieved. Hence, any two components herein combined to achieve a particular functionality can be seen as "associated with" each other such that the desired functionality is achieved, irrespective of architectures or intermediate components. Likewise, any two components so associated can also be viewed as being "operably connected", or "operably coupled", to each other to achieve the desired functionality.
[0100] Example Host System
[0101] As has been noted, a computer system such as host system 120 may be one of a variety of types, in general, though most commonly, will find these to fall into the categories of machines used as personal workstations, ranging to those used as network servers. Suitable host systems, designed to provide computing power to one or more users, either locally or remotely, will however be identifiable in that they comprise of a certain common set of essential elements, as listed below,
[0102] motherboard;
[0103] PCI bus;
[0104] central processor unit (CPU) & chipset
[0105] system read-only memory (ROM) and random access memory (RAM);
[0106] input/output (I/O) interface controller;
[0107] network interface controller;
[0108] power supply and associated electrical hardware;
[0109] FIG. 2 depicts a schematic diagram of a host server 220 suitable for attachment to embodiments of the present invention, and is an example of a computer that can serve as host in FIG. 1, as well as subsequent figures where a host is involved. Host computer 210 includes a PCI bus 270, which interconnects major components of host server 210 mounted on a motherboard 221, such as a CPU 222, a chipset 223. Also there is a system memory--ROM and RAM chipset 224, as well as an input/output interface controller 225, and a network interface controller 228 (the latter peripherals often being coupled to CPU 222 and chipset 223 via a PCI bus.
[0110] There may however be variations to this where, for example, input/output interface controller 225, and network interface controller 228 can instead be coupled via a separate expansion bus (not shown). Network interface 228 may provide a direct connection to a remote server via a direct network link to the Internet. All of these discrete components are powered by electrical signals provided through a power supply and other electrical hardware 290. Within such a host, there is not necessarily contained any component of storage media, as the purpose of a SAN is to provide coupling between hosts of any kind to such storage media.
[0111] Depending on the processor (CPU) type powering the host, if, for example, it's an Intel x86.RTM. or a competitively similar CPU chip, the operating system (OS) provided on such a system 210, may be a suitable form of one of MS-DOS.RTM., MS-Windows NT.RTM. or 2000, though this same host may also run any of the Intel x86.RTM. UNIX.RTM. family OS such as Linux.RTM.. Alternatively, if the CPU is one of Sun Sparc.RTM., or HP PA-RISC.RTM., or DEC Alpha.RTM., or Motorola-IBM Power PC.RTM. based (or similar), these will be almost exclusively be a platform employing only one of the UNIX.RTM. variants, including Solarist and Linux.RTM., or one of any other known open systems type OS.
[0112] It will be noted that the variable identifier "N" is used in several instances throughout, particularly in the discussions which follow, to more simply designate the final element. Consider for example, the ports 110[1]-[N]) of a series of related or similar elements (e.g., ports 110). Furthermore, these N ports can be seen to be mounted on a series of MPFC Controllers (100[1]-[N]). The repeated use of such variable identifiers is not meant to imply a correlation between the sizes of such separate series of different elements, although such correlation may exist. The use of such variable identifiers does not require that each series of elements has the same number of elements as another series delimited by the same variable identifier. Rather, in each instance of use, the variable identified by "N" may hold the same or a different value than other instances of the same variable identifier.
[0113] Constructs of a Storage Area Network Employing a DSMP Architecture
[0114] FIG. 3 is a schematic diagram illustrating some specific details of one side of a basic implementation of a DSMP Architecture, according to embodiments of the present invention. Depicted is an example of connectivity between a number of hosts 320([1]-[N]) and a number of storage drives array/subsystems 330([1]-[N]) through certain devices (subsystem interface devices ([1]-[N]), each containing a generic embodiment of the interface controller 301([1]-[N]). A number of these devices, each generically described as subsystem interface device 300([1]-[N]) are coupled to a path set (preferably a loop) coupling hosts 320([1]-[N]) to a storage array/sub-system 330([1]-[N]). Such subsystem interface devices are shown coupled via ports 302([1,1]-[N,N]) mounted on interface controllers 301([1]-[N]) in the loop, which is associated with the host side. Hosts 320([1]-[N]) (each notably a simplified version of the collective group of elements defined to comprise a host 220 in FIG. 2), are each connected into this loop through one of host I/O storage network interface controllers 325([1]-[N]).
[0115] The resulting SAN (SAN 310) based upon this open construct of the DSMPA invention, is, in general terms, includes three distinguishable types of physical elements, namely:
[0116] ports,
[0117] networking devices, and
[0118] link cabling. ps Thus, clearly defined within its bounds are subsystem interface devices 300[i] containing interface controllers 301[i], and mounted thereon ports 302[i]. It should be understood that storage drive array/subsystem media 330[i] elements, and hosts 320[i], are not part of SAN 310, but rather are connected to SAN 310. This is in accordance with the understanding of those of skill in the art that a SAN is the total sum of all of the components located between the host I/O controllers and the subsystems.
[0119] Each subsystem interface device 300[i] is only shown to consist of, among other components, a single interface controller 301[i], depicted as being coupled with the host-side topology of SAN 310, (though as discussed subsequently, a minimal practical requirement should be two such controllers, the other being for coupling to the device side). From this perspective, details of the connection between subsystem interface devices 300([1]-[N]) and storage drive array/subsystems 330([1]-[N]) will, for FIG. 3 and FIG. 4, be apparent to one of skill in the art. In other words, details of varying connectivities for the storage device side coupling the storage elements to the SAN 310, are discussed subsequently.
[0120] The flexible nature of the DSMPA in accommodating host systems which have a different protocol/topology from that of the storage arrays/subsystems is a powerful advantage over preceding efforts in related technology. Using this open construct where, the storage media connectivity is decoupled in this manner, the significance of not requiring any one kind of protocol/topology to match that of the hosts, which is considered as the SAN backbone, is emphasized. In the embodiments of FIGS. 5 and 6 that follow, examples are given of the more common practical applications, where FC protocol in a loop topology is the backbone, which transmits through to the storage media. The practicability of the DSMPA readily accommodates, at the transport layer, SCSI protocol (as opposed to SCSI upper-layer protocol), and so too, the proprietary SSA protocol, as examples, each of which may not be obvious, and may otherwise go unrecognized.
[0121] Moreover, it will also be noted that any appropriate transport layer and upper-layer protocol combinations may be employed in the backbone, exemplified by SAN 310, (although SCSI upper-layer protocol over fibre channel is preferred), including any of a variety of existing alternatives, examples being:
[0122] TCP/IP over 10base-T/100base-TX Ethernet, as pertaining to
[0123] a LAN,
[0124] a proprietary WAN, or
[0125] the Internet;
[0126] modified TCP/IP (or other similar protocol) over Gigabit Ethernet transport layer;
[0127] Infiniband transport protocol compatible with IP routing, over Ethernet based networks;
[0128] iSCSI (SCSI upper-layer protocol over internet protocol), over Gigabit Ethernet based networks;
[0129] token-passing in a proprietary or other protocol--over a suitable medium.
[0130] Further, the topology of this SAN 310 can be one of a variety of topologies, including, but not limited to:
[0131] ring,
[0132] mesh,
[0133] bus,
[0134] tree.
[0135] Thus, while certain of the discussions herein are accented towards in a loop topology employing a fibre channel protocol, a DSMP Architecture according to embodiments of the present invention may be discussed in terms of other topologies and protocols without loss of generality. It will be noted that, for the sake of simplicity, the term protocol, as used herein, encompasses all layers of communication protocols, implicating both hardware and software.
[0136] An important facet of the invention is the links path connecting hosts 320([1]-[N]) to interface controllers. The organization of hosts and subsystem interface devices depicted herein as being connected generically in a path set (preferably a loop) is a defining feature of the DSMPA (where the transport layer carries data in a unidirectional manner, as will be seen in more complete embodiments which follow that of FIG. 3). Note that for the case of a transport layer where there is bi-directional final data carriage, a similar connection strategy will apply, although the direction of data flow can invert dynamically.
[0137] Commencing with link 311, which begins at terminal 327[1,out] of port 326[1] of the host I/O interface controller 325[1] for host 320[1], signals carrying data I/O (or similar) go to terminal 303[1,1,in] of the first port (port 302[1,1]), contained as part of interface controller 301[1 ] of subsystem interface device 300[1]. From there, the signal can be followed by tracing the links sequentially numbered 311-315, several of which consist of multiple segments. Signals are internally routed across the interface controller 301[N] and emerge at terminal 303[1,N,out], where the first segment of multiple segment link 312
commences. This link 312 segment joins with terminal 303[2,1,in] of port 302[2,1] of interface controller 301[2] of the next subsystem interface device (subsystem interface device 300[2]), being re-routed internally within interface controller 301[2], then emerging via terminal 303[2,N,out] of last port (port 302[2,N]) of interface controller 301[2].
[0138] Signals continue along subsequent segments of multiple segment link 312 passing through each of any interstitial interface controllers between that of 301[2] of second subsystem interface device (subsystem interface device 300[2]), and that of 301[N] of the last subsystem interface device (subsystem interface device 300[N]) in the same fashion as just described. Coming in through one terminal of the first port of the interface controller of each subsystem interface device, the signals are then internally re-routed to emerge via the out terminal of the last port of the same controller, and so on, until the final interface controller 301[N] is encountered. There, the signals pass along the final segment of link 312 into terminal 303[N,1,in] of the first port (port 302[N,1]) of interface controller 301[N]. Instead of being internally routed within interface controller 301[N] through to the last port (port 302[N,N]) (as per the foregoing pattern), the signals never reach the last port (port 302[N,N]), but are instead diverted to exit via terminal 303[N,1,out] of port 302[N,1], to then begin a return journey across the array of subsystem interface devices (subsystem interface devices 300([N]-[1]), along a path made up of various segments of multiple segment link 313.
[0139] The signals follow a path which is a loop-back traversing each of the same ports encountered by link segments 312, though in the reverse order, and in each case, via the terminal of each port respectively not listed thus far in the description regarding FIG. 3.
[0140] Typically, as shown for the subsystem interface device 300[2], the signals pass via a link segment 313 into terminal 303[2,N,in] of the last port (port 302[2,N]) of interface controller 301[2], whereupon they are re-routed internally within interface controller 301[2] to emerge at terminal 303[2,1,out] of the first port (port 302[2,1]) of this same controller.
[0141] Next, the signals reach the first subsystem interface device (subsystem interface device 300[1]), thus making this segment of incoming link 313 its final segment. Entering via terminal 303[1,N,in] of the last port (port 302[1,N]) of interface controller 301[1], the signals are internally routed within interface controller 301[1] to exit via terminal 303[1,1,out] of its first port (port 302[1,1]). The signals then cross over via link 314 where the loop continues in segments of a multiple segment link 315, sequentially coupling hosts 320([N]-[2]), eventually having the circuit completed at host 320[1].
[0142] Incoming via terminal 327[N,in] of port 326[N] of host I/O interface controller of the final host (host 320[N]), the signals re-emerge via terminal 327[N,out], whereupon the signals continue along a path made up of series of successive link segments 315 which will sequentially traverse each I/O interface controller 325([N]-[2]) of the array of hosts 320([N]-[2]), finally reaching I/O interface controller 325[1] of the first host (host 320[1]), corresponding to that from which tracing of the signal path commenced.
[0143] Typically, as shown for host 320[2], the signals being carried along segments of multiple segment link 315 enter host I/O interface controller (e.g., host I/O interface controller 325[2]) via a terminal (e.g., terminal 327[2,in] of port 326[2]), and are then passed out via a terminal (e.g., terminal 327[2,out] of that same port). From here, the signals are returned to host interface controller (e.g., host interface controller 325[1] contained in the first host 320[1]), entering via a terminal (e.g., terminal 327[1,in] of port 326[1]), thus making this incoming segment of link 315 its final segment, and also completing the host-side network loop.
[0144] As will be apparent to one of skill in the art, each one of the multiple segments of link 312 through to 313 as well as link 311, is typically substantially similar, if not identical, to other of multiple segments of link 315, as well as link 314. As coupled in FIG. 3, subsystem interface devices 300([1]-[N]), containing the interface controllers 301([1]-[N]), and their associated links, form a SAN in a loop configuration (i.e., SAN 310).
[0145] FIG. 4 is a schematic diagram illustrating an example of connectivity which is an extension of that shown in FIG. 3.
[0146] Depicted is another example of connectivity between a number of host computers and a number of storage drives array/subsystems through certain devices. Simultaneously, a number of these devices, each generically described as a subsystem interface device 400[i] of a SAN 410
can be joined in the following ways.
[0147] a) By links 411 to 415 in a loop forming part of the primary network backbone of SAN 410, the purpose of which is to provide for coupling of hosts 420([1]-[N]) to one of storage array/subsystems 430([1]-[N]). Subsystem interface devices 400([1]-[N]) are shown coupled in this loop, which is associated with the host side, via ports 402([1,1]-[N,N]) mounted on interface controllers 401([1]-[N]), an embodiment of which subsystem interface devices 400([1]-[N]) contain. The hosts 420([1]-[N]) are coupled into this loop, each through one of host I/O storage network interface controllers 425([1]-[N]).
[0148] b) Via links 441 within SAN 410, to allow independent coupling of the elements with a separate secondary (not necessarily loop topology) or sub-network (a network 440). Each one of hosts 420([1]-[N]), and the subsystem interface devices 400([1]-[N]) is coupled via links 441 to a separate independent network (again, network 440), each respectively through a particular type of network interface card (NIC), with NICs 428([1]-[N]) selected for compatibility with particular hosts 420([1]-[N]), and NICs 408([1]-[N]) integrally designed and installed to match with subsystem interface devices 400([1]-[N]).
[0149] In the case of the primary network backbone coupling, employment of fibre channel as the lower-layer protocol of the SAN 410 is suggested as a particular embodiment, although the invention is not limited to such a configuration. An example of the kind of network which may be implied in the latter case is that of TCP-IP over a local area network (LAN), although a different kind of network protocol could easily be employed instead to provide for this alternative secondary network connectivity.
[0150] This implementation capitalizes on the abilities of the DSMP Architecture that provide for inter-device communication across any of several separate independent channels. Thus, in such embodiment of the invention as drawn (i.e., subsystem interface device consisting of, amongst other non-specific components, the first of a minimum of two interface controllers, plus one or more NICs), an extra level of management flexibility and operational redundancy beyond the one shown in FIG. 3 is enabled.
[0151] As established in regard to FIG. 3, the generic pattern of the looped path set of links continues in FIG. 4, connecting an array of hosts 420([1]-[N]) to an array of subsystem interface devices 400([1]-[N]), each containing one of interface controllers 401 ([1]-[N]) reappears, being a defining feature of the DSMPA involving the transport layer carrying data either in a unidirectional or bi-directional manner.
[0152] Commencing with link 411, which begins at terminal 427[1,out] of port 426[1] of host I/O interface controller 425[1] for host 420[1], the signals go to terminal 403[1,1,in] of the first port (port 402[1,1]), contained as part of interface controller 401[1] of subsystem interface device 400[1]. From there, the signals can be followed by tracing the links 411-415 sequentially. Signals are internally routed across interface controller 401[N] to emerge at terminal 403[1,N,out] of port 402 [1,N], where the first segment of multiple segment link 412
commences. This segment of link 412 joins with terminal 403[2,1,in] of the first port (port 402[2,1]) of interface controller 401[2] of the next subsystem interface device (subsystem interface device 400[2]), being re-routed internally within interface controller 401[2], then emerging via terminal 403[2,N,out] of the last port (port 402[2,N]) of interface controller 401[2].
[0153] Signals continue along subsequent segments of link 412 by passing through each of any intermediate interface controllers 401([3]-[N-1]) between that of 401[2] mounted in the second subsystem interface device (subsystem interface device 400[2]), and that of interface controller 401[N] mounted in the last subsystem interface device (subsystem interface device 400[N]) in the same fashion as just described. Coming in through one terminal of the first port of the given interface controller of each subsystem interface device, the signals are then internally re-routed to emerge via the out terminal of the last port of the same controller, and so on, until the final interface controller (interface controller 401[N]) is encountered. There, the signals will pass along the final segment of link 412 into terminal 403[N,1,in] of the first port (port 402[N,1]) of interface controller 401[N]. Instead of being internally routed within interface controller 401[N], through to the last port therein 402[N,N] (as per the foregoing pattern), the signals never reach the last port (port 402[N,N]), but are instead diverted to exit via terminal 403[N,1,out] of the same port (port 402[N,1]), to then begin a return journey across the array of subsystem interface devices (subsystem interface devices 400([N]-[1]), along a path made up of various segments of multiple segment link 413.
[0154] The signals follow a path which is a loop-back traversing each of the same ports encountered by link segments 412, though in the reverse order, and in each case, via the terminal of each port respectively not listed thus far in the description regarding FIG. 4.
[0155] Typically, as shown for the subsystem interface device 400[2], the signals pass via a link segment 413 into terminal 403[2,N,in] of the last port (port 402[2,N]) of interface controller 401[2], whereupon they are re-routed internally within interface controller 401[2] to emerge at terminal 403[2,1,out] of the first port (port 402[2,1]) of this same controller.
[0156] Next, the signals reach the first subsystem interface device (subsystem interface device 400[1]), thus making this segment of incoming link 413 its final segment. Entering via terminal 403[1,N,in] of the last port (port 402[1,N]) of interface controller 401[1], the signals are internally routed within interface controller 401[1] to exit via terminal 403[1,1,out] of its first port (port 402[1,l]). The signals then cross over via link 414 where the loop continues in segments of a multiple segment link 415, sequentially coupling hosts 420([N]-[2]), eventually having the circuit completed at host 420[1].
[0157] Incoming via terminal 427[N,in] of port 426[N] of host I/O interface controller 425[N] of the final host (host 420[N]), the signals re-emerge via terminal 427[N,out], whereupon the signals continue along a path made up of series of successive link segments 415 which will sequentially traverse each I/O interface controller 425([N]-[2]) of the array of hosts 420([N]-[2]), finally reaching I/O interface controller 425[1] of the first host (host 420[1]), corresponding to that from which tracing of the signal path commenced.
[0158] Typically, as shown for host 420[2], the signals being carried along segments of multiple segment link 415 enter host I/O interface controller (e.g., host I/O interface controller 425[2]) via a terminal (e.g., terminal 427[2,in] of port 426[2]), and are then passed out via a terminal (e.g., terminal 427[2,out] of that same port). From here, the signals are returned to host interface controller (e.g., host interface controller 425[1] contained in the first host 420[1]), entering via a terminal (e.g., terminal 427[1,in] of port 426[1]), thus making this incoming segment of link 415 its final segment, and also completing the host-side network loop.
[0159] As will be apparent to one of skill in the art, each one of the multiple segments of link 412 through to 413 as well as link 411, is typically substantially similar, if not identical, to other of multiple segments of link 415, as well as link 414. As coupled in FIG. 4, subsystem interface devices 400([1]-[N]), containing the interface controllers 401([1]-[N]), and their associated links form a SAN with a primary loop configuration (i.e., SAN 410). As mentioned earlier, while employment of the fibre channel transport layer protocol is implied here, the invention should not be considered to be limited only to such an embodiment. In any case, these links could be collectively considered to be an in-band channel of communications, specifically between the subsystem interface devices, as any communications data shares the available bandwidth with the distinct I/O signals data traveling in both directions between the hosts and the storage arrays/subsystems.
[0160] Meanwhile, the set of link branches 411, being of a different type than links 411-415, maintain a coupling between elements on the host side within a sub-network, a network 440 of the SAN 410. Any one of link branches 441 joining network 440 to a host element 420([1]-[N]) can be seen to pass via one of ports 429([1]-[N])), that reside on one of host NICs 428([1]-[N]). In the case of joining the network 440 to subsystem interface devices 400([1]-[N]), any link can be seen to pass in via one of ports 409([1]-[N]) residing on one of subsystem interface device NICs 408([1]-[N]).
[0161] In this embodiment, the set of link branches 441 of this sub-network, network 440 exist simultaneously and independently to the coupling of the primary loop carrying the in-band channel. This set of sub-network link branches (i.e., set of links 441) of network 440 within SAN 410 can be collectively considered the out-of-band communications channel between subsystem interface devices 400([1]-[N]). Other embodiments, however, exist in which two different sets of links (links 411-415 compared with link branches 441) could have their roles transposed. Alternatively, a loss of connectivity across the links of one network type being substituted for by the links of the other network, transparently usurping the role of the former.
[0162] FIG. 5 is a schematic diagram illustrating an example SAN that elaborates on the implementation of single primary network connectivity beyond that open construct shown in FIG. 3. Links (preferably of fibre channel type) are used to create an extensible SAN, described in terms of two separate path set configurations, one on the host side, and another on the device side.
[0163] Depicted is another example of a DSMP Architecture that is the basis of the SAN 510 infrastructure, establishing connectivity between a number of host computers through to a number of storage drives array/subsystems via certain devices.
[0164] As established in FIG. 3, now similarly in FIG. 5 there appears a generic pattern of a path set of links coupling an array of hosts 520([1]-[N]) to an array of subsystem interface devices 500([1]-[N])--this being a defining feature of a DSMPA according to embodiments of the present invention--involving the transport layer carrying data in both a uni-directional or bi-directional manner. However, each such device is now shown to contain interface controllers which can specifically be multi-port controllers, thus, in addition to a first multi-port controller 501[i], there is a second multi-port controller 506[i],. Now the remainder of the connectivity is revealed in detail--that of another path set of links--coupling each of the subsystem interface devices 500([1]-[N]), each via one of the second multi-port controllers 506([1]-[N]) of subsystem interface devices 500([1]-[N]), through to the storage arrays/subsystems 530([1]-[N]).
[0165] Each of such subsystem interface devices is now referred to herein as a storage router type device (or simply storage router), this being a key aspect generic embodiment of the embodiments of the present invention. However to maintain generality and not limit the scope of the invention, it should be understood that this device is one type of storage router, so there may be other combinations of numbers of controllers, supporting other protocols, which are equally well able to be successfully implemented in practice.
[0166] Simultaneously, a number of these storage router type devices (e.g., storage routers 500([1]-[N])) may be joined in the following ways.
[0167] a) On the host side, by links 511 through to 515, in a path set forming part of the primary network backbone of SAN 510, whose purpose is to provide for one portion of the coupling of hosts 520([1]-[N]) to a storage array/subsystem 530([1]-[N]). Storage router type devices 500([1]-[N]) are shown coupled in this path set (preferably a loop), which is associated with the host side, via ports 502([1,1]-[N,N]) of multi-port controllers 501([1]-[N]) contained therein. The hosts 520([1]-[N]) are coupled into this loop, each through one of the host I/O storage network interface controllers 525([1]-[N]).
[0168] b) On the device side, via links 516-519 as well as link 521, in a path set forming the second part of the primary network backbone of SAN 510, the purpose of which is to provide the remaining portion of the coupling of hosts 520([1]-[N]) to storage array/subsystems 530([1]-[N]). Storage router type devices 500([1]-[N]) are shown coupled in this path set (preferably a loop), which is associated with the device side, via ports 506([1,1]-[N,N]) of multi-port controllers 505([1]-[N]) contained therein. Storage drive arrays/subsystems 530([1]-[N]) are coupled into this loop, each through storage subsystem controllers 531([1]-[N])--these latter items however, as obvious to one of skill in the art, not being a part of the invention.
[0169] For this type of primary network coupling backbone, employment of SCSI upper-layer protocol over fibre channel transport layer of SAN 510
is suggested as a one preferred embodiment, although the invention is not limited to such a configuration. Moreover, it will be noted that any appropriate transport layer and upper-layer protocol combinations may conceivably be employed for the SAN, and any of a variety of existing and potentially suitable alternatives (as listed earlier regarding FIG. 3, but restated here):
[0170] TCP/IP over 10base-T/100base-TX Ethernet, as pertaining to
[0171] a LAN,
[0172] a proprietary WAN, or
[0173] the Internet;
[0174] modified TCP/IP (or other similar protocol) over Gigabit Ethernet transport layer;
[0175] Infiniband transport protocol compatible with IP routing, over Ethernet based networks;
[0176] iSCSI (SCSI upper-layer protocol over internet protocol), over Gigabit Ethernet based networks;
[0177] token-passing in a proprietary or other protocol--over a suitable medium.
[0178] Further, the topology of this SAN 510 can be one of a variety of topologies, including, but not limited to:
[0179] ring,
[0180] mesh,
[0181] bus,
[0182] tree.
[0183] Thus, while certain of the discussions herein are accented towards in a loop topology employing a fibre channel protocol, a DSMP Architecture according to embodiments of the present invention may be discussed in terms of other topologies and protocols without loss of generality. It will be noted that, for the sake of simplicity, the term protocol, as used herein, encompasses all layers of communication protocols, implicating both hardware and software.
[0184] Commencing with link 511, which begins at terminal 527[1,out] of port 526[1] of host I/O interface controller 525[1] for host 520[1], the signals go to terminal 503[1,1,in] of the first port (port 502[1,1]), which is part of multi-port controller 501[1] of storage router type device 500[1].
[0185] From there, the signal can be followed by tracing links 511-515
sequentially. Signals are internally routed across multi-port controller 501[N], and emerge at terminal 503[1,N,out] of port 502[1,N], where the first segment of multiple segment link 512 commences. This segment of link 512 joins with terminal 503[2,1,in] of the first port (port 502[2,1]) of the multi-port controller 501[2] of the next storage router type device (storage router type device 500[2]), being re-routed internally within multi port controller 501[2], then emerging via terminal 503[2,N,out] of last port 502[2,N].
[0186] Signals continue along subsequent segments of multiple segment link 512 by passing through each of any intermediate multi-port controllers 501([3]-[N-1]) between that of 501[2] mounted in the second storage router type device (storage router type device 500[2]), and that of multi-port controller 501[N] of the last storage router type device (storage router type device 500[N]) in the same fashion as just described. Coming in through one terminal of the first port of the given multi-port controller of each storage router type device, the signals are then internally re-routed to emerge via the out terminal of the last port of the same controller, and so on, until the final multi-port controller (multi-port controller) is encountered. There, the signals will pass along the final segment of link 512 into terminal 503[N,1,in] of the first port (port 502[N,1]) of multi-port controller 501[N]. Instead of being internally routed within multi-port controller 501[N], through to the last port therein 502[N,N] (as per the foregoing pattern), the signals never reach the last port (port 502[N,N]), but are instead diverted to exit via terminal 503[N,1,out] of the same port (port 502[N,1]), to then begin a return journey across the array of storage router type devices (storage router type devices 500([N]-[1])), along a path made up of various segments of multiple segment link 513.
[0187] The signals follow a path which is a loop-back traversing each of the same ports encountered by link segments 512, though in the reverse order, and in each case, via the terminal of each port respectively not listed thus far in the description regarding FIG. 5.
[0188] Typically, as shown for storage router type device 500[2], the signals pass via a link segment 513 into terminal 503[2,N,in] of the last port (port 502[2,N]) of multi-port controller 501[2], whereupon they are re-routed internally within multi-port controller 501[2] to emerge at terminal 503[2,1,out] of the first port (port 502[2,1]) of this same controller.
[0189] Next, the signals reach the first storage router type device (storage router type device 500[1]), thus making this segment of incoming link 513 its final segment. Entering via terminal 503[1,N,in] of the last port (port 502[1,N]) of multi-port controller 501[1], the signals are internally routed within multi-port controller 501[1] to exit via terminal 503[1,1,out] of its first port (port 502[1,1]). The signals then cross over via link 514 where the loop continues in segments of a multiple segment link 515, sequentially coupling hosts 520([N]-[2]), eventually having the circuit completed at host 520[1].
[0190] Incoming via terminal 527[N,in] of port 526[N] of host I/O interface controller 525[N] of the final host (host 520[N]), the signals re-emerge via terminal 527[N,out], whereupon the signals continue along a path made up of series of successive link segments 515 which will sequentially traverse each 1/O interface controller 525([N]-[2]) of the array of hosts 520([N]-[2]), finally reaching I/O interface controller 525[1] of the first host (host 520[1]), corresponding to that from which tracing of the signal path commenced.
[0191] Typically, as shown for host 520[2], the signals being carried along segments of multiple segment link 515 enter host I/O interface controller (e.g., host I/O interface controller 525[2]) via a terminal (e.g., terminal 527[2,in] of port 526[2]), and are then passed out via a terminal (e.g., terminal 527[2,out] of that same port). From here, the signals are returned to host interface controller (e.g., host interface controller 525[1] contained in the first host 520[1]), entering via a terminal (e.g., terminal 527[1,in] of port 526[1]), thus making this incoming segment of link 515 its final segment, and also completing the host-side network loop.
[0192] As will be apparent to one of skill in the art, each one of the multiple segments of backbone links 512-513 as well as link 511, is typically substantially similar, if not identical, to other of multiple segments of link 515, as well as link 514. In this particular embodiment, the identity may be extended to cover each and all of the links 516 to 519 and then 521 also. (However, as discussed in several paragraphs subsequent, such is not necessarily the case in a heterogeneous environment.) As coupled in FIG. 5, storage router type devices 500([1]-[N]), containing multi-port controllers 501([1]-[N]) and 505([1]-[N]), and their associated links, form a SAN with a primary loop configuration (i.e., SAN 510).
[0193] As mentioned earlier, while employment of the fibre channel transport layer protocol is implied here, the invention should not be considered to be limited only to such an embodiment. In any case, these links could be collectively considered to be an in-band channel of communications, specifically between the storage router type devices, as any communications data shares the available bandwidth with the distinct I/O signals data traveling in both directions between the hosts and the storage arrays/subsystems.
[0194] Not mentioned thus far however, but readily supported by a DSMP Architecture of a kind similar to that depicted in FIG. 5, is the provision for accommodating hosts and storage equipment subscribing to mixed protocols, and connecting them into the same SAN. By having storage router type devices with interchangeable multi-port controller modules, these devices may be fitted with different combinations of modules designed with any one of several common types of port hardware (also implementing appropriately matching embedded firmware).
[0195] Demonstrating a need for such embodiments of DSMPA are several common heterogeneous commercial/industrial IT configuration requirements, among them being:
[0196] a combination of a dual-port FC controller on the host side, with dual-port SSA controller on the storage side, to support the IBM proprietary ring-based Serial Storage Architecture;
[0197] a combination of a dual-port FC controller, with a dual terminal SCSI controller, which can be used in either host-side versus storage-side orientation; namely linking SCSI-based hosts to an FC-based SAN, or alternatively, coupling SCSI-based JBOD style storage equipment to an FC-based SAN.
[0198] The implementation of such equipment resolves some previously untenable problems. Not only capably fulfilling a critical storage support role--that of providing essential bridging functionality across otherwise incompatible transport protocols--but also providing the wide range of extra advantages for a SAN associated with employing the DSMP Architecture, as discussed in prior sections of this document.
[0199] By arranging appropriate coupling of the storage router type devices, the interconnection of environments of heterogeneous storage array/subsystems, or of heterogeneous host, or a mixture of all is supported. Ensuring that adjacent storage router type devices are coupled by controllers of matching transport layer protocol (port interconnect hardware), such storage router type devices can be interconnected in cascade, or alternatively daisy chain, configurations with other storage router type devices, as necessary to establish a common loop (or other topology) coupling, which will form the backbone of the SAN overall. As implied by the discussion regarding FIGS. 3 to 5, the preferable, though not exclusive, protocol for this purpose is SCSI upper layer over fibre channel transport (lower) layer.
[0200] Despite preference for employing the fibre channel transport layer protocol, the invention should not be considered as being limited thereto. In any case, these links can be collectively considered to comprise the in-band channel of communications, specifically amongst the storage router type devices, as any communications data is sharing the available bandwidth with the distinct I/O signals data traveling in both directions between the hosts and the storage arrays/subsystems.
[0201] It will be noted that coupling to network 510 provides more generic flexibility in this configuration, when compared to that depicted in FIG. 3, because each host and subsystem is coupled into the network by one of the multi-port controllers contained within the storage router type device. Thus, addition or removal of any type of element, in relation to the SAN, becomes a relatively straight-forward task, with predictable results, thus eliminating uncertainties from the sphere of concern for administrative tasks via the management platform. These tasks are associated with prescribed repeatable methods, depending upon whether such an element is a host, or a storage array/subsystem, or even some other device.
[0202] Each storage router type device thus becomes a stand-alone building block, from which SANs of various topologies can easily be configured. This extensibility is desirable, for example, because such extensibility simplifies tasks associated with building, maintaining and administering a storage network, providing powerful capabilities with regard to structuring a SAN for optimal performance, reliability, ease of maintenance, and other such advantages.
[0203] FIG. 6 is a schematic diagram which cumulatively combines various DSMPA facets introduced progressively in prior figures, that might make up a SAN 610. The storage router type device is referred to herein as a Storage Virtualization Engine (SVE) device, this being a key aspect of a preferable embodiment of a DSMP Architecture according to the present invention. Furthermore, another feature of this embodiment is that the controllers, rather than having a non-specific multiplicity of ports, are shown as dual-port fibre channel controllers. However with an intent not to limit the scope of the invention, it should be understood that this device is essentially a type of storage router, so there may be other combinations of numbers of controllers, and ports per controller, supporting other protocols, which are equally well able to be successfully implemented in practice, to which the SVE term may be transferred.
[0204] Depicted is another example of connectivity between a number of host computers and a number of storage drives array/subsystems through SVE devices 600([1]-[N]) of a SAN 610, which may be coupled in the following ways.
[0205] a) On the host side, by links 611-615, in a path set forming part of the primary network backbone of SAN 610, the purpose of which is to provide for one portion of the coupling of hosts 620([1]-[N]) to a storage array/subsystem 630([1]-[N]). These SVE devices are shown coupled in this path set (preferably a loop), which is associated with the host side, via ports 602([1,1]-[N,2]) of dual-port controllers 601([1]-[N]), contained therein. Hosts 620([1]-[N]) are coupled into this loop each through one of host I/O storage network interface controllers 625([1
]-[N]), which may be otherwise referred to as host bus adapters (HBAs).
[0206] b) On the device side, via links 616-619 as well as link 621, in a path set forming the second part of the primary network backbone of SAN 610, the purpose of which is to provide the remaining portion of the coupling of hosts 620([1]-[N]) to a storage array/subsystem 630([1]-[N]). These SVE devices are shown coupled in this path set (preferably a loop), which is associated with the device side, via ports 606([1,1]-[N,2]) of multi-port controllers 605([1]-[N]), contained therein. The storage drive arrays/subsystems are coupled into this loop each through storage subsystem controllers 631([1]-[N])--this latter item however, as obvious to one of skill in the art, not being a part of the DSMPA invention.
[0207] c) Via links 641 within SAN 610, to allow independent coupling of the elements with a separate secondary (not necessarily loop topology) or sub-network (a network 640). Each one of hosts 620([1]-[N]), and the subsystem interface devices 600([1]-[N]) is coupled via links 641 to a separate independent network (again, network 640), each respectively through a particular type of network I/F card (NIC), with NICs 628([1]-[N]) selected for compatibility with particular hosts 620([1]-[N]), and NICs 608([1]-[N]) integrally designed and installed to match with SVE devices 600([1]-[N]).
[0208] In the case of the primary network backbone coupling regime, employment of fibre channel as the lower-layer protocol of SAN 610 is suggested as a particular embodiment, although the invention is not limited to such configurations. An example of a network that can be implied in the latter case is that of TCP-IP via a LAN, although a different kind of network protocol could easily be employed instead to provide for this alternative secondary network connectivity.
[0209] This embodiment of the invention capitalizes on the ability provided by the DSMP Architecture for inter-device communication across separate, independent channels (and elaborates on the scheme of FIG. 4
which shows an intermediate construct embodiment used to assist with preliminarily explaining the DSMP Architecture concept). Hence, this embodiment of the invention as shown in FIG. 6 (i.e., an SVE device consisting of, amongst other non-specific components, a single dual-port fibre channel controller plus a network I/F card) enables an extra level of management flexibility and operational redundancy beyond the one shown in FIG. 5.
[0210] As established in FIG. 5, the generic pattern of the looped path of links connecting an array of hosts 620([1]-[N]) to an array of SVE devices 600([1]-[N]) reappears, though now each specifically contains a dual-port controller 601([1]-[N]), these various facets being a defining features of the DSMP Architecture utilizing a fibre channel transport layer carrying data both in a uni-directional or bi-directional manner.
[0211] Commencing with link 611, which begins at terminal 627[1,out] of port 626[1] of host I/O interface controller 625[1] for host 620[1], the signals go to terminal 603[1,1,in] of the first port (port 602[1,1]), contained as part of dual-port controller 601[1] of SVE device 600[1]. From there, the signal can be followed by tracing the links 611-615
sequentially. Signals are internally routed across dual-port controller 601[N] to emerge at terminal 603[1,2,out] of port 602[1,2], where the first segment of a link 612 commences. This segment of multiple segment link 612 joins with terminal 603[2,1,in] of the first port (port 602[2,1]) of the dual-port controller 601[2] of the next SVE device (SVE device 600[2]), being re-routed internally within dual-port controller 601[2], then emerging via terminal 603[2,2,out] of second port 602[2,2].
[0212] Signals continue along subsequent segments of multiple segment link 612 by passing through each of any intermediate dual-port controllers 601([3]-[N-1]) between that of 601[2] mounted in the second SVE device (SVE device 600[2]), and that of dual-port controller 601[N] of the last SVE device (SVE device 600[N]) in the fashion just described. Coming in through one terminal of the first port of the given dual-port controller of each SVE device, the signals are then internally re-routed to emerge via the out terminal of the second port of the same controller, and so on, until the final dual-port controller (dual-port controller) is encountered. There, the signals will pass along the final segment of link 612 into terminal 603[N,1,in] of the first port (port 602[N,1]) of dual-port controller 601[N]. Instead of being internally routed within dual-port controller 601[N], through to the last port (port 602[N,N]) therein (as per the foregoing pattern), the signals never reach the last port (port 602[N,N]), but are instead diverted to exit via terminal 603[N,1,out] of the same port (port 602[N,1]), to then begin a return journey across the array of SVE devices (SVE devices 600([N]-[1])), along a path made up of various segments of multiple segment link 613.
[0213] The signals follow a path which is a loop-back traversing each of the same ports encountered by link segments 612, though in the reverse order, and in each case, via the terminal of each port respectively not listed thus far in the description regarding FIG. 6.
[0214] Typically, as shown for SVE device 600[2], the signals pass via a link segment 613 into terminal 603[2,N,in] of the second port (port 602[2,N]) of dual-port controller 601[2], whereupon they are re-routed internally within dual-port controller 601[2] to emerge at terminal 603[2,1,out] of the first port (port 602[2,1]) of this same controller.
[0215] Next, the signals reach the first SVE device (SVE device 600[1]), thus making this segment of incoming link 613 its final segment. Entering via terminal 603[1,N,in] of the last port (port 602[1,N]) of dual-port controller 601[1], the signals are internally routed within dual-port controller 601[1] to exit via terminal 603[1,1,out] of its first port (port 602[1,1]). The signals then cross over via link 614 where the loop continues in segments of a multiple segment link 615, sequentially coupling hosts 620([N]-[2]), eventually having the circuit completed at host 620[1].
[0216] Incoming via terminal 627[N,in] of port 626[N] of host I/O interface controller 625[N] of the final host (host 620[N]), the signals re-emerge via terminal 627[N,out], whereupon the signals continue along a path made up of series of successive link segments 615 which will sequentially traverse each I/O interface controller 625([N]-[2]) of the array of hosts 620([N]-[2]), finally reaching I/O interface controller 625[1] of the first host (host 620[1]), corresponding to that from which tracing of the signal path commenced.
[0217] Typically, as shown for host 620[2], the signals being carried along segments of multiple segment link 615 enter host I/O interface controller (e.g., host I/O interface controller 625[2]) via a terminal (e.g., terminal 627[2,in] of port 626[2]), and are then passed out via a terminal (e.g., terminal 627[2,out] of that same port). From there, signals are returned to host interface controller (e.g., host interface controller 625[1] contained in the first host 620[1]), entering via a terminal (e.g., terminal 627[1,in] of port 626[1]), thus making this incoming segment of link 615 its final segment, and also completing the host-side network loop.
[0218] As will be apparent to one of skill in the art, each one of the multiple segments of backbone links 612-613, as well as link 611, is typically substantially similar, if not identical, to other of multiple segments of link 615, as well as link 614. In this particular embodiment, the identity can be extended to cover links 616 to 619, as well as 621. (However, as discussed with regard to FIG. 5, such is not necessarily the case in a heterogeneous environment, where there may be other protocols employed in various controller module combinations within SV devices, and SV devices can be coupled in cascade type or daisy chain type arrangements as necessary to successfully establish complete backbone loop (or other topology) connectivity).
[0219] As coupled in FIG. 6, SVE devices 600([1]-[N]), containing the dual-port controllers 601([1]-[N]) and 605([1]-[N]), and their associated links, form a SAN with a primary loop configuration (i.e., SAN 610). As mentioned earlier, employment of the fibre channel transport layer protocol is implied here, though the invention should not be considered to be limited to such a protocol only. In any case, these links could be collectively considered to be the in-band channel of communications, specifically between the SVE devices, as any communications data is sharing the available bandwidth with the distinct I/O signals data traveling in both directions between the hosts and the storage arrays/subsystems.
[0220] Meanwhile, the set of link branches 641, being a different type than links 611-621, maintain a coupling between elements on the host side within a sub-network, a network 640 of SAN 610. Any one of link branches 641 joining network 640 to a host element 620([1]-[N]) can be seen to pass via one of ports 629([1]-[N]) that reside on one of host NICs 628([1]-[N]). In the case of joining the network 640 to SVE devices 600([1]-[N]), any link can be seen to pass in via one of ports 609([1]-[N]) residing on one of subsystem interface device NICs 608([1]-[N]).
[0221] In this embodiment, the set of link branches of this sub-network, network 640 exist simultaneously and independently to the coupling of the primary loop carrying the in-band channel. This set of sub-network link branches (i.e., set of links 641) of network 640 within SAN 610 could be collectively considered to be the out-of-band channel of communications between the SVE devices. However, other embodiments exist in which the two different sets of links (links 611-621 compared with link branches 641) could have their roles transposed. Alternatively, a loss of connectivity across the links of one network type is substituted for by the links of the other network, transparently usurping the role of the former.
[0222] It will be noted the combined network couplings within SAN 610
provide complete generic flexibility in this configuration, fully developed upon that of SAN 410 depicted in FIG. 4. Each host and storage array/subsystem is coupled into the network by one of the dual-port controllers contained within the SVE device. Addition to or removal from the SAN of any type of element (a basic facet of SAN scalability/extensibilty), becomes a relatively straight-forward task, the associated network interruption being well-managed (results predictable--seamless and with rapid ramp-up or system restoration). Thus uncertainties are eliminated from the sphere of concern for administrative tasks via this management platform. These tasks are associated with prescribed repeatable methods, depending upon whether such an element is a host, or a storage array/subsystem, or can even be some other device that resides within or outside the SAN boundary.
[0223] Each SVE device thus becomes a stand-alone building block, from which SANs of various topologies can easily be configured. This extensibility is desirable, for example, because such extensibility simplifies tasks associated with building and maintaining and administering a storage network, providing powerful capabilities with regard to structuring a SAN for optimal performance, reliability, ease of maintenance and other such advantages.
[0224] Component Level Detail of a Multi-port Fibre Channel Controller Embodiment
[0225] In discussing these links, between controllers, ports and ultimately the terminals, it is necessary to elaborate further discussing some of the internal electronic mechanisms. Amongst other benefits, this will clarify some fundamental issues, which are a potential source of confusion, due to common variations in nomenclature usage disseminated by, and amongst, those skilled in the art. An issue central to the invention being properly understood is in recognition of the distinction between what is a port in the physical sense, when discussing a connector consisting of a terminal pair, and what is a port in the logical context, when discussing SCSI upper protocol over a fibre channel transport layer.
[0226] Useful for these purposes is FIG. 7, which is a block diagram outlining what hardware comprises a SVE device, (being a type of storage router), which plays an important role as a building block in a SAN which employs DSMP Architecture.
[0227] Typically an SVE device of the form described in the embodiments of FIG. 6, may consist of several circuit boards, with perhaps one being subservient to the governing or processing functionality of the other. Given this relationship, the former might be considered to be a daughterboard, and the latter a motherboard. In those earlier illustrated embodiments, the daughterboard may be dedicated to connectivity with the host-side topology of the SAN, for example, while the motherboard may be associated with the links to elements in the storage side of the SAN. However, it may be common for some versions of such an SVE device to be installed in an opposite case manner,--such that the motherboard is coupled with the host side, and the daughterboard with the device side.
[0228] Turning to FIG. 7, a daughterboard 750 embodies the hardware essential to a multi-port fibre channel controller (MPFCC), in a basic dual-port form embodiment. Such an MPFCC consists of a fibre channel controller chip 751 (of which a typically preferable commercial example is an IC referred to as a Tachyon--manufactured by HP/Agilent Technologies of Palo Alto, Calif., as part no.: TL5000D), adjoined by a synchronous static random access memory (SS RAM) chip 752, coupled through to a pair of FC port connectors 755[1] and 755[2], each via a FC port bypass circuit chip 754 (one to each connector 755[1] and 755[2] respectively), through a transceiver chip 753. It will be noted that these two FC port connectors can correspond to any of the pairs of ports 602[i,1] and 602[i,2], as shown in each of the host side dual port fibre channel controllers of FIG. 6.
[0229] These components are found repeated collectively as a subset portion of a motherboard 760. Here, the principal component, being a second fibre channel controller chip 761 (indistinguishable from chip 751
found on daughterboard 750), is adjoined by another SS RAM chip 762, and coupled through to another pair of FC port connectors 765[1] and 765[2], each via a FC port bypass circuit chip 764 (one to each of connectors 765[1] and 765[2], respectively), through another transceiver chip 763. It will be noted that these two FC port connectors can correspond to any of the pairs of ports 606[i,1] and 606[i,2], as shown in each of the storage side dual port fibre channel controllers of FIG. 6.
[0230] The intelligent decision-making characteristics of the DSMPA emanate from these and several other components shown here, the critical ones being a local bus field programmable gate array (LB FPGA) peripheral component interconnect (PCI) arbiter 772, and a central processor unit (CPU) 773. Motherboard 760 supports two PCI buses--a primary PCI bus 770a, and a secondary PCI bus 770b, allowing electronic signals to pass between the various components.
[0231] There is a noteworthy distinction between components 750-754 of the daughterboard (supporting the FC port connector pair 755[i]), and those of the collective components group 760-764 (feeding FC connector port pair 765[i]). Those components residing on daughterboard 750 are coupled through fibre channel controller chip 751 via secondary PCI bus 770b and a corresponding secondary PCI FPGA 771b, whilst the latter group mounted on motherboard 760 are coupled through fibre channel controller chip 761
via primary PCI bus 770a, and a corresponding primary PCI FPGA 771a.
[0232] Both PCI buses feed out from the LBFPGA PCI arbiter and terminate in a synchronous dynamic memory (SD RAM) chip 774. The other noteworthy elements are a programmable read only memory (PROM) chip 775 through which instructions for LB FPGA PCI arbiter 772 are set, several other memory components--an SS RAM chip 776, branching off the link between LB PCI FPGA arbiter772 and CPU 773, as well as a flash memory chip 777, and a non-volatile memory (NV RAM) chip 778, both of these connected in parallel to the path through LB FPGA PCI arbiter 772, bridging secondary PCI bus 770b to CPU 773.
[0233] Finally, there are also two different ports separate from those of the fibre channel (or which ever other primary in-band channel of communications is established between hosts and storage), each providing an independent facility for management or maintenance functions of the SVE device. One of these is an Ethernet port 790, the participation of which has already been discussed in each of FIG. 4 and FIG. 6 (this being embodied respectively therein by port 409[i] and port 609[i], respectively). The other is a serial port 780, the role of which is important in initially establishing some aspects of configuration in a SAN. However, the serial port 780 has a less significant role in the ongoing operations of the SVE device in a SAN employing a DSMPA, and so, lacking inventive novelty in the present invention context, does not warrant specific discussion beyond this mention.
[0234] As can be seen, motherboard 760 and daughterboard 750 each consists of a variety of interconnected elements of electronic hardware, the focus (from the perspective of defining the multi-port fibre channel controller being a key element of the SVE device which is a key device of the DSMP Architecture) is nevertheless the manner of harnessing the powerful functionality of each of single fibre channel controller chips 751 and 761 on each board respectively, and their possible means of facilitating capable interaction with the other components mounted thereon.
[0235] There is software--which may be more precisely referred to as firmware, (to distinguish it from the code which usually resides on read/write storage media on a computer)--embedded in flash memory chip 777. This firmware prescribes instructions for the manner in which signals are processed by controllers 761 and 751, also describing how the signals may be transmitted and/or distributed amongst other components residing on motherboard 760, and on daughterboard 750. These instructions provide the underlying basis for defining the operation of the DSMP Architecture.
[0236] Hosts and storage connected to a SAN employing a DSMP Architecture according to embodiments of the present invention, with SVE devices as the nodes of its framework, may only observed to behave and display the inherently intelligent, powerful and flexible characteristics associated with a DSMPA when the electronic components supporting each FC controller, (the controller in a collective sense, including either those components on the daughterboard 750 (items 751-755), or the group of components on the motherboard 760 (items 761-765) as per FIG. 7), work together as prescribed by the firmware.
[0237] Flowing from host to storage along the fibre channel path through any one SVE device is signal data, which may or may not be in a packetized form. This data, which can flow at rates of the order of 100
megabytes per second or more, is received and, with minimal latency, retransmitted onwards, perhaps even redirected along multiple duplicated streams. In the mean-time, though, a copy of certain bit portions of the data packet may need to be collected and saved to one of the memory chips (for example SD RAM chip 774, SS RAM chip 776, or NV RAM chip 778), the governing decision for which is made by a component such as the LB FPGA PCI arbiter 772, based on deciphering of certain other bit portions of a data packet, and comparing those deciphered bit portions with information stored in primary PCI FPGA 771a, or in secondary PCI FPGA 77 1b.
[0238] This storage router type device-based intelligent data handling and decision making processes depend on communications signals flowing between adjacent FC controllers coupled in loop (or other topology) link paths described earlier, which share the link bandwidth simultaneously and in an uninhibiting manner with the I/O data also passing along this channel. Alternatively, if links of the secondary (or sub-network are additionally established), then the management communications signals can proceed exclusively, or concurrently via these links, a regime with the potential to provide complete redundancy to protect against equipment or individual channel failure.
[0239] An important feature of the invention is the manner in which adjacent fibre channel controllers (residing within neighboring SVE devices), interact with each other, not only acting as conduit nodes for the unrestricted passage of I/O flowing from servers to storage media, but also sharing between them the storage virtualization information stored within databases on the memory chips internal to each of them. Such information, relating to management of the storage media and management of the SVE devices themselves, can be termed meta-data, describing customized configurations defined by an administrative user, including mapping and SCSI states. This characteristic is mentioned again subsequently in a comparison between various forms of architecture which may be employed in a SAN, distinguishing how DSMP Architecture differs from those existing.
[0240] Within the regime of optical signal connectivity, there are two separate fibre channel connector ports mounted on each controller board (enveloping either ports 755[i] or ports 765[i]--corresponding to either daughterboard 750 or motherboard 760 respectively), each of these itself having a terminal pair--one an incoming (RX) optical terminal, and the other an outgoing (TX) optical terminal. While such components are shown drawn by outline in FIG. 7, they are not identified by label ther