United States Patent5187787
Skeen , ; et al.February 16, 1993

Title

Apparatus and method for providing decoupling of data exchange details for providing high performance communication between software processes

Abstract

A communication interface for decoupling one software application from another software application such communications between applications are facilitated and applications may be developed in modularized fashion. The communication interface is comprised of two libraries of programs. One libray manages self-describing forms which contain actual data to be exchanged as well as type information regarding data format and class definition that contain semantic information. Another library manages communications and includes a subject mapper to receive subscription requests regarding a particular subject and map them to particular communication disciplines and to particular services supplying this information. A number of communication disciplines also cooperate with the subject mapper or directly with client applications to manage communications with various other applications using the communication protocols used by those other applications.


Inventors:Skeen; Marion D. (San Francisco, CA), Bowles; Mark  (Woodside, CA)
Assignee:Teknekron Software Systems, Inc. (Palo Alto, CA)
Appl. No.:386584
Filed:July 27, 1989

Current U.S. Class:719/314 
Field of Search:364/DIG.1,DIG.2 395/600

U.S. Patent Documents
4363093December 1982Davis et al.
4688170August 1987Waite et al.
4718005January 1988Feigenbaum et al.
4815030March 1989Cross et al.
4851988July 1989Trottier et al.
4914583April 1990Weisshaar et al.
4937784June 1990Masai et al.
4975830December 1990Gerpheide et al.
4992972February 1991Brooks et al.
4999771March 1991Ralph et al.
5062037October 1991Shorter et al.
5073852December 1991Siegel et al.
5101406March 1992Messenger
Other References
TIB Reference Manual, "The Teknekron Information Bus.TM.: Programmer's Reference Manual," Version 1.1, Sep. 7, 1989, pp. 1-46. .
"BASIS Application Programming Interface (AIP)," pp. 1-82. .
"BASIS Objectives, Environments, Concepts Functions, Value for Business Partners and Customers," IBM Confidential. .
DataTrade R1, "Lans Lans/Wans," Aug. 23, 1990, pp. 1-4. .
DataTrade R1, "Lans DT R1 Software Components," Aug. 23, 1990, pp. 1-7. .
DataTrade R1, "Lans DT R1 Networkf Architecture," Aug. 23, 1990, pp. 1-14. .
DataTrade R1, "Lans Broadcast Concepts," Aug. 23, 1990, pp. 1-9. .
DataTrade R1, "Lans Broadcast Performance," Aug. 23, 1990, pp. 1-3. .
DataTrade R1, "Lans Point-Point Concepts," Aug. 23, 1990, pp. 1-4. .
DataTrade R1, "Lans Security," Aug. 23, 1990, pp. 1-4. .
DataTrade R1, "API Overview," Jun. 6, 1990, pp. 1-11. .
DataTrade R1, "API Datatrade API Verbs," Jun. 6, 1990, pp. 1-14. .
DataTrade R1, "DataTrade Using DataTrade: APs," Aug. 23, 1990, pp. 1-14. .
"Delivering Integrated Solutions," 6 pages. .
Digital, "PAMS Message Bus for VAX/VMS," May 11, 1990, pp. 1-3. .
Howard Kilman and Glen Macko, "An Architectural Perspective of a Common Distributed Heterogeneous Message Bus," 1987, pp. 171-184. .
Glen Macko, "Developing a Message Bus for Integrating VMS High Speed Task to Task Communications," Fall 1986, pp. 339-347. .
Steven G. Judd, "A Practical Approach to Developing Client-Server Applications Among VAX/VMS, CICS/VS, and IMS/VS LU6.2 Applications Made Easy," Spring 1990, pp. 95-112. .
Product Insight, "Don't Miss the Lates Message Bus, VAXPAMSV2.5," Jun. 1989, pp. 18-21. .
Digital Equipment Corporation, "Digital Packaged Application Software Description PASD PASD Name: VAX-PAMS PASD: US.002.002," Version 2.5, Dec. 5, 1989, pp. 1-8. .
Digital Equipment Corporation, "PAMS Basic Call Set PAMS Message BUS Efficient Task-to -Task Communication," Jul. 1989, pp. 1-25. .
Digital Equipment Corporation, "Package Application Software Description for ULTRIX-PAMS," Version 1.2, Dec. 5, 1989, pp. 1-7. .
Digital Equipment Corporation, "Package Application Software Description for PC-PAMS," Version 1.2, Dec. 5, 1989, pp. 1-7. .
Digital Equipment Corporation, "PAMS Self-Maintenance Service Description," Apr. 3, 1990, pp. 1-3. .
Digital Equipment Corporation, "LU6.2 PAMS Self-Maintenance Service Description," Apr. 3, 1990, pp. 1-3. .
Digital Equipment Corporation, "PAMS Installation and Orientation Service Description," Jan. 31, 1989, pp. 1-3. .
Digital Equipment Corporation, "PAMS LU6.2 Installation and Orientation Service Description," Apr. 19, 1990, pp. 1-3. .
Digital Equipment Corporation, "Package Application Software Description for PAMS LU6.2," Version 2.1, Apr. 19, 1990, pp. 1-18. .
The Metamorphosis of Information Management; David Gelernter; Scientific American, Aug. 1989; pp. 66-73..~
Primary Examiner: Heckler; Thomas M.
Attorney, Agent or Firm:Fish; Ronald Craig

Claims


What is claimed is:
1. An apparatus for facilitating data exchange between one or more data consuming and data publishing processes running on one or more computers coupled by any data exchange medium, comprising:
a first software layer for execution on said one or more computers and coupled to all said processes for implementing data distribution decoupling between said processes such that no data consuming process need include any software routine to determine the address of any particular process which is publishing data desired by said data consuming process, but may simply request desired data by subject and said subject will then be automatically mapped to the appropriate one or more data publishing processes publishing data on that subject and the appropriate one or more communication protocols necessary to communicate with said data publishing processes;
a second software layer for execution on said one or more computers and coupled to all said processes and to said first software layer for implementing service protocol decoupling between said processes such that no data consuming or data publishing process need include any software routine to implement a communication protocol necessary to communicate with any other data consuming or data publishing process, all said communication protocols being encoded in service discipline programs forming part of said second software layer, said second software layer including means for receiving information from said first software layer regarding the subject upon which data is requested, and including one or more software routines for establishing communication through said data exchange medium such that only data on said requested subject is received by said data consuming process which requested data on said subject; and
a third software layer coupled to all said processes and to said second software layer for implementing data format decoupling between said processes such that processes communicating with each other need not include software routines to translate between differing data representation formats and data record organizations used in data objects being exchanged by said communicating processes, all necessary translation from the data representation used by a data publishing process to a data representation used by the data consuming process which requested data on the subject being carried out by said third software layer.

2. The apparatus of claim 1 further comprising communication layer software coupled to said second software layer for providing reliable communications between said processes such that no data consuming or data producing process need include any software routine to perform reliable communications between processes, and wherein said third software layer includes means for receiving requests to retrieve data from a particular named field of data from a data record published by a data publishing process and for determining the location of said field within said data record and retrieving the data from said field and returning it to the process which requested it.

3. An apparatus for coupling data between processes running in a computing environment having
a first central processing unit;
a second central processing unit;
one or more application processes which are data consumers running on said first and/or second central processing units and having at least one software routine to generate a subscription request for data on a particular subject;
one or more service processes running on one or more of said first and/or second central processing units, each having an access protocol and each supplying data on a particular subject or group of subjects;
one or more data exchange media such as shared memory, and/or shared distributed memory, and/or local area networks and/or wide area networks coupling said first and second central processing units together, each data exchange media having a communication protocol, said apparatus for coupling data comprising:
one or more subject-based addressing programs coupled at least to said application processes for execution on said first and/or second central processing units for receiving subscription requests from said application processes, each subscription request requesting data on a particular subject and including means for mapping the subject of each said subscription request to the network address and/or identity of one or more of said service processes which supply data on said subject, and for generating a request that at least one communication link per subscription request to the appropriate service process be established on each said subject with one or more of said service processes which provides data on said subject;
one or more service discipline programs for execution on said first and/or second central processing units which receive said one or more requests to establish a communication link on each said subject, for sending a request to one or more of said service process or processes using the access protocols for traversing the appropriate data exchange media to establish a communication link and for communicating with the selected service process or processes so as to establish a subscription for data on each said subject with the appropriate service process or processes, said service discipline programs for continuing to assist in passing data on each said subject to the appropriate application process that originated the subscription request until said subscription request on said subject cancelled.

4. The apparatus of claim 3 wherein said subject-based addressing programs are also coupled to each service process and include software routines which maintain records of all active subscription subjects and which support subject-based addressing by filtering data messages to be transmitted over said data exchange media by subject such that data on subjects for which there are no active subscriptions is never transmitted on said media thereby conserving computing resources, and wherein said application processes include software routines to generate cancel requests naming a subject for which a subscription is to be cancelled when data on that particular subject are no longer desired by the data consuming process which issued the subscription request, and wherein said subject-based addressing programs include routines to compare the subject of each said cancel request to the list of active subscriptions and, if a match is found, to send a cancel request to the appropriate one or more service discipline programs managing communication links on that particular subject, and wherein said service discipline programs include routines to send one or more cancel requests to all service processes which are currently supplying data on a particular subject for which the subscription was cancelled.

5. The apparatus of claim 4 wherein said mapping means includes means for sending each said subscription request and the subject thereof to a directory services program which maintains service records matching subjects to the addresses of service processes which supply data on said subjects, and wherein said directory services program includes a routine to compare the subject in each said subscription request to the subjects in said service records and to pass all service records for which there is a subject match back to said means for mapping, and wherein said means for mapping includes a routine to invoke all the service discipline programs identified by said service records passed back from said directory services program so as to set up communication links on the pertinent subject with all the service processes identified in said service records.

6. The apparatus of claim 5 wherein said service record for each said service process includes data identifying the address where said service process can be accessed through said data exchange media and the appropriate service discipline program used to communicate with said service process.

7. The apparatus of claim 6 wherein said service records also include data identifying the central processing unit upon which each service process is in execution and the address on said data exchange media of said central processing unit.

8. The apparatus of claim 3 wherein any service discipline program which has established an active subscription communication link with a service process which is broadcasting data on more subjects than just the subject of said active subscription includes a software routine to filter out data not pertinent to the subject of said active subscription and pass the remaining data to the subscribing application process.

9. The apparatus of claim 8 wherein said communication daemons include programs and/or software subroutines to provide low-level system support by filtering data messages to be transmitted via said data exchange media by subject.

10. The apparatus of claim 3 wherein said subject-based addressing programs include a directory services program which maintains service records which contain information on each said service process indicating which subjects on which each service process can supply data, and the address of each said service process, and the particular service discipline program encapsulating an appropriate communication protocol to access said service process, said means for mapping including means for invoking said directory services program and passing to it the subject of said subscription request, said directory services program including means for comparing the subject of said subscription request to the subjects listed in said service records and for returning all said service records for which there is a subject match to said subject based addressing program which received the subscription request, said subject based addressing program including means for examining said service record or records and identifying the appropriate one or more service discipline programs that are capable of communicating with the service processes identified in said service records and for passing said service records to the selected service discipline program or programs with said request to establish said communication link to aid said service discipline program or programs in establishing one or more subscription communication links on the requested subject.

11. The apparatus of claim 3 further comprising one or more failure recovery programs for switching between service processes supplying data on the same subject, or switching between available alternative paths if any through the networks or one or more data exchange media coupling an application process which requested data on a subject to a service process supplying data on that subject upon failure to receive the requested data so as to maintain the flow of data on the subject if possible.

12. The apparatus of claim 3 wherein each service process is coupled to said one or more data exchange media through a communications daemon which includes routines to filter data output by said service process in accordance with active subscriptions such that only data pertaining to the subject or subjects of one or more active subscriptions are transmitted over said data exchange media to service discipline programs managing one or more active subscription communication links.

13. The apparatus of claim 3 wherein said one or more subject-based addressing programs and said one or more service discipline programs together comprise a communication library of programs, and wherein each said application process and each said service process is linked to its own copy of said communication library programs, and further comprising a communication daemon coupling each said central processing unit to said one or more data exchange media, each daemon comprising one or more protocol engine programs linking said communication library or libraries to said one or more data exchange media, each said protocol engine implementing a communication protocol to interface a service discipline program to said data exchange media, each said protocol engine linked to an application process with an active subscription for cooperating with another protocol engine linked to a service process supplying the requested data to ensure reliable communication of data on the requested subject over said data exchange media.

14. The apparatus of claim 3 wherein at least some of said data exchange media have transport layer protocols and wherein said one or more subject-based addressing programs and said one or more service discipline programs together comprise a communication library of programs, and wherein each said application process and each said service process is linked to its own copy of said communication library programs, and further comprising one or more communication daemons, each coupling one of said central processing units to said data exchange media, and each daemon comprising one or more protocol engine programs, each said protocol engine encapsulating a communication protocol suitable for communication over at least one of said data exchange media, said protocol engines for cooperating with each other and with said transport layer protocols of said data exchange media to ensure reliable communications of data between application processes and/or said service processes, and further comprising one or more data exchange components linked to said application and service processes, each data exchange component comprised of a library of programs for execution on said first and/or second central processing units for managing data exchange between application processes and service processes in execution on either of said central processing units which use different data record formats and/or data representations by automatically performing data format conversion services such that an application process can request and receive data on a subject from a service process in a format and/or data representation which is useable by the requesting process and for freeing the requesting process of the need to convert said data request into a data format used by said server process such that the formalities of data format conversion necessary for effective data communication between application and service processes is transparent to these processes in that neither application or service processes communicating with each other need contain software routines capable of performing said data format conversions.

15. In a computing environment having a first computer, a second computer, one or more data consuming application processes executing on said first and/or second computers for requesting and using data on a subject specified in a subscription request, and one or more service processes executing on said first or second computers, each capable of supplying data on particular subjects, and one or more data exchange media coupling said data consuming process to said service process, an apparatus for facilitating data exchange between said data consuming process and said service process, comprising:
one or more service discipline programs which encode communication protocols to communicate with particular ones of said service processes, each for receiving a request to establish communications on a particular subject and for establishing communications with a service process so as to receive data from said service process and pass only data on the requested subject to said data consuming process which issued said subscription request;
a subject based addressing program for receiving a subscription request on a particular subject and for mapping the subject to a particular service discipline and for issuing a request to establish communications on the requested subject to said service discipline.

16. An apparatus for assisting in the communication of data between computer processes in a computer network having one or more host CPU's or workstations and/or servers coupled together by a data communication structure comprising one or more local area networks and/or wide area networks and/or shared local or distributed memory, each said host CPU, server and workstation running software processes some of which are publisher processes which publish data, some of which are subscriber processes which consume data and some of which may do both, said publication of data carried out by sending data over said communication structure, comprising:
a library of communication programs a copy of which is linked to every software process, for receiving subscription requests from data consuming processes requesting data on particular subjects, and for mapping each of said subjects to an appropriate communication program which is capable of establishing communications with a publisher process which supplies data on the requested subjects, and for invoking said appropriate communication program for each said subject in said communication library to establish and maintain a communication link with at least one of said publisher processes supplying data on said subject, said communication program encapsulating all of the software communication protocols needed to access said publisher process and to provide a programmatic interface by which subscription requests to said communication library are entered by said subscriber processes; and
one or more libraries of data format decoupling programs a copy of which is linked to each said communication library and to each said software process for performing data format translation services for publisher and subscriber processes between which data exchange takes place such that any subscriber process may receive data in a format useable by said subscriber process even though the publisher process from which the data originated may use a different data record format or data representation, and for providing semantic data retrieval services whereby subscriber processes may extract desired data from specific fields of specific data records from a publisher process despite the fact that the specified data record from which data is to be retrieved may have a different data record organizations without the need for any software routines in the subscriber process or the publisher process to process or translate differences in the organization of data records or differences in the field names in said data records.

17. The apparatus of claim 16 wherein each said library of programs for data format decoupling includes one or more forms-manager programs for manipulating instances of self-describing data records each of which contains both actual data and format data, said format data being defined in one or more form class definitions, said form class definition defining the names and formats or data representation type of each field of data records in the class of data records corresponding to said form class definition and defining the organization of said fields within each instance of a data record of that class, and further includes one or more forms-class manager programs coupled to said forms-manager program or programs for manipulation of said form class definitions including a program for receiving a get-field call from a subscriber or data publishing process or from a program in said library of communication programs, said get-field call including the address of a particular data record and the name of a field therein, and for retrieving the form class definition of the class of the data record of interest and searching said form class definition for a field having the same name or a synonym of the name of the field name given in said get-field call, and for returning a relative offset pointer to the calling program, and including a program for receiving a get-data call from a subscriber or data publishing process or from a program in said library of communication programs including the relative address pointer for the location of the desired data field in the desired data record identified by said get-field call and for extracting the data in in the field of the desired data record as identified by said relative address pointer and returning said data to the calling program, and including one or more format conversion programs for converting data records to be sent to a subscribing process from the data record format of the transmitting process to a data record format necessary for transmission over said communication structure, and, upon arrival at the receiving host CPU or workstation or server, for converting the data record format from the data record format used for transmission over said communication structure to the data record format used by the receiving process.

18. An apparatus for facilitating communication of data between two or more software processes in execution on the same or different computers coupled by a data exchange medium where no process needs to know the port address of any other process, comprising:
one or more computers;
a network comprised of at least one data transfer path, said network coupling said one or more computers by one or more data transfer paths;
at least one application process in execution on at least one of said computers and capable of requesting data by subject;
at least one data publishing process which may or may not be the same as said application process, said data publishing process in execution on at least one said computer and capable of outputting data on at least one subject;
a subject based addressing program including means for obtaining data on different subjects, said subject based addressing program including computer programs linked at least to each of said at least one application and data publishing processes and to said network for receiving and processing subscription requests from said at least one application process on at least one subject and for mapping said subject to an appropriate means for obtaining data on the requested subject, and for entering a subscription for data on the requested subject with said appropriate means for obtaining data on said subject, and for receiving data on the requested subject and passing the data to the appropriate said one or more application processes which requested the data.

19. The apparatus of claim 18 wherein said subject based addressing program further comprises means for issuing a command to establish a subscription communication session on said subject with one or more of said data publishing processes capable of supplying data on the requested subject, and wherein said means for obtaining data on different subjects includes a service discipline program for encapsulating a protocol for obtaining data on the subject, and for receiving said command to establish a subscription communication session on said subject, and for establishing a subscription communication session with one or more of said data publishing processes, and entering a subscription request to said one or more data publishing processes to supply data on said subject, and for receiving data on said subject and passing said data to said one or more application processes which requested data on said subject.

20. The apparatus of claim 19 wherein said one or more computers comprises at least two server computers, and wherein said at least one data publishing process comprises at least two service instances in execution on at least two of said server computers, and wherein at least two of said service instances and/or server computers require different communication protocols to communicate therewith, and wherein said service discipline program encapsulates at least two different service discipline protocols for communicating with said at least two different service instances, and wherein said subject based addressing program includes means for mapping said subject to one or more of said service discipline protocols, and further includes means for issuing a subscription request so as to cause the appropriate service discipline protocol to execute and set up a communication session so as to obtain data on said subject, each said service discipline protocol including means for passing data received on said subject to the one or more application processes which requested data on said subject.

21. The apparatus of claim 20 wherein at least some of said service discipline protocols have different fault tolerance characteristics.

22. The apparatus of claim 21 wherein said network comprises at least two networks each of which is coupled to at least some of said application and data producing processes and wherein said service discipline protocols have different fault tolerance characteristics which comprise at least automatic switchover to a different service instance capable of supplying data on the same subject upon failure of a service instance from which data is being received via a subscription, and automatic switchover to an alternate network coupled to a data producing process capable of supplying data on the same subject upon failure of the data path in use.

23. The apparatus of claim 22 wherein said different fault tolerance characteristics comprise at least automatic switchover to a different server computer having in execution thereon a service instance capable of supplying data on the requested subject upon failure of either the server computer or the service instance from which data is being received.

24. The apparatus of claim 19 wherein said one or more computers includes at least two server computers each of which has running thereon a data publishing process or service instance, at least two of said server computers and the service instances in execution thereon requiring different communication protocols to communicate therewith, and wherein said service discipline program includes means for encapsulating at least two different service discipline protocols, each for communicating with at least one of said different data publishing processes or service instances in execution on said server computers, and wherein said subject based addressing program includes means for mapping said subject to the appropriate said service discipline protocol, and for invoking said service discipline protocol so as to establish a communication session on the requested subject and, for passing data received on said subject to the appropriate one or more application processes which requested data on said subject, and further comprising at least two said networks, and means coupled to said at least two networks and to said at least one application process and said at least one data publishing process or service instance for providing network failure fault tolerance by automatically switching to an alternate network upon failure of the data transfer path being used to transfer data on said subject.

25. The apparatus of claim 24 wherein said service discipline program includes means for monitoring the continued viability of at least one said service instance as a source of data on the requested subject with which a communication session has been established, and, upon failure of the service instance to supply data on the requested subject, for automatically selecting another server computer upon which another service instance is in execution which is capable of supplying data on the requested subject, and for establishing a communication session therewith by invoking an appropriate service discipline protocol and establishing a subscription with said service instance to supply data on the requested subject, and for passing the resulting data on the requested subject to the one or more application processes which requested said data.

26. The apparatus of claim 20 or 24 wherein said service discipline program includes means for monitoring the continued viability of one or more of said server computers as a source of data on the requested subject, and, upon failure of the monitored server computer supplying data on the requested subject, for automatically selecting another server computer upon which a service instance is in execution which is capable of supplying data on said subject, and for establishing a communication session therewith on said subject by invoking an appropriate service discipline protocol, and for passing the resulting data on the requested subject to the one or more application processes which requested said data.

27. The apparatus of claim 19 wherein said one or more computers comprises at least two server computers, and wherein said at least one data publishing process comprises at least two service instances in execution on at least two of said server computers and wherein at least two of said service instances and/or server computers require different communication protocols to communicate therewith, and wherein said service discipline program includes means for encapsulating at least two different service discipline protocols for communicating with said at least two different service instances which require different protocol to communicate therewith, and wherein said subject based addressing program includes means to invoke the appropriate service discipline protocol to obtain data on a particular subject, each said service discipline program including means for passing data received on said subject to the one or more application processes which requested data on the subject, and further comprising communication means coupled at least to said service discipline program and comprising at least one protocol engine for encapsulating a network communication protocol for establishing communications over said network by interfacing said service discipline protocol to said network using the communication protocol native to said network.

28. The apparatus of claim 27 wherein said communication means comprises a plurality of protocol engines, each of which may encapsulate a different network communication protocol, each of said protocol engines being able to interface one or more of said service discipline protocols to said network using the network communication protocol encapsulated in said protocol engine.

29. The apparatus of claim 28 wherein at least some of said service discipline protocols have different fault tolerance characteristics such as automatic switchover to a different server computer supplying data on the same subject upon failure of a server computer from which data is being received via a subscription or automatic switchover to an alternate network to reach the same server computer or another server computer supplying data on the same subject upon failure of the data path currently being used to obtain data on said subject.

30. The apparatus of claim 28 wherein at least some of said protocol engines have different fault tolerance characteristics such as reliable broadcast communication protocols for verifying that all message packets of a broadcast have been successfully received and requiring rebroadcast of any lost or garbled message packets, or automatic switchover to an alternate network upon failure of the data path currently being used.

31. The apparatus of claim 19 wherein said service discipline program is coupled to a data publishing process supplying data on said subject, and supports subject based addressing by filtering data by subject so as to conserve network bandwidth.

32. The apparatus of claim 18 further comprising a data format decoupling program comprised of one or more computer programs coupled to each of said one or more application processes and to each of said one or more data publishing processes, for facilitating the transfer of data via said network between said data publishing process and said application process using self-describing data objects or forms by performing format conversion operations where the formats for the expression and organization of data records used by each computer data publishing process or application process may be different, and where said self-describing data objects each contain one or more fields and are organized into one or more classes each of which has a unique class identification, said data format decoupling program including one or more computer programs to define the general organization of each class of self-describing data objects in terms of the semantic information or names of each field and the format information defining the class identification or code used to express the data contained in each field in a class definition, and wherein the actual data to be transferred and said format information is stored in each instance of a self-describing data object, and wherein said data format decoupling means includes at least one forms manager program means for converting the data format of data on a subject requested by an application process from the data format in which said data is published by said data publishing process to a format suitable for transfer via said network and, upon receipt from said network, for converting said data from the format used for transfer over said network to a format used by said application process, and for performing one or more of said format conversion operations using format information stored in the instance of the form itself or in said class definition.

33. The apparatus of claim 32 further comprising a data format decoupling library comprised of one or more computer programs linked at least to each said one or more application processes and said one or more data publishing processes, including at least one class manager means for performing semantic-dependent operations to facilitate the exchange of self-describing data objects called forms between said at least one application process which uses data in a first format and said at least one data publishing process which may publish data in a second format different from said first format, said forms each being comprised of one or more fields each of which contains another form which may be either a primitive class form in that said field contains data or a constructed class form, said constructed class form containing one or more fields each of which may be a primitive class form or another constructed class form, such that form classes may be nested to any number of nesting levels, said class manager means also for facilitating the exchange of data by receiving a request to get the data from a particular field of a particular instance of a form and for searching a class definition for a field having a name matching the field named in said request, said searching including searching of all class definitions for any fields of said class definition containing constructed class forms and including searching through all levels of nesting of said class definitions, and for returning to the requesting process a relative address pointer identifying the location of the requested field within instances of the form of the class of forms containing the requested field, and further for receiving a request for the particular data contained in said field named in the original request and for using said relative address pointer and the address of the particular instance of the form to read the requested data and return said data to said requesting process.

34. The apparatus of claim 18 further comprising a data format decoupling library comprised of one or more computer programs linked at least to each said one or more application processes and to said one or more data publishing processes, including at least one class manager means for performing semantic-dependent operations to facilitate the exchange of self-describing data objects called forms between said at least one application process which uses data in a first format and said at least one data publishing process which outputs data in a second format which may be different from said first format, said forms each being comprised of one or more fields each of which contains another form which may be either a primitive class form in that said field contains data or a constructed class form, said constructed class form containing one or more fields each which may be a primitive class form or another constructed class form, such that form classes may be nested to any number of nesting levels, said class manager means further for facilitating the exchange of data by receiving a request to get data from a particular named field of a particular instance of a form, and for searching a class definition for a field having a name matching the field named in said request, said searching including searching of all class definitions for any fields in the class definition containing constructed class forms, and including searching through all levels of nesting of said class definitions, and for returning a relative address pointer to the requesting process identifying the location of the requested field within instances of the form of the class of forms containing the requested field, and further for receiving a request for the particular data contained in said field named in the original request and using said relative address pointer and the address of a particular instance of said form to read the requested data and return said data to said requesting process.

35. The apparatus of claim 18 or 32 or 34 further comprising at least two said networks coupling at least some of said one or more computers, and wherein said one or more computers includes at least two server computers upon which at least two data publishing processes or service instances are in execution publishing data on said subject upon which data has been requested and further comprising means coupled to said at least one application process and to said at least two service instances and to said at least two networks for providing network failure and service instance failure fault tolerance by automatically switching to an alternate network upon failure of the data transfer path being used to transfer data to said application process on said subject and by automatically switching to another service instance supplying data on the requested subject in case of service instance failure so as to provide a substantially continuous flow of data on the requested subject to said at least one application process which is requesting data on said subject.

36. The apparatus of claim 35 further comprising means coupled to said server computers and to said at least one application process for providing server failure fault tolerance by automatically switching to another server computer upon which a data publishing process or service instance supplying data on the requested subject is in execution so as to maintain a substantially uninterrupted flow of data on the requested subject to said at least application process which requested data on said subject.

37. The apparatus of claim 18 or 32 or 33 or 34 wherein said subject based addressing program further comprises means for issuing a command to establish a subscription communication session with one or more of said data publishing processes capable of supplying data on the requested subject, and further comprising a service discipline means for encapsulating a protocol for communicating with one or more of said data publishing processes identified by said subject based addressing program as capable of supplying data on said subject, and for receiving said command to establish a subscription communication session with at least one or more of said data publishing processes, and for establishing a subscription communication session with said one or more data publishing processes, and for entering a subscription request to said data publishing process, and for receiving data on said subject and passing said data to said one or more application processes which requested data on said subject.

38. The apparatus of claim 18 further comprising at least two said networks, and means coupled to said at least one application process and said at least one data publishing process for providing network failure fault tolerance by automatically switching to an alternate network upon failure of the data transfer path being used.

39. The apparatus of claim 18 or 32 or 34 or 38 further comprising at least two server computers coupled to said network each of which has at least one said data publishing process in execution thereon capable of supplying data on the subject upon which data has been requested by said one or more application processes, and further comprising means coupled at least to said server computers and to said at least one application process for providing server failure fault tolerance by automatically switching to another server computer upon which a data publishing process supplying data on the requested subject is in execution so as to maintain a substantially uninterrupted flow of data on the requested subject to said at least one application process which requested data on said subject.

40. The apparatus of claim 18 further comprising communication means coupled at least to said subject based addressing program including at least one protocol engine for encapsulating a network communication protocol for establishing communications regarding said subject using the protocol native to said network.

41. The apparatus of claim 40 wherein said communication means is coupled to each of said application and data publishing processes, and wherein said application and data publishing processes communicate over said network through respective protocol engines using the same communication protocol, and wherein said protocol engines include means for cooperating to implement an intelligent multicast protocol wherein data regarding a particular subject is transmitted over the network using point-to-point communication protocol between the data publishing process and each application process having a current subscription to data on said subject until the number of subscribing application processes exceeds a number of subscriptions wherein point-to-point communications are the most efficient way to send the data, and then for switching automatically to a broadcast communication protocol.

42. The apparatus of claim 41 wherein said protocol engines switch automatically to a reliable broadcast communication protocol when a point-to-point communication protocol is no longer the most efficient way to transmit said data.

43. The apparatus of claim 41 wherein said protocol engines cooperate to implement an intelligent multicast protocol wherein any number of switches between said point-to-point and broadcast protocols may occur depending upon the number of application processes subscribing to said subject at any particular time, and wherein said point-to-point protocol includes reliable point-to-point protocol and said broadcast protocol includes reliable broadcast protocol.

44. The apparatus of claim 40 wherein said communication means is coupled to each of said application and data publishing processes, and wherein said application and data publishing processes communicate over said network through respective protocol engines using the same network communication protocol, and wherein said protocol engines cooperate to implement a reliable broadcast protocol wherein data on a subject is transmitted in discrete messages and wherein the complete reception of all discrete messages of a broadcast data transmission on a subject for which there is an outstanding subscription is verified by the protocol engines coupled to the application and data publishing processes, and any lost or garbled messages are rebroadcast.

45. The apparatus of claim 41 or 44 wherein said protocol engines filter data being transmitted over said network by subject.

46. The apparatus of claim 40 wherein said communication means includes a plurality of different protocol engines, each of which may encapsulate a different network communication protocol.

47. The apparatus of claim 40 wherein at least one said protocol engine supports subject based addressing by filtering incoming data by subject.

48. The apparatus of claim 18 wherein said subject based addressing program further comprises means for issuing a command to establish a subscription communication session with one or more of said data publishing processes capable of supplying data on the requested subject, and wherein said network further comprises transport layer protocol means for transferring data through said data transfer path according to a particular protocol native to said network, and further comprising a service discipline program means for encapsulating a communication protocol program capable of being invoked by said subject based addressing program via said command to establish a subscription communication session on the requested subject, and also for establishing a subscription communication session with one or more of said data publishing processes by invoking said transport layer protocol means and sending thereto data to be transmitted over said network on said subject, said service discipline program means also for sending an appropriate message to said one or more data publishing processes using the appropriate protocol for communicating with said data publishing processes to establish said subscription communication session for data on the requested subject, and also for receiving data on the requested subject and passing said data to said at least one application process which requested said data.

49. An apparatus for facilitating communication of data in an environment comprising one or more computers and a network coupling said one or more computers by one or more data transfer paths and having a transport layer protocol, and having at least one data publishing process in execution on at least one said computer so as to implement at least one service instance capable of outputting data for transmission over said network, and having at least one application process in execution on at least one of said computers for performing various operations including requesting data from one or more of said service instances, comprising:
service discipline means for receiving from at least one said application process a subscription request to obtain data from a specified service instance and for automatically carrying out processing to set up a communication link to said service instance over said network using appropriate communication protocols to communicate with said service instance so as to obtain said data whenever data is published by said service instance and for automatically providing said data to said application process which requested said data until said application process cancels said subscription request; and
at least one protocol engine program in execution on at least one said computer and coupled to said service discipline means for interfacing said service discipline means to said transport layer protocol to set up said communication link over one or more of said data transfer paths.

50. The apparatus of claim 49 wherein said at least one data publishing process comprises a plurality of data publishing processes, and wherein said service discipline means is comprised of a plurality of service discipline means for carrying out the communication protocols needed to communicate with each different data publishing process, and wherein at least one of said service discipline means includes means for automatically establishing a communication session with the same data publishing process running on a different computer or a different but independent data publishing process supplying the same type data upon detection of failure to receive the requested data.

51. The apparatus of claim 50 further comprising one or more redundant networks or data paths coupling said one or more computers, and wherein said service discipline means includes means to re-establish any communication session to obtain the requested data by switching networks or switching computers to find an alternate data path to the selected data publishing process or an alternate computer upon which a data publishing process is running which can supply data of the same type as requested, and further comprising protocol engine means for implementing a reliable communication protocol to obtain the requested data such that missing or garbled packets are retransmitted.

52. The apparatus of claim 51 further comprising a communication means including a plurality of protocol engines for interfacing said service discipline means to said network, at least some of said protocol engines being coupled to said application and data publishing processes and including means for cooperating to implement a reliable broadcast communication protocol wherein the protocol engines coupled to a plurality of said application processes which are all receiving data from the same data publishing process via a broadcast communication protocol communicate with the protocol engine coupled to the data publishing process to insure that all packets of a message are successfully received by all subscribing application processes including cooperating to cause retransmission of missing or garbled packets, and wherein at least some of said protocol engines include means for cooperating to implement an intelligent multicast protocol wherein the protocol engines coupled to said data publishing process sends data to subscribing application processes via a point-to-point communication protocol until the number of subscribing processes reaches a number where it would be more efficient to send the data via a broadcast protocol, and then automatically switching to said reliable broadcast protocol when said number is reached, and for switching back and forth as needed between said point-to-point or reliable broadcast protocols depending upon the number of subscribing application processes at any particular time.

53. The apparatus of claim 52 wherein said service discipline means coupled to said data publishing process further comprises means for using said subscription list to determine the appropriate communication protocol to use in communicating said data over said network.

54. An apparatus as defined in claim 52, further comprising:
means coupled to said application process, said data publishing process and said service discipline means for providing subject based addressing such that an application process may request data on a particular subject by entering a subscription request, and may receive said data continuously until the subscription is cancelled without the need to know the location of the data publishing process which can supply the requested data; and
means coupled to said subject based addressing means for implementing subject entitlement checking such that access to data on certain subjects by certain application processes may be blocked; and
further comprising means for providing fault recovery services so as to maintain the flow of requested data if possible.

55. The apparatus of claim 51 wherein said service discipline means includes means for automatic load balancing between computers running the selected data publishing process supplying the requested data such that communication sessions between requesting application processes and multiple computers and data publishing processes supplying the same data are substantially balanced across all operational computers upon which the selected data publishing process is running.

56. The apparatus of claim 49 further comprising a plurality of protocol engine means for implementing communication protocols for sending data over said network including means for implementing a point-to-point communication protocol and a reliable broadcast communication protocol, and for using said point-to-point communication protocol when the number of subscribing application processes to the same data is less than or equal to a programmable number, and for using said reliable broadcast communication protocol when the number of subscribing application processes to the same data is greater than said programmable number.

57. The apparatus of claim 49 wherein said service discipline protocol includes means to communicate over said network the fact that an application process has requested particular data by a subscription request, and further comprising a service discipline means coupled to a data publishing process which includes means for maintaining a list of active subscriptions, and for filtering outgoing data such that only subscribed to data is sent over the network so as to conserve network bandwidth.

58. The apparatus of claim 49 further comprising data format decoupling means, comprised of one or more computer programs coupled to each of said one or more application processes and said one or more data publishing processes, for facilitating the transfer of data via said network between said data publishing process and said application process using self-describing data objects by performing format conversion operations where the formats for the expression and organization of data records used by each computer, data publishing process and application process may be different, and wherein said self-describing data objects each contain one or more fields and are organized into one or more classes each of which has a unique class identification, and wherein the general organization of each class of self-describing data objects in terms of the names of each field and the format information defining either the class identification of the self-describing data object referenced in a field of another self-describing data object or the computer code used to express data contained in each field of the self-describing data object is defined in a class definition, and wherein the actual data to be transferred and said format information is stored in each instance of a self-describing data object, and wherein said data format decoupling means includes at least one forms manager program means for converting the data format of requested data from the data format in which said data is published by said data publishing process to a format suitable for transfer via said network and, upon receipt from said network, for converting said data from the format used for transfer over said network to a format used by said application process which requested the data, and for performing one or more of said format conversion operations using format information stored in the instance of the form itself.

59. An apparatus as defined in claim 49, further comprising:
means coupled to said application process, said data publishing process and said service discipline means for providing subject based addressing such that an application process may request data on a particular subject by entering a subscription request, and may receive said data continuously until the subscription is cancelled without the need to know the location of the data publishing process which can supply the requested data; and
means coupled to said subject based addressing means for implementing subject entitlement checking such that access to data on certain subjects by certain application processes may be blocked.

60. An apparatus for facilitating communications of data in a computing environment comprised of one or more computers and/or servers executing one or more data consuming software processes and one or more data publishing server processes coupled by one or more networks and/or other interprocess communication paths, comprising:
at least one service discipline means capable of communicating with said server processes;
first means for receiving a subscription request for data on a particular subject from a data consuming process and for mapping said subject to one or more said service discipline means which is capable of communicating with said data publishing server processes; and
second means for invoking said service discipline means identified by said first means and using said service discipline means for establishing a communication session with said data publishing server process over said network or interprocess communication path between the requesting data consuming process and said data publishing server process, and for obtaining data only on the requested subject and passing said data to the data consuming process which requested the data.

61. The apparatus of claim 60 further comprising means for automatically translating data representations in one or more fields of data records containing data published by said data publishing server process from the format used by said data publishing server process to the format used by said data consuming process.

62. The apparatus of claim 61 further comprising means for performing data retrieval operations for data desired by said data consuming process stored in a specific data record from said data publishing server process, all said data records being grouped into one or more classes and each data record comprised of one or more fields, each class having all data records therein having the same organization in terms of names, sizes, sequences and data representation format for the fields comprising each said data record in the class, the foregoing characteristics for each class being defined in a class definition, said means for performing data retrieval operations including first means for receiving a request to obtain the data in a named field of a specified data record, and for accessing the class definition for the class to which said specified data record belongs and for searching the names of said fields listed in said class definition until the name specified in said request or a synonym thereof is found, and for generating a relative offset pointer address indicating the address of the desired field relative to the address of the start of the specified data record, said means for performing data retrieval operation further comprising second means coupled to said first means for using said relative offset pointer address to access said specified data record and the field therein specified by said relative offset pointer address and retrieve the data in the specified field upon request from said data consuming process.

63. An apparatus as defined in claim 60 wherein said first means includes means for storing a mapping table containing service records storing the identity or network address of one or more server process which supplies data on each particular subject and the appropriate one of said service discipline means to use in communicating with said server process, and further comprises means for consulting said mapping table in response to each said subscription request to match the subject of said subscription request to the subject of one or more service records in said mapping table and for returning to said first means the identity or address of one or more server processes which supply data on the requested subject and the identity of the appropriate service discipline means to communicate with the identified server process, and wherein said second means includes means coupled to said first means for receiving information from said first means from said mapping table including at least the identity and/or network address of a server process that supplies data on the requested subject and the identity of the appropriate service discipline means to use in communicating with the server process, if any, which publishes data on the subject identified in said subscription request and for invoking the identified service discipline means so as to establish a communication session with said data publishing server process using a communication protocol used by said server process and passing to said data publishing server process the subject identified in said subscription request, and for receiving data on said subject whenever said data publishing server process publishes data on said subject and passing said data to the data consuming process which made said subscription request until said subscription request is cancelled by said data consuming process.

64. The apparatus of claim 60 wherein said service discipline means includes means for filtering data by subject when said server process is broadcasting data on many subjects on said one or more networks and/or other interprocess communication paths, and for passing only data on the requested subject to the data consuming process which made the subscription request.

65. A communication interface for assisting in the exchange of data objects between software processes that use different data record formats and/or data representations for data objects, comprising:
means coupled to each said process for manipulating data objects called forms, each said form containing one or more class identifiers corresponding to one or more class definitions that give, for a corresponding class of forms, the name, data type and organization of the fields of each form of the class to which said form belongs, where each said field of a form may contain data or a form of another class, where each said form can store in each field thereof either actual data or a class identifier for another form such that multiple levels of nesting of forms can exist; and
means for converting the format of a form to be transmitted to another process to a format necessary to accomplish the transmission over whatever data communication path exists between the two processes and for performing another conversion to data record format which renders said form suitable for use by the receiving process.

66. A communication interface for assisting in communication of data between computer processes which use different data record organizations and/or different data representations, comprising:
data record format conversion means coupled to each said process for manipulating self-describing data objects called forms each of which has one or more fields and which belongs to a class of forms having a unique class identifier, each field in a form capable of storing either actual data or the class identifier for another form, each form in any particular class having a data record organization, field names, and data format for each said field given by a class definition corresponding to the class identifier for that particular class, said data record format conversion means including means for translating the data record format for a form to be transmitted from one computer process to another first to a data record format suitable for transmission over whatever data communication path coupled the two processes together and then for translating the data record format of the transmitted form to a data record format suitable for use by the receiving software process; and
semantic operation means coupled to said data format translation means and/or to each said process for receiving a request by a computer process to retrieve the data in any selected field of a particular form, said request identifying the particular form of interest and the name of the field of interest or a synonym thereof, and for accessing the class definition defining the field names and organization of the type of form identified in said request and locating a field having a name which matches the field name given in said request or is a synonym thereof and for returning a relative offset pointer address locating said field in every instance of said form and for accessing the form identified in said request and using said relative offset pointer address to locate the data in the field of interest and obtain and return said data to the request process.

67. A process for communicating data between software processes coupled by a data communication path comprising:
receiving a request for information on a particular subject from a requesting software process and automatically mapping that subject using a subject mapper program to the identity of a service discipline program which is capable of communicating with a data publishing process that supplies data on that subject; and
invoking said service discipline program identified by said subject mapper program and establishing a communication link over the network or other communication path used by said data publishing process to publish data using said service discipline program and filtering data published by said data publishing process by subject if necessary such that only data on the requested subject reaches said requesting software process.

68. A process for communicating data between one or more processes running on one or more computers comprising the steps of:
receiving a request for information on a particular subject from a data consuming process and automatically mapping that subject to a data publishing process called a service that supplies data on that subject and to a service discipline encapsulating an appropriate communication procedure in order to communicate with the service identified by the mapping step, and outputting service record data identifying said service and said service discipline;
invoking the service discipline identified by said service record data and establishing a communication link with said service using said service discipline such that data on the requested subject reaches said subscriber process which requested said data;
exchanging the requested data between said service and said requesting computer program using self-describing data objects such that communications with said service are done using data objects having the format used by said service while the data received by said requesting computer program is via data objects having the format used by said data consuming process; and
monitoring the reliability of the exchange of data and retransmitting lost data or switching services supplying data on the subject if necessary so as to maintain the flow of reliable data.

Description



BACKGROUND OF THE INVENTION

The invention pertains to the field of decoupled information exchange between software processes running on different or even the same computer where the software processes may use different formats for data representation and organization or may use the same formats and organization but said formats and organization may later be changed without requiring any reprogramming. Also, the software processes use "semantic" or field-name information in such a way that each process can understand and use data it has received from any foreign software process, regardless of semantic or field name differences. The semantic information is decoupled from data representation and organization information.

With the proliferation of different types of computers and software programs and the ever-present need for different types of computers running different types of software programs to exchange data, there has arisen a need for a system by which such exchanges of data can occur. Typically, data that must be exchanged between software modules that are foreign to each other comprises text, data and graphics. However, there occasionally arises the need to exchange digitized voice or digitized image data or other more exotic forms of information. These different types of data are called "primitives." A software program can manipulate only the primitives that it is programmed to understand and manipulate. Other types of primitives, when introduced as data into a software program, will cause errors.

"Foreign." as the term is used herein, means that the software modules or host computers involved in the exchange "speak different languages." For example, the Motorola and Intel microprocessor widely used in personal computers and work stations use different data representations in that in one family of microprocessors the most significant byte of multibyte words is placed first while in the other family of processors the most significant byte is placed last. Further, in IBM computers text letters are coded in EBCDIC code while in almost all other computers text letters are coded in ASCII code. Also, there are several different ways of representing numbers including integer, floating point, etc. Further, foreign software modules use different ways of organizing data and use different semantic information, i.e., what each field in a data record is named and what it means.

The use of these various formats for data representation and organization means that translations either to a common language or from the language of one computer or process to the language of another computer or process must be made before meaningful communication can take place. Further, many software modules between which communication is to take place reside on different computers that are physically distant from each other and connected only local area networks, wide area networks, gateways, satellites, etc. These various networks have their own widely diverse protocols for communication. Also, at least in the world of financial services, the various sources of raw data such as Dow Jones News or Telerate.TM. use different data formats and communication protocols which must be understood and followed to receive data from these sources.

In complex data situations such as financial data regarding equities, bonds, money markets, etc., it is often useful to have nesting of data. That is, data regarding a particular subject is often organized as a data record having multiple "fields," each field pertaining to a different aspect of the subject. It is often useful to allow a particular field to have subfields and a particular subfield to have its own subfields and so on for as many levels as necessary. For purposes of discussion herein, this type of data organization is called "nesting." The names of the fields and what they mean relative to the subject will be called the "semantic information" for purposes of discussion herein. The actual data representation for a particular field, i.e., floating point, integer, alphanumeric, etc., and the organization of the data record in terms of how many fields it has, which are primitive fields which contain only data, and which are nested fields which contain subfields, is called the "format" or "type" information for purposes of discussion herein. A field which contains only data (and has no nested subfields) will be called a "primitive field," and a field which contains other fields will be called a "constructed field" herein.

There are two basic types of operations that can occur in exchanges of data between software modules. The first type of operation is called a "format operation" and involves conversion of the format of one data record (hereafter data records may sometimes be called "a forms") to another format. An example of such a format operation might be conversion of data records with floating point and EBCDIC fields to data records having the packed representation needed for transmission over an ETHERNET.TM. local area network. At the receiving process end another format operation for conversion from the ETHERNET.TM. packet format to integer and ASCII fields at the receiving process or software module might occur. Another type of operation will be called herein a "semantic-dependent operation" because it requires access to the semantic information as well as to the type or format information about a form to do some work on the form such as to supply a particular field of that form, e.g., today's IBM stock price or yesterday's IBM low price, to some software module that is requesting same.

Still further, in today's environment, there are often multiple sources of different types of data and/or multiple sources of the same type of data where the sources overlap in coverage but use different formats and different communication protocols (or even overlap with the same format and the same communication protocol) It is useful for a software module (software modules may hereafter be sometimes referred to as "applications") to be able to obtain information regarding a particular subject without knowing the network address of the service that provides information of that type and without knowing the details of the particular communication protocol needed to communicate with that information source.

A need has arisen therefore for a communication system which can provide an interface between diverse software modules, processes and computers for reliable, meaningful exchanges of data while "decoupling" these software modules and computers. "Decoupling" means that the software module programmer can access information from other computers or software processes without knowing where the other software modules and computers are in a network, the format that forms and data take on the foreign software, what communication protocols are necessary to communicate with the foreign software modules or computers, or what communication protocols are used to transit any networks between the source process and the destination process; and without knowing which of a multiplicity of sources of raw data can supply the requested data. Further, "decoupling," as the term is used herein, means that data can be requested at one time and supplied at another and that one process may obtain desired data from the instances of forms created with foreign format and foreign semantic data through the exercise by a communication interface of appropriate semantic operations to extract the requested data from the foreign forms with the extraction process being transparent to the requesting process.

Various systems exist in the prior art to allow information exchange between foreign software modules with various degrees of decoupling. One such type of system is any electronic mail software which implements Electronic Document Exchange Standards including CCITT's X.409 standard. Electronic mail software decouples applications in the sense that format or type data is included within each instance of a data record or form. However, there are no provisions for recording or processing of semantic information. Semantic operations such as extraction or translation of data based upon the name or meaning of the desired field in the foreign data structure is therefore impossible. Semantic-Dependent Operations are very important if successful communication is to occur. Further, there is no provision in Electronic Mail Software by which subject-based addressing can be implemented wherein the requesting application simply asks for information by subject without knowing the address of the source of information of that type. Further, such software cannot access a service or network for which a communication protocol has not already been established.

Relational Database Software and Data Dictionaries are another example of software systems in the prior art for allowing foreign processes to share data. The shortcoming of this class of software is that such programs can handle only "flat" tables, records and fields within records but not nested records within records Further, the above-noted shortcoming in Electronic Mail Software also exists in Relational Database Software.

SUMMARY OF THE INVENTION

According to the teachings of the invention, there is provided a method and apparatus for providing a structure to interface foreign processes and computers while providing a degree of decoupling heretofore unknown.

The data communication interface software system according to the teachings of the invention consists essentially of several libraries of programs organized into two major components, a communication component and a data-exchange component. Interface, as the term is used herein in the context of the invention, means a collection of functions which may be invoked by the application to do useful work in communicating with a foreign process or a foreign computer or both. Invoking functions of the interface may be by subroutine calls from the application or from another component in the communications interface according to the invention.

In the preferred embodiment, the functions of the interface are carried out by the various subroutines in the libraries of subroutines which together comprise the interface. Of course, those skilled in the art will appreciate that separate programs or modules may be used instead of subroutines and may actually be preferable in some cases.

Data format decoupling is provided such that a first process using data records or forms having a first format can communicate with a second process which has data records having a second, different format without the need for the first process to know or be able to deal with the format used by the second process. This form of decoupling is implemented via the data-exchange component of the communication interface software system.

The data-exchange component of the communication interface according to the teachings of the invention includes a forms-manager module and a forms-class manager module. The forms-manager module handles the creation, storage, recall and destruction of instances of forms and calls to the various functions of the forms-class manager. The latter handles the creation, storage, recall, interpretation, and destruction of forms-class descriptors which are data records which record the format and semantic information that pertain to particular classes of forms. The forms-class manager can also receive requests from the application or another component of the communication interface to get a particular field of an instance of a form when identified by the name or meaning of the field, retrieve the appropriate form instance, and extract and deliver the requested data in the appropriate field. The forms-class manager can also locate the class definition of an unknown class of forms by looking in a known repository of such class definitions or by requesting the class definition from the forms-class manager linked to the foreign process which created the new class of form. Semantic data, such as field names, is decoupled from data representation and organization in the sense that semantic information contains no information regarding data representation or organization. The communication interface of the invention implements data decoupling in the semantic sense and in the data format sense. In the semantic sense, decoupling is implemented by virtue of the ability to carry out semantic-dependent operations. These operations allow any process coupled to the communications interface to exchange data with any other process which has data organized either the same or in a different manner by using the same field names for data which means the same thing in the preferred embodiment. In an alternative embodiment semantic-dependent operations implement an aliasing or synonym conversion facility whereby incoming data fields having different names but which mean a certain thing are either relabeled with field names understood by the requesting process or are used as if they had been so relabeled.

Data distribution decoupling is provided such that a requesting process can request data regarding a particular subject without knowing the network address of the server or process where the data may be found. This form of decoupling is provided by a subject-based addressing system within the communication interface.

Subject-based addressing is implemented by the communication component of the communication interface of the invention. The communication component receives "subscribe" requests from an application which specifies the subject upon which data is requested. A subject-mapper module in the communication component receives the request from the application and then looks up the subject in a database, table or the like. The database stores "service records" which indicate the various server processes that supply data on various subjects. The appropriate service record identifying the particular server process that can supply data of the requested type and the communication protocol (hereafter sometimes called the service discipline) to use in communicating with the identified server process is returned to the subject-mapper module.

The subject mapper has access to a plurality of communications programs called "service disciplines." Each service discipline implements a predefined communication protocol which is specific to a server process, network, etc. The subject mapper then invokes the appropriate service discipline identified in the service record.

The service discipline is given the subject by the subject mapper and proceeds to establish communications with the appropriate server process. Thereafter, instances of forms containing data regarding the subject are sent by the server process to the requesting process via the service discipline which established the communication.

Service protocol decoupling is provided by the process described in the preceding paragraph.

Temporal decoupling is implemented in some service disciplines directed to page-oriented server processes such as Telerate.TM. by access to real-time data bases which store updates to pages to which subscriptions are outstanding.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 is a block diagram illustrating the relationships of the various software modules of the communication interface of one embodiment of the invention to client applications and the network.

FIG. 2 is an example of a form-class definition of the contructed variety.

FIG. 3 is an example of another constructed form-class definition.

FIG. 4 is an example of a constructed form-class definition containing fields that are themselves constructed forms. Hence, this is an example of nesting.

FIG. 5 is an example of three primitive form classes.

FIG. 6 is an example of a typical form instance as it is stored in memory.

FIG. 7 illustrates the partitioning of semantic data, format data, and actual or value data between the form-class definition and the form instance.

FIG. 8 is a flow chart of processing during a format operation.

FIG. 9 is a target format-specific table for use in format operations.

FIG. 10 is another target format-specific table for use in format operations.

FIG. 11 is an example of a general conversion table for use in format operations.

FIG. 12 is a flow chart for a typical semantic-dependent operation.

FIGS. 13A and 13B are, respectively, a class definition and the class descriptor form which stores this class definition.

FIG. 14 is a block diagram illustrating the relationships between the subject-mapper module and the service discipline modules of the communication component to the requesting application and the service for subject-based addressing.

FIG. 15 illustrates the relationship of the various modules, libraries and interfaces of an alternative embodiment of the invention to the client applications.

FIG. 16 illustrates the relationships of various modules inside the communication interface of an alternative embodiment.

DETAILED DESCRIPTION OF THE PREFERRED AND ALTERNATIVE EMBODIMENTS

Referring to FIG. 1 there is shown a block diagram of a typical system in which the communications interface of the invention could be incorporated, although a wide variety of system architectures can benefit from the teachings of the invention. The communication interface of the invention may be sometimes hereafter referred to as the TIB or Teknekron Information Bus in the specification of an alternative embodiment given below. The reader is urged at this point to study the glossary of terms included in this specification to obtain a basic understanding of some of the more important terms used herein to describe the invention. The teachings of the invention are incorporated in several libraries of computer programs which, taken together, provide a communication interface having many functional capabilities which facilitate modularity in client application development and changes in network communication or service communication protocols by coupling of various client applications together in a "decoupled" fashion. Hereafter, the teachings of the invention will be referred to as the communication interface. "Decoupling," as the term is used herein, means that the programmer of client application is freed of the necessity to know the details of the communication protocols, data representation format and data record organization of all the other applications or services with which data exchanges are desired. Further, the programmer of the client application need not know the location of services or servers providing data on particular subjects in order to be able to obtain data on these subjects. The communication interface automatically takes care of all the details in data exchanges between client applications and between data-consumer applications and data-provider services.

The system shown in FIG. 1 is a typical network coupling multiple host computers via a network or by shared memory. Two host computers, 10 and 12, are shown in FIG. 1 running two client applications 16 and 18, although in other embodiments these two client applications may be running on the same computer. These host computers are coupled by a network 14 which may be any of the known networks such as the ETHERNET.TM. communication protocol, the tokenring protocol, etc. A network for exchanging data is not required to practice the invention, as any method of exchanging data known in the prior art will suffice for purposes of practicing the invention. Accordingly, shared memory files or shared distributed storage to which the host computers 10
and 12 have equal access will also suffice as the environment in which the teachings of the invention are applicable.

Each of the host computers 10 and 12 has random access memory and bulk memory such as disk or tape drives associated therewith (not shown). Stored in these memories are the various operating system programs, client application programs, and other programs such as the programs in the libraries that together comprise the communication interface which cause the host computers to perform useful work. The libraries of programs in the communication interface provide basic tools which may be called upon by client applications to do such things as find the location of services that provide data on a particular subject and establish communications with that service using the appropriate communication protocol.

Each of the host computers may also be coupled to user interface devices such as terminals, printers, etc. (not shown).

In the exemplary system shown in FIG. 1, host computer 10 has stored in its memory a client application program 16. Assume that this client application program 16 requires exchanges of data with another client application program or service 18
controlling host computer 12 in order to do useful work. Assume also that the host computers 10 and 12 use different formats for representation of data and that application programs 16 and 18 also use different formats for data representation and organization for the data records created thereby. These data records will usually be referred to herein as forms. Assume also that the data path 14 between the host computers 10 and 12 is comprised of a local area network of the ETHERNET.TM. variety.

Each of the host processors 10 and 12 is also programmed with a library of programs, which together comprise the communication interfaces 20 and 22, respectively. The communication interface programs are either linked to the compiled code of the client applications by a linker to generate run time code, or the source code of the communication programs is included with the source code of the client application programs prior to compiling. In any event, the communication library programs are somehow bound to the client application. Thus, if host computer 10 was running two client applications, each client application would be bound to a communication interface module such as module 20.

The purpose of the communications interface module 20 is to decouple application 16 from the details of the data format and organization of data in forms used by application 18, the network address of application 18, and the details of the communication protocol used by application 18, as well as the details of the data format and organization and communication protocol necessary to send data across network 14. Communication interface module 22 serves the same function for application 18, thereby freeing it from the need to know many details about the application 16 and the network 14. The communication interface modules facilitate modularity in that changes can be made in client applications, data formats or organizations, host computers, or the networks used to couple all of the above together without the need for these changes to ripple throughout the system to ensure continued compatibility.

In order to implement some of these functions, the communications interfaces 20 and 22 have access via the network 14 to a network file system 24 which includes a subject table 26 and a service table 28. These tables will be discussed in more detail below with reference to the discussion of subject-based addressing. These tables list the network addresses of services that provide information on various subjects.

A typical system model in which the communication interface is used consists of users, users groups, networks, services, service instances (or servers) and subjects. Users, representing human end users, are identified by a user-ID. The user ID used in the communications interface is normally the same as the user ID or log-on ID used by the underlying operating system (not shown). However, this need not be the case. Each user is a member of exactly one group.

Groups are comprised of users with similar service access patterns and access rights. Access rights to a service or system object are grantable at the level of users and at the level of groups. The system administrator is responsible for assigning users to groups.

A "network," as the term is used herein, means the underlying "transport layer" (as the term is used in the ISO network layer model) and all layers beneath the transport layer in the ISO network model. An application can send or receive data across any of the networks to which its host computer is attached.

The communication interface according to the teachings of the invention, of which blocks 20 and 22 in FIG. 1 are exemplary, includes for each client application to which it is bound a communications component 30 and a data-exchange component 32. The communications component 30 is a common set of communication facilities which implement, for example, subject-based addressing and/or service discipline decoupling. The communications component is linked to each client application. In addition, each communications component is linked to the standard transport layer protocols, e.g., TCP/IP, of the network to which it is coupled. Each communication component is linked to and can support multiple transport layer protocols. The transport layer of a network does the following things: it maps transport layer addresses to network addresses, multiplexes transport layer connections onto network connections to provide greater throughput, does error detection and monitoring of service quality, error recovery, segmentation and blocking, flow control of individual connections of transport layer to network and session layers, and expedited data transfer. The communications component provides reliable communications protocols for client applications as well as providing location transparency and network independence to the client applications.

The data-exchange component of the communications interface, of which component 32 is typical, implements a powerful way of representing and transmitting data by encapsulating the data within self-describing data objects called forms. These forms are self-describing in that they include not only the data of interest, but also type or format information which describes the representations used for the data and the organization of the form. Because the forms include this type or format information, format operations to convert a particular form having one format to another format can be done using strictly the data in the form itself without the need for access to other data called class descriptors or class definitions which give semantic information. The meaning of semantic information in class descriptors basically means the names of the fields of the form.

The ability to perform format operations solely with the data in the form itself is very important in that it prevents the delays encountered when access must be made to other data objects located elsewhere, such as class descriptors. Since format operations alone typically account for 25 to 50% of the processing time for client applications, the use of self-describing objects streamlines processing by rendering it faster.

The self-describing forms managed by the data-exchange component also allow the implementation of generic tools for data manipulation and display. Such tools include communication tools for sending forms between processes in a machine-independent format. Further, since self-describing forms can be extended, i.e., their organization changed or expanded, without adversely impacting the client applications using said forms, such forms greatly facilitate modular application development.

Since the lowest layer of the communications interface is linked with the transport layer of the ISO model and since the communications component 30 includes multiple service disciplines and multiple transport-layer protocols to support multiple networks, it is possible to write application-oriented protocols which transparently switch over from one network to another in the event of a network failure.

A "service" represents a meaningful set of functions which are exported by an application for use by its client applications. Examples of services are historical news retrieval services such as Dow Jones New, Quotron data feed, and a trade ticket router. Applications typically export only one service, although the export of many different services is also possible.

A "service instance" is an application or process capable of providing the given service. For a given service, several "instances" may be concurrently providing the service so as to improve the throughput of the service or provide fault tolerance.

Although networks, services and servers are traditional components known in the prior art, prior art distributed systems do not recognize the notion of a subject space or data independence by self-describing, nested data objects. Subject space supports one form of decoupling called subject-based addressing. Self-describing data objects which may be nested at multiple levels are new. Decoupling of client applications from the various communications protocols and data formats prevalent in other parts of the network is also very useful.

The subject space used to implement subject-based addressing consists of a hierarchical set of subject categories. In the preferred embodiment, a four-level subject space hierarchy is used. An example of a typical subject is: "equity.ibm.composite.trade." The client applications coupled to the communications interface have the freedom and responsibility to establish conventions regarding use and interpretations of various subject categories.

Each subject is typically associated with one or more services providing data about that subject in data records stored in the system files. Since each service will have associated with it in the communication components of the communication interface a service discipline, i.e., the communication protocol or procedure necessary to communicate with that service, the client applications may request data regarding a particular subject without knowing where the service instances that supply data on that subject are located on the network by making subscription requests giving only the subject without the network address of the service providing information on that subject. These subscription requests are translated by the communications interface into an actual communication connection with one or more service instances which provide information on that subject.

A set of subject categories is referred to as a subject domain. Multiple subject domains are allowed. Each domain can define domain-specific subject and coding functions for efficiently representing subjects in message headers.

DATA INDEPENDENCE: The Data-Exchange Component

The overall purpose of the data-exchange component such as component 32 in FIG. 1 of the communication interface is to decouple the client applications such as application 16 from the details of data representation, data structuring and data semantics.

Referring to FIG. 2, there is shown an example of a class definition for a constructed class which defines both format and semantic information which is common to all instances of forms of this class. In the particular example chosen, the form class is named Player.sub.-- Name and has a class ID of 1000. The instances of forms of this class 1000 include data regarding the names, ages and NTRP ratings for tennis players Every class definition has associated with it a class number called the class ID which uniquely identifies the class.

The class definition gives a list of fields by name and the data representation of the contents of the field. Each field contains a form and each form may be either primitive or constructed. Primitive class forms store actual data, while constructed class forms have fields which contain other forms which may be either primitive or constructed. In the class definition of FIG. 2, there are four fields named Rating, Age, Last.sub.-- Name and First.sub.-- Name. Each field contains a primitive class form so each field in instances of forms of this class will contain actual data. For example, the field Rating will always contain a primitive form of class 11. Class 11 is a primitive class named Floating.sub.-- Point which specifies a floating-point data representation for the contents of this field. The primitive class definition for the class Floating.sub.-- Point, class 11, is found in FIG. 5. The class definition of the primitive class 11 contains the class name, Floating.sub.-- Point, which uniquely identifies the class (the class number, class 11 in this example, also uniquely identifies the class) and a specification of the data representation of the single data value. The specification of the single data value uses well-known predefined system data types which are understood by both the host computer and the application dealing with this class of forms.

Typical specifications for data representation of actual data values include integer, floating point, ASCII character strings or EBCDIC character strings, etc. In the case of primitive class 1/1, the specification of the data value is Floating.sub.-- Point.sub.-- 11 which is an arbitrary notation indicating that the data stored in instances of forms of this primitive class will be floating-point data the decimal point.

Returning to the consideration of the Player.sub.-- Name class definition of FIG. 2, the second field is named Age. This field contains forms of the primitive class named Integer associated with class number 12 and defined in FIG. 5. The Integer class of form, class 12, has, per the class definition of FIG. 5, a data representation specification of Integer.sub.-- 3, meaning the field contains integer data having three digits. The last two fields of the class 1000 definition in FIG. 2
are Last.sub.-- Name and First.sub.-- Name. Both of these fields contain primitive forms of a class named String.sub.-- Twenty.sub.-- ASCII, class 10. The class 10 class definition is given in FIG. 5 and specifies that instances of forms of this class contain ASCII character strings which are 20 characters long.

FIG. 3 gives another constructed class definition named Player.sub.-- Address, class 1001. Instances of forms of this class each contain three fields named Street, City and State. Each of these three fields contains primitive forms of the class named String.sub.-- 20.sub.-- ASCII, class 10. Again, the class definition for class 10 is given in FIG. 5 and specifies a data representation of 20-character ASCII strings.

An example of the nesting of constructed class forms is given in FIG. 4. FIG. 4 is a class definition for instances of forms in the class named Tournament.sub.-- Entry, class 1002. Each instance of a form in this class contains three fields named Tournament.sub.-- Name, Player, and Address. The field Tournament.sub.-- Name includes forms of the primitive class named String.sub.-- Twenty.sub.-- ASCII, class 10 defined in FIG. 5. The field named Player contains instances of constructed forms of the class named Player.sub.-- Name, class 1000 having the format and semantic characteristics given in FIG. 2. The field named Address contains instances of the constructed form of constructed forms of the constructed class named Player.sub.-- Address, class 1001, which has the format and semantic characteristics given in the class definition of FIG. 3.

The class definition of FIG. 4 shows how nesting of forms can occur in that each field of a form is a form itself and every form may be either primitive and have only one field or constructed and have several fields. In other words, instances of a form may have as many fields as necessary, and each field may have as many subfields as necessary. Further, each subfield may have as many sub-subfields as necessary. This nesting goes on for any arbitrary number of levels. This data structure allows data of arbitrary complexity to be easily represented and manipulated.

Referring to FIG. 6 there is shown an instance of a form of the class of forms named Tournament.sub.-- Entry, class 1002, as stored as an object in memory. The block of data 38 contains the constructed class number 1002 indicating that this is an instance of a form of the constructed class named Tournament.sub.-- Entry. The block of data 40 indicates that this class of form has three fields. Those three fields have blocks of data shown at 42, 44, and 46 containing the class numbers of the forms in these fields. The block of data at 42 indicates that the first field contains a form of class 10 as shown in FIG. 5. A class 10 form is a primitive form containing a 20-character string of ASCII characters as defined in the class definition for class 10 in FIG. 5. The actual string of ASCII characters for this particular instance of this form is shown at 48, indicating that this is a tournament entry for the U.S. Open tennis tournament. The block of data at 44 indicates that the second field contains a form which is an instance of a constructed form of class 1000. Reference to this class definition shows that this class is named Player.sub.-- Name. The block of data 50 shows that this class of constructed form contains four subfields. Those fields contain forms of the classes recorded in the blocks of data shown at 52, 54, 56 and 58. These fields would be subfields of the field 44. The first subfield has a block of data at 52, indicating that this subfield contains a form of primitive class 11. This class of form is defined in FIG. 5 as containing a floating-point two-digit number with one decimal place. The actual data for this instance of the form is shown at 60, indicating that this player has an NTRP rating of
3.5. The second subfield has a block of data at 54, indicating that this subfield contains a form of primitive class 12. The class definition for this class indicates that the class is named integer and contains integer data. The class definition for class 1000 shown in FIG. 2 indicates that this integer data, shown at block 62, is the player's age. Note that the class definition semantic data regarding field names is not stored in the form instance. Only the format or type information is stored in the form instance in the form of the class ID for each field.

The third subfield has a block of data at 56, indicating that this subfield contains a form of primitive class 10 named String.sub.-- 20.sub.-- ASCII. This subfield corresponds to the field Last.sub.-- Name in the form of class Player.sub.-- Name, class 1000, shown in FIG. 2. The primitive class 10 class definition specifies that instances of this primitive class contain a 20-character ASCII string. This string happens to define the player's last name. In the instance shown in FIG. 6, the player's last name is Blackett, as shown at 64.

The last subfield has a block of data at 58, indicating that the field contains a primitive form of primitive class 10 which is a 20-character ASCII string. This subfield is defined in the class definition of class 1000 as containing the player's first name. This ASCII string is shown at 66.

The third field in the instance of the form of class 1002 has a block of data at 46, indicating that this field contains a constructed form of the constructed class 1001. The class definition for this class is given in FIG. 3 and indicates the class is named Player.sub.-- Address. The block of data at 68 indicates that this field has three subfields containing forms of the class numbers indicated at 70, 72 and 74. These subfields each contain forms of the primitive class 10 defined in FIG.
5. Each of these subfields therefore contains a 20-character ASCII string. The contents of these three fields are defined in the class definition for class 1001 and are, respectively, the street, city and state entries for the address of the player named in the field 44. These 3-character strings are shown at 76, 78 and 80, respectively.

Referring to FIG. 7, there is shown a partition of the semantic information, format information and actual data between the class definition and instances of forms of this class. The field name and format or type information are stored in the class definition, as indicated by box 82. The format or type information (in the form of the class ID) and actual data or field values are stored in the instance of the form as shown by box 72. For example, in the instance of the form of class Tournament.sub.-- Entry, class 1002 shown in FIG. 6, the format data for the first field is the data stored in block 42, while the actual data for the first field is the data shown at block 48. Essentially, the class number or class ID is equated by the communications interface with the specification for the type of data in instances of forms of that primitive class. Thus, the communications interface can perform format operations on instances of a particular form using only the format data stored in the instance of the form itself without the need for access to the class definition. This speeds up format operations by eliminating the need for the performance of the steps required to access a class definition which may include network access and/or disk access, which would substantially slow down the operation. Since format-type operations comprise the bulk of all operations in exchanging data between foreign processes, the data structure and the library of programs to handle the data structure defined herein greatly increase the efficiency of data exchange between foreign processes and foreign computers.

For example, suppose that the instance of the form shown in FIG. 6 has been generated by a process running on a computer by Digital Equipment Corporation (DEC) and therefoer text is experssed in ASCII characters. Suppose also that this form is to be sent to a process running on an IBM computer, where character strings are expressed in EBCDIC code. Suppose also that these two computers were coupled by a local area network using the ETHERNET.TM. communications protocol.

To make this transfer, several format operations would have to be performed. These format operations can best be understood by reference to FIG. 1 with the assumption that the DEC computer is host 1 shown at 10 and the IBM computer is host 2
shown at 12.

The first format operation to transfer the instance of the form shown in FIG. 6 from application 16 to application 18 would be a conversion from the format shown in FIG. 6 to a packed format suitable for transfer via network 14. Networks typically operate on messages comprised of blocks of data comprising a plurality of bytes packed together end to end preceded by multiple bytes of header information which include such things as the message length, the destination address, the source address, and so on, and having error correction code bits appended to the end of the message. Sometimes delimiters are used to mark the start and end of the actual data block.

The second format operation which would have to be performed in this hypothetical transfer would be a conversion from the packed format necessary for transfer over network 14 to the format used by the application 18 and the host computer 12.

Format operations are performed by the forms-manager modules of the communications interface. For example, the first format operation in the hypothetical transfer would be performed by the forms-manager module 86 in FIG. 1, while the second format operation in the hypothetical transfer would be performed by the forms-manager module in the data-exchange component 88.

Referring to FIG. 8, there is shown a flowchart of the operations performed by the forms-manager modules in performing format operations. Further details regarding the various functional capabilities of the routines in the forms-manager modules of the communications interface will be found in the functional specifications for the various library routines of the communications interface included herein. The process of FIG. 8 is implemented by the software programs in the forms-manager modules of the data-exchange components in the communications interface according to the teachings of the invention. The first step is to receive a format conversion call from either the application or from another module in the communications interface. This process is symbolized by block 90 and the pathways 92 and 94 in FIG. 1. The same ty