United States Patent Application20030172368
Kind CodeA1
Alumbaugh, Elizabeth ; et al.September 11, 2003

System and method for autonomously generating heterogeneous data source interoperability bridges based on semantic modeling derived from self adapting ontology
Abstract
A system, including software components, that efficiently and dynamically analyzes changes to data sources, including application programs, within an integration environment and simultaneously re-codes dynamic adapters between the data sources is disclosed. The system also monitors at least two of said data sources to detect similarities within the data structures of said data sources and generates new dynamic adapters to integrate said at least two of said data sources. The system also provides real time error validation of dynamic adapters as well as performance optimization of newly created dynamic adapters that have been generated under changing environmental conditions.

Inventors:Alumbaugh; Elizabeth (El Dorado Hills, CA), Bohorquez; Yuri Adrian Tijerino  (Cameron Park, CA), Bain; Mary Elizabeth  (Nevada City, CA), Reynolds; Ronald Joseph  (Davis, CA), Rasmussen; Steven John  (Citrus Heights, CA), Lucky; David Eugene  (Orangevale, CA)
Correspondence Name and Address:2000 UNIVERSITY AVENUE
GRAY CARY WARE & FREIDENRICH LLP
E. PALO ALTO
CA
94303-2248
US
Series Code:329153
Filed:December 23, 2002
U.S. Current Class:717/106
U.S. Class at Publication:717/106
Intern'l Class:G06F 009/44

Claims


What is claimed is:
1. A system connected to multiple heterogeneous data sources each having a data structure, said system monitoring at least one of said data structures, analyzing changes to said at least one of said data structure and providing for simultaneous re-coding of adapters between at least two of said multiple heterogeneous data sources.

2. The system of claim 1 including a system component for monitoring at least one data source and automatically detecting changes in the data structure of said data source.

3. A system connected to multiple heterogeneous data sources each having a data structure, said system monitoring at least two of said data sources to detect similarities within the data structures of said data sources and generating new dynamic adapters to integrate said at least two of said data sources.

4. The process in a system within an integration environment for analyzing changes to multiple heterogeneous data sources each having a data structure and providing for simultaneous re-coding of dynamic adapters between said multiple heterogeneous data sources, including the steps of intelligently analyzing the conceptual relationships and alternative data mapping strategies between a plurality of said data structures by utilizing intelligent computer programs to analyze and adapt to structural, contextual and semantic differences between said multiple heterogeneous data sources.

5. The process of claim 3 wherein said system monitors a plurality of dynamic adapters generated under changing computer environment conditions, said process including the steps of providing real time error validation of said dynamic adapters and performance optimization of at least one of said dynamic adapters.

6. The process of claim 5 including the step of using syntactic processes to automatically create adapter maintenance and support plans.

7. The process of claim 6 wherein the step of using syntactic processes occurs in an App2App Ontology Mapper and a Planner.

8. The process of claim 6 including the step of automatically checking for errors in said dynamic adapter.

9. The system of claim 1 further including error management components for automatically testing said recoded dynamic adapters before they are placed into operation.

10. The process of claim 2 further including the step of generating programming code automatically in response to said automatically detecting changes.

11. The process of claim 6 further including the steps of dynamically detecting changes, including revisions in said at least one data source, analyzing said revisions, generating data structure mapping between heterogeneous data sources, validating errors, and executing appropriate adapter modifications.

12. The process of claim 11 further including the step determining an optimum update for the said dynamic adapters.

13. The system of claim 1 further including models that are jobs, applications, users, change specifications, schemas, applications, ontologies, App2App similarity maps, at least one Common Ontology and at least one database.

14. The system of claim 1 further including system managers for managing system-wide settings and data, schema managers for providing, storing, listing, and deleting schemas, user managers for managing users and their preferences, change specification managers for managing storage and retrieval of change specifications, job managers for managing jobs performing analysis or automation, task managers for managing and running scheduled tasks, ontology managers for mapping the access to and modification of the Common Ontology or other application ontologies, language managers for managing different programming languages in which the system can produce integration adapters.

15. The system of claim 14 wherein each said change specification represents the changes between two specific snapshots of a schema.

16. The system of claim 14 wherein a language manager allows a user to set preferences for delivery of language specific adapters.

17. The system of claim 1 further including an application ontology factory for mapping schemata of a plurality of data sources to the common ontology to produce data source specific ontologies; an App2App Similarity Mapper for mapping a specific data source ontology to another data source ontology and producing a map of potential integration points between the two data sources; an ontology editor functioning both as a manager and a factory; and a Planner for producing an interactive integration plan between two disparate data sources based on the App2App similarity map.

18. The system of claim 17 wherein said ontology editor manages direct human interaction with the common ontology for validation, expansion and modification of said common ontology.

19. The system of claim 18 wherein said ontology editor provides a visual representation of the common ontology.

20. The system of claim 19 wherein said factories produce specific kinds of models.

21. The system of claim 20 wherein said factories manage persistence operations for said models set forth in claim 13.

22. The system of claim 1 further including (a) a Codegen Agent for interacting with a planner, a change specification manager, an App2App ontology factory and external data source-specific settings to generate and adapt integration code, and (b) a deployment agent for interacting with external data source environment elements and a Codegen Agent for deploying code in a self-adapting fashion.

23. The system of claim 22 wherein said Codegen Agent validates said deployed code.

24. The system of claim 22 wherein said components run on a backend server.

25. The system of claim 1 further including a desktop client running on users' or clients' desktops, said desktop capable of making requests of the system server components via system Proxies, receiving data from those requests, and presenting that data to the user, said desktop comprising an Application Context, a Schema Context, a change Specification context, a Report Generation Context, a Task List Context, an Admin Context, a User Administration context, a Notification context, an Application Ontology View context, an App2App Similarity Mapping Context, a Plan View context, a Language editor and a Code Browser context.

26. The system of claim 25 wherein said Application Context lists previously defined data sources and shows detailed information for the selected data source.

27. The system of claim 26 wherein the Application Context allows a user to add, modify or remove data source definitions.

28. The system of claim 25 wherein the Schema context lists previously collected schemas and shows detailed information for a selected schema.

29. The system of claim 25 wherein the Schema context shows detailed information for the selected schema and allows a user to add or remove schemas.

30. The system of claim 25 wherein the Change Specification Context lists the previously created Change Specifications and shows detailed information for the selected change specification.

31. The system of claim 25 wherein the Change Specification Context allows a user to add or remove change specifications.

32. The system of claim 25 wherein the Report Generation Context allows retrieval of previously saved reports.

33. The system of claim 25 wherein the Report Generation Context creates a new report from an existing schema or change specification.

34. The system of claim 25 wherein the Report Generation Context allows a user to save the current report.

35. The system of claim 25 wherein the Task List Context lists the pending/scheduled tasks for the current user and allows said user to add, modify or remove a task.

36. The system of claim 25 wherein the User Administration Context lists users of the system and allows an administrator user to set up new users and administer passwords.

37. The system of claim 26 wherein the Notification Context displays notifications and sets up notification preferences.

38. The system of claim 25 wherein the Application Ontology View Context lists application ontologies and displays application ontologies for browsing.

39. The system of claim 25 wherein the App2App Similarity Mapping Context lists App2App Similarity Maps and displays App2App Similarity Maps for browsing and user acceptance.

40. The system of claim 25 wherein the Plan View Context lists Integration Plans and displays Integration Plans for user browsing and acceptance.

41. The system of claim 25 wherein the Language Editor lists languages supported by the system and displays specific language settings for user browsing and preference selection.

42. The system of claim 25 wherein the Code Browser Context displays code in specific language for user browsing, user saving and user preference settings.

43. The system of claim 1 including a System Hub for providing clients with components that can be used to directly communicate with server components.

44. The system of claim 1 further including software processes comprising an Assessment Micro Agent, an App2App Similarity Mapper, a Planner, a Hub, and Error Validation and Code Generation components.

45. The system of claim 44 wherein said Assessment Micro Agent component comprises a Schema, Change Specification, a Task Manager and a Job Manager.

46. The process of operating on two data sources within a system including other components than said two data sources, said other components including at least a Common Ontology library, including the steps of: monitoring each of said data sources by an Assessment Micro Agent including a Schema Manager, said Assessment Micro Agent creating an inventory of the data structures and functionalities of said data sources and making said inventory available to predetermined ones of said other components of said system, said Assessment Micro Agent detecting a change in either of said data sources and notifying at least some of said other components of the change.

47. The process of claim 46 further including the step of an Application Ontology Factory accepting a data structure inventory from said Schema Manager and information provided from said Common Ontology library to produce data source ontologies.

48. The process of claim 47 including the further step of an App2App Similarity Mapper accepting the information in the data source ontologies to produce a similarity map between the two data sources.

49. The process of claim 48 including the further step of a Planner using the information contained in said similarity map to produce an integration plan.

50. The process of claim 49 including the further step of a CodeGen Agent accepting the information provided in the integration plan and using it to produce integration code.

51. The process of claim 50 including the further steps of validating said integration code by an Error Management Micro Agent and deploying said integration code between the two data sources.

52. The process of claim 46 including the further step of the Schema Manager of said Assessment Micro Agent reading the data structure stored in a data source to produce a schema that is placed into a memory model.

53. The process of claim 52 including the steps of the Schema Manager collecting data source information, data source driver information, table names, table types, indexes, foreign keys, column names, column data types, column precision, column nullability, primary key designation, view definitions, synonym and alias references, and remarks stored in the database schema and providing said collected information to predetermined ones of said other components.

54. The process of claim 46 including the further steps of the Assessment Micro Agent, in response to a change in a monitored data source, detecting alterations including new information in the database structure of said data source and analyzing said change by comparing said new information of said alteration to data stored in the Schema Manager.

55. The process of claim 54 wherein said last named step is performed by the Change Specification Manager comparing one historical view of the schema for one data source to another historical view of said schema.

56. An Assessment Micro Agent comprising a plurality of components including: a Schema--Manager connected to at least one data source for analyzing said at least one data source and extracting a meta-data model in the form of a schema, storing said schema and providing an interface to certain of said plurality of components for retrieving the schema; a Change Specification Manager for performing an analysis of what is different between two different versions of a data source by comparing the schemas associated with each version and presenting the change specification file to a user in a structured manner with specific information indicating changes in the schemas; a Task scheduler for allowing a user to schedule tasks; and a Notification Manager for providing an interface in which users can define notifications at several levels of granularity.

57. The Assessment Micro Agent of claim 56 wherein said levels of granularity include setting up notifications on the complete file of the change specifications or on filtered views of said files according to user preferences.

58. The Assessment Micro Agent of claim 56 wherein the Notification Manager can send notifications via standard mediums such as email, pager or PDAs according to user preferences.

59. The Assessment Micro Agent of claim 56 wherein the tasks include the generation of schemas through the Schema Manager and the generation of change specifications through the Change Specification Manager.

60. The Assessment Micro Agent of claim 56 further including the functions of monitoring connectivity between the Assessment Micro Agent and said data sources, managing the schema monitoring, retrieving change specifications, sending system-level notifications and user notifications, and allowing a user to create filtered views of changes according to one or more user preferences.

61. The process of operating an Application Ontology Factory including the steps of: converting the schema obtained from the Schema Manager component of the Assessment Micro Agent into a language compatible to the Common Ontology; mapping schema element identifiers to a WordNet to extract at least one of the senses of said elements; using said senses to extract all possible Common Ontology concept hierarchies to which the element might be a top-most specialization; assigning each concept hierarchy a confidence factor; merging said concept hierarchies to produce a micro-theory including each of said senses.

62. The process of claim 61 wherein a schema element is associated with one or more concept hierarchies.

63. The process of claim 62 wherein each concept hierarchy has an independent confidence factor.

64. In an artificial intelligence system connected to multiple heterogeneous data sources for generating new dynamic adapters to integrate changes in at least two of said data sources, the process of describing a schema using the syntax of the Common Ontology language.

65. In a system for automatically re-coding interfaces between heterogeneous data sources the process of monitoring changes in a monitored data source, analyzing the exact nature of the change, evaluating alternative data mapping possibilities, and adjusting the existing dynamic adapter integration code structures to address the changes.

66. The process of claim 65 including the step of using synonym relations for lexical level mapping by computing lexical proximity of elements in the schemas of the data sources.

67. The process of claim 65 including the step of finding semantical proximity by using hypernym relationships.

68. The process of claim 65 including the step of using computing the closeness of data values on mapped schema elements.

69. In a system for automatically generating dynamic adapters between heterogeneous data sources the process of monitoring changes in a monitored data source using pattern matching, said process including the steps of: generating a data source to ontology mapping for each data source being mapped by evaluating the mathematical probabilities of lexical and semantic relationships between schema entities and ontology concepts; determining lexical closeness between the data source ontology and Common Ontology concepts using synonym relationships; determining mathematical closeness of semantic relationships in the form of hypernyms; and determining confidence factors based on the mathematical probability of said data source ontology and said Common Ontology being lexically and semantically close.

70. The process of claim 69 including the further steps of: comparing the data source ontologies of the monitored data sources to determine common concepts; mapping a data source ontology to another data source ontology using synonym and hypernym relationships; extracting a sample of data element values from each said data sources and comparing said data element values to determine mathematical closeness; validating expected data values for said data source ontology mappings; composing and decomposing semantic relationships between target and source data source ontology elements; and uniting semantically similar schema elements into new ontology concepts.

71. The process of claim 70 wherein the step of validating mappings using expected data values includes the step of validating said closeness by performing pattern matching on the data values of one data source data element and another data source data element by determining how close data values for said elements are.

72. The process of claim 71 including the step of using pattern-matching to normalize data properties of the data structures of the data sources including data type and data length.

73. The process of claim 70 wherein the step of composing semantic relationships includes the steps of comparing data values of data source data structure elements and deriving semantic similarity thereof based on semantic proximity of one data source's data structure elements to another data source's data structure elements.

74. The process of claim 70 wherein the step of decomposing semantic relationships includes the steps of: determining that two data structure elements are similar; determining that one of said data structures has data elements with no associated functional relationship and that said other data structure element has a functional relationship with other data structure elements; determining whether said data elements display any similarity with said other data structure elements.

75. The process of claim 70 wherein the step of uniting data structure elements to form a new concept in the Common Ontology includes the step of mapping two or more different data structure elements from a data source to another data source by determining whether the mapped-to concept in the Common Ontology is the most specialized concept of a concept hierarchy in the Common Ontology and has no children concept, and adding said data structure as a concept to the Common Ontology.

76. In a system for automatically generating dynamic adapters between heterogeneous data sources, a Planner receiving the change specification file created by the Change Specification Manager and developing and logically testing an ordered dynamic adapter development plan.

77. In a system for automatically generating dynamic adapters between heterogeneous data sources, a Planner receiving a similarity map file created by an App2App Similarity Mapper and developing and logically testing an ordered dynamic adapter development plan.

78. The Planner of claim 77, said Planner being a software component for performing the process steps of (a) using a planning engine to evaluate confidence factors determined by an App2App Similarity Mapper and selecting higher confidence factors as planning goals and (b) determining the required data transformation steps that need to occur in order to accomplish said goals.

79. The Planner of claim 78 wherein the mappings having a confidence factor of 100% are provided to a user as planning goals with high degree of confidence and mappings with less than 100% confidence factors produce a plurality of alternative mapping goals.

80. The Planner of claim 79 including a software process responsive to said planning goals to produce the required data transformation steps to accomplish said planning goals.

81. An App2App Ontology Mapper for producing data mapping between schema elements, said mappings having confidence factors, said App2App Ontology Mapper including a software process for detecting that said mapping is accomplished by a lexical, semantic, expected data value, composition or decomposition process and, responsive to any such detecting, increasing said confidence factor.

82. An App2App Ontology Mapper for producing data mapping between schema elements, said mappings having confidence factors, said App2App Ontology Mapper including a software process for detecting that said mapping is refuted by a lexical, semantic, expected data value, composition or decomposition process and, responsive to any such detecting, lowering said confidence factor.

83. An App2App Ontology Mapper for producing data mappings between schema elements, said mappings having confidence factors, said App2App Ontology Mapper including a software process for assigning a lower confidence factor to mappings accomplished by lexical similarity than to mappings accomplished by lexical similarity plus semantic mapping.

84. An App2App Ontology Mapper for producing data mappings between schema elements, said mappings having confidence factors, said App2App Ontology Mapper including a software process for assigning a lower confidence factor to mappings accomplished by semantic mapping than to mappings accomplished by semantic mapping and expected data value mapping.

85. In a system for generating dynamic adapters between changed data sources, a process for generating dynamic adapters including the steps of: after an integration plan between two data sources has been generated, an Assessment Micro Agent determining that one of said data source's data structure has changed and, in response to said detecting, informing a Planner software component to generate a new plan if the previously generated plan has been affected by said change; creating a Change Specification File that describes said changes that occurred; discovering which schema elements of said dynamic adapter have changed; mapping the affected schema elements into the existing data source ontology; performing lexical and semantic mapping on the affected schema elements to find new associations with said data source ontology; in response to finding said new associations, validating said new associations; and attempting to find new mappings for the affected elements.

86. The process of claim 85 wherein said attempting to find new mappings is accomplished using an expected data value process.

87. The process of claim 85 including the further step of in response to finding no said mappings, attempting to find new mappings using composition and decomposition processes.

88. The process of claim 85 including the step of producing a new map and presenting said new map to a user.

89. The process of claim 88 including the step of detecting an indication that said user accepts said new map and, in response to said detecting of said indication, providing the map to the Planner.

90. The process of claim 89 wherein said Planner generates the new plan, said plan having confidence factors associated therewith.

91. In a system for generating revised dynamic adapters between changed data sources, a process for revising said adapters including the steps of: a Planner presenting an integration plan approved by a user as input to a CodeGen Agent; said CodeGen Agent executing the development of new adapters by reparsing said integration plan into a user-selected programming language.

92. The process of claim 91 wherein said reparsing is accomplished using a template file that contains transformation instructions to translate each integration operation into compilation-ready source code for the selected adapter language.

93. In a system for generating new dynamic adapters between data sources, a process for generating said adapters including the steps of: a Planner presenting as input to a CodeGen Agent an integration plan approved by a user, said integration plan including an indication of a use-selected programming language; said CodeGen Agent executing the development of new adapters by producing programming instructions to accomplish the integration plan in the user-elected programming language.

94. For use in a system for generating new dynamic adapters between data sources, an Error Management Micro Agent coupled to a Planner and accepting the output from said Planner to determine and categorize program errors and remediation plans.

95. The Error Management Micro Agent of claim 94 including a software process capable of detecting errors in one or more of the group consisting of generated code, data extraction, data aggregation and data insertion.

96. The Error Management Micro Agent of claim 95 wherein said detecting errors in said generated code is accomplished by using compiler and script verification technology.

97. The Error Management Micro Agent of claim 95 wherein detecting errors in data extraction, data aggregation and data insertion is accomplished by detecting one or more errors in the logical correctness of the generated code.

98. The Error Management Agent of claim 97 wherein the step of detecting one or more errors in the logical correctness of the code is accomplished by (a) use of a database emulator to emulate database tasks and, (b) comparing the results of the emulations against said plan presented by said Planner.

99. A system for automatically re-coding interfaces between heterogeneous data sources comprising: means for monitoring modifications made to a data source existing within an integration environment, wherein the environment contains multiple heterogeneous data sources, means for analyzing said modifications, means for formulating a set of potential ontological mappings between heterogeneous data sources, means for providing interoperability code structures between heterogeneous data sources.
100. The system of claim 99, wherein the system is additionally comprised of a means for error detection.
101. A system for automatically re-coding interfaces between heterogeneous data sources comprising: means for monitoring and analyzing modification made to a data source existing within an integration environment, wherein the environment contains multiple heterogeneous data sources; means for formulating a set of potential ontological mappings between heterogeneous data sources and providing interoperability code structures between data sources.
102. In a system for automatically generating dynamic adapters between heterogeneous data sources the process of generating a new adapter, said process including the steps of: generating a data source to ontology mapping for each data source being mapped by evaluating the mathematical probabilities of lexical and semantic relationships between schema entities and ontology concepts; determining lexical closeness between the data source ontology and Common Ontology concepts using synonym relationships; determining mathematical closeness of semantic relationships in the form of hypernyms; determining confidence factors based on the mathematical probability of said data source ontology and said Common Ontology being lexically and semantically close.
103. The process of claim 102 including the further steps of: comparing the data source ontologies of the monitored data sources to determine common concepts; mapping a data source ontology to another data source ontology using synonym and hypernym relationships; extracting a sample of data element values from each said data sources and comparing said data element values to determine mathematical closeness; validating expected data values for said data source ontology mappings; composing and decomposing semantic relationships between target and source data source ontology elements; and uniting semantically similar schema elements into new ontology concepts.
104. The process of claim 103 wherein the step of validating mappings using expected data values includes the step of validating said closeness by performing pattern matching on the data values of one data source data element and another data source data element by determining how close data values for said elements are.
105. The process of claim 104 including the step of using pattern-matching to normalize data properties of the data structures of the data sources including data type and data length.
106. The process of claim 103 wherein the step of composing semantic relationships includes the steps of comparing data values of data source data structure elements and deriving semantic similarity thereof based on semantic proximity of one data source's data structure elements to another data source's data structure.
107. The process of claim 103 wherein the step of decomposing semantic relationships includes the steps of: determining that two data structure elements are similar; determining that one of said data structures has data elements with no associated functional relationship and that said other data structure element has a functional relationship with other data structure elements; determining whether said data elements display any similarity with said other data structure elements.
108. The process of claim 103 wherein the step of uniting data structure elements to form a new concept in the Common Ontology includes the step of mapping two or more different data structure elements from a data source to another data source by determining whether the mapped-to concept in the Common Ontology is the most specialized concept of a concept hierarchy in the Common Ontology and has no children concept, and adding said data structure as a concept to the Common Ontology.
109. The Planner of claim 76, said Planner being a software component for performing the process steps of (a) using a planning engine to evaluate confidence factors determined by an App2App Similarity Mapper and selecting higher confidence factors as planning goals and (b) determining the required data transformation steps that need to occur in order to accomplish said goals.
110. The Planner of claim 109 wherein the mappings having a confidence factor of 100% are provided to a user as planning goals with high degree of confidence and mappings with less than 100% confidence factors produce a plurality of alternative mapping goals.
111. The Planner of claim 110 including a software process responsive to said planning goals to produce the required data transformation steps to accomplish said planning goals.
112. In a system for generating dynamic adapters between two data sources, a process for developing dynamic adapters including the steps of: before an integration plan between said two data sources has been generated, an App2App Similarity Mapper determining the similarities between said two data sources and informing a Planner software component to generate a new plan, said App2App Similarity Mapper performing at least the steps of: creating an App2App similarity map that describes said similarities; mapping the schema elements affected by said similarities to an existing data source ontology; performing lexical and semantic mapping on the affected schema elements to find new associations with said data source ontology; in response to finding said new associations, validating said new associations; and attempting to find new mappings for the affected elements.
113. The process of claim 112 wherein said attempting to find new mappings is accomplished using an expected data value process.
114. The process of claim 112 including the further step of in response to finding no said mappings, attempting to find new mappings using composition and decomposition processes.
115. The process of claim 112 including the step of producing a new map and presenting said new map to a user.
116. The process of claim 115 including the step of detecting an indication that said user accepts said new map and, in response to said detecting of said indication, providing the map to the Planner.
117. The process of claim 116 wherein said Planner generates the new plan, said plan having confidence factors associated therewith.
118. One or more processor readable storage devices having processor readable code embodied on said processor readable storage devices, said processor readable code for programming one or more processors to perform in a system within an integration environment for analyzing changes to multiple heterogeneous data sources each having a data structure and providing for simultaneous re-coding of dynamic adapters between said multiple heterogeneous data sources, the process comprising the step of intelligently analyzing the conceptual relationships and alternative data mapping strategies between a plurality of said data structures by utilizing intelligent computer programs to analyze and adapt to structural, contextual and semantic differences between said multiple heterogeneous data sources.
119. The one or more processor readable storage devices of claim 118
wherein said system monitors a plurality of dynamic adapters generated under changing computer environment conditions where said process includes the further steps of providing real time error validation of said dynamic adapters and performance optimization of at least one of said dynamic adapters.
120. The one or more processor readable storage devices of claim 119 where said process includes the further step of using syntactic processes to automatically create adapter maintenance and support plans.
121. The one or more processor readable storage devices of claim 120 where said process includes the further step of using syntactic processes occurs in an App2App Ontology Mapper and a Planner.
122. The one or more processor readable storage devices of claim 121 where said process includes the further step of automatically checking for errors in said dynamic adapter.
123. One or more processor readable storage devices having processor readable code embodied on said processor readable storage devices, said processor readable code for programming one or more processors to perform a process of operating on two data sources within a system including other components than said two data sources, said other components including at least a Common Ontology library, the process comprising the steps of: monitoring each of said data sources by an Assessment Micro Agent including a Schema Manager; said Assessment Micro Agent creating an inventory of the data structures and functionalities of said data sources and making said inventory available to predetermined ones of said other components of said system; said Assessment Micro Agent detecting a change in either of said data sources and notifying at least some of said other components of the change.
124. The one or more processor readable storage devices of claim 123 where said process includes the further step of an Application Ontology Factory accepting a data structure inventory from said Schema Manager and information provided from said Common Ontology library to produce data source ontologies.
125. The one or more processor readable storage devices of claim 124 where said process includes the further step of an App2App Similarity Mapper accepting the information in the data source ontologies to produce a similarity map between the two data sources.
126. The one or more processor readable storage devices of claim 125 where said process includes the further step of a Planner using the information contained in said similarity map to produce an integration plan.
127. The one or more processor readable storage devices of claim 126 where said process includes the further step of a CodeGen Agent accepting the information provided in the integration plan and using it to produce integration code.
128. The one or more processor readable storage devices of claim 127 where said process includes the further step of validating said integration code by an Error Management Micro Agent and deploying said integration code between the two data sources.
129. The one or more processor readable storage devices of claim 123 where said process includes the further step of the Schema Manager of said Assessment Micro Agent reading the data structure stored in a data source to produce a schema that is placed into a memory model.
130. The one or more processor readable storage devices of claim 129 where said process includes the further step of the Schema Manager collecting data source information, data source driver information, table names, table types, indexes, foreign keys, column names, column data types, column precision, column nullability, primary key designation, view definitions, synonym and alias references, and remarks stored in the database schema and providing said collected information to predetermined ones of said other components.
131. The one or more processor readable storage devices of claim 123 where said process includes the further step of the Assessment Micro Agent, in response to a change in a monitored data source, detecting alterations including new information in the database structure of said data source and analyzing said change by comparing said new information of said alteration to data stored in the Schema Manager.
132. One or more processor readable storage devices having processor readable code embodied on said processor readable storage devices, said processor readable code for programming one or more processors to perform a process of operating an Application Ontology Factory, the process comprising the steps of: converting the schema obtained from the Schema Manager component of the Assessment Micro Agent into a language compatible to the Common Ontology; mapping schema element identifiers to a WordNet to extract at least one of the senses of said elements; using said senses to extract all possible Common Ontology concept hierarchies to which the element might be a top-most specialization; assigning each concept hierarchy a confidence factor; merging said concept hierarchies to produce a micro-theory including each of said senses.
133. The one or more processor readable storage devices of claim 132
wherein schema element is associated with one or more concept hierarchies.
134. The one or more processor readable storage devices of claim 133
wherein each concept hierarchy has an independent confidence factor.
135. One or more processor readable storage devices having processor readable code embodied on said processor readable storage devices, said processor readable code for programming one or more processors to perform a process, in an artificial intelligence system connected to multiple heterogeneous data sources for generating new dynamic adapters to integrate changes in at least two of said data sources, the process of describing a schema using the syntax of the Common Ontology language.
136. One or more processor readable storage devices having processor readable code embodied on said processor readable storage devices, said processor readable code for programming one or more processors to perform a process, in a system for automatically recoding interfaces between heterogeneous data sources, the process comprising the step of monitoring changes in a monitored data source, analyzing the exact nature of the change, evaluating alternative data mapping possibilities, and adjusting the existing dynamic adapter integration code structures to address the changes.
137. The one or more processor readable storage devices of claim 136 where said process includes the further step of using synonym relations for lexical level mapping by computing lexical proximity of elements in the schemas of the data sources.
138. One or more processor readable storage devices having processor readable code embodied on said processor readable storage devices, said processor readable code for programming one or more processors to perform, in a system for automatically generating dynamic adapters between heterogeneous data sources, the process of monitoring changes in a monitored data source using pattern matching, the process comprising the steps of: generating a data source to ontology mapping for each data source being mapped by evaluating the mathematical probabilities of lexical and semantic relationships between schema entities and ontology concepts; determining lexical closeness between the data source ontology and Common Ontology concepts using synonym relationships; determining mathematical closeness of semantic relationships in the form of hypernyms; and determining confidence factors based on the mathematical probability of said data source ontology and said Common Ontology being lexically and semantically close.
139. The one or more processor readable storage devices of claim 138 where said process includes the further steps of: comparing the data source ontologies of the monitored data sources to determine common concepts; mapping a data source ontology to another data source ontology using synonym and hypernym relationships; extracting a sample of data element values from each said data sources and comparing said data element values to determine mathematical closeness; validating expected data values for said data source ontology mappings; composing and decomposing semantic relationships between target and source data source ontology elements; and uniting semantically similar schema elements into new ontology concepts.
140. One or more processor readable storage devices having processor readable code embodied on said processor readable storage devices, said processor readable code for programming one or more processors to perform a process in a system for automatically generating dynamic adapters between heterogeneous data sources, the process comprising the step of a Planner receiving the change specification file created by the Change Specification Manager and developing and logically testing an ordered dynamic adapter development plan.
141. One or more processor readable storage devices having processor readable code embodied on said processor readable storage devices, said processor readable code for programming one or more processors to perform a process, in a system for automatically generating dynamic adapters between heterogeneous data sources, the process comprising the step of a Planner receiving a similarity map file created by an App2App Similarity Mapper and developing and logically testing an ordered dynamic adapter development plan.
142. One or more processor readable storage devices having processor readable code embodied on said processor readable storage devices, said processor readable code for programming one or more processors to perform a process in a system for generating dynamic adapters between changed data sources, said process for generating dynamic adapters including the steps of: after an integration plan between two data sources has been generated, an Assessment Micro Agent determining that one of said data source's data structure has changed and, in response to said detecting, informing a Planner software component to generate a new plan if the previously generated plan has been affected by said change; creating a Change Specification File that describes said changes that occurred; discovering which schema elements of said dynamic adapter have changed; mapping the affected schema elements into the existing data source ontology; performing lexical and semantic mapping on the affected schema elements to find new associations with said data source ontology; in response to finding said new associations, validating said new associations; and attempting to find new mappings for the affected elements.
143. The one or more processor readable storage devices of claim 142
wherein said attempting to find new mappings is accomplished using an expected data value process.
144. The one or more processor readable storage devices of claim 142 where said process includes the further step of in response to finding no said mappings, attempting to find new mappings using composition and decomposition processes.
145. The one or more processor readable storage devices of claim 142 where said process includes the further step of producing a new map and presenting said new map to a user.
146. One or more processor readable storage devices having processor readable code embodied on said processor readable storage devices, said processor readable code for programming one or more processors to perform, in a system for generating revised dynamic adapters between changed data sources, a process for revising said adapters the process comprising the steps of: a Planner presenting an integration plan approved by a user as input to a CodeGen Agent; said CodeGen Agent executing the development of new adapters by reparsing said integration plan into a user-selected programming language.
147. The one or more processor readable storage devices of claim 146
wherein said reparsing is accomplished using a template file that contains transformation instructions to translate each integration operation into compilation-ready source code for the selected adapter language.
148. One or more processor readable storage devices having processor readable code embodied on said processor readable storage devices, said processor readable code for programming one or more processors to perform, in a system for generating new dynamic adapters between data sources, a process for generating said adapters, the process comprising the steps of: a Planner presenting as input to a CodeGen Agent an integration plan approved by a user, said integration plan including an indication of a use-selected programming language; said CodeGen Agent executing the development of new adapters by producing programming instructions to accomplish the integration plan in the user-elected programming language.
149. One or more processor readable storage devices having processor readable code embodied on said processor readable storage devices, said processor readable code for programming one or more processors to perform, in a system for automatically generating dynamic adapters between heterogeneous data sources the process of generating a new adapter, the process comprising the steps of: generating a data source to ontology mapping for each data source being mapped by evaluating the mathematical probabilities of lexical and semantic relationships between schema entities and ontology concepts; determining lexical closeness between the data source ontology and Common Ontology concepts using synonym relationships; determining mathematical closeness of semantic relationships in the form of hypernyms; determining confidence factors based on the mathematical probability of said data source ontology and said Common Ontology being lexically and semantically close.
150. The one or more processor readable storage devices of claim 149 where said process includes the further steps of: comparing the data source ontologies of the monitored data sources to determine common concepts; mapping a data source ontology to another data source ontology using synonym and hypernym relationships; extracting a sample of data element values from each said data sources and comparing said data element values to determine mathematical closeness; validating expected data values for said data source ontology mappings; composing and decomposing semantic relationships between target and source data source ontology elements; and uniting semantically similar schema elements into new ontology concepts.
151. One or more processor readable storage devices having processor readable code embodied on said processor readable storage devices, said processor readable code for programming one or more processors to perform, in a system for generating dynamic adapters between two data sources, a process for developing dynamic adapters, the process comprising the steps of: before an integration plan between said two data sources has been generated, an App2App Similarity Mapper determining the similarities between said two data sources and informing a Planner software component to generate a new plan, said App2App Similarity Mapper performing at least the steps of: creating an App2App similarity map that describes said similarities; mapping the schema elements affected by said similarities to an existing data source ontology; performing lexical and semantic mapping on the affected schema elements to find new associations with said data source ontology; in response to finding said new associations, validating said new associations; and attempting to find new mappings for the affected elements.
152. The one or more processor readable storage devices of claim 151
wherein said attempting to find new mappings is accomplished using an expected data value process.
153. The one or more processor readable storage devices of claim 151 where said process includes the further step of, in response to finding no said mappings, attempting to find new mappings using composition and decomposition processes.
154. A process of managing revision in a data source including the steps of: connecting an Assessment Micro Agent to a data source; using the Schema Manager, extracting information about the data source; using the Schema Manager, building a schema of the data source from at least some of said extracted information; and presenting the schema to a user.
155. The process of claim 154 including the additional steps of: the user selecting schema elements of interest to the user and creating a filtered view thereof; and the user using the Task Manager to schedule frequency for generating schema specifications.
156. The process of claim 155 including the additional steps of: the Change Specification Manager identifying a change in any of the selected schema elements during running of said data source; and in response to said identifying, informing the user of said detected change.
157. The process of claim 154 wherein the step of collecting information includes the step of collecting data source information, connectivity driver information, table names and types, indexes, primary keys, foreign keys, column names and types, column precision, view definitions, synonym and alias references, and remarks stored in a database schema.

Description



PRIORITY TO PRIOR PROVISIONAL APPLICATIONS

[0001] Priority is claimed to Provisional Applications Serial No. 60/342,098, filed on Dec. 27, 2001, No. 60/426,761 filed on Nov. 15, 2002
and No. 60/427,395 filed on Nov. 18, 2002.

FIELD OF THE INVENTION

[0002] This invention relates to a system and method for efficiently and dynamically analyzing changes to software applications that exist within a systems integration environment containing multiple heterogeneous data sources; and for providing for the simultaneous data mapping, coding, and maintenance support of interfaces between multiple software applications in real time event driven actions.

COPYRIGHT NOTICE

[0003] A portion of the disclosure of this patent document contains material which is protected by copyright. The copyright owner has no objection to the facsimile reproduction by anyone of this patent document, but otherwise reserves all copyright rights including, without limitation, making derivative works of the material protected by copyright.

BACKGROUND OF THE INVENTION

[0004] Providing application integration between heterogeneous software applications, environments and data resources (data sources) requires some type of provision for transformation, format, interface, and data connectivity services. These services are provided by a collection of software components that are collectively called adapters. Adapters integrate software application and database resources so they can interoperate with other disparate data sources and applications. They provide the interface between the application and, with most current integration approaches, the messaging subsystems that connects to the various applications.

[0005] Historically adapters have been viewed as the weakest link in application integration. This is because adapters are built to specific versions of software, such as business or database applications, and are specific to the platform upon which those applications operate. Most integration adapters aren't reusable and virtually all require extensive manual customization and ongoing maintenance. Customization almost always adds unforeseen weeks and months to the integration effort and greatly increases the complexity, cost, and time required for adapter maintenance and support efforts. Yet customization is almost unavoidable as business rules and data transformation occurs within integration adapters. These issues are compounded whenever any of the software applications and data sources within the integration environment change.

[0006] Each time a data source is upgraded, patched, revised or customized the integration adapters between the modified application and all other applications within the integrated environment must be rewritten. Even relatively simple or minor modifications to mission-critical data sources require extensive manual effort to determine the impact of the revision on the integration environment. Prior to this invention a self-generating and auto repairing solution for building, maintaining and supporting integration adapters did not exist. The prior art for adapter development requires some form of manual user intervention/manipulation to build, maintain and/or support integration infrastructures. Integrating heterogeneous applications is accomplished through the use of a variety of software or hardware based "tools" wielded by highly technical software professionals. For example U.S. Pat. No. 6,016,394 requires the manual development and maintenance of a single monolithic database to address integration needs; U.S. Pat. No. 6,167,564 aggregates multiple integration tools from a variety of vendors within a single coherent development framework so that users only have to navigate one application (which is still manual) pertaining to building integration adapters; U.S. Pat. No. 6,308,178 allows the user to manipulate a graphically enhanced data mapping/code generation and wizard driven screen system that guides the process of configuring inputs, creating interface tables, naming source files, and adding custom integration options; U.S. Pat. No. 6,256,676 requires a user to use a series of middleware tools known as an Application Development Kit (ADK) to manually build integration adapters; U.S. Pat. No. 6,236,994 provides a method and apparatus for manually developing and managing a metadata taxonomy catalog containing the referential linkages of data between multiple heterogeneous documents and multiple heterogeneous data sources.

[0007] It has been estimated that from 60-80% of the annual $10.7B software integration market (year 2001-2002) is spent on manual adapter development, maintenance and support efforts rather than on software licensing. The majority of this cost is for the systems analysis, data mapping, hard coding and testing of integration adapters. When done manually, the transformations and validations needed for data integration can require significant developer time and effort. In fact, these tasks are often the cause of costly implementation delays and project overruns. Rapidly evolving business demands, combined with ever-tightening budgets and time constrains, mean that organizations need an integration adapter solution that can be disassociated from specific software applications, version and operating platforms. Additionally organizations need an effective integration platform that can dynamically and intelligently adjust to the reality of continuously morphing or changing applications and computing environments.

[0008] Managing change across software and database applications accounts for approximately 33% of a company's entire IT budget, according to some estimates. The majority of this cost is for detailed systems analysis required to understand the impact of product upgrades, revisions and patches on a company's existing computing infrastructure. Prior to this invention, this activity required manual significant effort, was inordinately expensive, time consuming and fraught with error. Users frequently upgraded an application only to find that management reports no longer functioned, integration adapters were compromised, or that the application itself has become unstable. The prior art falls short of these needs and requires months of manual effort including detailed systems analysis, large budgets, and long lead times, as well as additional maintenance and support expenses.

[0009] The object, therefore, of the present invention is to provide a system to efficiently, in terms of both time and resources, and dynamically, in terms of real time event driven actions, analyze changes to data sources, and dynamically, in terms of real time event driven actions, analyze changes to data sources within an integration environment and provide for simultaneous recoding of adapters between multiple heterogeneous data sources. In addition, the present invention intelligently analyzes the conceptual relationships and alternative data mapping strategies by utilizing intelligent computer programs that can analyze and adapt to structural, contextual and semantic differences between multiple data sources. It is a further object of the present invention to be disassociated from application specific platforms, business logic and coding structures that are inherent to the specific data source thereby allowing automatic supportability and maintainability of interoperability adapters that conforms to the specific requirements of the source systems. It is also an object of the present invention to provide real time error validation of dynamic adapters as well as performance optimization of newly created adapters that have been generated under changing environmental conditions while maximizing the use of existing integration infrastructures. One of the embodiments of the invention can help users gain control over data source change thus reducing the risk, time, costs and efforts associated with adapter maintenance and support allowing users to optimize the value of IT investments and establish governance, visibility and control.

INTEGRATION ADAPTER REQUIREMENTS AND TYPES

[0010] Providing complete application integration between heterogeneous environments and resources requires the provision of the following services:

[0011] Data flow services to provide work and process flow flexibility that can reflect business processes;

[0012] Transformation services to provide data syntax resolution and validation management;

[0013] Format Services to provide schema and semantic messages;

[0014] Interface services to provide reconciliation and translation of interfaces including SQL, RPC, IDL, CGI, APIs, etc.;

[0015] Network services to provide such as queuing, multiplexing, ordering, routing, security, compression, and recovery; and

[0016] Connectivity services to provide such as TCP, HTTP, SOAP, CORBA, and SNA.

[0017] These services are provided by a collection of software components that exist within most integration environments. Adapters provide some of these services; transformation, format, interface and connectivity. Adapters connect software into the integration environment so that disparate applications and data stores can interoperate with other connected resources. There are many different techniques and approaches to achieving interoperability. Since many of these choices are complex, expensive and cumbersome the selected method should align with the companies long-term business needs without causing the business to lose its ability to quickly exploit opportunities created by new technologies.

[0018] There are five categories of adapters--application, language, environment, data, and middleware.

[0019] Application adapters tie disparate software systems together by mapping processes, workflows or functions from a source software program to a target application. Application adapters' use specialized "bridge" programs that are written so that one program can work with the data or the output from functions in another program. The result of this type of integration may be a new application with its own user interface or the capability of a desktop or mainframe application to handle data and includes capabilities borrowed from other applications.

[0020] Language adapters accomplish integration by mapping the syntax of one programming language with another (COBAL, RPC, C, Basic, IDL, Tcl, and others) so that older legacy software systems can connect to new applications using the same programming standards (JAVA, XML, COM, EJB, Visual Basic, and the like) that the more modern systems use to communicate with each other.

[0021] Environment adapters provide platform level integration by using standards such as CICS, SNA, and Mainframe OSI to provide connectivity.

[0022] Data adapters provide connectivity by mapping information between applications from flat files, data sources and database connections using the applications underlying data store (such as Oracle, Sybase, VSAM, and others) Data adapters tend to be used inside applications to provide tightly coupled synchronous access to heterogeneous databases intended for direct use, for which an application-level (API) interface is not preferred or doesn't exist. p1 Middleware adapters provide connectivity and interoperability by using specialized bridging applications that support application interoperability and data interchange. Middleware adapters use languages and protocols such as XML, FTP, MQ Series and ODBC to accomplish environmental connectivity, transapplication workflow, data mapping, and programmatic exchanges across applications that in turn initiates an event that causes additional programmatic actions.

[0023] Products that exist within each of the above listed adapter categories can be further segmented into the following types--static, intelligent, and dynamic.

[0024] A static adapter is one that is predefined; custom developed, both application and version specific, and provides basic application integration to a targeted resource. Static adapters provide very little, if any, data transformation, validation, or filtering; they simple shuttle data from one application to another in either real-time or batch transmission modes.

[0025] An intelligent adapter implements data manipulation, validation, and business rules processing by blending new applications and processes with existing systems. Intelligent adapters are aware of application metadata and they provide integration performance improvements by moving business rule processing from centralized integration brokers to the distributed application adapter, thus reducing network traffic. However, not all intelligent adapters are equal. Each one's functionality is directly controlled by the depth, breath and amount of application knowledge that has been encapsulated into the adapter by the supplier. Intelligent adapters reduce the amount of custom coding and application expertise required to support an integrated environment because they are designed to address the underlying business logic of version-specific products within the integrated environment. While labeled as "smart," intelligent adapters usually fail to address application/database/logic customizations created by end user customers. Intelligent adapters require manual intervention and custom augmentation whenever an application is modified or upgraded.

[0026] A dynamic adapter has the advantages of an intelligent adapter with few, if any, of the weaknesses. It actually learns from performing its data manipulations and can change its behavior by detecting changes in a monitored application. A dynamic adapter is capable of sensing changes in the integrated environment; automatically re-programming itself once a change has been detected and finetunes its performance as the result of newly learned operational information. Only dynamic adapters can seamlessly function within all five of the above mentioned adapter categories without custom coding.

[0027] Our invention provides a novel system that overcomes the above shortcomings.

[0028] Accordingly, it is an object of the invention to monitor an application and to automatically detect changes in the application's database structure and record this information in a format such as XML format in a knowledge base repository.

[0029] It is another object of the invention to "learn" user preferences and data mapping criteria each time the application is used.

[0030] It is a further object the invention to automatically detect application changes, reducing the need for extensive database analysis.

[0031] It is yet a further object of the invention to use dynamic syntactic processes to create adapter maintenance and support plans automatically.

[0032] It is an additional object of the invention to significantly reduce the time and manpower required to plan, analyze, design, and generate an interoperability plan for applications.

[0033] It is another object of the invention to provide a system that automatically checks for errors in new adapters, minimizing the number of staff required for this task.

[0034] It is still another object of the invention to automatically maintain and support adapters, reducing the need for expensive integration programmers.

[0035] It is still an additional object of the invention to provide error management components that automatically test updated adapters before they are placed into a production environment.

[0036] It is a further object of the invention to automatically detect application changes, so that end users do not need an in-depth understanding of the structure of each application.

[0037] It is another object of the invention to generate programming code automatically, so that end users do not need to learn numerous interface programming languages.

BRIEF DESCRIPTION OF THE DRAWINGS

[0038] FIG. 1 is a general representation of the overall system architecture useable in the invention.

[0039] FIG. 2 is an alternate illustration of the general operation of the invention, including processes associated with Assessment, Modification Planner, Hub, Error Validation and Code Generation components.

[0040] FIG. 3 illustrates some of the information collected by the Schema Manager, which information becomes the input for ontology generation.

[0041] FIG. 4 illustrates the steps for generating a change specification between to different instances of an application's schemas.

[0042] FIG. 5 illustrates the steps necessary to create an application ontology from an application schema

[0043] FIG. 6 illustrates the steps necessary to generate a similarity map between two disparate applications.

[0044] FIG. 7 illustrates the three main steps that go into planning an integration adapter.

[0045] In describing our invention we will be using terms used in the software and artificial intelligence technologies. Some of these terms, as used in this patent document, are defined below.

DEFINITIONS

[0046] "Adapter" means software code that allows heterogeneous software applications and data sources to interoperate and share data with each other.

[0047] "Application Ontology Factory" means the concept engine that is responsible for the development of an Application Specific Ontology. The Application Ontology Factory is common and reusable across any application and in turn produces an application specific ontology (conceptual model that is an axiomatic characterization of data and meaning) for each monitored data source, by mapping application schema elements, relationships between those elements and other constraints to a common ontology.

[0048] "Application Program Interface (API)" means a series of functions that programs can use to make the operating system do a specific function. Using Windows APIs, for example, a program can open windows, files, and message boxes--as well as perform more complicated tasks--by passing a single instruction.

[0049] "Assessment Microagent" means an intelligent software program that can independently and in an event driven fashion analyze selected data sources (software applications and databases) thereby creating a point in time situation assessment and application specific concept model of the data source as well as a comparison record that shows the differences between two or more point-in-time snapshots of a data source.

[0050] "Change Specification File" means the record that represents the detailed summary attributes of information about differences between two or more specific point-in-time snapshots of an application which is inclusive of the data sources underlying schema.

[0051] "Change Specification Manager" means the mechanism that handles the persistence operations that are associated with retrieving and storage of multiple versions of change specifications files.

[0052] "Code Generator Agent" means an intelligent software program whose purpose is to generate interoperability adapter code from a generic Integration Plan to a specific implementation programming language selected by a human user.

[0053] "Common ontology" is a general purpose ontology that contains definitions for concepts and relationships among those concepts that have wide coverage among multiple domains. In the ontology community this is sometimes called the upper ontology.

[0054] "Communicator" means the graphic user interface that supports human interaction with all the systems microagents contained in the instant invention. The Communicator implicitly directs the various microagents to be responsive to the plans and goals of the human users.

[0055] "Concept Hierarchy" refers to concepts in an ontology and means the compendium of all concepts and relationships between those concepts as they define a given concept. In other words, "Concept Hierarchy" means all the more abstract concepts and their relationships used to define a concept in an ontology.

[0056] "Constraint" means an attribute of a table which restricts the values that a field can have. (e.g., NOT NULL, UNIQUE, etc.)

[0057] "Cyclic Redundancy Check" or "CRC" means an algorithm applied to a block of data which produces a number, typically 32-bits or more, which has a very high probability of being unique for that block of data. Note, this is more widely known as a "Message Digest" or "Hash" algorithm and for the record CRC's are used primarily to detect data transmission errors whereas hashes are used to determine uniqueness (though having duplicate CRC's for dissimilar blocks of data is also very unlikely and CRC's are typically faster to produce than hashes). Commonly used message digest algorithms include CRC-32, MD5, and SHA-1.

[0058] "Data Source" means any software system with a data structure such as a database, an enterprise application, or flat data files.

[0059] "Deployment Agent" means an intelligent software program whose purpose is to deploy newly generated adapter interoperability code to a user specified location such as a secured server using a deployment strategy that is identified by a system user. Deployment strategies may include File Transfer Protocol (FTP), file-copy, telnet and Secure Socket Shell (SSH).

[0060] "Document Type Definition" or "DTD" means a file used to validate the structure of an XML document. DTDs are used so that a validating XML parser can validate that the tag structure and attributes in an XML document are valid based on the rules laid out in the DTD.

[0061] "Dynamic" means performed when a program is running.

[0062] "Enterprise Application Integration (EAI)" means a method of integrating software applications that is workflow driven.

[0063] "Error Management Microagent" means an intelligent software program that evaluates newly created interoperability adapter code to detect errors in code generation, data extraction, aggregation and insertion or would hinder the software application programs to interoperate (process a transaction and exchange data).

[0064] "Event-Driven" means a trigger that allows a program to react independent of human intervention to changes that have occurred in a software environment.

[0065] "Event of Interest" means an event, such as a structure change in a table, that is of significance to the system.

[0066] "Extensible Markup Language (XML)" means a semantic-preserving markup language used for interchanging data between heterogeneous systems.

[0067] "Foreign Key" means a value stored in a table which is the Primary Key of another table. Used to create a reference between two tables, such as Person.addrld and Address.id.

[0068] "Global Ontology" is synonym with Common Ontology as defined above.

[0069] "Hub" means the central entry point into the system from external interfaces and from the GUI. The hub controls session management activities including user authentication, retaining information for a specific user about the time between logging in and logging out, and routing of user requests to the appropriate system components and routing of the results back to the requester.

[0070] "Immutability" means an inability to change. Immutable objects, once created, never change their value, which allows for certain assumptions and optimizations to be made when using them.

[0071] "Implementation Language" is a "programming" language in which an integration plan can be implemented. This includes languages such as Perl, Java, and so forth, but also languages such as XML which are not true programming languages per-se.

[0072] "Index" means a hash value calculated for a row based on fields within that row which can then be used for faster querying, such as creating an index of Person.LastName so that queries for Person records by LastName will be faster.

[0073] "Integration Validation" means performing an error check to determine the correctness of newly generated interoperability adapter code as well as ensuring that the newly generated code will not corrupt transported information or adversely impact the targeted data source, as well as other existing interoperability code structures.

[0074] "Interface" means a boundary across which two independent systems meet and act on or communicate with each other.

[0075] "Language Descriptor" is an object which describes a language in a form readable by software. A descriptor would include things like the name of the language, the statement-terminator character, the comment character, the string constant-delimiter, and so forth.

[0076] "Microagent" means an intelligent software program that can be viewed as perceiving its environment through sensors that communicate what should be accomplished and in turn act upon that environment through effectors which are software tools and services that dynamically determine how and where to satisfy the request.

[0077] "Micro Agent (software robots)" means intelligent software programs that use software tools and services on a person's behalf. Also known as softbots. Micro agents allow a person to communicate what they want accomplished and then dynamically determine how and where to satisfy the person's request.

[0078] "Modification Planning Microagent" means an intelligent software program that defines data mapping and interoperability operations between two or more application specific ontologies. The Modification Planning Micro Agent uses expert traces to dynamically synthesise transformation information between two or more ontologies by means of an inferance engine (algorithm) to develop a sequence of actions (plans) that will achieve concept mapping and data transformation conditions which are representitive of the ideal interoperability state required by the two or more application specific ontologies that exist within an integration enviornment.

[0079] "Ontological Comparative Knowledge Base" means the application specific Ontology that maintains information that pertains to a data source's infrastructure (Tables, Columns, Indexes, Foreign Keys, Triggers, Stored Procedures, Primary Keys, Other Constraints, Views, Aliases/Synonyms, etc.). The Assessment Microagent compares one point in time Ontological Comparative Knowledge Base to other point in time snapshots to determine if a change has occurred. Identified changes between two point-in-time versions of the Ontological Comparative Knowledge Base can be used to facilitate understanding, organizing, and formalizing information about the monitored data source supportive of the operational needs of the other micro agents.

[0080] "Ontology" means the specification of conceptualizations, used to help programs and humans share knowledge. In this usage, an ontology is a set of concepts--such as things, events, and relations--that are specified in some way (such as specific natural language) in order to create an agreed-upon vocabulary for exchanging information. "Ontology Editor" means the mechanism that allows editing of existing ontology settings including information on specific concepts and relationships of a common or application specific ontology.

[0081] "Ontology Manager" means the mechanism that manages the persistence operation associated with storage and retrieval of various versions of the common ontology, application ontologies and application-to-applicatio- n ontology mappings.

[0082] "Open Database Connectivity (ODBC)" means a widely accepted application programming interface (API) for database access that makes it possible to access different database systems with a common language. ODBC is based on CLI (Call Level Interface). There are ODBC drivers and development tools for a variety of operating systems such as Windows, Macintosh, UNIX and OS/2.

[0083] "Persistence" means that the information stored in a view has to continue to exist even after the application that saved and manipulated the data presented in the view has ceased to run.

[0084] Persistence provides a mechanism for server-side components to create, read, update, and delete and store multiple versions of system data.

[0085] "Planner" means the intelligent software program that takes input from application specific ontology generation processes, understands the differences and similarities between two or more heterogeneous application specific ontologies and generates an integration plan that includes the detailed concept mapping and data transformation rules between heterogeneous applications.

[0086] "Polling" means querying a source on a recurring schedule, such as once every 10 minutes.

[0087] "Primary Key" or "PK" is an identifier which uniquely identifies a single instance of a particular type of object. (e.g., a SSN is a Primary Key for a U.S. citizen).

[0088] "Schema" means the logical organization or structure for representing data that exists in a database. Schema includes definitions and relationships of data and shows abstract representations of an object's characteristics and its relationships to other objects. This process is completed by evaluating the data source's metadata, meta-relationships inclusive of the basic notions of parenthood, integrity, identity, and dependence, etc., which in turn, are compiled into a tag library that becomes the foundation of an application specific Ontological Comparative Knowledge Base.

[0089] "Script Executor Microagent" means as the Code Generator Agent generates interoperability code from a generic Integration Plan to a specific implementation programming language selected by a human user, the Script Executor Microagent executes that code.

[0090] "State Machine" means a construct used to describe a flow of events given input and the results of the currently executed state within the machine. State-machines allow for very flexible sequencing and decoupling of their component parts to allow the user of the state-machine to alter and customize its behavior with a minimum of effort. State-machines are normally represented as a directional graph in which each node of the graph represents a state of the machine ("startup", "login", "ftp", "done", "failure") and the branches within the graph represent the flow of control from state to state (`success` at the `login` state results in a transition to the `ftp` state, `failure` at the `login` state results in a transition to the `failure` state, and so forth).

[0091] "Stored Procedure" means a compiled query stored on the database server and used for efficiency and encapsulation process.

[0092] "Structured Query Language (SQL)" means a scripting language used to communicate with a database.

[0093] "Synectics" means the human problem-solving process based on logical elimination of options and heuristic reasoning.

[0094] "Trigger" means an entity within a database which is notified when a specified event occurs, such as a row being added to a table.

[0095] "Validating XML Parser" means a parser that, when parsing XML, validates both that the XML is well formed and that the XML is valid based on the rules specified in a specified DTD or XML-Schema file.

[0096] "WordNet" means a specific online lexical database of the English language, which is maintained by the Cognitive Science Laboratory at Princeton University. The WordNet is commonly used in the computer science field to compare words based on their meanings.

[0097] "Use case" means a formal description of a particular functionality or behavior that the system displays for specific situations.

[0098] "View" means a "fake" table normally composed of data from various tables which appears to the user as a regular database table, such as a consolidated view showing data from both Person and Address data in a single table.

[0099] "XML (Extensible Markup Language)" means a markup language developed by the World Wide Web Consortium (W3C) to organize and deliver content more reliably through the use of customized tags.

DESCRIPTION OF THE PREFERRED EMBODIMENT

[0100] We will now describe the various aspects of our invention.

[0101] Invention Overview

[0102] Every organization is unique and each company has its own distinctive configuration of hardware, software, databases, enterprise applications, product customizations and network infrastructure. Fixed models for integration don't scale because they fail to address a company's individuality. Our invention treats each monitored application within the integration environment as the center of its own unique universe, continually examining the application (data, business logic, etc.) for changes while accommodating the uniqueness of each application within the integration environment. This approach provides a system that efficiently and dynamically (in terms of time, resources, and event driven actions) analyzes changes to heterogeneous software applications, integration environments and/or data resources that is both platform and application independent and provides a robust application change management control that allows the user to immediately determine the downstream impact of installing product revisions, patches or new versions within his or her integration environment. Its revision control infrastructure can help solve data integration adapter maintenance and support issues, reduce dependencies on integration professional services consulting, enhance data security and decrease the risks associated with software upgrades.

[0103] The main aspect of our invention is as an automated interoperability analysis and code generation tool, or intelligent, dynamic universal adaptor, that dynamically detects application changes, analyzes revisions, generates data mapping between heterogeneous applications, performs error validation, and executes necessary adapter modifications. It features a robust software infrastructure for adapter construction, maintenance and support that consistently develops, deploys and monitors Intelligent, Dynamic Adapters. When a monitored application has been modified, the invention uses a proactive planning and learning approach to determine how best to update the application's integration adapters. This significantly reduces the amount of human intervention as well as the risk, cost, time, and manual effort required to update application integration environments.

[0104] System Architecture

[0105] The system including our invention can be built on a highly extensible, flexible and robust distributed architecture allowing it to scale for an almost unlimited amount of users and enterprise applications. The benefits of this architecture include providing ability for deployment in highly complex IT environments, the ability to distribute processing requirements across the IT environment without affecting other critical IT systems, the ability to support fail over, among other functions.

[0106] The distributed architecture can be built on Jini technology from Sun Microsystems, which allows highly distributed components to coexist independent of each other. Jini provides the infrastructure necessary for components to log services and allows other components to find those services when required. Along with Jini, other technologies can be used to further allow flexibility, extensibility and robustness. These technologies include Remote Method Invocation (RMI) for inter-process communication between different components and the use of JavaSpaces as a standard way to persist objects and messages across components. System architectures can be viewed in different ways. Two ways that have been used are Logical Architecture and Physical Architecture.

[0107] The Logical Architecture describes the behavior of a system's application. Since the system of the current invention can be written in Java, the descriptions of the logical architecture map directly to Java packages and classes. For the most part, component types can be mapped to Java packages. Components can be mapped to Java classes.

[0108] The Physical Architecture shows how the logical architecture is mapped to physical things, such as operating system (OS) processes and machines. Put another way, the components defined in the Logical Architecture are allocated onto OS processes and machines. This provides the perspective of how components map to the real, physical world. Because the system components can exist in multiple OS processes on multiple machines, the system architecture is distributed.

[0109] The system architecture is illustrated generally in FIG. 1 showing both logical architecture and physical architecture.

[0110] Logical Architecture

[0111] A number of major component types of the Logical Architecture can be classified as:

[0112] 1. Model

[0113] 2. Managers

[0114] 3. Factories

[0115] 4. Agents

[0116] 5. Desktop Client

[0117] 6. Hub

[0118] 7. Notifications

[0119] 8. Jini and JavaSpaces

[0120] 9. RMI

[0121] 10. Exceptions

[0122] Each is described below.

[0123] The Model

[0124] Model components contain data used by other components within the system. When data is exchanged between server and client components, the data is packaged as one or more Model objects. Examples of the Model component types, along with their components, are:

[0125] 1. Job (Jobid, JobStatus, JobSummary, Step)

[0126] 2. System (Application, Appld)

[0127] 3. User (UserData, Userld, User, UserName, UserPassword, UserPreferences)

[0128] 4. Change Specification

[0129] 5. Schema

[0130] 6. Application Ontology

[0131] 7. App2App Similarity Map

[0132] 8. Common Ontology

[0133] 9. Database

[0134] System Managers

[0135] Backend server components are implemented in the form of managers that address different aspects of the system. The Managers provide the server-side functionality for the system of our invention. Put another way, Managers provide the business behavior and rules for the system. Examples of Managers seen in FIG. 1 are:

[0136] 1. System Manager 2, which manages system-wide settings and data.

[0137] 2. Schema Manager 4, which provide, store, list, and delete schemas.

[0138] 3. User Manager 6, which manages users and their preferences.

[0139] 4. Change Specification Manager 8, which manages storage and retrieval of change specifications. Each change specification represents the changes between two specific snapshots of a schema.

[0140] 5. Job Manager 10, which manages jobs that may run for a long time. Typically, jobs perform heavy analysis and automation.

[0141] 6. Task Manager 12, which manages and runs scheduled tasks.

[0142] 7. Ontology Manager 14, which maps the access to and modification of the Common Ontology and other application ontologies.

[0143] 8. Language Manager 16, which manages the different programming languages in which the system can produce integration adaptors, also referred to as dynamic adapters. This managers allows an advanced user to set preferences for the delivery of language-specific adaptors.

[0144] System Factories

[0145] The system of our invention has several factories running on the server side which produce specific kinds of models. Besides production of models, the factories also have the role of managing persistence operations for the models. These are seen below with reference to FIG. 1.

[0146] 1. Application Ontology Factory 18, which maps application schemata to the Common Ontology 35 and produces application-specific ontologies.

[0147] 2. App2App Similarity Mapper 20, which maps a specific application ontology to another application ontology and produces a map of potential integration points between the two applications.

[0148] 3. Ontology Editor 22, which acts both as a manager and a factory, manages direct human interaction with the Common Ontology 35 for validation, expansion and modification of the Common Ontology. It also provides a visual representation of the Common Ontology 35.

[0149] 4. Planner 24, which produces an interactive integration plan between two disparate applications based on the App2App Similarity Map.

[0150] System Agents

[0151] The system implements agents that run on the server side, are highly adaptive and autonomous in nature and interact with internal and external components in a goal-oriented manner. These include:

[0152] 1. CodeGen Agent 26, which interacts with Planner 24, ChangeSpecification Manager 8 and external application-specific settings such as version and programming language to generate and adapt integration code.

[0153] 2. Deployment Agent 28, which interacts with external application environment elements and the CodeGen Agent 26 to deploy and validate code in a self-adapting fashion. It is self-adapting to the extent that when a change such as an IP address change occurs, it is detected and the deployment agent makes the necessary modification autonomously or semi-autonomously by further inquiring input from the human operator to insure the continued operation of the code.

[0154] Desktop Client

[0155] The system Desktop Client is seen in FIG. 1 in logical architecture form 7 and in physical architecture form 9. It is used to provide the graphical user interface (GUI) between users and the system. The Desktop Client runs on users' or clients' desktops. It can make requests of the system server components via system Proxies, receive data from those requests, and present that data to the user. Even though the Desktop Client is a full desktop application, it does not need to provide any business logic.

[0156] The Desktop Client contains the following views each functioning as indicated:

[0157] 1. Application Context, illustrated as Application Manager 11

[0158] Lists the applications which were previously defined by the users.

[0159] Shows detailed information for the selected application.

[0160] Adds, modifies or removes application definitions in response to user requests.

[0161] 2. Schema Context 13

[0162] Lists the previously collected schemas.

[0163] Shows detailed information for the selected schema.

[0164] Adds or removes schemas in response to system or user requests.

[0165] 3. Change Specification Context 15

[0166] Lists the previously created Change Specifications.

[0167] Shows detailed information for the selected change specification.

[0168] Add, or remove change specifications in response to system or user requests.

[0169] 4. Report Generation Context 17

[0170] Uses a File selection dialog to open previously saved reports.

[0171] Creates a new report from an existing schema or change specification.

[0172] Saves the current report to the local disk, in HTML or XML.

[0173] 5. Task List Context 19

[0174] List the pending/scheduled tasks for the current user.

[0175] Adds, modifies or remove a task.

[0176] 6. User Administration Context 21

[0177] Lists the users of the system

[0178] Sets up new users

[0179] Administers passwords

[0180] 7. Notification Context 23

[0181] Displays notifications

[0182] Sets up notification preferences

[0183] 8. Application Ontology View Context 25

[0184] Lists Application Ontologies

[0185] Displays Application Ontologies for browsing

[0186] 9. App2App Similarity Mapping Context 27

[0187] Lists App2App Similarity Maps

[0188] Displays App2App Similarity Maps for browsing and user acceptance

[0189] 10. Plan View Context 29

[0190] Lists integration Plans

[0191] Displays Plan for user browsing and acceptance

[0192] 11. Language Editor 31

[0193] Lists language supported

[0194] Displays specific language settings for user browsing and preference selection

[0195] 12. Code Browser Context 33

[0196] Displays code in specific language for user browsing, saving and preference settings

[0197] A context as used above is a particular view or component of the user interface that the user can use to perform specific tasks, browse through system output or interact with the system in general. Each context has a server side counterpart with which it interacts to produce the desired functionality.

[0198] System Hub

[0199] The System Hub 30 is a broker, which means that it is used to connect client components with server components. It need not, and usually does not, however, perform the communication between clients and servers. Rather, the Hub provides clients (typically the Desktop Client) with components that can be used to directly communicate with server components using Java RMI (Remote Method Invocation) 32. In system terms, the Hub provides Proxies to clients. These Proxies know how to communicate directly with Managers, which run on the server.

[0200] A portion of the Hub runs on both clients and a server. The portion of the Hub running on the server registers itself with Jini as a Jini service. To register in Jini, means that it makes an entry in Jini that other services can look up and connect to if necessary. Once this registration takes place, client Hubs can now find the server Hub. Communication between client Hubs and the server Hub takes places using RMI/JRMP.

[0201] A Proxy running on the client finds its associated Manager after the Manager has registered itself as an RMI server object with the server Hub. Once that registration takes place, Proxies can find Managers and they can communicate directly using RMI/JRMP. Manager registration is part of the initialization step for the Hub running on the server.

[0202] From the Desktop Client's perspective, communication with Managers to perform the needed processing is straightforward. When the Desktop Client is started, the client Hub is automatically created and initialized. Afterwards, the Desktop Client can ask the client Hub to provide a Proxy. The Desktop Client can then use the Proxy to communicate directly with its associated manager, bypassing the client Hub completely.

[0203] System Notifications

[0204] Notifications provide events of interest to the system components. For example, a Desktop Client component may want to know when a particular job has been completed. The component would register interest for a "job completion" event for a specific user. Since registration takes place through Jini, other services that have been registered in Jini will be able to read the request and provide the information if available. When the job for that user has completed, a notification is sent to the registered Desktop Client component. Notifications, managed by Notifications Manager 34, provide a way to check on status rather than continuously polling that status. The system uses both push and pull methods of notifications. Notifications can be persistent, or stored, rather than transient. This means that a registered component receiving a notification does not have to be online at the time of the notification to receive the event. The component can register interest for a particular notification, disconnect from the system, reconnect at a later time, and receive any outstanding events. Notifications can also be set up to be distributed via email, SMS or any other kind of delivery mechanism.

[0205] Jini and JavaSpaces

[0206] Jini 36 is an object-oriented, distributed processing infrastructure technology developed by Sun to enable the creation of dynamic distributed processing networks of services. Jini provides a way for servers to register their services (with Jini). Clients can use Jini to obtain access to those services. Services may run completely on either the server or client, or partially on both. Once a client has found a service, Jini is not used to facilitate the communication between clients and servers. Instead, the client and server communicate directly using the protocol defined by the service. Jini does, however, use RMI as its mechanism for servers to register services and clients to find those services.

[0207] JavaSpaces 38 is a Jini technology that provides transactionally secure, asynchronous object exchange and object storage for distributed applications. Instead of direct, synchronous communications, JavaSpaces allow applications to communicate indirectly and asynchronously. Using JavaSpaces allows application components to put objects into one or more JavaSpaces. Those objects can be retrieved later by other application components (in the same or different application) using JavaSpaces. JavaSpaces are Jini services, which can have leases so they can come and go on the network.

[0208] The system uses Jini services in two places:

[0209] 1. The Hub, which is a Jini service.

[0210] 2. Notifications, which use JavaSpaces, which, in turn, are Jini services.

[0211] The system uses JavaSpaces in two ways:

[0212] 1. Asynchronous messaging mechanism to support system notifications.

[0213] 2. Short-term data storage mechanism (e.g., holds job status for short period of time).

[0214] Java RMI

[0215] RMI (Remote Method Invocation), shown at 40 in FIG. 1 in respect of desktop clients, is a Java network protocol, which provides the distributed mechanism that allows system Proxies to communicate with Managers. RMI can host two other higher-level transport protocols, JRMP (Java Remote Method Protocol) and IIOP (Internet Inter-ORB Protocol). JRMP is the native, default, and Java-only higher-level protocol. IIOP allows Java objects to communicate with CORBA or J2EE objects. RMI relies on TCP/IP for its underlying network protocol.

[0216] RMI is used for communication between system Proxies and Managers, as well as the client and server portions of the Hub. The system currently can use the default RMI/JRMP.

[0217] Java Swing

[0218] Swing 42 is a technology that is part of standard Java. It provides (along with other complimentary technologies, such as AWT) a framework and list of graphical components for building portable graphical user interfaces. Swing is usually used to build Intranet-based application (i.e., those applications that exist behind company firewalls). Typically, Swing is not used for Internet-based applications.

[0219] XML

[0220] XML 44 is used to represent the following kinds of data:

[0221] 1. Schemas on the Desktop Client. DOM XML technology is used.

[0222] 2. Change Specifications on the Desktop Client (only written at this time). DOM XML technology is used.

[0223] 3. Reports on the Desktop Client. These reports can be transformed using a report template (XSLT) into an HTML file, which can be viewed later by the user.

[0224] 4. Properties on the Desktop Client (manually written, automatically read)

[0225] 5. Properties on the Server (manually written, automatically read)

DETAILED DESCRIPTION OF INVENTION COMPONENTS

[0226] The invention is illustrated in an alternate illustration in FIG. 2
and includes processes associated with Assessment Micro Agent, App2App Similarity Mapper, Planner, Hub, Error Validation and Code Generation components. Note that some of the components in FIG. 1 are in fact subcomponents of the functional components described hereafter. For instance, the Assessment Micro Agent component is composed of the Schema, Change Specification, Task and Job Managers in FIG. 1. In other words, the combination of these managers are an embodiment of the Assessment Micro Agent.

[0227] The functional components of the invention are described in FIG. 2. This figure shows how the functional components interact with each other and with two applications that are the target for integration.

[0228] First of all, applications A and B, which may be any ODBC or JDBC compliant data sources, are monitored by the Assessment Micro Agent component of the invention. Note that ODBC and JDBC are just examples of data source standards, but the Assessment Micro Agent might support other standards as well such as XML, HL7 or any other standard available that provides data structure information. The Assessment Micro Agent, when first installed, creates a complete inventory of the data structure and functionality of the data source and makes it available to other components of the invention as described below. If a change occurs in either of the applications the Assessment Micro Agent interacts notifies other components of the invention that then act upon this information as described below.

[0229] Once the Assessment Micro Agent has been installed in two or more applications, it is possible to produce similarity maps between those applications based on the data structure inventory provided by it. In order to accomplish this, the Application Ontology Factory uses application data structure information provided by the Assessment Micro Agent and the information provided in the Common Ontology library to produce the application ontologies. Then the App2App Similarity Mapper then uses the information in the application ontologies to produce a similarity map between the applications. Once the similarity map is completed, the Planner uses the information contained in the similarity map to produce an integration plan. Then the CodeGen Agent uses the information provided in the integration plan to produce the integration code. After the integration code is validated by the Error Management Micro Agent, the it is deployed as the x-walk file between the applications and thus they become integrated.

[0230] The details for the process of each of these components are described in more detail in the following sections.

[0231] Assessment Micro Agent

[0232] The Assessment Micro Agent serves three primary functions: schema discovery, change monitoring and system or user notification of changes.

[0233] FIG. 3 illustrates the process of schema discovery. The first time the Assessment Micro Agent 320 is installed for a given application 310, schema discovery is initiated. Schema discovery involves reading the meta-data stored in a data source 310 to produce a schema 360 that is placed into a memory model, which can then be displayed in textual 380 or graphic 390 form. This process is carried out by the Schema Manager 4 of FIG. 1. and includes collecting the following: data source information, data source driver information, table names, table types, indexes, foreign keys, column names, column data types, column precision, column nullability, primary key designation, view definitions, synonym and alias references, and remarks stored in the database schema as illustrated by 330, 340 and 350. The collected information can then be displayed by the client 17 of FIG. 1 in either a textual presentation 380 or graphic presentation 390. The schema 360 extracted in this manner becomes the input for ontology generation.

[0234] The invention's change monitoring capability provides detailed analysis through the Change Specification Manager 8 of software under consideration so that the user knows exactly what is different between product versions. The Change Specification Manager receives input of schemas from the Schema Manager 4. The Change Specification Manager 8
then creates change specifications if something has change between versions of the schema. It can manage revision control against new versions, patches and application upgrades that may affect data interoperability and in turn makes possible the development, maintenance and support of intelligent, dynamic adapters that contain application-level business logic, dependencies and constraints at the sub-modular level. Using an event driven model that is triggered by a system change, the Change Specification Manager 8 automatically detects alterations in the database structure of an application by making comparisons of schemas generated by the Schema Manager 4. When an application is being monitored, the Change Specification Manager 8
proceeds to analyze the change by comparing the new schema to a previous schema or schemas. First the Change Specification Manager 8 is triggered by a user or a system event 410. As seen in FIG. 4, the Change Specification Manager, described subsequently, compares schema information 420 for one historical view of the schema of one application to another historical view of the same application. The trigger mechanism 410 can be set as a scheduled task in the task manager or by some application dependent event such as a trigger mechanism, which usually is included in most commercial database management systems. The comparisons are done first at the table level for name or type differences 430. Then for each table the Change Specification Manager 8 compares meta-data information 440 such as name and type and length changes for the fields, columns, indices, primary keys, foreign keys, etc. The changes are then stored in as a change specification 450 for use by other components of the invention. If required to do so, the Change Specification Manager can show the change specification to the user via the Change Specification Browser 15.

[0235] The Assessment Micro Agent resides on an application server. The Assessment Micro Agent is application/product/version agnostic, which means that because its focus is exclusive on data structures, it does not depend on particular implementations of applications, products or versions of those applications of products.

[0236] In our implementation of the Assessment Micro Agent, we have further broken it down to at least four more components that provide distinctively useful functionality. These include:

[0237] Schema Manager. This component connects to applications through standard interfaces, which include JDBC, ODBC, Flat File Translators, and the like. It makes an analysis of the application and extracts the meta-data model in the form of a schema. The schema manager stores the schema and then provides an interface to other components to retrieve t