United States Patent5343554
Koza , ; et al.August 30, 1994

Title

Non-linear genetic process for data encoding and for solving problems using automatically defined functions

Abstract

An apparatus and method for solving problems using automatic function definitions, for solving problems using recursion and for performing data encoding. The present invention includes an apparatus and process for creating a population and then evolving that population to generate a result. When solving problems using automatic function definition, the apparatus and process initially creates a population of entities. Each of said entities has sub-entities of internally and externally invoked sub-entities. The externally invoked sub-entities are capable of having actions, invocations of sub-entities which are invoked internally, and material. Also, each sub-entity which is invoked internally is capable of including actions, invocations of internally invocable sub-entities, material provided to the externally invocable sub-entity, and material. The population is then evolved to generate a solution to the problem. When using the process to solve problems using recursion, the entities in the population are constructed in such a manner as explicitly to represent the termination predicate, the base case and the non-base case of the recursion. Each entity has access to a name denoting that entity so as allow recursive references. The population is then evolved to generate a solution to the problem. When encoding a set of data values into a procedure capable of approximating those data values, the apparatus and process initially creates a population of entities. The population is then evolved to generate a solution to the problem.


Inventors:Koza; John R. (Los Altos, CA), Rice; James P.  (Redwood City, CA)
Assignee:Koza; John R. (Los Altos, CA)
Appl. No.:881507
Filed:May 11, 1992

Current U.S. Class:706/13 
Field of Search:395/13

U.S. Patent Documents
4697242September 1987Holland et al.
Other References
Towards the Evolution of Symbols; Proc. of the 2nd Int. Conf. on Genetic Algorithms; Dolan et al.; Jul. 28-31, 1987; pp. 123-131. .
Symbolic Schemata in Connectionist Memories: Role Binding and the Evolution of Structure; UCLA-Al-87-11; Dolan et al.; pp. 1-23. .
Cognitive Systems Based on Adaptive Algorithms; Cognitive Systems; Holland et al.; 1978; pp. 313-329. .
On Using Genetic Algorithms to Search Program Spaces; Proc. of the 2nd Int. Conf. on Genetic Algorithms; De Jong; Jul. 28-31, 1987; pp. 210-216. .
Tree Structured Rules in Genetic Algorithms; Proc. of the 2nd Int. Conf. on Genetic Algorithms; Bickel et al.; Jul. 28-31, 1987; pp. 77-81. .
An Adaptive Crossover Distribution Mechanism for Genetic Algorithms; Proc. of the 2nd Int. Conf. on Genetic Algorithms; Schaffer et al.; Jul. 28-31, 1987; pp. 36-40. .
The Argot Strategy: Adaptive Representation Genetic Optimizer Technique; Proc. of the 2nd Int. Conf. on Genetic Algorithms; Sheafer; Jul. 28-31, 1987; pp. 50-58. .
Using the Genetic Algorithm to Generate LISP Source Code to Solve the Prisoner's Dilemma; Proc. of the 2nd Int. Conf. on Genetic Algorithms; Fujiki et al.; Jul. 28-31, 1987; pp. 236-240. .
A Darwinian Approach to Artificial Neural Systems; Proc. of Systems, Man, and Cybernetics Conference; Knisley; 1987; pp. 572-577. .
Sims, Karl, "Artificial Evolution for Computer Graphics", Computer Graphics, vol. 25, No. 4, pp. 319-328, Jul. 1991. .
Storer, James A. and Cohn, Martin, "The Use of Factual Theory in a Video Compression System", Data Compression Conference DCC '92, pp. 259-268, 1992..~
Primary Examiner: MacDonald; Allen R.
Attorney, Agent or Firm:Blakely, Sokoloff, Taylor & Zafman

Parent Case Text



This application is a continuation-in-part of co-pending U.S. patent application Ser. No. 07/787,748, filed Nov. 5, 1991, which is a continuation of U.S. patent application (continuation-in-part) Ser. No. 07/500,791, filed Mar. 28, 1990now abandoned, titled Non-linear Genetic Algorithms for Solving Problems by Finding a Fit Composition of Functions, which is a continuation-in-part of Ser. No. 07/196,973, filed May 20, 1988, now U.S Pat. No. 4,935,877, issued Jun. 19, 1990, titled Non-linear Genetic Algorithms for Solving Problems.

Claims


We claim:
1. In a computing system having at least one processor and at least one memory, a computer implemented process for solving a problem comprising the steps of:
creating a population of programmatic entities having sub-entities, wherein at least one of said sub-entities is externally invocable and at least one of said programmatic entities in the population has at least one internally invocable sub-entity, said at least one externally invocable sub-entity including at least one invocation of an internally invocable sub-entity; and
evolving said population, including the step of executing the population of programmatic entities, wherein said at least one externally invocable sub-entity invokes said at least one internally invocable sub-entity to produce results and, wherein said step of evolving further includes the step of generating at least one new programmatic entity in response to the results.

2. The process defined in claim 1 wherein said step of evolving comprises a series of iterative steps, wherein said series of iterative steps includes executing each of said programmatic entities to produce a result, thereafter subjecting at least one of said programmatic entities to an operation for evolving said at least one of said programmatic entities to generate a new programmatic entity, and adding said new programmatic entity to said population.

3. The process defined in claim 1 wherein the step of creating a population includes the step of creating at least one internally invocable sub-entity within at least one of said programmatic entities that references another internally invocable sub-entity within said at least one of said programmatic entities.

4. The process defined in claim 1 wherein said at least one externally invocable sub-entity includes at least one action.

5. The process defined in claim 1 wherein said at least one externally invocable sub-entity includes material.

6. The process defined in claim 5 wherein material comprises material provided to said at least one externally invocable sub-entity.

7. The process defined in claim 5 wherein material comprises material provided to said at least one internally invocable sub-entity.

8. The process defined in claim 1 wherein said at least one internally invocable sub-entity includes material.

9. The process defined in claim 1 wherein said at least one internally invocable sub-entity includes at least one action.

10. In computing system having at least one processor and at least one memory, a computer implemented process for problem solving using a population of programmatic entities, wherein each of said programmatic entities includes sub-entities, wherein at least one of said sub-entities is externally invocable and at least one of said programmatic entities in the population has at least one internally invocable sub-entity, said at least one externally invocable sub-entity including at least one invocation of an internally invocable sub-entity, said process comprising iterations of a series of steps, each iteration comprising the steps:
executing each said programmatic entity to produce a result;
selecting at least one programmatic entity from said population using selection criteria, said selection criteria based on a fitness associated with each said programmatic entity, said selection criteria preferring each said programmatic entity having a relatively high associated fitness over each said programmatic entity having a relatively low associated fitness;
choosing and performing an operation wherein each chosen operation is one of the operations of crossover or reproduction;
retaining said selected programmatic entity such that said selected programmatic entity remains unchanged if said chosen operation is reproduction;
creating at least one new programmatic entity by crossover using a group of programmatic entities if said chosen operation is crossover, said group of programmatic entities comprising said selected programmatic entity and at least another programmatic entity, wherein a portion of the selected programmatic entity and a portion of said at least another programmatic entity are designed, such that the new programmatic entity created by crossover comprises said portion of said selected programmatic entity other than said designed portion and said designated portion of said at least another programmatic entity, said new programmatic entity differing in size and shape from said selected programmatic entity and said at least another programmatic entity when the designated portion of said selected programmatic entity and said designated portion of said at least another programmatic entity differ in size and shape; and
adding said new programmatic entity to said population.

11. The process defined in claim 4 wherein said crossover operation is restrained such that said designated portion of said at least another programmatic entity in said new programmatic entity includes only those actions, material and references to internally invocable sub-entities that have been given a meaning by the other than the designed portion of said selected programmatic entity.

12. The process defined in claim 11 further comprising the step of determining actions, materials and references to internally invocable sub-entities that have meaning to said selected programmatic entity.

13. The process defined in claim 10 wherein said step of creating includes creating at least one programmatic entity having a first internally invocable sub-entity referring to a second internally invocable sub-entity within that programmatic entity, such that said internally invocable sub-entity invokes said second internally invocable sub-entity within said at least one programmatic entity when executed.

14. The process defined in claim 10 wherein the step of executing includes the step of said at least one internally invocable sub-entity invoking itself recursively.

15. The process defined in claim 10 wherein the step of executing includes executing one of said programmatic entities having at least one internally invocable sub-entity, such that said at least one internally invocable sub-entity invokes itself by referring to another internally invocable sub-entity in said one of said programmatic entities.

16. The process defined in claim 10 wherein said step of selecting at least one programmatic entity further comprises selection criteria based on a probability that is proportional to said fitness associated with said programmatic entity.

17. The process defined in claim 10 wherein said step of choosing and performing an operation further comprises the operation of mutation such that if said chosen operation is mutation, a step of mutation occurs before said adding step, wherein said selected programmatic entity is mutated, such that at least one designated portion of said selected programmatic entity is replaced by a randomly generated portion, such that the randomly generated portion contains only actions, material and references to internally invocable sub-entities that have been given meaning by the other than said designated portion of said selected programmatic entity.

18. The process defined in claim 10 further comprising a step of removing at least one programmatic entity having a relatively low associated fitness.

19. The process defined in claim 10 further including the steps of determining if an individual programmatic entity in said population attains a pre-established fitness with respect to a solution to the problem and designating said individual programmatic entity as the solution to the problem.

20. The process defined in claim 10 wherein the initial population of programmatic entities includes programs created by randomly generating programs of various sizes and structures, said programmatic entities consisting of hierarchical sub-entities, said hierarchical sub-entities including at least either actions and material available for the problem.

21. The process defined in claim 10 wherein the steps of executing includes executing more than one programmatic entity simultaneously.

22. The process defined in claim 10 wherein said step of executing includes selecting more than one programmatic entity simultaneously.

23. The process defined in claim 10 wherein said step of choosing and performing is performed simultaneously for more than one programmatic entity.

24. The process defined in claim 10 wherein if said operation of reproduction is chosen for at least one of said programmatic entities then said step of executing is not repeated for said at least one of said programmatic entities.

25. In a computing system having at least one processor and a memory, a computer implemented process for solving a problem comprising the steps of:
creating an initial population of programmatic entities having sub-entities, wherein at least one of said sub-entities produces a result upon execution and at least one of said programmatic entities in the population has at least one function defining sub-entity, said one of said at least one sub-entity capable of including at least one invocation of a function defining sub-entity; and
evolving said population to generate a solution to said problem, wherein said step of evolving includes the step of executing each said programmatic entity to produce a result, wherein the function defining sub-entity is invoked by said one of the programmatic entities that produces results, such that the solution to said problem is derived from results produced by evolving said population.

26. The process defined in claim 25 wherein the step of creating an initial population includes the step of creating at least one programmatic entity having a function defining sub-entity that remains constant, such that the function defining sub-entity does not undergo evolution.

27. The process defined in claim 25 wherein said function defining sub-entity includes at least one action.

28. The process defined in claim 25 wherein said function defining sub-entity includes material.

29. The process defined in claim 28 wherein material comprises material provided to said one of said programmatic entities that produces results.

30. The process defined in claim wherein material comprises material provided to said function defining sub-entity.

31. A computing system for solving problems comprising a processor and a memory means coupled to said processor for storing a population of programmatic entities, wherein each of said programmatic entities is comprised of sub-entities, wherein at least one of said sub-entities is externally invocable and at least one of said programmatic entities in the population has at least one internally invocable sub-entity, said at least one externally invocable sub-entity including at least one invocation of an internally invocable sub-entity, said computing system further comprising:
means for executing each said programmatic entity to produce a result, said means for executing coupled to said memory means;
means for selecting at least one programmatic entity from said population using selection criteria, said selection criteria based on a fitness associated with each said programmatic entity, said selection criteria preferring each said programmatic entity having a relatively high associated fitness over each said programmatic entity having a relatively low associated fitness, said means for selecting coupled to said memory means;
means for choosing and performing an operation wherein each chosen operation is one of the operations of crossover or reproduction, said means for choosing and performing coupled to said memory means;
means for creating at least one new programmatic entity by crossover using a group of programmatic entities if said chosen operation is crossover, said group of programmatic entities comprising said selected programmatic entity and at least another programmatic entity, wherein a portion of the selected programmatic entity and a portion of said at least another programmatic entity are designated, such that said new programmatic entity created by crossover comprises the portion of said selected programmatic entity other than said designated portion and said designated portion of said at least another programmatic entity, wherein said crossover operation is restrained such that said designated portion of said at least another programmatic entity in said new programmatic entity includes only those actions, material and references to internally invocable sub-entities that have been given a meaning by the other than the designated portion of said selected programmatic entity, said new programmatic entity differing in size and shape from said selected programmatic entity and said at least another programmatic entity when the designated portion of said selected programmatic entity and said designated portion of said at least another programmatic entity differ in size and shape, said means for creating coupled to said memory means;
means for retaining said selected programmatic entity such that said selected programmatic entity remains unchanged if said chosen operation is reproduction, said means for retaining coupled to said memory means;
means for adding said new programmatic entity to said population, said means for adding coupled to said memory means.

32. The computing system defined in claim 31 wherein the means for executing includes means for invoking said sub-entities wherein at least one internally invocable sub-entity refers to and invokes another internally invocable sub-entity, such that said another internally invocable sub-entity is executed.

33. The computing system defined in claim 31 further comprising means for selecting at least one programmatic entity further comprising selection criteria based on a probability that is proportional to said fitness associated with said program.

34. The computing system defined in claim 31 further comprising means for choosing and performing the operation of mutation, such that if said chosen operation is mutation, a step of mutation occurs before said adding step, wherein said selected programmatic entity is mutated, wherein at least one portion of said selected programmatic entity is replaced by a randomly generated portion, such that said randomly generated portion contains only actions, material and references to internally invocable sub-entities having meaning in said selected programmatic entity.

35. The computing system defined in claim 31 further comprising means for removing at least one programmatic entity having a relatively low associated value.

36. The computing system defined in claim 31 further comprising means for removing said group from said population.

37. The computing system defined in claim 31 further including means for designating the solution coupled to the processor, wherein said means for designating the solution designates an individual programmatic entity in said population attaining a pre-established fitness is designated as the solution to the problem.

38. The computing system defined in claim 31 further comprising means for creating the initial population of programmatic entities by randomly generating programmatic entities of various size and structures, said programmatic entities having hierarchical sub-entities that include at least one action and material available for the problem.

39. The computing system defined in claim 31 wherein said means for executing executes said at least one programmatic entity, wherein at least one internally invocable sub-entity invokes itself recursively.

40. The computing system defined in claim 31 wherein means for executing executes said at least one programmatic entity, wherein at least one internally invocable sub-entity in one of said programmatic entities invokes itself by referring to another internally invocable sub-entity in said one of said programmatic entities.

41. In a computing system having at least one processor and a memory, a computer implemented process for automatically encoding a set of data values into a procedure for at least approximating said set of data values, using a population of programmatic entities of various sizes and shapes, wherein each programmatic entity is a hierarchical arrangement of actions and material, said process comprising iterations of a series of steps which generate said procedure, each iteration comprising the steps:
executing each said programmatic entity to produce a result;
selecting at least one programmatic entity from said population using selection criteria, said selection criteria based on a fitness associated with each said programmatic entity, said selection criteria preferring each said programmatic entity having a relatively high associated fitness over each said programmatic entity having a relatively low associated fitness;
choosing and performing an operation wherein each chosen operation is one of the operations of crossover or reproduction;
creating at least one new programmatic entity by crossover using a group of programmatic entities if said chosen operation is crossover, said group of programmatic entities comprising said selected programmatic entity and at least another programmatic entity, such that any new programmatic entity created by crossover comprises at least a portion of said selected programmatic entity and at least a portion of said at least another programmatic entity, said new programmatic entity differing in size and shape from said selected programmatic entity and said at least another programmatic entity when said at least a portion of said selected programmatic entity and said at least a portion of said at least another programmatic entity differ in size and shape;
retaining said selected programmatic entity such that said selected programmatic entity remains unchanged if said chosen operation is reproduction;
adding said new programmatic entity to said population.

42. The process defined in claim 41 wherein said set of data values corresponds to an image, wherein the step of executing each said programmatic entity produces a result approximating the image, and wherein the process further comprises the step of designating the programmatic entity that represents the image as having a relatively high associated fitness over each said programmatic entity having a relatively low associated fitness, such that said process performs image compression.

43. The process defined in claim 41 wherein said set of data values corresponds to an image, wherein the step of executing each said programmatic entity produces a result approximating the image, and wherein the process further comprises the step of designating the programmatic entity that represents the image as having a relatively high associated fitness over each said programmatic entity having a relatively low associated fitness, such that said process performs image encoding.

44. The process defined in claim 42 or claim 43 wherein each of said set of data values is a data structure, such that each data structure represents a plurality of components, and further wherein the step of designating that program having a relatively high associated fitness selects based on the quality of the approximation to the data values produced by said selected programmatic entity.

45. The process defined in claim 41 wherein each of said set of data values is a data structure, such that each data structure represents a plurality of components, and further wherein a programmatic entity having a relatively high associated fitness is selected based on the quality of the approximation to the data values produced by said selected programmatic entity.

46. The process defined in claim 41 further comprising the step of designating an individual programmatic entity in said population attaining a pre-established value of fitness with respect to encoding said data as an encoding that represents said data, said process including the step of translating said programmatic entity representing an encoding of said data into an executable form while maintaining the logical consistency of said programmatic entity representing said encoding of said data.

47. The process defined in claim 41 wherein said step of executing executes more than one programmatic entity simultaneously.

48. The process defined in claim 41 wherein said executing step is performed simultaneously for more than one element of said set of data for at least one programmatic entity.

49. The process defined in claim 41 wherein said iterating a series of steps is performed simultaneously for more than one programmatic entity.

50. A computing system for automatically encoding a set of data values into a procedure for at least approximating said set of data values comprising a processor and a memory means coupled to said processor for storing a population of programmatic entities of various sizes and shapes, wherein each programmatic entity is a hierarchical arrangement of actions and material appropriate to the domain of data values, said computing system further comprising:
means for executing each said programmatic entity to produce a result by performing said hierarchical arrangement of functions, said means for executing coupled to said memory means;
means for selecting at least one programmatic entity from said population using selection criteria, said selection criteria based on a fitness associated with each said programmatic entity, said selection criteria preferring each said programmatic entity having a relatively high associated fitness over each said programmatic entity having a relatively low associated fitness, said means for selecting coupled to said memory means;
means for choosing and performing an operation wherein each chosen operation is one of the operations of crossover or reproduction, said means for choosing and performing coupled to said memory means;
means for creating at least one new programmatic entity by crossover using a group of programmatic entities if said chosen operation is crossover, said group of programmatic entities comprising said selected programmatic entity and at least another programmatic entity, such that any new programmatic entity created by crossover comprises at least a portion of said selected programmatic entity and at least a portion of said at least another programmatic entity, said new programmatic entity differing in size and shape from said selected programmatic entity and said at least another programmatic entity when said at least a portion of said selected program and said at least a portion of said at least another program differ in size and shape, said means for creating coupled to said memory means;
means for retaining said selected programmatic entity such that said selected programmatic entity remain unchanged if said chosen operation is reproduction, said means for retaining coupled to said memory means;
means for adding said new programmatic entity to said population, said means for adding coupled to said memory means, wherein said computing system generates a computer program representing said data, such that said data is encoded.

51. In a computing system having at least one processor and at least one memory, a computer implemented process for solving a problem using a population of named programmatic entities of various sizes and shapes, wherein each programmatic entity has a hierarchical arrangement of actions and material, at least one programmatic entity containing at least one reference to itself by use of said name, said process comprising iterations of a series of steps, each iteration comprising the steps:
executing each said programmatic entity to produce a result, wherein at least one of the programmatic entities invokes itself recursively;
selecting at least one programmatic entity from said population using selection criteria, said selection criteria based on a fitness associated with each said programmatic entity, said selection criteria preferring each said programmatic entity having a relatively high associated fitness over each said programmatic entity having a relatively low associated fitness;
choosing and performing an operation wherein each chosen operation is one of the operations of crossover or reproduction;
creating at least one new programmatic entity by crossover using a group of programmatic entities if said chosen operation is crossover, said group of programmatic entities comprising said selected programmatic entity and at least another programmatic entity, wherein a portion of the selected programmatic entity and a portion of said at least another programmatic entity are designated, such that the new programmatic entity created by crossover comprises said portion of said selected programmatic entity other than said designated portion and said designated portion of said at least another programmatic entity, said new programmatic entity differing in size and shape from said selected programmatic entity and said at least another programmatic entity when the designated portion of said selected programmatic entity and said designated portion of said at least another programmatic entity differ in size and shape;
retaining said selected programmatic entity such that said selected programmatic entity remains unchanged if said chosen operation is reproduction;
adding said new programmatic entity to said population.

52. The process defined in claim 51 wherein the step of executing includes executing more than one programmatic entity is activated simultaneously.

53. The process defined in claim 51 wherein said iterating a series of steps is performed simultaneously for more than one programmatic entity.

54. A computing system for solving a problem comprising at least one processor and at least one memory means coupled to said at least one processor for storing a population of named programmatic entities of various sizes and shapes, wherein each programmatic entity has a hierarchical arrangement of actions and material, said programmatic entity containing at least one reference to itself by use of a name, said computing system comprising:
means for executing each said programmatic entity to produce a result by performing said hierarchical arrangement of functions, wherein at least one programmatic entity invokes itself recursively, said means for executing coupled to said memory means;
means for selecting at least one programmatic entity from said population using selection criteria, said selection criteria based on a fitness associated with each said programmatic entity, said selection criteria preferring each said programmatic entity having a relatively high associated fitness over each said programmatic entity having a relatively low associated fitness, said means for selecting coupled to said memory means;
means for choosing and performing an operation wherein each chosen operation is one of the operations of crossover or reproduction, said means for choosing and performing coupled to said memory means;
means for creating at least one new programmatic entity by crossover using a group of programmatic entities if said chosen operation is crossover, said group of programmatic entities comprising said selected programmatic entity and at least another programmatic entity, wherein a portion of the selected programmatic entity and a portion of said at least another programmatic entity are designated, such that the new programmatic entity created by crossover comprises said portion of said selected programmatic entity other than said designated portion and said designated portion of said at least another programmatic entity, said new programmatic entity differing in size and shape from said selected programmatic entity and said at least another programmatic entity when the designated portion of said selected programmatic entity and said designated portion of said at least another programmatic entity differ in size and shape;
means for retaining said selected programmatic entity such that said selected programmatic entity remains unchanged if said chosen operation is reproduction, said means for retaining coupled to said memory means;
means for adding said new programmatic entity to said population, said means for adding coupled to said memory means, wherein said computing system generates a computer program representing said data, such that said data is encoded.

55. In a computing system, a computer implemented process for solving an original problem comprising the stages of:
(a) decomposing said original problem into at least one subproblem;
(b) finding at least one solution to said at least one subproblem; and
(c) assembling said at least one solution to said at least one subproblem into a solution to said original problem;
wherein stages (a)-(c) are implemented using a series of steps including
creating a population of programmatic entities having sub-entities, wherein at least one of said sub-entities is externally invocable and at least one of said programmatic entities in the population has at least one internally invocable sub-entity, said at least one externally invocable sub-entity including at least one invocation of internally invocable sub-entities; and
evolving said population, including the step of executing the programmatic entities, wherein said at least one externally invocable sub-entity performing said at least one invocation of at least one of said internally invocable sub-entities to produce results and, wherein said step of evolving further includes the step of generating at least one new programmatic entity in response to the results, such that at least one of the programmatic entities in said population is designated a solution to the problem.

56. The process defined in claim 55 wherein said at least one externally invocable sub-entity includes at least one action.

57. The process defined in claim 55 wherein said at least one externally invocable sub-entity includes material.

58. The process defined in claim 57 wherein material comprises material provided to said at least one externally invocable sub-entity.

59. The process defined in claim 57 wherein material comprises material provided to said at least one internally invocable sub-entity.

60. The process defined in claim 55 wherein said at least one internally invocable sub-entity includes material.

61. The process defined in claim 55 wherein said at least one internally invocable sub-entity includes at least one action.

Description

BACKGROUND OF THE INVENTION

1. The Field of the Invention

The field of the invention is computer-implemented genetic algorithms. More specifically, the field is genetic algorithms useful for problem solving. The field spans the range of problems wherein a fit composition of functions may be found as a solution to the problem.

2. The Prior Art

The Natural Selection Process in Nature

The natural selection process provides a powerful tool for problem solving. This is shown by nature and its various examples of biological entities that survive and evolve in various environments. In nature, complex combinations of traits give particular biological populations the ability to adapt, survive, and reproduce in their environments. Equally impressive is the complex, relatively rapid, and robust adaptation and relatively good interim performance that occurs amongst a population of individuals in nature in response to changes in the environment. Nature's methods for adapting biological populations to their environment and nature's method of adapting these populations to successive changes in their environments (including survival and reproduction of the fittest) provides a useful model. This model can be used to develop methods to solve a wide variety of complex problems which are generally thought to require "intelligence" to solve.

In nature, a gene is the basic functional unit by which hereditary information is passed from parents to offspring. Genes appear at particular places (called gene "loci") along molecules of deoxyribonucleic acid (DNA). DNA is a long thread-like biological molecule that has the ability to carry hereditary information and the ability to serve as a model for the production of replicas of itself. All known life forms on this planet (including bacteria, fungi, plants, animals, and humans) are based on the DNA molecule.

The so-called "genetic code" involving the DNA molecule consists of long strings (sequences) of 4 possible gene values that can appear at the various gene loci along the DNA molecule. For DNA, the 4 possible gene values refer to 4 "bases" named adenine, guanine, cytosine, and thymine (usually abbreviated as A, G, C, and T, respectively). Thus, the "genetic code" in DNA consists of a long strings such as CTCGACGGT.

A chromosome consists of numerous gene loci with a specific gene value (called an "allele") at each gene locus. The chromosome set for a human being consists of 23 pairs of chromosomes. The chromosomes together provide the information and the instructions necessary to construct and to describe one individual human being and contain about 3,000,000,000 genes. These 3,000,000,000 genes constitute the so-called "genome" for one particular human being. Complete genomes of the approximately
5,000,000,000 living human beings together constitute the entire pool of genetic information for the human species. It is known that certain gene values occurring at certain places in certain chromosomes control certain traits of the individual, including traits such as eye color, susceptibility to particular diseases, etc.

When living cells reproduce, the genetic code in DNA is read. Sub-sequences consisting of 3 DNA bases are used to specify one of 20 amino acids. Large biological protein molecules are, in turn, made up of anywhere between 50 and 500 such amino acids. Thus, this genetic code is used to specify and control the building of new living cells from amino acids.

The organisms consisting of the living cells created in this manner spend their lives attempting to deal with their environment. Some organisms do better than others in grappling with (or opposing) their environment. In particular, some organisms survive to the age of reproduction and therefore pass on their genetic make-up (chromosome string) to their offspring. In nature, the process of Darwinian natural selection causes organisms with traits that facilitate survival to the age of reproduction to pass on all or part of their genetic make-up to offspring. Over a period of time and many generations, the population as a whole evolves so that the chromosome strings in the individuals in the surviving population perpetuate traits that contribute to survival of the organism in its environment.

Prior Art Genetic Algorithms

Genetic algorithms are highly parallel algorithms that transform populations of individual mathematical objects (typically fixed-length binary character strings) into new populations using operations patterned after (1 ) natural genetic operations such as sexual recombination (crossover) and (2) fitness proportionate reproduction (Darwinian survival of the fittest). Genetic algorithms begin with an initial population of individuals (typically randomly generated) and then iteratively (1) evaluate the individuals in the population for fitness with respect to the problem environment and (2) perform genetic operations on various individuals in the population to produce a new population. John Holland of the University of Michigan presented the pioneering formulation of genetic algorithms for fixed-length binary character strings in Adaptation in Artificial and Natural Systems, by Professor John H. Holland, 1975. Holland established, among other things, that the genetic algorithm is a mathematically near optimal (minimax) approach to adaptation in that it maximizes expected overall average payoff when the adaptive process is viewed as a multi-armed slot machine problem requiring an optimal allocation of future trials given currently available information. Recent work in genetic algorithms and genetic classifier systems can be surveyed in Grefenstette ( 1985), Grefenstette (1987), Goldberg (1989), Davis (1987), and Schaffer (1989).

In Adaptation in Artificial and Natural Systems, Holland summarizes his research in genetic algorithms and presents an overall mathematical theory of adaptation for both natural and artificial systems. A key part of this book described a "genetic algorithm" patterned after nature's methods for biological adaptation. However, a limitation of this work resides in using fixed length binary strings to represent the population. U.S. Pat. No. 4,697,242 (Holland) and U.S. Pat. No.
4,881,178 (Holland) are examples of processes which use fixed length binary strings with a genetic algorithm.

Empirical studies by various researchers have demonstrated the capabilities of such genetic algorithms in many diverse areas, including function optimization (De Jong 1980), operation of a gas pipeline (Goldberg 1983), and many others reviewed in Goldberg (1989).

In the chapter entitled "An Overview" contained in the 1987 collection Genetic Algorithms and Simulated Annealing, Lawrence Davis and Martha Steenstrup stated, "In all of Holland's work, and in the work of many of his students, chromosomes are bit strings--lists of 0's and 1's." In addition, they continue, "Some researchers have explored the use of other representations, often in connection with industrial algorithms. Examples of other representations include ordered lists (for bin-packing), embedded lists (for factory scheduling problems), variable-element lists (for semiconductor layout), and the representations used by Glover and Grefenstette in this volume."

Some researchers have attempted to solve search and optimization problems using schemes patterned after evolution that employed mutation-plus-save-the-best strategies. Examples are Box (1957), Hicklin (1986), and the 1966 book by Fogel, Owens, and Walsh entitled Artificial Intelligence Through Simulated Evolution. The few results obtained from these efforts were highly specific to particular application domains and largely reflect the cleverness of implementation rather than its usefulness as a general technique for achieving adaptive increases in fitness in populations. It is important to note that mutation is not the primary means by which biological populations in nature improve their fitness and it is not the primary means used in the present invention.

Since Holland's 1975 book, Holland and various colleagues have developed a novel application of conventional genetic algorithms called a "genetic classifier system". A classifier system is a group of rules. Each rule consists of a condition part and an action part (i.e. an IF-THEN rule). Both the condition part and action part of each rule are like the individuals in the conventional genetic algorithm in that they are a strings of 0's and 1's of fixed length. In a classifier system, messages (consisting of binary strings) are received from the environment and invoke those rules whose conditional part ("IF" part) match the message (binary string) coming in. This invokation triggers the action part ("THEN" part) of the rule. The action part of a rule sends out a new message (binary string).

Classifier Systems are described in the 1978 article "Cognitive Systems based on Adaptive Algorithms" (by Holland and Judith S. Reitman) published in Pattern-Directed Inference Systems, edited by D. A. Waterman and Frederick Hayes-Roth; and David E. Goldberg's 1983 dissertation entitled Computer-Aided Gas Pipeline Operations Using Genetic Algorithms and Rule Learning. In classifier systems, credit is assigned to chains of individual rules that are invoked using a credit allocation scheme known as the "bucket brigade". The Holland process is a combination of a classifier system and a "bucket brigade algorithm". A 1987 paper by Cory Fujiki and John Dickinson in Genetic Algorithms and Their Applications: Proceedings of the Second International Conference on Genetic Algorithms, (John J. Grefenstette, 1987) describes a computer program written in LISP for solving the Prisoner's Dilemma using binary strings of fixed length and IF-THEN classifier rules. In addition, Smith (1980, 1983) has placed IF-THEN rules in genetic strings in lieu of individual characters.

We call conventional genetic algorithms "linear" because they manipulate strings (sequences) of characters over a fixed alphabet (typically strings of binary digits 0 and 1). This is in contrast to the "non-linear" situation in which the objects being manipulated are hierarchical expressions consisting of a hierarchical arrangement of functions and terminals.

The reasons for limiting the conventional genetic algorithm to binary strings of fixed length appear in the literature. First, in his 1983 dissertation entitled Computer-Aided Gas Pipeline Operation Using Genetic Algorithms and Rule Learning, David E. Goldberg argues that any binary string of the common fixed length always has an interpretation (via a well-defined representation scheme) to the problem being solved. This might be called the property of being "well defined" and it is a desirable property.

Secondly, if each individual in the population consists of a binary string of fixed length, then the crossover operation will always produce another binary string of fixed length when applied to any two individuals in the population. This might be called a "closure" property and it is also a desirable property. Of course, binary strings of fixed length are not the only way of achieving these desirable properties of closure and being well-defined.

In Adaptation in Natural and Artificial Systems (1975, page 71), Holland argues in favor of strings consisting only of 0's and 1's (i.e. binary strings) in the conventional genetic algorithm on the basis that the number of strings in the search space that are searched automatically using what he calls the "implicit parallelism" of the conventional genetic algorithm is highest when the strings consist only of two possibilities. This point is true; however, it should not be the controlling consideration. For various reasons cited hereinafter, limiting the genetic algorithm to the one dimensional world of linear strings of fixed length (and, in particular, binary strings of fixed length) precludes solving many problems. The field of computer science is replete with other situations where it is highly unrealistic to assume that the size or shape of a problem is known in advance to the solver so that he can use this information to rigidly pre-specify the size and shape of his search in advance.

Using fixed length binary strings in conventional genetic algorithms limits their ability to solve many problems. The following two separate example problems illustrate additional limitations of conventional genetic algorithms.

First, suppose we want a computer to program itself to solve the problem of finding the point at which two intersecting straight lines intersect. The point of intersection of two straight lines is the pair of numbers that satisfy the two linear equations in two variables that represent the lines. Thus, the computer program we are seeking would use the coefficients of the two equations and various mathematical operators (such as multiplication, subtraction, etc.) to produce the desired answer. To make the problem of having a computer learning to program itself more realistic, it is best not to specify in advance the size or shape of the mathematical expression needed to solve the problem. It is also more realistic if the computer had access to various irrelevant inputs and extraneous mathematical operations that might confuse its search to find the solution to the problem.

There is no simple or convenient way to uniquely associate a binary string whose length is predetermined in advance with an arbitrary mathematical expression composed of specified mathematical operations (functions) and terminals. A binary string of length n can only represent 2.sup.n different things (no matter what the representation scheme). No matter how large an n is pre-selected in advance, there are always additional mathematical expressions.

Before continuing, it should be emphasized that it is not necessary to represent things of infinite size. Rather, what should be avoided is arbitrarily pre-setting a limit on the size and shape of the things being represented (even though any particular thing will itself be finite in size). In most problems, the size and shape of the solution are not necessarily known in advance. The process of solving the problem should be free to develop proposed solutions without any pre-set limit on the size and shape of the solution.

Even if an arbitrary maximum length specified in advance were acceptable, the method for associating each arbitrary mathematical expression (for example: A*B+C-D*E*F) with a binary string would necessarily obscure the underlying mathematical operations involve. The highly complex method used by Godel in 1931 in his proof of the Incompleteness Theorem is an example of such a method for making this kind of association. Thus, this first example problem highlights the need to be able to represent arbitrary mathematical expressions (involving various functions and terminals) whose length is not arbitrarily limited in advance (rather than merely strings of 0's and 1's of the same fixed length).

Let us now consider the problem of solving a system of two linear equations and also the problem of sequence induction.

It should be noted that if it is assumed that the two straight lines in this problem always intersect, the problem is entirely numerical. However, if the two lines might possibly be parallel, the answer from a computer program to this expanded version of the problem might appropriately be a symbolic response (e.g. "The Equations are inconsistent and the lines are parallel") rather than the numeric location of the point of intersection. This situation can be easily recognized by a computer program by checking to see if a certain computed value (the determinant) is zero. Thus, this expanded version of this first example problem highlights the need occasionally to accommodate symbolic processing and symbolic output from a computer program that normally produces a numeric output.

Second, consider the problem of predicting the future elements of a sequence of numbers from a sampling of early numbers from the sequence. This problem is an example of induction. Induction is the logical process by which one observes specific examples of some process (e.g. "The sun has come up every morning so far during my life") and then "induces" a reasonable underlying rule for the process (e.g. "The sun always comes up in the morning"). In applying inductive reasoning, there is no proof that the result is correct. Nonetheless, the process of induction is very important and indeed lies at the heart of all learning.

In contrast, deduction is the logical process in which one starts with some given premises (or facts) and some deductive rules of inference and then reaches a logical conclusion by repeatedly applying the deductive rules to the original given premises or facts. The sequence of steps used in deduction to reach a conclusion is called the proof.

If one is given a sampling of a sequence of numbers such as 0, 2, 4, 6, 8, 10, 12, 14 it is not difficult to reasonably induce that the next number in the sequence is 16. The number 16 is a reasonable induction because each previous element of the sequence is 2 times the element's position in the sequence (counting the first element as position 0). Note, however, that even elements of this simple numerical sequence cannot be represented with strings whose length has been specified in advance.

More interesting sequences involve more complicated mathematical operations. For example, the 6th element of the sequence 2, 4, 8, 16, 32, can be expressed directly in mathematics as 2 raised to the 6th power (i.e. 64). This sequence can also be expressed in mathematics using a recursion--that is, by defining the 6th element in terms of previous element(s) in the sequence. In this case, the m.sup.th element of the sequence is 2 times element m-1 of the sequence (that is, 2 times 32 is 64).

For some important mathematical sequences of integers, there is no known non-recursive expression for each element of the sequence, and the use of a recursion becomes a necessity, not merely an option. The well-known Fibonacci sequence 1, 1, 2,
3, 5, 8, 13, 21, 34, 55, is constructed by adding the 2 previous elements of the sequence. For example, 8 is the sum of 3 and 5, and 13 is the sum of 5 and 8. In general, the m.sup.th element of the Fibonacci sequence is the sum of element m-1 and element m-2 of the sequence (with the understanding that the first two elements of the sequence are a "default" value of 1).

Thus, the problem of sequence induction highlights the need to be able to represent recursions as well as arbitrary mathematical expressions (involving functions and terminals). It also re-emphasizes the need to be able to represent strings whose length has not been pre-specified in advance.

Many problems are best approached by developing hierarchies in which solutions to sub-problems are manipulated and assembled hierarchically into solutions to the original main problem. In fact, many mathematical problems are solved by first "decomposing" a larger problem into smaller sub-problems. Then, an attempt is made to solve each of the sub-problems. And, finally, the solutions to the sub-problems are assembled into a solution to the original problem. The problem of solving large numbers of equations with many variables and solving polynomial equations of high order are examples of problems where decomposition can be used. In some cases, there is a symmetry between this process of assembly and the solution to the individual sub-problems. That is, in this assembly process, the solutions to the sub-problems may be manipulated as if they themselves were merely the elements of a sub-problem.

Even when no symmetry is involved, a "hierarchy" develops when a problem is solved by decomposition. At the lowest level of the hierarchy, the sub-problem is solved. The hierarchy consists of combining the solutions of the sub-problem into the solution to the larger problem. Something similar is commonplace in computer programming in general. For example, sub-routines (or sub-procedures) are typically called by a main program. The main program is at the top of the hierarchy, typically organized to provide an overview of the solution to the whole problem. Each of the sub-routines called by the main program are found at one level lower on the hierarchy. If one of the sub-routines itself happens to call upon another sub-routine, that second sub-routine is one level lower on the hierarchy than the sub-routine which called it. Complex social organizations (such as corporations and military organizations), are similarly organized into hierarchies. The ability to decompose problems into hierarchies of sub-problems is generally important for solving problems.

It should be noted, however that the conventional genetic algorithm imposes at least five important limitations which restrict its usefulness in solving a broad range of problems.

First, the requirement that each individual in the population be a string of the same length arbitrarily limits consideration to only a pre-determined number of situations, cases, or states of the problem environment.

Secondly, the use of a binary string (a string of 0's and 1's) leads to a representation scheme involving an explosively large number of "different" solutions merely to handle consideration of only a few past populations. In contrast, if the representation scheme were not required to be rigidly structured in advance prior to the start of operation of the conventional genetic algorithm, a representation scheme involving only a relative handful of relevant possible histories might have evolved.

Thirdly, the individuals in the population are representational descriptions (codings) of a solution (as opposed to being actionable procedures which directly implement the solution). Any particular solution that one envisions and wants to include in the population must be first coded into a binary string of fixed length before it can be inserted into the population. Before any solution can be implemented, the binary string must be decoded into actionable instructions.

Fourthly, the binary strings of fixed length provide no hierarchical structure for potential solutions to the problem. The binary string is one dimensional. All items in the string operate at the same level.

Fifth, it is often true that conventional genetic algorithms are extremely efficient in searching large, complex, non-linear spaces to find an area that is especially good, but that other search techniques are better than conventional genetic algorithms in zeroing in on the final, precise, global optimum value in the search space. Thus, for some problems, it is common to use conventional genetic algorithms to quickly find the best neighborhood of the overall search space and then to switch to another search technique (such as simulated annealing or hill-climbing) to zero in on the precise global optimum value. This shortcoming of conventional genetic algorithms is, for many problems, the direct result of the fixed representation scheme selected at the beginning of the process. If the representation scheme were adaptive (i.e. not fixed), it could change its size and shape after getting into the right general neighborhood of the solution. It could then become more refined so that it would be capable of finding the precise global optimum solution to the problem.

Background on Genetic Programming Paradigm

Representation is a key issue in genetic algorithm work because genetic algorithms directly manipulate the coded representation of the problem and because the representation scheme can severely limit the window by which the system observes its world. Fixed length character strings present difficulties for some problems--particularly problems in artificial intelligence where the desired solution is hierarchical and where the size and shape of the solution is unknown in advance. The need for more powerful representations has been recognized for some time (De Jong 1985, De Jong 1987, De Jong 1988).

The structure of the individual mathematical objects that are manipulated by the genetic algorithm can be more complex than the fixed length character strings. Smith (1980, 1983) departed from the early fixed-length character strings by introducing variable length strings, including strings whose elements were if-then rules (rather than single characters). Holland's introduction of the classifier system (1986) continued the trend towards increasing the complexity of the structures undergoing adaptation. The classifier system is a cognitive architecture into which the genetic algorithm is embedded so as to allow adaptive modification of a population of string-based if-then rules (whose condition and action parts are fixed length binary strings).

In addition, we have recently shown that entire computer programs can be genetically bred to solve problems in a variety of different areas of artificial intelligence, machine learning, and symbolic processing (Koza 1989, 1990). In this recently developed "genetic programming" paradigm, the individuals in the population are compositions of terminals and functions appropriate to the particular problem domain. The set of terminals used typically includes inputs (sensors) appropriate to the problem domain and various constants. The set of functions used typically includes arithmetic operations, mathematical functions, conditional logical operations, and domain-specific functions. Each function in the function set must be well defined for any element in the range of every other function in the set which may appear as an argument to that function.

Often in writing computer programs, a portion of the programming code (e.g., a subroutine) is dedicated to defining a function, such that a particular calculation can be performed on various different combinations of arguments. For example, if an exponential function in the form of an approximation of the first five terms of the Taylor series is applied to a single variable x on one occasion in a computer program, a programmer could use the following code:

or the equivalent code in whatever programming language that is being utilized. However, if the same exponential function is applied to another variable y or a quantity, such as 3z.sup.2, later in the same program, the programmer would have to tediously reproduce the code for the specific variables. For instance, the code for variable y would be:

and the code for the quantity 3z.sup.2 would be:

In order to overcome this tedious process of writing separate code for each of the three situations, a programmer would want to be able to define a function in terms of a dummy variable (i.e., formal parameter) dv to accommodate all three uses of the exponential function, such as:

Once a function has been defined, it can be called an arbitrary number of times from an arbitrary number of different places in the program with different instantiations of its dummy variable (i.e., formal parameter), such as x, y and 3z.sup.2. Thus, the process of rewriting code can be avoided. Furthermore, defining functions enhances the understandability of a program because common calculations are highlighted.

Moreover, by defining and making multiple uses of a function, a problem can be decomposed into a hierarchy of which the defined function is a part. If one defined function is allowed to make use of another, previously defined function, the hierarchical decomposition is accentuated. Moreover, if one defined function is allowed to call on itself, either directly or indirectly, through a sequence of other functions, the hierarchical decomposition may be even more accentuated As a problem increases in size and complexity, decomposition of a problem using a function definition becomes an increasingly important tool for solving problems.

What is needed is a way to apply some of the general principles of biological natural selection that are embodied in the conventional genetic algorithm (i.e. survival of the fittest and crossing over of parents's traits to offspring) to a greatly expanded class of problems. In particular, what is needed is a method for adaptively creating computer programs involving complicated combinations of mathematical functions and their arguments, recursions, symbolic processing, and complicated data structures with no advance limitations on the size, shape, or complexity of the programs, including the use of function definitions created for the particular problem domain. One object of the present invention is to provide a genetic process to provide solutions for an expanded class of problems. A further object of the present invention is to provide a genetic process without any predetermined limits on the size, shape, or complexity of the members of the subject population.

In solving problems with genetically bred computer programs using a population composed of terminals and functions appropriate to the particular problem domain, a search space is developed for solving the problem in conjunction with the computer programs. This search space is the hyperspace of all possible compositions of functions that can be recursively composed of the available functions and terminals. The symbolic expressions (S-expressions) of the LISP programming language are an especially convenient way to create and manipulate the compositions of functions and terminals described above. These S-expressions in LISP correspond directly to the "parse tree" that is internally created by most compilers.

The basic genetic operations for the genetic programming paradigm are fitness proportionate reproduction and crossover (recombination). Fitness proportionate reproduction is the basic engine of Darwinian reproduction and survival of the fittest and operates for the genetic programming paradigm in the same way as it does for conventional genetic algorithms. The crossover operation for the genetic programming paradigm is a sexual operation that operates on two parental programs (i.e. LISP S-expressions) and produces two offspring S-expressions using parts of each parent. In particular, the crossover operation creates new offspring S-expressions by exchanging sub-trees (i.e. sub-lists) between the two parents. Because entire sub-trees are swapped, this genetic crossover (recombination) operation produces syntactically and semantically valid LISP S-expressions as offspring regardless of which allowable point is selected in either parent.

This genetic programming paradigm has been successfully applied (Koza 1989, 1990) to example problems in several different areas, including, but not limited to, (1) machine learning of functions (e.g. learning the Boolean 11-multiplexer function), (2) planning (e.g. developing a robotic action sequence that can stack an arbitrary initial configuration of blocks into a specified order), (3) automatic programming (e.g. discovering a computational procedure for solving pairs of linear equations, solving quadratic equations for complex roots, and discovering trigonometric identities), (4) sequence induction (e.g. inducing a recursive computational procedure for the Fibonacci and the Hofstadter sequences), (5) pattern recognition (e.g. translation-invariant recognition of a simple one-dimensional shape in a linear retina), (6) optimal control (e.g. centering a cart and balancing a broom on a moving cart in minimal time by applying a "bang bang" force to the cart), (7) symbolic "data to function" regression, symbolic "data to function" integration, and symbolic "data to function" differentiation, (8) symbolic solution to functional equations (including differential equations with initial conditions, integral equations, and general functional equations), (9) empirical discovery (e.g. rediscovering Kepler's Third Law, rediscovering the well-known econometric "exchange equation" MV=PQ from actual time series data for the money supply, the velocity of money, the price level, and the gross national product of an economy), and (10) simultaneous architectural design and training of neural networks.

Prior Art Function Definition

To applicant's knowledge, there is no known usage of automatic function definitions in conjunction with the genetic programming paradigm.

Prior Art Data Encoding

To Applicant's knowledge, genetic algorithms have not been applied to data or image compression. Numerous methods of presentation of video image information on display devices are well-known in the art. One such method involves displaying image data on a red-green-blue (RGB) display monitor. In an RGB color system, a display may be controlled by presenting pieces of color information to drive circuitry which in turn produces three electrical signals which control the red, green and blue colors on the display.

Image data for an image display device, such as a video display or a printer, is typically organized into multiple lines (e.g., scanlines), with each line holding image data for a fixed number of "pixels" (picture elements). The image data stored for each pixel can vary from a single bit for black and white images, 8 bits for representing 256 colors, or even more bits to represent even more colors. The image data stored for a pixel is often also stored as an ordered set (vector) of three such numerical values for color, each denoting the level of the various color attributes (e.g., red, green, blue, etc.). Where the number of bits representing a pixel is large, the amount of memory required to store all of the pixels corresponding to an image, and thus to store the image, is large. In order to reduce the amount of memory required to store or bandwidth required to transmit an image, image (data) compression is typically employed.

Various methods of data compression are known in the field. Known methods often rely on the removal of redundant information or the encoding of information in a more compact representation, including the method of fractal data compression. Fractal data compression and its application to image compression are discussed in The Use of Fractal Theory in a Video Compression System by Maaruf Ali, et al.. Another method of image compression is discussed by Karl Sims in Artificial Evolution of Computer Graphics.

Sims (1991) creates complex visual structures, textures, and motions on a video monitor using a three step process of random generation, personal selection, and mutation of LISP S-expressions. First, Sims randomly generates hundreds or thousands of randomly generated LISP S-expressions on a video monitor. Secondly, Sims selects those he found to have interesting visual structures, textures and motions. When an S-expression has an explicit time variable, the S-expression creates a video image which varies with time and thus presents motion. The images are presented on the video monitor of a computer workstation and Sims communicates his selections to the computer by means of the interactive features of the computer. Third, Sims modifies interesting S-expressions by random mutation of the S-expression, and, in some cases, by directed mutation. In directed mutation, Sims applies his own experience as to what specific mutations are likely to produce particular interesting or desired changes in the video image. By using his three step process of random generation, selection, and mutation (random or directed), Sims discovers an impressive variety of different visual images. Programming comparable images from scratch would have been extremely difficult or impossible.

What is needed is a way to apply some of the general principles of biological natural selection that are embodied in the conventional genetic algorithm (i.e. survival of the fittest and crossing over of parents' traits to offspring) to a greatly expanded class of problems. In particular, what is needed is a method for adaptively creating computer programs involving complicated combinations of mathematical functions and their arguments, recursions, symbolic processing, and complicated data structures with no advance limitations on the size, shape, or complexity of the programs to compress (encode) data to minimize computer storage, transmission costs or some other significant metric. One object of the present invention is to provide a genetic process to provide solutions for an expanded class of problems. A further object of the present invention is to provide a genetic process without any predetermined limits on the size, shape, or complexity of the members of the subject population.

REFERENCES CITED

U.S. Patents

U.S. Pat. No. 4,821,333, "Machine learning procedures for generating image domain feature detector structuring elements", issued Apr. 11, 1989, filed Aug. 22, 1986, Gillies.

U.S. Pat. No. 4,935,877, "Non-Linear Genetic Algorithms for Solving Problems", issued Jun. 19, 1990, filed May 20, 1988, Koza.

U.S. Pat. No. 4,697,242, "Adaptive Computing System Capable of Learning and Discovery", issued Sep. 29, 1987, filed Jun. 11, 1984, Holland et al.

U.S. Pat. No. 4,881,178, "Method of Controlling a Classifier System", issued Nov. 14, 1989, filed May 7, 1987, Holland et al.

Other Publications

Ali, Papadopoulus, Costas, et al., "The Use of Fractal Theory in a Video Compression System," in Proceedings at the 1992 Data Compression Conference, IEEE Computer Society Press, 1992.

Axelrod, Robert (Editor), "The Evolution of Strategies in the Iterated Prisoner's Dilemma" In Genetic Algorithms and Stimulated Annealing, p. 32, Pittman, London 1987.

Binmore, Kenneth G. and Larry Samuelson, "Evolutionary Stable Strategies in Repeated Games Played by Finite Automata" (Draft), Sixth World Congress of the Econometric Society, Barcelona, Spain, August 1990.

Davis, Lawrence (Editor) - Genetic Algorithms and Simulated Annealing, Pitman, London 1987.

De Jong, Kenneth A. Genetic algorithms: A 10 year perspective. Proceedings of an International Conference on Genetic Algorithms and Their Applications, Hillsdale, N.J.: Lawrence Erlbaum Associates 1985.

De Jong, Kenneth A. "On Using Genetic Algorithms to Search Program Spaces", Genetic Algorithms and Their Applications: Proceedings of the Second International Conference on Genetic Algorithms, Hillsdale, N.J.: Lawrence Erlbaum Associates 1987.

De Jong, Kenneth A. Learning with genetic algorithms: an overview. Machine Learning, 3(2), 121-138, 1988.

Fogel, L. J., Owens, A. J. and Walsh, M. J.-Artificial Intelligence through Simulated Evolution, New York: John Wiley 1966.

Fujiki, Cory-An Evaluation of Holland's Genetic Operators Applied to a Program Generator, Master of Science Thesis, Department of Computer Science, University of Idaho, 1986.

Goldberg, David E. - Computer-Aided Gas Pipeline Operation Using Genetic Algorithms and Rule Learning, (Doctoral Dissertation, University of Michigan, 1983) Dissertation Abstracts International 44(10), 3174B (University Microfilms No. 8402282).

Goldberg, David E., Genetic Algorithms in Search, Optimization, and Machine Learning, Reading, Mass.: Addision-Wesley 1989.

Green, Cordell C. et al., Progress Report on Program-Understanding Systems, Standford Artificial Intelligence Laboratory memo AIM-240, Standford University Computer Science Department, August 1974.

Grefenstette, John J. (Editor)-Proceedings of an International Conference on Genetic Algorithms and Their Applications, Pittsburgh, Pa. 1985.

Grefenstette, John J. (Editor)-Genetic Algorithms and Their Applications; Proceedings of the Second International Conference on Genetic Algorithms, Lawrence Erlbaum Associates, Hillsdale, N.J. 1987.

Hicklin, Joseph F.-Application of the Genetic Algorithm to Automatic Program Generation, Master of Science Thesis Department of Computer Science, University of Idaho, 1986.

Hillis, W. Daniel, "Co-Evolving Parasites Improve Simulated Evolution as an Optimizing Procedure", Emergent Computation: Self-organizing, Collective, and Cooperative Computing Networks, edited by S. Forrest, Cambridge, Mass., MIT Press 1990.

Holland, John H.-Adaptation in Natural And Artificial Systems, The University of Michigan Press, Ann Arbor, 1975.

Holland, John H. Escaping brittleness: The possibilities of general-purpose learning algorithms applied to parallel rule-based systems. In Michalski, Ryszard S., Carbonell, Jaime G. and Mitchell, Tom M. Machine Learning: An Artificial Intelligence Approach, Volume II, P. 593-623. Los Altos. Calif.: Morgan Kaufman 1986.

Holland, J. H. "ECHO: Explorations of Evolution in a Minature World". In Proceedings of the Second Conference on Artificial Life, edited by C. G. Langton, and J. D. Farmer, J. Doyne, Redwood City, Calif.: Addison-Wesley. 1990. In Press.

Holland, J. H. & Reitman, J. S. (1978), Cognitive systems based on adaptive algorithms, in D. A. Waterman & F. Hayes-Roth (Eds.), Pattern Directed Inference Systems (pp. 313-329), New York: John Wiley 1965.

Jefferson, David, Collins, Rob, et al. "The Genesys System: Evolution as a Theme in Artificial Life". In Proceedings of the 11th International Joint Conference on Artificial Life, edited by C. G. Langton and D. Farmer. Redwood City, Calif.: Addison-Wesley. 1990. In Press.

Koza, John R., Genetic Programming: A Paradigm for Genetically Breeding Populations of Computer Programs to Solve Problems, Stanford University, Dept. of Computer Science, Report No. STAN-CS-90-1314, June 1990.

Koza, John R., Hierarchical genetic algorithms operating on populations of computer programs, Proceedings of the 11th International Joint Conference on Artificial Intelligence (IJCAI), San Mateo, Calif.: Morgan Kaufman 1989.

Lenat, Douglas B. AM: An Artificial Intelligence Approach to Discovery in Mathematics as Heuristic Search, PhD Dissertation, Computer Science Department, Stanford University, 1976.

Lenat, Douglas B., The role of heuristics in learning by discovery: Three case studies, In Michalski, Ryszard S., Carbonell, Jaime G. and Mitchell, Tom M., Machine Learning: An Artificial Intelligence Approach, Volume I, P. 243-306, Los Altos, Calif.: Morgan Kaufman 1983.

Miller, J. H. "The Evolution of Automata in the Repeated Prisoner's Dilemma." In Two Essays on the Economics of Imperfect Information, PhD Dissertation, Department of Economics, University of Michigan, 1988.

Miller, John H., "The Coevolution of Automata in the Repeated Prisoner's Dilemma." Santa Fe Institute and Carnegie-Mellon University, Document No. 89-003, Oct. 15, 1987.

Schaffer, J. D. (editor), Proceedings of the 3rd International Conference of Genetic Algorithms, San Mateo, Calif.: Morgan Kaufman Publishers Inc. 1989.

Sims, Karl, Artificial Evolution of Computer Graphics, Computer Graphics, 25(4): 319-328, 1991.

Sims, Karl, Interactive Evolution of Dynamical Systems, In Varela, Francisco J., and Bourgine, Paul (editors), Toward a Practice of Autonomous Systems: Proceedings of the First European Conference on Artificial Life, MIT Press, 1992.

Smith, Steven F., A Learning System Based on Genetic Adaptive Algorithms, PhD dissertation, Pittsburgh: University of Pittsburgh, 1980.

Smith, Steven F., Flexible learning of problem solving heuristics through adaptive search, Proceeding of the 8th International Conference on Artificial Intelligence, Karlsruhe, Germany: Morgan Kaufman 1983.

Tanese, Reiko, Distributed Genetic Algorithm For Function Optimization, PhD. dissertation, Department of Electrical Engineering and Computer Science, University of Michigan, 1989.

Wilson, Stewart, W. Hierarchical credit allocation in a classifier system. Proceedings of the Tenth International Joint Conference on Artificial Intelligence, 217-220, 1987.

SUMMARY OF THE INVENTION

An apparatus and method for solving problems using automatic function definition and for performing data encoding on a set of data values is described. The present invention includes an apparatus and process for creating an initial population and then evolving that population to generate a result.

In one embodiment, the apparatus and process for solving problems using automatic function definition initially creates a population of entities. Each of the entities has sub-entities. At least one of the sub-entities in each entity is invoked externally by the controlling process. At least one of the entities in the population has a sub-entity which is invoked internally. The externally invoked sub-entities are capable of having actions and invocations of sub-entities which are invoked internally, and have access to material provided to the externally invocable sub-entities. Furthermore, in one embodiment, each sub-entity which is invoked internally is also capable of including actions and invocations of internally invocable sub-entities, and each has access to material provided to the externally invocable sub-entity, and material provided to itself. The population is then evolved to generate a solution to the problem.

One use of this embodiment is to automatically discover abstract functional subunits of the problem being solved, i.e. function definitions, so that the problem can be solved more efficiently, by means of the repeated invocation of these subunits (automatically defined functions) often using different arguments so as to produce a functionally similar effect in each invocation that is specialized by the arguments provided by each respective invocation.

In another embodiment, the apparatus and process initially creates a population of entities which are evolved to automatically encode a set of data values into a procedure capable of approximating those data values. One use of this embodiment is to encode data, such as video sonar, audio or radar images, into a function whose representation is cheaper to store and transmit than is the data that it has been evolved to represent.

In yet another embodiment, a population of entities is evolved such that the entities in the population are capable of making recursive references to themselves. One use of this embodiment is to facilitate the solution to problems that cannot be solved by iterative (non-recursive) means.

The population in each embodiment is evolved by iterating a series of steps. Each iteration includes externally invoking each entity to produce a result. A value is then assigned to each result and the value is associated with the corresponding entity which produced the result. With respect to the data encoding, the value is indicative of the closeness of the encoded version to the original set of data values. Next, at least one entity having a relatively high associated value is selected. Then an operation, such as crossover and reproduction, is chosen and performed on the entity to created a new entity. The new entity is then added to the population, such that the population evolves and generates a solution to the problem or an encoding of the data.

Many seemingly different problems can be reformulated into a problem requiring discovery of an entity, e.g., a mathematical expression or computer program that produces some desired output for particular inputs. When viewed in this way, the process of solving these seemingly different problems in the computer embodiment becomes equivalent to searching a space of possible mathematical expressions or computer programs for a most fit individual mathematical expression or computer program.

Computer programs have the ability to perform alternative computations conditioned on the outcome of intermediate calculations, to perform computations on variables of many different types, to perform iterations and recursions to achieve the desired result, and to define and subsequently use computed values and sub-programs. This flexibility found in computer programs facilitates the solution to these various different problems.

The process of solving these problems can be reformulated as a search for a most fit individual computer program in the space of possible computer programs. In particular, the search space can conveniently be regarded as the space of LISP "symbolic expressions" (S-expressions) composed of various terminals along with standard arithmetic operations, standard programming operations, standard mathematical functions, and various functions peculiar to the given problem domain. For example, the standard arithmetic functions of addition, subtraction, multiplication, etc., are relevant when we are attempting to construct a mathematical expression that might be the solution to an equation for generating an image. In general, the objects that are manipulated in our attempts to build computer programs are of two broad classes. These objects include functions of various numbers of arguments, such as addition mentioned above, or control structures such as If-Then-Else, Do-Until, etc.; terminals, such as dummy variables, the independent variable(s) in an equation (i.e., the actual variables of the problem) or constants, such as 0, 1, etc.

The LISP S-expression required to solve each of the problems described above tends to emerge from a simulated evolutionary progression using the non-linear genetic process which we call "Genetic Programming". This process starts with an initial population of LISP S-expressions (typically randomly generated), each composed of functions and terminals appropriate to the problem domain. When using the automatic function definition mechanism, the S-expressions are generated by superimposing a structure on top of the functions and terminals for the problem domain. The structure sets forth a set of components to automatically define a function or functions, including dummy variables (i.e., formal parameters) for use within the scope of each component. An S-expression is then randomly generated from the functions and terminals, including dummy variables when a function definition is being created.

The fitness of each individual in the population drives the process. In data encoding problems, fitness can be measured by the sum of the distances (taken for all the environmental cases) between the point in the solution space (whether real-valued, complex-valued, vector-valued, multiple-valued, Boolean-valued, integer-valued, or symbolic-valued) created by the S-expression for a given set of actual variables of the problem and the correct point in the solution space. In other problems, other fitness measures can be used.

In problems where fitness is the sum of errors (i.e., distances, differences), the closer this sum is to zero, the better the S-expression. If this sum is close to zero, there is a good fit. If this sum attains the closest possible value to zero, there is a best fit. If this sum actually attains the value of zero, there is a perfect fit. The notions of good, best, and perfect fit are well known in the art. Once the desired level of fitness is attained, the iteration of the evolutionary process can be terminated.

The initial individual S-expressions in the population typically will have exceedingly poor fitness. Nonetheless, some individuals in the population will be somewhat more fit than others.

Then, a process based on the Darwinian principle of reproduction and survival of the fittest (fitness proportionate reproduction) and the genetic operation of crossover (recombination) is used to create a new population of individuals. In particular, a genetic process of sexual reproduction (crossover) among two parental S-expressions will be used to create offspring S-expressions. At least one of the two participating parental S-expressions will be selected in proportion to fitness. The resulting offspring S-expressions will be composed of sub-expressions from their parents.

In addition, other operations such as mutation and permutation, define building blocks (encapsulation) and editing may be used.

Finally, the new population of offspring (i.e. the new generation) will replace the old population of parents and the process will continue.

At each stage of this highly parallel, locally controlled and decentralized process, the state of the process consists only of the current population of individuals. Moreover, the only input to the algorithmic process is the observed fitness of the individuals in the current population in grappling with the problem environment. This process produces a population which, over a period of generations, tends to exhibit increasing average fitness in dealing with its environment, and which, in addition, can robustly (i.e. rapidly and effectively) adapt to changes in their environment. The solution produced by this process at any given time can be viewed as the entire population of distinctive alternatives (typically with improved overall average fitness), or more commonly, as the single best individual in the population found during execution of the run.

The hierarchical character of the computer programs is an essential aspect of the process. The results of this process are inherently hierarchical and in many cases the results contain default hierarchies which often solve the problem in a relatively parsimonious way.

The dynamic variability of the size and shape of the computer programs that are developed along the way to a solution are also an essential aspect of the process. In each case, it would be difficult and unnatural to try to specify or restrict the size and shape of the eventual solution in advance. Moreover, the advance specification or restriction of the size and shape of the solution to a problem narrows the window by which the system views the world and might well preclude finding the solution to the problem.

DESCRIPTION OF THE DRAWINGS

The present invention will be understood more fully from the detailed description given below and from the accompanying drawings of the preferred embodiment of the invention, which, however, should not be taken to limit the invention to the specific embodiment but are for explanation and understanding only.

FIG. 1 is a tree diagram representation of a LISP S-expression (program).

FIG. 2 is a tree diagram representation of a LISP S-expression (program).

FIG. 3 are flow chart diagrams of the present invention.

FIG. 4 is a tree diagram representation of a crossover operation occurring at internal points.

FIG. 5 is a tree diagram representation of a crossover operation occurring at external points.

FIG. 6 is a tree diagram representation of a crossover operation occurring at an internal and external point.

FIG. 7 is a tree diagram representation of a permutation operation.

FIG. 8 is a block diagram of the parallel processing embodiment of the present invention.

FIG. 9 is a chart diagram of the linear equation problem. is a tree diagram representation of an S-expression which is a member of the initial population for solving the linear equation problem using the present invention.

FIG. 11 is a tree diagram representation of a crossover operation as applied to two individuals during a solution of the Fibonacci series problem.

FIG. 12 illustrates a simple entity, namely the symbolic expression in the LISP programming language for the mathematical expression A+B*C.

FIG. 13 illustrates the simple entity in FIG. 12 after application of the "Define Building Block" (encapsulation) operation.

FIG. 14 illustrates the portion of the simple entity in FIG. 12 being represented by the function defined by the "Define Building Block" (encapsulation) operation.

FIG. 15 illustrates an example of an entity under the present invention.

FIG. 16 illustrates one embodiment of the structure of an example entity in the population.

FIG. 17 illustrates an example of the crossover operation in conjunction with two example entities having automatically defined functions.

FIG. 18 illustrates the preferred embodiment for performing the crossover operation in conjunction with two example entities having automatically defined functions.

FIGS. 19, 20 and 21 are graphs of points for an unknown curve used to illustrate symbolic function identification and regression.

FIG. 22 illustrates a black and white diagram of a color image.

FIG. 23 illustrates a black and white diagram representing a color image produced by the present invention.

FIG. 24 illustrates an example of the embodiment of the structure of an entity for the even-4-parity function.

FIG. 25 illustrates the S-expression for the even-4-parity function generated according to the present invention.

FIG. 26 illustrates the result producing component of the best-of-run individual from generation 12 for the even-4-parity problem using the present invention.

FIG. 27 depicts the second defined function evolved during a particular run for the even-4-parity example using the present invention.

FIG. 28 depicts the result-producing component evolved during a particular run for the even-4-parity example using the present invention.

FIG. 29 illustrates an example of the hierarchical functional dependencies generated by the present invention.

FIG. 30 illustrates an example of a solution generated on a run of the even-5-parity problem using the present invention.

FIG. 31 illustrates an example of a solution generated on a run of the odd-5-parity problem using the present invention.

FIG. 32 illustrates a typical computer configuration.

FIG. 33 illustrates a recursive formulation of the multiplication function RMULT.

FIG. 34 illustrates a general structure (template) used for the definition of recursive entities.

NOTATION AND NOMENCLATURE

This following description relates specifically to the automatic definition of functions, a topic not covered in its parent, co-pending U.S. patent application (continuation-in-part) Ser. No. 07/500,791, filed Mar. 28, 1990, titled Non-linear Genetic Algorithms for Solving Problems by Finding a Fit Composition of Functions. The choice of terminology used in the parent patent was made in a manner intended to increase the intelligibility of the description of the invention. However, the same terminology might be confusing in the following description in part because the following description requires a distinction between the definition of a function with arguments and the use of (call to) a function with a distinct set of instantiated arguments. Thus, the following terms are defined as:

ACTION-Actions are the primitive operations in an entity. Actions are often parameterized. If the entities are in the preferred computer embodiment, then an action may be a function, such as "+" or "IF", or a terminal such as "MOVE-LEFT". If the embodiment were to be in the domain of robots, then an action might be an operation by such a robot that applies to something else. For example, an action might be "PICK-UP", where the thing being picked up is specified as an argument to the action. An example of a non-parameterized action might be "MOVE-LEFT". Actions may be invoked using the results of other actions as arguments.

MATERIAL-Material is that which is provided to actions for their execution. This material might consist of externally provided material (the values of the actual variables of the problem), internally provided material (the values of dummy variables), or constants. In the computer embodiment, material is typically comprised of values, such as "TRUE", "FALSE" or 3.5. In an embodiment using robots in manufacturing, the material might be the contents of parts bins or the ingredients for the process. Thus, material encompasses both information and physical objects used in the problem solving process.

FUNCTION-A function is the computer embodiment of an action. We do not use the term in the strict mathematical sense of the word. For example, a function in a particular problem domain, which we will call "MOVE-BLOCK" might be viewed as transforming one state space into another. "MOVE-BLOCK" might cause a block in a simulated world to be moved from one side of a table to the other. This can be viewed as transforming one state space in which the block is on one side of the table into a new state space in which the block is on the other side of the table. Programmers often view this as a process of side-effecting (i.e. changing the values of) state variables. Thus, by "function" we mean any construct that takes zero or more arguments, returns zero or more values, and transforms the arguments and/or side-effects some global or lexical state. Other examples of "function" using this definition could therefore be "+", which takes numbers as its arguments and returns the sum of the arguments, "PRINT" which takes an argument and prints it (to the global environment), "PROGN" which takes program segments as arguments and returns the result of evaluating its last argument after evaluating all of its other arguments in sequence, and so-called non-strict operators such as "IF" which takes program segments as arguments and returns the result of evaluating one of its arguments dependent upon the result of evaluating its "condition" argument. "MOVE-BLOCK" might be, therefore, a function that takes no arguments, returns no values and whose purpose consists of side-effecting the state space of the problem. One could also view "MOVE-BLOCK" as a function that takes an old state space as its argument and returns as its value a new and transformed state space. This definition of "function" therefore subsumes, among others, the programming terms function, operator, procedure, macro, NLAMBDA and Special Form.

PRIMITIVE FUNCTION-A primitive function is a function provided by the user of the present invention for use during the problem-solving process. Primitive functions usually derive from the problem domain or are provided because they are thought to be useful in the problem-solving process. Primitive functions are common to the entire population, though there may be specific limits on which types or categories of functions may be used in certain contexts.

AUTOMATICALLY DEFINED FUNCTION (ADF, DEFINED FUNCTION)-An automatically defined function is a function whose behavior is evolved during the problem-solving process. ADFs are defined within particular entities in the population and the ADFs defined within a particular entity are used only within that particular entity. Each entity will have at most one definition for an ADF of a particular name, though there may be many different definitions for an ADF of that same name within the population.

TERMINAL-A terminal is the termination point of an entity (i.e., program in the preferred computer embodiment) when that program is represented as a tree. A terminal could be, but is not limited to, a constant (such as the number 1.5 or a structured constant value, such as a constant matrix), a variable (such as x, which might be a dummy variable (formal parameter) or an actual variable of the problem), a function of no arguments that performs side-effects on the environment of activation of the entity, a function of no arguments that causes data to be read from a table, database or from some sensor machinery.

DUMMY VARIABLE (FORMAL PARAMETER)-A variable, which is local to the definition of a function. For example, if we define the sine function in the computer embodiment in terms of the Taylor series expansion, a function might be defined in the form:

In this case, the variable X is a dummy variable. When the function so defined is invoked with a specific argument, such as in the call:

the dummy variable X takes on the value of the argument provided to the function (i.e., 0.34 in this case). Thus, this dummy variable is said to have been instantiated with the value 0.34.

A dummy variable is therefore a symbol which takes on the meaning of the specific argument to a function during the execution of that function. A dummy variable has no meaning outside the definition of the function.

ACTUAL VARIABLE OF THE PROBLEM-The actual variables of the problem are variables whose values are defined externally to the problem being solved. These are frequently the independent variables of the problem. For example, if one were evolving a program in the computer embodiment to control a robot, the actual variables of the problem might be the physical coordinates of the robot, i.e. X-POSITION and Y-POSITION. Actual variables of the problem are sometimes called Global Variables.

ARGUMENT-An argument is a specific value which is passed to a function. The value of an argument becomes the specific instantiated value of the associated dummy variable of the function being called (invoked). For example, in the expression:

in the computer embodiment, the sine function is called with one argument, namely, the result of the evaluation of the expression 3*4, i.e., 12. In this case, the one dummy variable of the sine function will be invoked with the value 12. Because the sine function requires exactly one such argument in order to instantiate its dummy variable, the sine function is said to take one argument, or is a one-argument function. A function taking n arguments is said to have arity n.

ARGUMENT LIST-The argument list of a function is the ordered set of dummy variables used by that function. In the example above, the sine function has the argument list (X), where "(X)" denotes a list containing the single element "X" and "(X Y)" would be the argument list for a two-argument function, the arguments being X and Y.

DETAILED DESCRIPTION OF THE INVENTION

The present invention describes a non-linear genetic process for problem solving and for encoding data. In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present invention. It will be apparent, however, to one skilled in the art that the present invention may be practiced without using these specific details. In other instances, well-known methods and structures have not been described in detail so as not to unnecessarily obscure the present invention.

The present invention operates on a population of entities. The entities must possess an ability to produce an objectively observable result. To provide utility, the entities must direct their actions toward a constructive end, even if their results do not always serve those ends. The iterative process of the present invention produces a population which tends to accomplish its constructive ends better over time.

The present invention uses a value assigned to the objectively observable result produced by an entity to guide an evolutionary process for solving a problem. Entities are selected from the population in accordance with their value (which we often refer to as "fitness"). They are then retained in the population ("reproduced"), retained with slight random modification ("mutated"), or selected, in groups (typically pairs) to participate in an operation which produces a new offspring entity by combining a portion of each entity in the group (pair). This latter operation is referred to as the recombination or crossover operation.

The iterative steps of activating an entity, observing its results, assigning a value to its observed results, selecting, and then reproducing, mutating, or recombining can take place in a variety of different media. Because of their extreme behavioral plasticity, computers offer an extremely flexible medium for performing the foregoing steps; however, the same steps can be performed in other media.

Although the preferred embodiment uses computer programs as entities, using other types of entities remain within the scope and spirit of the present invention. For example, electrical circuits could provide a population for the iterative process of the present invention. The circuits could reproduce and recombine sub-circuits from two parental circuits until a circuit performing the desired behavior (function) is attained. Additionally, different automobile designs could comprise another population, with elements of the design or sub-processes of manufacture taken as different alleles for crossover and rearrangement. Additionally, a population of mechanical nanomachines could comprise another population. Nanomachines are extremely small (and potentially extremely inexpensive) machines built up from individual atoms and molecules or groups of atoms and molecules that are very much smaller than ordinary mechanical devices. Nanomachines, just like more commonplace larger machines, perform specified actions on available material. For example, a nanomachine could perform a physical action on a quantity of physical material, just as an ordinary large machine performs an action on material. The step of reproducing could then be performed on a nanomachine. Similarly, the step of mutating would be performed on a nanomachine. And, similarly, the step of recombining portions to a group of two or more nanomachines could be performed.

The precise result of the action of a machine depends on both its specified action and on the precise material provided to that machine. For example, if a machine performs the action of crushing, the result of the action of that machine when supplied with glass material is crushed glass, whereas the result of the action of that machine when supplied with plastic material is crushed plastic. If the machine performs the action of heating, shaking, squeezing, or pressing, the machine performs those actions on the particular material supplied to it.

The material supplied to a physical machine (whether of ordinary size or very small size) corresponds to the arguments supplied to a computational structure including the actual variables of the problem. The action of the machine is the same (i.e., heating) regardless of the nature of the particular material on which it is acting (e.g., glass, plastic).

The actions performed by a machine on its material need not be a physical action. For example, if the entity is an electrical machine whose action consists of the action of a primitive electrical component (e.g., a resistor, amplifier, inductor, capacitor, etc.), that machine performs its action on whatever signal (i.e., material) is supplied to it. An amplifier that increases the amplitude of an electrical signal by a factor of three can perform its amplifying action on a weak sine wave signal or a strong sawtooth wave signal.

Hierarchical compositions of machines, each performing a specified action on the particular material supplied to it, can perform certain overall tasks.

Thus, although the following description uses computer programs as entities, the description does not limit the present invention. Further, the use of sequential iteration is only a preferred embodiment. Methods for the use of parallel processing is also presented.

The Representation of the Population

The computer languages FORTRAN, COBOL, ALGOL, PL/1, FORTH, PASCAL, C, PROLOG, ADA, BASIC, etc. provide, in general, the ability to write complicated mathematical expressions, recursions, complex data structures, and symbolic expressions. Using any of these languages, one can write symbolic expressions that are executable as computational procedures (or programs) within the language itself. Also, any of these languages can generate symbolic expressions, although often this process is inconvenient and inefficient. In general, most computer languages do not allow arbitrarily complex expressions to be written. Also, most do not delay assigning actual computer memory (and types) in the computer for such expressions until just prior to actual execution of the expression. Such a memory management method is termed dynamic storage allocation.

One existing computer language, however, has all the features discussed above and is generally available in relatively efficient forms on a variety of computers. This language is LISP, and is the computer language of choice for many artificial intelligence applications. Many dialects of the LISP language have been created over the years. A dialect of LISP called "Common LISP" has started to emerge as a standard.

The LISP programming language's basic structure is a list of items (an ordered set of items contained within a pair of parentheses). An important source of LISP's simplicity, generality, and power arises from treating the first element in every list encountered as a function to be executed, termed "evaluated", and treating the remaining elements of the list as arguments to that function. Moreover, unless otherwise indicated, LISP reads, evaluates, and returns a value for each such function it encounters. Thus, in LISP, entire computer programs can appear as merely function invocations within function invocations within yet more function invocations (often called "compositions" of functions and terminals or more simply a "composition" of functions). Applying functions to arguments as encountered controls the flow of LISP programs. In other words, the control structures in LISP are based on the composition of functions.

Within the outermost pair of parentheses in LISP, there may be numerous functions and operators, including functions for performing arithmetic, functions for performing recursions, functions for modifying symbolic expressions, functions for conditionally varying the program flow, and other complex functions. A key feature of LISP is that LISP programs have the same form as the data they manipulate. As the above features indicate, LISP is a functional programming language. LISP is not the only existing functional programming language nor is it the only possible functional programming language. It is, however, the most widely used language in this category and well-suited for the requirements at hand.

In spite of the complex results obtained, LISP can be viewed as being very simple because it simply reads, evaluates, and returns a value for each such function it encounters. This seeming simplicity gives LISP enormous flexibility (including the flexibility to accommodate computational procedures which modify themselves and execute themselves). This enormous flexibility makes LISP the preferred computer language for the present invention.

For example, consider the simple mathematical expression ordinarily written as 5*4-3*2. To evaluate this expression, one must start by first evaluating 5*4. One evaluates 5*4 by performing the function of multiplication (*) on the two arguments (5 and 4). The basic structure in LISP is a list of items (that is, an ordered set of items contained within a set of parentheses). Moreover, unless otherwise indicated, LISP treats the first item in every list encountered as a function and the remaining items in the list as arguments to that function. Thus, LISP represents 5*4 as (*54). Here a function (i.e. the multiplication function denoted by *) is the first item of the list and the two arguments to the function (i.e. the two numbers to be multiplied) follow. Similarly, LISP denotes 3*2 as (*32). Once these two multiplications are executed (evaluated), the subtraction function then has the two arguments (i.e. 20 and 6). The two values obtained by evaluating these two multiplication functions are treated as arguments to the subtraction function which performs the operation of subtraction, which is (-(*54) (*32)). Expressions such as (-(*54) (*32)) in LISP are called symbolic expressions (S-expressions). Here the function of subtraction (-) is performed on the results previously obtained for (*54) and (*32). When a simple number or variable is used as the argument of a function (such as the 3 or 2 in the multiplication 3*2), it is called an "atomic" argument. The contrasting situation occurs with a composition of functions when the argument to one function is itself the result of carrying out an earlier (embedded) computation. We can represent increasingly complex mathematical expressions by embedding previous results within new expressions in this manner.

It is helpful to graphically depict a functional programming language's expressions. Functional expressions can be viewed graphically as a tree with labels on the various points of the tree. In particular, any such expression can be viewed as a rooted point-labeled tree in which the internal points of the tree are labeled with functions and the endpoints of the lines radiating downwards from each such internal point is labeled with the arguments to that function. The term "downwards" in connection with rooted-point labeled trees means extending farther away from the root of the tree. The external points of the tree (sometimes called "leafs") are labeled with the terminals. The root of the tree is the particular internal point labeled with the function executed last. In a LISP S-expression, the first function is the outer-most LISP function (i.e. the function just inside the outermost left parenthesis of the LISP S-expression).

FIG. 1 illustrates this for LISP using the equation 5*4-3*2. In the ordinary notation of arithmetic shown as equation 100, the function 104 (multiplication) operates on the arguments 102 and 106 (i.e. 5 and 4 respectively) and the function 112
(multiplication) operates on the arguments 110 and 114 (i.e. 3 and 2 respectively). The function 108 (subtraction) then operates on the results of these two functions as its arguments. The function 108 is higher in the hierarchy than the functions 104
and 112.

In FIG. 1, the LISP S-expression 120, (-(*54) (*32)) is expressed as the function 124 (multiplication) operating on the arguments 126 (the number 5) and 128 (the number 4) and the function 130 (multiplication) operating on the arguments 132 (the number 3) and 134 (the number 2). The function 122 (subtraction) then operates on the results of these two evaluations.

When presented graphically in FIG. 1, the internal point 150 of the tree 130 with root 140 is labeled with the function of multiplication (*) and the external points 156 and 158 of the tree are labeled with the two arguments to the multiplication function (i.e. 5 and 4 respectively). The arguments to a given function (such as the multiplication function denoted by the internal point 150) are found by following the lines 152 and 154 radiating downwards from the internal point 150. Similarly, the internal point 160 of the tree is labeled with the function of multiplication and the external points of the tree 166 and 168 are labeled with the two arguments to the multiplication function (i.e., 3 and 2, respectively). The arguments to the function
160 are found by following the lines 162 and 164 radiating downwards from the internal point 160. The internal point of the tree 140 is labelled with the subtraction function. The arguments to the subtraction function are found by following the lines
142 and 144 radiating downwards from point 140. These arguments turn out to be the results of the previously performed multiplication operations. Arguments may be found at external points (if they are terminals) or at internal points (i.e. when the arguments to one function, such as subtraction here at 140, are the result of previous functions). The internal point 140 is the root of the tree and is labeled with the outermost function (subtraction). Internal point 140 is equivalent to point 122 in the LISP S-expression 120 (i.e., the function just inside the outermost left parenthesis of the LISP S-expression).

The advantage of a computer language such as LISP for performing work of this kind derives from the enormous flexibility arising from repeated applications of this very simp