United States Patent5560003
Nilsen , ; et al.September 24, 1996

Title

System and hardware module for incremental real time garbage collection and memory management

Abstract

The garbage-collecting memory module (GCMM) functions much like traditional memory in a computer system, thereby permitting the invention to be utilized with a wide variety of computers. It differs from traditional memory in that it automatically cleanses itself of garbage while functioning as traditional memory without causing excessive delays in the execution of application programs by an associated computer. The GCMM can be designed to interface with a computer system via a traditional memory bus and to communicate with the central processing unit (CPU) of the computer using standard communication protocols. The GCMM is comprised of a memory, a means for communicating with the CPU, and a garbage-collecting control unit. The garbage-collecting control unit gives top priority to satisfying the computer's requests for memory services. The collection of garbage takes place during the intervals between memory service requests. Garbage collection is accomplished by copying live objects that are stored in one region of memory to a second region thereby leaving dead objects behind in the first region. When the copying process has been completed, the dead objects are disposed of, and the garbage-collecting process continues with the copying of live objects in the second region back to the first. An up-to-date list of live objects is maintained by the CPU and forwarded to the GCMM at the start of each garbage-collection cycle.


Inventors:Nilsen; Kelvin D. (Ames, IA), Schmidt; William  (Rochester, MN)
Assignee:Iowa State University Research Foundation, Inc. (Ames, IA)
Appl. No.:994517
Filed:December 21, 1992

Current U.S. Class:707/206 
Current International Class:G06F 12/02 (20060101)
Field of Search:395/425,600

U.S. Patent Documents
4775932October 1988Oxley et al.
4797810January 1989McEntee et al.
4807120February 1989Courts
4967353October 1990Brenner et al.
5088036February 1992Ellis et al.
5136706August 1992Courts
5218698June 1993Mandl
5241673August 1993Schelvis
5293614March 1994Ferguson et al.
5321834June 1994Weiser et al.
Other References
Corporaal, Automatic Heapmanagement and Real Time Performance, Compeuro '91 Advanced Computer Technology Prov., 1991, pp. 290-295. .
Corporaal et al, The Design Space of Garbage Collection, Compeuro '91 Advanced Computer Technology Proc., 191, pp. 423-428 using incremental copying collectors, System Sciences, 1991 Annual Haugic International Conf. Proc., 191 pp. 344-352..~
Primary Examiner: Amsbury; Wayne
Attorney, Agent or Firm:Malm; Robert E.

Government Interests:


The U.S. Government has a paid-up license in this invention and the right
in limited circumstances to require the patent owner to license others on
reasonable terms as provided for by the terms of Grant MIP 9010412 awarded
by the National Science Foundation and Grant ITA 87-02 awarded by the
Department of Commerce.

Claims


What is claimed is:
1. A garbage-collecting memory module (GCMM) for use with a computer system having one or more digital processors, said digital processor(s) maintaining list(s) of source descriptors pointing to regions of memory in said GCMM containing live objects, said GCMM comprising:
a memory for the storage of objects;
a means for communicating with said digital processor(s);
a garbage-collecting control unit which (1) allocates space for and stores an object in said memory upon request by one of said digital processor(s), (2) causes an object to be retrieved from said memory and returned to one of said digital processor(s) upon request, and (3) collects garbage from said memory utilizing said source descriptors supplied by said digital processor(s).

2. The garbage-collecting memory module of claim 1 wherein:
said memory comprises a from-space region and a to-space region for the storage of objects, a designated word of each of said objects being a header that specifies the size of said object and whether said object contains descriptors;
said garbage collecting takes place in cycles, the names and functions of said from-space and to-space regions being interchanged at the start of a garbage-collection cycle, new objects being allocated space in to-space during each of said garbage-collecting cycles, said from-space containing both live and dead objects at the start of a garbage-collecting cycle, said to-space being initialized to contain no objects at the start of a garbage-collection cycle;
said garbage-collecting control unit collects garbage by obtaining said source descriptors from said computer system, entering the objects to which said source descriptors point and slice regions containing data belonging to said objects into a copying queue, said objects being called source-descriptor objects, allocating space in to-space for said objects and slice regions, writing a header and a from-space pointer for each of said objects and slice regions in predetermined memory cells of the to-space memory allocations of said objects and slice regions, said from-space pointers pointing to the locations of said objects and slice regions in from-space, replacing the header of each of said objects and slice regions in from-space with a forwarding pointer to the header in to-space, updating said source descriptors to point to to-space, returning the updated source descriptors to said computer system, and copying the objects and slice regions in said copying queue from from-space to to-space when said GCMM is not performing nor been requested to perform memory services for said computer system.

3. The garbage-collecting memory module of claim 2 wherein the objects stored in said memory include weak-pointer objects, a weak-pointer object being an object containing a weak pointer, objects referenced only by weak pointers being garbage, weak-pointer objects being distinguishable by said GCMM from other objects, said weak-pointer objects being entered into a weak-pointer object postprocessing (WPOP) queue after being entered into said copying queue, postprocessing of the objects in the WPOP queue being performed after all objects in said copying queue have been copied, postprocessing of a weak-pointer object consisting of either (1) updating the weak pointer to reflect the referenced object's new location in to-space if the object has been copied into to-space or (2) overwriting the weak pointer with 0 in the weak-pointer field if the object has not been copied into to-space.

4. The garbage-collecting memory module of claim 2 wherein said garbage collecting delays the storage of objects by at most 30 memory cycles, delays the retrieval of objects by at most 50 memory cycles, and delays the allocation of objects near the start of a garbage-collection cycle by at most 25 memory cycles times the number of said source descriptors maintained by said digital processor(s).

5. The garbage-collecting memory module of claim 2 wherein said garbage-collecting control unit identifies descriptors resident within objects in said copying queue, said descriptors being called resident descriptors, adds the objects to which said resident descriptors point to said copying queue unless said resident-descriptor objects have already been copied or placed in said copying queue, allocates space in to-space for said objects, writes a header and a from-space pointer for each of said objects in predetermined memory cells of the to-space memory allocations of said objects, replaces the header of each of said objects in from-space with a forwarding pointer to the header in to-space, and updates said resident descriptors to point to to-space.

6. The garbage-collecting memory module of claim 5 wherein said source-descriptor objects include only those weak-pointer objects from which the data fields are fetched by one of said digital processor(s) during garbage collection, a weak-pointer object being an object containing a weak pointer, objects referenced only by weak pointers being garbage, weak-pointer objects being distinguishable by said GCMM from other objects.

7. The garbage-collecting memory module of claim 6 wherein said digital processor(s) may enquire into the value of a weak pointer and the status of the object referenced by the weak pointer without causing said weak-pointer objects to be included with said source-descriptor objects.

8. The garbage-collecting memory module of claim 5 wherein a slice region comprises a header and a plurality of subregions, all subregions except the first and last having the same number of memory cells, the sum of the numbers of memory cells in the first and last subregions being equal to the number of memory cells in each of the other subregions, said garbage-collecting control unit placing each copied slice object in a scanning queue, entering initial values for a slice region control block into the former slice region in from-space for each slice region copied to to-space, and temporarily replacing the slice region's title in to-space with a pointer to said slice region control block in from-space, said slice region control block consisting of a pointer to the slice region in to-space, a number specifying the number of memory cells occupied by the slice region, a pointer to the slice region control block for the previous slice region copied, and a plurality of subregion control blocks, each subregion control block consisting of a first-cell pointer to the first memory cell referenced by slice objects pointing into said subregion and a length which, when added to said pointer, corresponds to the last memory cell occupied by any slice object pointing into said subregion, said garbage-collecting control unit scanning the slice objects in said scanning queue after said copying queue empties and postprocessing said slice region control blocks after said scanning queue empties, said scanning resulting in the entry of final values in said slice region control blocks for each slice region referenced by a slice object in said scanning queue and the restoration of the title of each slice region following the entry of final values in the associated slice region control block, said postprocessing resulting in the examination of each slice region control block for garbage-containing memory regions, and the separation of slice regions having garbage-containing memory regions into multiple slice regions provided said garbage-containing slice regions are large enough to accept slice region headers.

9. The garbage-collecting memory module of claim 8 wherein the number of memory cells in said first subregion is changed at the start of each garbage-collection cycle.

10. The garbage-collecting memory module of claim 8 wherein said garbage-collecting control unit includes an object space manager for from-space and an object space manager for to-space, the object space manager for the region of memory called to-space generating and storing an object locator code for each memory cell allocated to an object, said object locator code being retrievable by said garbage-collecting control unit and translatable into the address of the memory cell in which said object header is located.

11. The garbage-collecting memory module of claim 10 wherein each of said memory regions contains L.sub.1 memory cells, the individual memory cells being represented by the integers from 0 to L.sub.1 -1, the addresses of the individual memory cells being the memory cell integers plus an integer offset, said offset being different for from-space and to-space, the difference in offsets for the two regions of memory being equal to or greater than L.sub.1, L.sub.1 having a plurality of submultiples N.sub.1, N.sub.2, . . . N.sub.G, the largest submultiple N.sub.G being equal to L.sub.1, an object being allocated memory cells from S.sub.1 to S.sub.2, each of said object space managers comprising:
an object space manager memory;
an encoder which generates an object locator code [M(S), S.sub.1 modulo N.sub.M(S) ] for each memory cell S in the range S.sub.1 to S.sub.2, S.sub.1 and S.sub.2 being inputs to said encoder, S.sub.1 being the memory cell occupied by the header of the object, M(S) being the subscript of the smallest of said submultiples for which INT(S/N.sub.M(S)) equals INT(S.sub.1 /N.sub.M(S)), INT() being the integer portion of the quantity in parentheses, said object locator code being outputted to said object space manager memory for storage;
an object locator which accepts a memory cell address S allocated to said object as input, retrieves the object locator code from said object space manager memory, and produces said S.sub.1 as output, S.sub.1 being equal to [INT(S/N.sub.M(S))]N.sub.M(S) +(S.sub.1 modulo N.sub.M(S)).

12. The garbage-collecting memory module of claim 8 wherein said garbage-collecting control unit includes a plurality of from-space and a plurality of to-space object space managers, said from-space object space managers being assigned contiguous regions of equal size of from-space, said to-space object space managers being assigned contiguous regions of equal size of to-space, an object space manager generating and storing an object locator code for each memory cell allocated to an object in its assigned region, said object locator code being retrievable by said garbage-collecting control unit and translatable into the address of the memory cell in which said object header is located.

13. The garbage-collecting memory module of claim 12 wherein said from-space and to-space memory regions each contain L.sub.1 memory cells, the individual memory cells being represented by the integers from 0 to L.sub.1 -1, the addresses of the individual memory cells being the memory cell integers plus an integer offset, said offset being different for from-space and to-space, the difference in offsets for the two regions of memory being equal to or greater than L.sub.1, the number of object space managers for each of the two regions being equal to L.sub.1 /F, F being a submultiple of L.sub.1, the object space managers being represented by the integers from 0 to L.sub.1 /F-1, a from-space or to-space memory cell L being assigned to object space manager INT(L/F) where INT() is the integer portion of the quantity in parentheses, F having a plurality of submultiples N.sub.1, N.sub.2, . . . N.sub.G, the largest submultiple N.sub.G being equal to F, an object being allocated memory cells from S.sub.1 to S.sub.2, an object space manager comprising:
an object space manager memory;
an encoder which generates an object locator code [M(S), S.sub.1 modulo N.sub.M(S) ] for each from-space or to-space memory cell S assigned to said object space manager and in the range S.sub.1 to S.sub.2, S.sub.1 and S.sub.2 being inputs to said encoder, S.sub.1 being the memory cell occupied by the header of the object, M(S) being the subscript of the smallest of said submultiples for which INT(S/N.sub.M(S)) equals INT(S.sub.1 /N.sub.M(S)), said object locator code being [Z, S.sub.1 modulo F] if INT(S.sub.1 /F) is less than INT(S/F), Z being the difference between INT(S.sub.1 /F) and INT(S/F), said object locator code being outputted to said object space manager memory for storage;
an object locator which accepts a memory cell address S allocated to said object as input, S being within said object space manager's assigned region of memory, retrieves the object locator code from said object space manager memory, and produces said S.sub.1 as output, S.sub.1 being equal to [INT(S/N.sub.M(S))]N.sub.M(S) +(S.sub.1 modulo N.sub.M(S)) if the first term of said object locator code is a positive integer, S.sub.1 being equal to [INT(S/F)]F+ZF+(S.sub.1 modulo F) if the first term of said object locator code is a negative integer.

14. The garbage-collecting memory module of claim 8 wherein said scanning comprises the steps:
reading the slice region pointer of the first slice object in said scanning queue;
finding the header of the referenced slice region;
finding the control block associated with said slice region, said header being the pointer to said control block;
identifying the subregion containing the first address referenced by said slice object;
updating said first-cell pointer and said length field within the control block of said subregion;
restoring the title of said slice object and removing said slice object from said scanning queue;
updating each descriptor in said slice region referenced by said slice object to point to to-space.

15. The garbage-collecting memory module of claim 8 wherein said postprocessing comprises the steps:
searching said subregion control blocks from first to last for contiguous segments of live data;
overwriting the memory preceding each contiguous segment of live data found with an appropriate slice data header.

16. The garbage-collecting memory module of claim 2 wherein said garbage-collecting control unit includes a from-space and a to-space object space manager, each of said object space managers generating and storing an object locator code for each memory cell occupied by an object, said object locator code being retrievable by said garbage-collecting control unit and translatable into the address of the memory cell in which said object header is located.

17. The garbage-collecting memory module of claim 16 wherein each of said memory regions contains L.sub.1 memory cells, the individual memory cells being represented by the integers from 0 to L.sub.1 -1, the addresses of the individual memory cells being the memory cell integers plus an integer offset, said offset being different for from-space and to-space, the difference in offsets for the two regions of memory being equal to or greater than L.sub.1, L.sub.1 having a plurality of submultiples N.sub.1, N.sub.2, . . . N.sub.G, the largest submultiple N.sub.G being equal to L.sub.1, an object being allocated memory cells from S.sub.1 to S.sub.2, each of said object space managers comprising:
an object space manager memory;
an encoder which generates an object locator code [M(S), S.sub.1 modulo N.sub.M(S) ] for each memory cell S in the range S.sub.1 to S.sub.2 allocated to said object, S.sub.1 and S.sub.2 being inputs to said encoder, S.sub.1 being the memory cell occupied by the header of the object, M(S) being the subscript of the smallest of said submultiples for which INT(S/N.sub.M(S)) equals INT(S.sub.1 /N.sub.M(S)), INT() being the integer portion of the quantity in parentheses, said object locator code being outputted to said object space manager memory for storage;
an object locator which accepts a memory cell number S allocated to said object as input, retrieves the object locator code [M(S), S.sub.1 modulo N.sub.M(S) ] from said object space manager memory, and produces said S.sub.1 as output, S.sub.1
being equal to [INT(S/N.sub.M(S))]N.sub.M(S) +(S.sub.1 modulo N.sub.M(S)).

18. The garbage-collecting memory module of claim 2 wherein said garbage-collecting control unit includes a plurality of from-space and a plurality of to-space object space managers, said from-space object space managers being assigned contiguous regions of equal size of from-space, said to-space object space managers being assigned contiguous regions of equal size of to-space, an object space manager generating and storing an object locator code for each memory cell occupied by an object in its assigned region, said object locator code being retrievable by said garbage-collecting control unit and translatable into the address of the memory cell in which said object header is located.

19. The garbage-collecting memory module of claim 18 wherein said from-space and to-space memory regions each contains L.sub.1 memory cells, the individual memory cells being represented by the integers from 0 to L.sub.1 -1, the addresses of the individual memory cells being the memory cell integers plus an integer offset, said offset being different for from-space and to-space, the difference in offsets for the two regions of memory being equal to or greater than L.sub.1, the number of from-space object space managers and the number of to-space object space managers each being equal to L.sub.1 /F, F being a submultiple of L.sub.1, the object space managers being represented by the integers from 0 to (L.sub.1 /F)-1, a from-space or to-space memory cell L being assigned to object space manager INT(L/F) where INT() is the integer portion of the quantity in parentheses, the quantity F having a plurality of submultiples N.sub.1, N.sub.2, . . . N.sub.G, the largest submultiple N.sub.G being equal to F, an object being allocated memory cells from S.sub.1 to S.sub.2, an object space manager comprising:
an object space manager memory;
an encoder which generates an object locator code [M(S), S.sub.1 modulo N.sub.M(S) ] for each from-space or to-space memory cell S assigned to said object space manager and in the range S.sub.1 to S.sub.2, S.sub.1 and S.sub.2 being inputs to said encoder, S.sub.1 being the memory cell occupied by the header of the object, M(S) being the subscript of the smallest of said submultiples for which INT(S/N.sub.M(S)) equals INT(S.sub.1 /N.sub.M(S)), said object locator code being [Z, S.sub.1 modulo F] if INT(S.sub.1 /F) is less than INT(S/F), Z being the difference between INT(S.sub.1 /F) and INT(S/F), said object locator code being outputted to said object space manager memory for storage;
an object locator which accepts a memory cell number S allocated to said object as input, S being within said object space manager's assigned region of memory, retrieves the object locator code from said object space manager memory, and produces said S.sub.1 as output, S.sub.1 being equal to [INT(S/N.sub.M(S))]N.sub.M(S) +(S.sub.1 modulo N.sub.M(S)) if the first term of said object locator code is a positive integer, S.sub.1 being equal to [INT(S/F)]F+ZF+(S.sub.1 modulo F) if the first term of said object locator code is a negative integer.

20. The garbage-collecting memory module of claim 2 wherein said garbage-collecting control unit maintains the rate of memory allocation for new objects at a level equal to or less than the rate of garbage collection.

21. The garbage-collecting memory module of claim 20 wherein said garbage-collecting control unit includes a ScanBalance register for monitoring the rate of garbage collection vis a vis the rate of memory allocations for new objects, the contents of said register being initialized to 0 at the start of a garbage-collecting cycle, said register being decremented by one for each operation performed on a word in memory in connection with said garbage collection, said register being incremented by 2 nK for each allocation of memory for a new n-word object, K being a constant, the quantity 2 nK being the number of garbage-collection operations that must be performed to collect n words of garbage, said garbage-collecting control unit making a memory allocation for a new object only if said allocation will not cause the contents of said register to exceed 0.

22. A process for collecting garbage in a memory and providing memory services to a computer system, said memory comprising a from-space region and a to-space region for the storage of objects, a designated word of each object being a header that specifies the size of said object and whether said object contains descriptors, said computer system maintaining a list of source descriptors pointing to regions of said memory containing live objects, said process comprising the steps:
performing a memory service at the request of said computer system, said memory service consisting of allocating space for an object in said to-space region of memory;
performing a memory service at the request of said computer system, said memory service consisting of storing an object in said to-space region of memory;
performing a memory service at the request of said computer system, said memory service consisting of retrieving an object from memory and returning said object to said computer system;
obtaining said source descriptors from said computer system;
entering the objects to which said source descriptors point and slice regions containing data belonging to said objects into a copying queue, said objects being called source-descriptor objects;
allocating space in to-space for said objects and slice regions;
writing a header and a from-space pointer for each of said objects and slice regions in predetermined memory cells of the to-space memory allocations of said objects and slice regions, said from-space pointers pointing to the locations of said objects and slice regions in from-space;
replacing the header of each of said objects and slice regions in from-space with a forwarding pointer to the header in to-space;
updating said source descriptors to point to to-space;
returning the updated source descriptors to said computer system;
copying the objects and slice regions in the copying queue from from-space to to-space when not performing or requested to perform memory services for said computer system.

23. The process of claim 22 wherein the objects stored in said memory include weak-pointer objects, a weak-pointer object being an object containing a weak pointer, objects referenced only by weak pointers being garbage, weak-pointer objects being distinguishable from other objects, said process comprising the additional steps:
entering weak-pointer objects into a weak-pointer object postprocessing (WPOP) queue when weak-pointer objects are entered into said copying queue;
postprocessing of the objects in the WPOP queue, said postprocessing being performed after all objects in said copying queue have been copied, postprocessing of a weak-pointer object consisting of either (1) updating the weak pointer to reflect the referenced object's new location in to-space if the object has been copied into to-space or (2) overwriting the weak pointer with 0 in the weak-pointer field if the object has not been copied into to-space.

24. The process of claim 22 wherein each word of an object has a tag that identifies the word as either a descriptor or terminal word, said process comprising the additional steps:
identifying the descriptors resident within objects in the copying queue, said descriptors being called resident descriptors, the identification being made as the objects are copied;
adding the objects to which said resident descriptors point to said copying queue, said objects being called resident-descriptor objects;
allocating space in to-space for said resident-descriptor objects;
updating said resident descriptors to point to to-space.

25. An electronic circuit for practicing the process of claim 24.

26. The process of claim 24 wherein said source-descriptor objects include only those weak-pointer objects from which the data fields are fetched by said computer system during garbage collection, a weak-pointer object being an object containing a weak pointer, objects referenced only by weak pointers being garbage, weak-pointer objects being distinguishable from other objects.

27. The garbage-collecting memory module of claim 26 wherein said computer system may enquire into the value of a weak pointer and the status of the object referenced by the weak pointer without causing said weak-pointer objects to be included with said source-descriptor objects.

28. The process of claim 24 wherein said updating step for each of said resident descriptors comprises the steps:
obtaining the pointer to the header of the resident-descriptor object;
calculating the to-space location of the resident descriptor by adding said header to the difference in said resident descriptor and said header pointer.

29. The process of claim 24 comprising the additional steps:
generating an object locator code [M(S), S.sub.1 modulo N.sub.M(S) ] for each memory cell S in the range S.sub.1 to S.sub.2 allocated to an object entering said copying queue, S.sub.1 being the memory cell occupied by the header of the object, said from-space and to-space regions of memory each containing L.sub.1 memory cells, the individual memory cells in each of said regions being represented by the integers from 0 to L.sub.1 -1, the addresses of the individual memory cells being the memory cell integers plus an integer offset, said offset being different for from-space and to-space, the difference in offsets for the two regions of memory being equal to or greater than L.sub.1, L.sub.1 having a plurality of submultiples N.sub.1, N.sub.2, . . . N.sub.G, the largest submultiple N.sub.G being equal to L.sub.1, M(S) being the subscript of the smallest of said submultiples for which INT(S/N.sub.M(S)) equals INT(S.sub.1 /N.sub.M(S)), INT() being the integer portion of the quantity in parentheses;
saving said object locator code.

30. The process of claim 29 wherein said updating step for each of said resident descriptors comprises the steps:
obtaining the pointer to the header of a resident-descriptor object;
calculating the to-space location of said resident descriptor by adding said header to the difference in said resident descriptor and said header pointer.

31. The process of claim 30 wherein said header pointer obtaining step comprises the steps:
translating the resident descriptor into the memory cell number S;
retrieving the object locator code corresponding to S;
decoding the object locator code into the memory cell number containing the header of the resident-descriptor object, said memory cell number being equal to [INT(S/N.sub.M(S))]N.sub.M(S) +(S.sub.1 modulo N.sub.M(S));
translating said header memory cell number into said header pointer.

32. The process of claim 24 comprising the additional steps:
generating an object locator code [M(S), S.sub.1 modulo N.sub.M(S) ] for each memory cell S in the range S.sub.1 to S.sub.2 allocated to an object entering said copying queue, S.sub.1 being the first word of said object, said from-space and to-space regions of memory each containing L.sub.1 memory cells, the individual memory cells in each of said regions being represented by the integers from 0 to L.sub.1 -1, the addresses of the individual memory cells being the memory cell integers plus an integer offset, said offset being different for from-space and to-space, the difference in offsets for the two regions of memory being equal to or greater than L.sub.1, each of said regions consisting of L.sub.1 /F subregions numbered from 0 to L.sub.1 /F-1, each of said subregions having F memory cells, F being a submultiple of L.sub.1, a memory cell L being located in subregion INT(L/F) where INT() is the integer portion of the quantity in parentheses, F having a plurality of submultiples N.sub.1, N.sub.2, . . . N.sub.G, the largest submultiple N.sub.G being equal to F, M(S) being the subscript of the smallest of said submultiples for which INT(S/N.sub.M(S)) equals INT(S.sub.1 /N.sub.M(S)), said object locator code being [Z, S.sub.1
modulo F] if INT(S.sub.1 /F) is less than INT(S/F), Z being the difference between INT(S.sub.1 /F) and INT(S/F);
saving said object locator code.

33. The process of claim 32 wherein said updating step for each of said resident descriptors comprises the steps:
obtaining the pointer to the header of a resident-descriptor object;
calculating the to-space location of said resident descriptor by adding said header to the difference in said resident descriptor and said header pointer.

34. The process of claim 33 wherein said header pointer obtaining step comprises the steps:
translating the resident descriptor into the memory cell number S;
retrieving the object locator code corresponding to S;
decoding the object locator code into the memory cell number containing the header of the resident-descriptor object, said memory cell number being equal to [INT(S/N.sub.M(S))]NM(S)+(S.sub.1 modulo N.sub.M(S)) if the first term of said object locator code is a positive integer, said memory cell number being equal to [INT(S/F)]F+ZF+(S.sub.1 modulo F) if the first term of said object locator code is a negative integer;
translating said first-word memory cell number into said header pointer.

35. An electronic circuit for practicing the process of claim 34.

36. The process of claim 24 comprising the additional step:
generating and saving an object locator code for each object and slice region added to said copying queue, said object locator code enabling the address of any memory cell within the memory allocation of an object or slice region to be decoded into the number of the memory cell in which the header of said object or slice region is located.

37. The process of claim 36 wherein a slice region comprises a header and a plurality of subregions, all subregions except the first and last having the same number of memory cells, the sum of the numbers of memory cells in the first and last subregions being equal to the number of memory cells in each of the other subregions, said process comprising the additional steps:
placing each copied slice object in a scanning queue;
entering initial values for a slice region control block into the slice region in from-space for each copied slice region and temporarily replacing the slice region's title in to-space with a pointer to said slice region control block, said slice region control block consisting of a pointer to the slice region in to-space, a number specifying the number of memory cells occupied by the slice region, a pointer to the slice region control block for the next slice region copied, and a plurality of subregion control blocks, each subregion control block consisting of a first-cell pointer to the first memory cell referenced by slice objects pointing into said subregion and a length which, when added to said pointer, corresponds to the last memory cell occupied by any slice object pointing into said subregion;
scanning the slice objects in said scanning queue after said copying queue empties, said scanning resulting in the entry of final values in said slice region control blocks for each slice region referenced by a slice object in said scanning queue and the restoration of the title of each slice region following the entry of final values in the associated slice region control block;
postprocessing said slice region control blocks after said scanning queue empties, said postprocessing resulting in the examination of each slice region control block for garbage-containing memory regions and the separation of slice regions having garbage-containing memory regions into multiple slice regions provided said garbage-containing slice regions are large enough to accept slice region headers.

38. An electronic circuit for practicing the process of claim 37.

39. The process of claim 37 comprising the additional step:
changing the number of memory cells in said first subregion at the start of each garbage-collection cycle.

40. The process of claim 37 wherein said object locator code is [M(S), S.sub.1 modulo N.sub.M(S) ] for each memory cell S in the range S.sub.1 to S.sub.2 allocated to a slice region entering said copying queue, S.sub.1 being the memory cell occupied by the header of the slice region, said from-space and to-space regions of memory each containing L.sub.1 memory cells, the individual memory cells in each of said regions being represented by the integers from 0 to L.sub.1 -1, the addresses of the individual memory cells being the memory cell integers plus an integer offset, said offset being different for from-space and to-space, the difference in offsets for the two regions of memory being equal to or greater than L.sub.1, L.sub.1 having a plurality of submultiples N.sub.1, N.sub.2, . . . N.sub.G, the largest submultiple N.sub.G being equal to L.sub.1, M(S) being the subscript of the smallest of said submultiples for which INT(S/N.sub.M(S)) equals INT(S.sub.1 /N.sub.M(S)), INT() being the integer portion of the quantity in parentheses.

41. The process of claim 40 wherein said scanning step includes the steps:
reading the slice region pointer of the first slice object in said scanning queue;
translating said slice region pointer into a memory cell number S in the referenced slice region;
retrieving the object locator code corresponding to S;
decoding the object locator code into the memory cell number containing the header of said slice region, said memory cell number being equal to [INT(S/N.sub.M(S))]N.sub.M(S) +(S.sub.1 modulo N.sub.M(S));
translating said header memory cell number into a pointer to said slice region header.

42. The process of claim 37 wherein said object locator code is [M(S), S.sub.1 modulo NM(S)] for each memory cell S in the range S.sub.1 to S.sub.2 allocated to a slice region entering said copying queue, S.sub.1 being the memory cell occupied by the header of the slice region, said from-space and to-space regions of memory each containing L.sub.1 memory cells, the individual memory cells in each of said regions being represented by the integers from 0 to L.sub.1 -1, the addresses of the individual memory cells being the memory cell integers plus an integer offset, said offset being different for from-space and to-space, the difference in offsets for the two regions of memory being equal to or greater than L.sub.1, each of said regions consisting of L.sub.1 /F subregions numbered from 0 to L.sub.1 /F-1, each of said subregions having F memory cells, F being a submultiple of L.sub.1, a memory cell L being located in subregion INT(L/F) where INT() is the integer portion of the quantity in parentheses, F having a plurality of submultiples N.sub.1, N.sub.2, . . . N.sub.G, the largest submultiple N.sub.G being equal to F, M(S) being the subscript of the smallest of said submultiples for which INT(S/N.sub.M(S)) equals INT(S.sub.1
/N.sub.M(S)), said object locator code being [Z, S.sub.1 modulo F] if INT(S.sub.1 /F) is less than INT(S/F).

43. An electronic circuit for practicing the process of claim 42.

44. The process of claim 42 wherein said scanning step includes the steps:
reading the slice region pointer of the first slice object in said scanning queue;
translating said slice region pointer into a memory cell number S in the referenced slice region;
retrieving the object locator code corresponding to S;
decoding the object locator code into the memory cell number containing the header of said slice region, said memory cell number being equal to [INT(S/N.sub.M(S))]N.sub.M(S) +(S.sub.1 modulo N.sub.M(S)) if the first term of said object locator code is a positive integer, said memory cell number being equal to [INT(S/F)]F+ZF+(S.sub.1 modulo F) if the first term of said object locator code is a negative integer;
translating said header memory cell number into a pointer to said slice region header.

45. The process of claim 37 wherein for each slice object in said scanning queue said scanning step includes the steps:
reading the slice region pointer of the first slice object in said scanning queue;
finding the header of the referenced slice region;
finding the control block associated with said slice region, said header being the pointer to said control block;
identifying the subregion
containing the first address referenced by said slice object;
updating said first-cell pointer and said length field within the control block of said subregion;
restoring the title of said slice object and removing said slice object from said scanning queue;
updating each descriptor in said slice region referenced by said slice object to point to to-space.

46. The process of claim 37 wherein said postprocessing step comprises the steps:
searching said subregion control blocks from first to last for contiguous segments of live data;
overwriting the memory preceding each contiguous segment of live data found with an appropriate slice data header.

47. The process of claim 22 comprising the additional step:
maintaining the rate of allocating memory for new objects in to-space equal to or less than the rate at which garbage is collected.

48. The process of claim 47 wherein said maintaining step comprises the steps:
counting downward in unit increments from 0 at the beginning of a garbage-collection cycle for each operation performed on a word in memory while collecting garbage;
counting upward by a 2 nK increment each time a memory allocation for a new n-word object is made, K being a predetermined constant, the quantity 2 nK being the number of garbage-collection operations that must be performed to collect n words of garbage;
enabling memory allocation for new objects only if the count that results from counting upward and downward is less than or equal to 0.

Description

BACKGROUND OF INVENTION

This invention relates generally to computer systems and more specifically to the memory portions of such systems.

One of the major trends in computer science of the last decade has been the increasing popularity of the "object-oriented" paradigm. While there is little consensus regarding the meaning of this term, any object-oriented system must be concerned with the allocation and maintenance of storage for "objects" where an "object" is data that share a particular attribute and occupy a contiguous region of memory. Objects are not permitted to overlap. "Live" objects are those needed in the computational process currently being performed by a computer system.

If all objects in a system are permanent, then there is no concern about memory management. The space assigned to each object at system startup need never be reclaimed. In most real systems, however, live objects have varying lifetimes that cannot be predicted in advance. In such systems, some method of recognizing expired ("dead") objects and evicting them from memory is necessary if memory resources are to be conserved.

"Garbage" is a term of art in computer technology which refers to data stored in computer system memory that is no longer being used in the performance of an application program. Garbage collection is the process of locating data in dynamically-allocated memory that is no longer being used and reclaiming the memory to satisfy future allocation requests. Since garbage collection greatly reduces low-level programming detail, it offers the potential of significant programmer productivity gains. By freeing programmers from this low-level detail, garbage collection encourages programmers and system designers to dedicate their intellectual efforts to higher-level pursuits, such as the design of fundamental algorithms, user interfaces, and general program functionality. Also, by eliminating many low-level programming concerns, garbage collection reduces the likelihood of programming errors. And finally, dynamic memory management based on copying-types of garbage-collection algorithms are capable of delivering much higher storage throughput than explicit allocation and deallocation, reference-count storage reclamation, and even stack allocation. Together these benefits of garbage collection combine to offer improved software functionality and reliability for lower development costs.

Traditional garbage collectors work by periodically halting execution of system programs in order to traverse all of memory in search of memory regions that are no longer in use. Traditional garbage collectors have a number of major shortcomings: (1) storage throughput in terms of rates of allocation and deallocation of objects is generally much lower than, for example, stack allocation; (2) the times required to allocate memory are only very loosely bounded--the bounds on allocation times are not tight enough to allow reliable programming of highly-interactive or real-time systems such as mouse tracking, interactive multimedia device control, virtual reality systems, and reactive robot control; and (3) in incremental garbage collection systems, the performance penalties associated with memory reads and writes are so high that overall system performance may be unacceptably slow.

Traditional garbage collection systems are incompatible with real-time systems because of their stop-and-wait behavior. Real-time garbage collectors work by dividing the labor of garbage collection into many small steps so that system programs are infrequently halted for the purpose of collecting garbage. Software prototypes of real-time garbage collection algorithms demonstrate the feasibility of the real-time algorithms but exhibit much poorer throughput than traditional garbage collection algorithms. By dedicating hardware to the task of garbage collection, both real-time response and high storage throughput are possible.

A number of incremental garbage collection techniques have been proposed. Some of these are capable of guaranteeing upper bounds on the times required to allocate a unit of memory and to read or write previously allocated memory cells. All of the incremental garbage collection algorithms require frequent synchronization between the application processor and the garbage collector. Depending on the algorithm, this synchronization generally consists of one or more extra instructions executed on every fetch or store that accesses the garbage-collected heap. In detailed performance analysis of these systems, the overhead of synchronizing on writes ranges from 3 to 24 percent of total execution time in one study, and synchronizing on reads was found to more than double execution time in a different study. Furthermore, all garbage collectors occasionally suspend execution of the application while the garbage collector completes certain uninterruptable activities.

A real-time garbage collector must honor a certain upper bound on the duration of time during which it might suspend execution of the application process. The tightest bound currently available on the time applications must wait for garbage collection based on using stock hardware is 500 microseconds for applications that are somewhat restricted in their use of dynamic memory. More general garbage collection systems promise looser bounds ranging from several to several hundred milliseconds. Suspending execution of the application process for time periods such as these is unacceptable for many real-time system applications.

BRIEF SUMMARY OF INVENTION

The garbage-collecting memory module (GCMM) is intended to function much like traditional memory in a computer system thereby permitting the invention to be utilized with a wide variety of computers. It differs from traditional memory in that it automatically cleanses itself of garbage while functioning as traditional memory without causing excessive delays in the execution of application programs by an associated computer. The GCMM can be designed to interface with a computer system via a traditional memory bus and to communicate with the central processing unit (CPU) of the computer using standard communication protocols.

The GCMM is comprised of a memory, a means for communicating with the CPU, and a garbage-collecting control unit. The garbage-collecting control unit gives top priority to satisfying the computer's requests for memory services. The collection of garbage takes place during the intervals between memory service requests.

Garbage collection is accomplished by copying live objects that are stored in one region of memory to a second region, thereby leaving dead objects behind in the first region. When the copying process has been completed, the dead objects are disposed of, and the garbage-collecting process continues with the copying of live objects in the second region back to the first. An up-to-date list of live objects is maintained by the central processing unit and forwarded to the GCMM at the start of each garbage-collection cycle.

The copying process requires that memory space be allocated to the objects being copied in the region of memory to which the objects are being transferred. The information needed for allocating memory space to objects is contained in the header of each object. Thus, the garbage-collecting control unit must be able to access the header of each live object being transferred to another region of memory. This requirement poses a problem in the case of live objects that are discovered as a result of examining the objects on the CPU's list in that the object headers are not readily available.

The list of objects supplied by the CPU is essentially a list of pointers to addresses contained within live objects. Objects on the CPU's list may contain pointers to data within other objects and these "referenced" objects are also live and must also be copied even though they do not appear on the CPU's list of live objects. In order that garbage collecting may proceed expeditiously, it is necessary that pointers to data within objects be translatable into pointers to object headers.

A key circuit in the garbage-collecting control unit is the object space manager which provides the means for deriving pointers to headers of objects from pointers to internal data of objects. The object space manager comprises an encoder that generates an object locator code for each memory cell in which an object is resident, a memory for storing the object locator codes for all memory cells in which objects are resident, and an object locator which identifies the memory cell containing the header of an object by means of the object locator code for any memory cell occupied by the object.

BRIEF DESCRIPTION OF DRAWINGS

FIG. 1 is a block diagram of a computer system employing the garbage-collecting memory module.

FIG. 2 is a block diagram of the garbage-collecting memory module showing the interface with the conventional system bus of a computer system.

FIG. 3 shows the coding format for object headers.

FIG. 4 is a C++ code fragment which identifies the special I/O addresses that the mutator uses in communicating with the garbage collector.

FIG. 5 gives type definitions and constants to which the C++ code fragments shown in subsequent figures make frequent reference.

FIG. 6 gives variables and data structures to which the C++ code fragments shown in subsequent figures make reference.

FIG. 7 is a C++ code fragment which allocates "size" bytes of memory to a record.

FIG. 8 is a flow diagram which describes the operation of allocating "size" bytes of memory to a record.

FIG. 9 illustrates the relationship between a slice region and slice objects.

FIG. 10 is a C++ code fragment that allocates "size" bytes of new slice region data and a corresponding slice object that refers to that data.

FIG. 11 is a flow diagram which describes the operation of allocating "size" bytes of slice region data and a corresponding slice object that refers to the allocated slice region data.

FIG. 12 is a C++ code fragment which allocates "size" bytes to a slice object in a previously-allocated slice data region

FIG. 13 is a flow diagram which describes the operation of allocating "size" bytes to a slice object in a previously-allocated slice data region.

FIG. 14 is a C++ code fragment which demonstrates the protocol for initializing a block of memory.

FIG. 15 is a flow diagram which describes the operation of initializing a block of memory.

FIG. 16 is a continuation of the flow diagram of FIG. 15.

FIG. 17 is a C++ code fragment which demonstrates the recommended protocol for tending a descriptor.

FIG. 18 is a flow diagram which describes the operation of tending a descriptor.

FIG. 19 is a C++ code fragment which demonstrates the protocol for informing the garbage collector that all descriptors have been tended.

FIG. 20 is a flow diagram which describes the operation of reading a single word from memory.

FIG. 21 is a flow diagram which describes the operation of writing a single word to memory.

FIG. 22 is a flow diagram which describes the operation of reading the descriptor tag associated with the word at a particular memory location.

FIG. 23 is a flow diagram which describes the operation of copying data from one memory region to another.

FIG. 24 is a continuation of the flow diagram of FIG. 23.

FIG. 25 is a continuation of the flow diagram of FIG. 23.

FIG. 26 is a flow diagram which describes the operation of increasing the live portion of a stack, initializing each of the stack-allocated words, and setting descriptor tags.

FIG. 27 is a flow diagram which describes the operation of expanding a stack and then copying data with accompanying descriptor tags onto the top of the stack.

FIG. 28 is a continuation of the flow diagram of FIG. 27.

FIG. 29 is a flow diagram which describes the operation of shrinking the size of a stack.

FIG. 30 is a flow diagram which describes the operation of allocating memory to a record and initializing the descriptor tag of each word.

FIG. 31 is a flow diagram which describes the operation of allocating memory to a stack and returning a pointer to the first of the allocated words.

FIG. 32 is a C++ fragment which demonstrates the protocol for initiating garbage collection.

FIG. 33 gives the port addresses of the garbage-collecting memory module components.

FIG. 34 gives the coding format for random-access memory module commands.

FIG. 35 gives the coding format for object space management module commands.

FIG. 36 illustrates a single level-one object space management group which controls eight possible object locations.

FIG. 37 illustrates two level-one object space management groups controlled by a two-element level-two group.

FIG. 38 illustrates four level-one object space management groups controlled by two level-two groups which are in turn controlled by a single level-three group.

FIG. 39 shows C++ declarations which represent the three levels of the object space management hierarchy exemplified in FIG. 38 and a representation in C++ of the beginning portion of the algorithm for installing a new object within an object space management module.

FIG. 40 is the continuation of a representation in C++ of the beginning portion of the algorithm for installing a new object within an object space management module.

FIG. 41 is a representation in C++ of the algorithm to find a header within the object space management module.

FIG. 42 identifies registers used to keep track of activities while transferring data out of one region of memory into another during garbage collection.

FIG. 43 shows C++ declarations which are used in pseudo-code implementations of TendDesc, HandleRead, and HandleWrite operations.

FIG. 44 shows the algorithm expressed in C++ that is used by the arbiter in monitoring memory read transactions.

FIG. 45 shows the algorithm expressed in C++ that is used by the arbiter in monitoring memory write transactions.

FIG. 46 shows the coding format for mutator-initiated commands to the arbiter.

FIG. 47 shows the algorithm expressed in C++ for tending a descriptor.

FIG. 48 shows the coding format for the microprocessor's work requests.

FIG. 49 is a flow diagram which describes the operation of incrementally copying data contained within a single from-space object to the Relocated address and maintaining the contents of the CopySrc, CopyDest, and CopyEnd registers during copying.

FIG. 50 is a flow diagram which describes the operation of incrementally scanning data contained within a single from-space object while copying it to the Relocated address and maintaining the contents of the CopySrc, CopyDest, and CopyEnd registers during copying.

FIG. 51 is a flow diagram which describes the operation of tending a descriptor held in memory.

FIG. 52 is a flow diagram which describes the operation of incrementally scanning data contained within a single to-space object.

FIG. 53 shows two functions expressed in C++ which abstract the interface between the microprocessor and the arbiter.

FIG. 54 gives constants expressed in C++ which represent the operation code portion of the operation encodings shown in FIG. 48.

FIG. 55 gives the type declarations which are used in the C++ implementation of the garbage collection code that runs on the microprocessor.

FIG. 56 shows the declarations pertaining to the tags used within object headers to represent the type of the object.

FIG. 57 shows the declarations supporting the operation of aligning slice subregions at a different offset relative to the beginning of the slice region on each pass of the garbage collector.

FIG. 58 shows two C++ routines which implement the operations of returning a pointer to a slice data region's control block and making a header for the controlled region consisting of the pointer combined with the region's type tag.

FIG. 59 shows the C++ declarations of constants that represent the values of flags that distinguish terminal from descriptor data and that identify write-protected memory.

FIG. 60 is a C++ code fragment which obtains the size of the object from the object's header.

FIG. 61 shows C++ declarations which represent the configuration of the garbage-collecting memory module.

FIG. 62 defines the pendingOperation variable which remembers whether the arbiter is currently working on an operation whose completion has not yet been verified and shows the functions expressed in C++ which represent the interface between the arbiter and the microprocessor but which return no result when executed.

FIG. 63 shows additional functions expressed in C++ which represent the interface between the arbiter and the microprocessor but which return no result when executed.

FIG. 64 shows additional functions expressed in C++ which represent the interface between the arbiter and the microprocessor but which return no result when executed.

FIG. 65 shows C++ library routines that return arbiter responses to requested operations.

FIG. 66 shows the beginning portion of a C++ implementation of the garbage collector.

FIG. 67 shows the second portion of a C++ implementation of the garbage collector which copies an object into to-space.

FIG. 68 shows the third portion of a C++ implementation of the garbage collector which copies slice region data and initializes a slice region control block.

FIG. 69 shows the fourth portion of a C++ implementation of the garbage collector which (1) updates the control block for the slice region that is associated with the scanned slice object and (2) if the slice object is identified as a descriptor slice, rescans the corresponding slice region and tends any descriptors referenced by this particular slice object.

FIG. 70 shows the fifth portion of a C++ implementation of the garbage collector wherein certain macros are defined and the initial part of the sixth portion of a C++ implementation of the garbage collector which postprocesses the control blocks, restores the headers of slice regions, carves up controlled regions into smaller regions containing live data, looks for live data, looks for the ends of live data, and starts a new region of contiguous data if the next live data starts after the current contiguous data region ends and if certain other conditions are met.

FIG. 71 shows the concluding part of the sixth portion of a C++ implementation of the garbage collector which postprocesses the control blocks, restores the headers of slice regions, carves up controlled regions into smaller regions containing live data, looks for live data, looks for the ends of live data, and starts a new region of contiguous data if the next live data starts after the current contiguous data region ends and if certain other conditions are met.

FIG. 72 shows the seventh portion of a C++ implementation of the garbage collector which detects the end of a segment of live data which will become an independent slice region.

FIG. 73 shows the concluding portion of a C++ implementation of the garbage collector which, whenever the garbage collector isolates a sufficiently large contiguous span of live slice region data, encapsulates this slice data into a smaller slice region.

DESCRIPTION OF THE PREFERRED EMBODIMENT

While garbage-collection researchers struggle to alleviate the shortcomings of traditional garbage collection methods, continuing advances in computer architecture and VLSI technology have made feasible new techniques for high-performance real-time garbage collection. More specifically, memory chips and other VLSI processing elements are increasingly affordable. Already it is common for the RAM in desktop workstations to exceed the memory needs of typical users. Permanently dedicating large segments of physical memory to hard real-time tasks is now economically feasible. VLSI circuitry is so inexpensive that it represents only a small fraction of a modern computer system's cost. These advances make possible cost-effective hardware-assisted garbage collection such as that performed by the garbage-collecting memory module (GCMM).

Recent history has taught that special-purpose architectures such as Lisp machines cannot easily compete in the free market with mass-marketed general-purpose systems. Special-purpose architectures do not enjoy the luxury of large teams of engineers to implement pipelined, superpipelined, and superscalar versions of their processors because the target audience is too small. For similar reasons, major software developers do not consider it economical to port their products to specialized architectures.

To avoid these pitfalls, all of the special circuitry associated with the high-performance garbage-collecting process that provides the conceptual basis for the present invention is incorporated within a special memory module that interfaces with the central processor unit (CPU) by way of a traditional memory bus. The GCMM mimics traditional memory for fetch and store operations and additionally provides several I/O ports to support allocation and identification of objects intended for storage in the GCMM.

Since, in principle, the GCMM can be interfaced with a large number of different CPU and bus architectures, the technology investment can be shared between users of many different processor architectures. Furthermore, computer users can retain their existing computer components and familiar software libraries when they add high-performance real-time garbage collection capabilities to their systems. Additionally, the interface to the GCMM is designed to provide generality and flexibility to application and programming language implementors. The GCMM supports a variety of primitive data structures from which specialized data structures to support languages like C++, Icon, and Smalltalk are easily constructed.

Throughout this specification, the term "garbage collector" refers to the processing elements within the GCMM that perform the garbage collection process. Communicating with the garbage collector consists of reading or writing to dedicated I/O addresses on the computer system bus. Though the contents of the GCMM are usually cached, commands and communication sent via the I/O system typically are not.

"Word", as used herein, is the architecture-specific size of a pointer. The preferred embodiment of the GCMM uses 32-bit words and assumes the address space is byte addressable. All GCMM-allocated objects are aligned on word boundaries. For certain applications of the invention, other word sizes or alignments may be more appropriate.

Throughout this specification, the term "descriptor" is used interchangeably with "pointer". By pointing to objects allocated elsewhere, each descriptor is capable of "describing" all conceivable kinds of information. To the garbage collector, an object is simply a contiguous region of memory that shares a particular attribute. Since some programming language implementations use linked data structures to represent individual language-defined objects, the garbage collector's view of what constitutes an object may differ from the view of a particular object-oriented programming language.

We use the adjective "terminal" to characterize memory locations known not to contain pointers. If all live memory is represented as a directed graph in which nodes represent dynamically-allocated objects and directed edges represent pointers from one object to another, the terminal nodes are those from which no directed edges emanate. The source nodes in this directed graph are pointers residing outside of the GCMM. These source pointers, which are under direct control of the CPU, are called "tended descriptors".

During garbage collection, live objects are copied from one region of memory to another. At the moment garbage collection begins, the application process updates each of the tended descriptors to point to the new locations of the objects they reference by communicating with the GCMM via dedicated I/O ports. Tended descriptors may reside either in physical machine registers of the application processor or within traditional memory.

Application processes run on the CPU and garbage-collection tasks run within the GCMM. Application programs are collectively referred to herein as the "mutator" since, insofar as garbage collection is concerned, their only role is to modify (or mutate) the contents of GCMM memory.

The garbage collector distinguishes between memory representing descriptors and memory representing terminal data by adding a one-bit "descriptor tag" to each 32-bit word of memory. Instead of using an extra bit to tag descriptors, a convention could be established whereby all words are internally tagged without the need for a 33rd bit of RAM to accompany each word. The important point is that the garbage collector must be able to quickly distinguish pointers from non-pointers.

Besides distinguishing between descriptors and terminals, the garbage collection protocol allows some flexibility in declaring the significance of each descriptor with respect to the object it references. In some cases a pointer to a word contained within a larger object is interpreted by the garbage collector as an indication that the entire referenced object is live. In other cases, only a portion of the referenced object is considered to be live, and the garbage collector takes responsibility for shrinking or splitting the enclosing object in order to isolate and reclaim garbage from within it. These different cases are distinguished by the garbage collector based on the types of the referencing and referenced objects.

Every GCMM-allocated object has a header containing information used by the garbage collector. The first word of every header is an encoded "title" representing the object's type and size. The headers of GCMM-allocated stacks contain additional information besides the title, as described below. For all other objects, the title comprises the entire header.

A record is a fixed-size object containing any combination of descriptors and terminal data. The size of an allocated record is defined at the time of its allocation. However, its internal organization as characterized by descriptor tags on individual words within the record does not necessarily remain constant.

The record type is the most fundamental of the supported types. Records can be used to implement C++ and Smalltalk objects; C arrays, structures, and unions; and Lisp dotted pairs. Data structures built from linked records can be used to implement, for example, Icon tables and Smalltalk class hierarchies.

If any address location within a record is referenced by a live descriptor, the entire record is considered live.

A stack is a fixed-size object containing descriptor and terminal data and a one-word field representing the offset of the stack's current top element. The preferred embodiment of the garbage collector implements only stacks that grow downward. Comparisons between the locations of stack-allocated objects and the current top of stack are described in this specification using the adjectives "above" and "below". Because stacks grow downward, addresses above the current top-of-stack location are smaller-valued absolute addresses.

Each time the stack grows or shrinks, the application must update the stack's height by communicating with the garbage collector. Words within the stack are tagged similarly to words within records. Updating these tags makes growth of a garbage-collected stack more expensive than traditional stack allocation, which consists simply of decrementing the dedicated stack pointer register by the desired amount of stack growth. No tag maintenance is performed when the stack shrinks, so removing elements from the stack is nearly as efficient as in traditional stack architectures.

Because of the extra effort spent initializing descriptor tags for words pushed onto the stack, stack allocation of activation frames is not much faster than GCMM allocation of records. However, during certain phases of garbage collection, allocation of records is accompanied by garbage collection efforts that may incur delays proportional to the size of the record. Stack allocation does not incur this overhead, since the stack expands into memory that was allocated previously. Another advantage of stack allocation and deallocation is that it does not contribute to the pool of memory that must eventually be reclaimed by the garbage collector. An application that stack-allocates rather than GCMM-allocates objects collects garbage less frequently.

If any address location contained within a stack object is referenced by a live descriptor (even a location above its current top), then the entire stack object is considered to be live. When processing a live stack, the garbage collector examines only that portion of the stack found beneath its current top in search of pointers to additional objects.

A slice object consists of a pointer to a location within a slice region and a length representing the number of consecutive bytes from that point forward that are contained within the slice. Slices are useful in implementing the built-in string and stream data types of languages like Icon and Conicon. They might also be used to represent the catenation of multimedia audio visual clips into complete audiovisual programs, and to implement shared code segments in a dynamic object-oriented programming environment. Once allocated, a slice object is considered to be read-only. Only the slice region data referenced by the slice object is writable.

When the garbage collector allocates a slice object, it initializes the object to point to a segment of contiguous slice region data. The referenced slice region is either allocated at the same time the slice object is allocated, or it is a subslice of a previously allocated segment of slice data. Slice objects may overlap each other in a slice region. Slice objects may be either descriptor slices or terminal slices. The referenced slice region for a descriptor slice may contain descriptors while that for a terminal slice does not. The distinction between terminal and descriptor slice objects is made because terminal slices make more efficient use of available memory.

Arbitrary descriptors may point directly into the slice region. These descriptors are updated properly during garbage collection. However, the slice region data referenced by an arbitrary descriptor is only treated as live if it is also referenced by a slice object. The rationale for this rule is to provide efficient support for machine register induction variables and derived pointers (including possibly the machine's instruction and stack pointers) to slice regions. These tended descriptors typically obtain new values by incrementing their previous values rather than loading from memory. For the garbage collector to decide how much slice region data should be treated as live, based only on tending of a descriptor that points to a particular location within that region, is not generally feasible. Furthermore, for the garbage collector to treat each read or write of slice region data as enlivening the referenced word significantly adds to the garbage collector's complexity and increases the number of memory cycles required to handle fetch and store operations. For these reasons, the garbage collector considers as live only slice region data that is directly referenced by slice objects.

Within slice regions, descriptors are distinguished from terminals using descriptor tags, as discussed above. Unlike records and stacks, the garbage collector may shrink a slice region or may split a single slice region into several smaller regions if segments of unreachable data are found within the region.

Slice regions are not directly visible to the mutator. There is no way to explicitly allocate one, or to directly manipulate its size. Instead, the mutator asks the garbage collector to allocate a slice object that refers to a particular amount of slice region data. In satisfying this request, the garbage collector may allocate a new slice region or it may obtain the requested segment of slice region data from within a slice region that was allocated previously. After allocating a slice object, the mutator initializes the descriptor tags of the referenced slice region by invoking certain primitive operations that will be described later.

The real-time garbage-collection process is based on an algorithm originally described by Henry Baker (H. G. Baker Jr., "List Processing in Real Time on a Serial Computer", Comm. ACM 21, 4 (Apr. 1978), 280-293). The basic idea of the algorithm is to divide available memory into two regions named "to-space" and "from-space". Objects are allocated space in to-space while previously allocated live objects are incrementally copied into to-space out of from-space.

When the garbage collector copies an object into to-space, the first word of the old object is overwritten with a forwarding pointer to the object's new location. The garbage collector uses this forwarding pointer to update other pointers that refer to the same object. When those pointers are traced, the garbage collector recognizes that the referenced object's first word is a forwarding pointer and updates the pointers appropriately rather than creating yet another copy of the referenced object.

The garbage collector tends all pointers contained within records, stacks, and slice objects as they are copied. Descriptors within slice regions are tended after being copied. Tending of a descriptor consists of first checking whether the descriptor refers to from-space. If the descriptor refers to a from-space object already scheduled for copying into to-space, the object's new location is found by examining the from-space object's forwarding pointer. If the descriptor refers to a from-space object that has not yet been scheduled for copying, the garbage collector inserts the referenced object into the copy queue. In either case, descriptor tending makes sure that the obsolete pointer to from-space is replaced with an updated pointer to the new location of the object in to-space.

New objects are allocated at the same time that old objects are being copied into to-space. When there is no longer adequate memory in to-space to satisfy an allocation request, garbage collection begins. The names assigned to the two memory regions are interchanged, so that allocations are now made from the other region. This is called a "flip". The design of the algorithm guarantees that all live data will have been copied out of the old from-space by the time the next flip occurs.

The application program is allowed to maintain only a limited number of pointers (i.e. descriptors) to dynamically allocated objects. The descriptors under direct control of the application are called tended descriptors as indicated above. When a flip occurs, the objects directly referenced by tended descriptors are scheduled for copying into to-space, and the descriptors are modified to reflect the new locations of the objects that they refer to. The task of updating a pointer to reflect the new location of a live data object is called "tending". The garbage collector follows the rule that tended descriptors always point into to-space. Each time a value is loaded into a tended descriptor by reading from an internal field of a dynamically-allocated object, the value is tended before it is assigned to the tended descriptor.

In order to support fast response to memory read, write, and allocate instructions, it is necessary to divide the garbage collection process into a number of very small atomic actions. Certain system invariants are maintained between execution of these atomic actions. These invariants are sufficient to allow memory read and write operations to interleave with background garbage collection efforts. In order to simplify recognition of addresses referencing particular regions of memory, it is necessary to require that the total size of the module's memory be a power of two. For similar reasons, the base address of the expansion memory must have zeros in all of the low-order bits used to address locations within the module.

Because there is no limit on the size of objects supported by the garbage collector, it is essential that copying and scanning of objects be performed incrementally. Otherwise, the time required to complete a single atomic operation might exceed the desired real-time response. When an object is queued for copying, space is reserved for it in to-space and the first two words of the reserved space are initialized with the object's title and a pointer to its original location respectively. The title of the original object in from-space is overwritten with a forwarding pointer to the space reserved for eventual copying, and the descriptor tag is set for the original object's forwarding pointer.

The memory reserved for copying of objects is allocated starting from the beginning of to-space. Since objects are copied in FIFO order, all uncopied objects reside within a single contiguous range of memory addresses.

As with Baker's original algorithm, the garbage collection algorithm incorporated within the GCMM presents to the mutator the illusion that all live memory is copied instantaneously into to-space at the time of a flip. Though the garbage collector carries the main burden of performing the flip, the mutator's cooperation is required to find all live objects. The mutator keeps track of a bounded number of pointers into the GCMM--the so-called "tended descriptors".

Garbage collection is triggered by a memory allocation request that cannot be satisfied. In response to this request, the GCMM returns a special code informing the mutator that it is time to perform a flip. The mutator then passes each of its tended descriptors to the garbage collector, which queues the referenced objects for copying into to-space and returns updated values for each of the descriptors. Alternatively, the mutator could initiate a flip by passing each of its tended descriptors to the garbage collector.

The process of updating a descriptor to make sure that it does not point into from-space, including the work of queuing the referenced object for copying into to-space if necessary, is, as mentioned previously, called "tending".

Because of the alignment restrictions described above, the GCMM recognizes attempts to read untended descriptors in approximately the same time required to implement traditional memory error-correcting codes. An untended descriptor is simply any word with the descriptor tag set for which the high-order bits exactly match the base address of from-space. Whenever the mutator requests to read an untended descriptor, the requested word is tended before its value is made available to the mutator.

In Baker's original algorithm, each live object is first copied and then scanned. Scanning, in Baker's algorithm, consists of examining copied objects and tending the descriptors contained within them. In the GCMM algorithm, the descriptors within most objects are tended as they are copied. This approach approximately halves the number of memory cycles required to relocate live objects out of from-space. This approach is only possible because copying of the objects referenced by descriptors that were previously untended is deferred until a later time.

The only objects that are not scanned while copying are slice regions. Even though the pointer field of a slice object is tended while copying, it is still necessary for a subsequent scanning phase of garbage collection to visit all of the slice objects copied into to-space. Since only slice objects need to be scanned, each slice object is placed onto a linked list threaded through its title field when it is copied into to-space.

In order to guarantee sufficient space for the copying of all live memory into to-space while new objects are being allocated, it is important to pace the rate of allocation in relation to the garbage collection rate. Either the mutator or the garbage collector may take responsibility for ensuring that the allocation rate does not exceed the rate of garbage collection. The mutator assumes this responsibility in the preferred embodiment.

Each allocation of size n is accompanied by an amount of garbage collection quantified by 2 nK, where K is an experimentally-determined constant. The general technique is for the mutator to maintain two variables called GCProgress and AllocProgress. Both of these variables are initialized to zero at the start of garbage collection. GCProgress records the amount of garbage collection that has been completed. AllocProgress records the amount of allocation that has been performed since the most recent flip operation. Each allocation of n words increments AllocProgress by 2 nK where K is assigned a value, experimentally determined, such that GCProgress continually equals or exceeds AllocProgress. Under normal circumstances, an allocation of n words is only permitted if AllocProgress plus 2 nK is less than or equal to GCProgress. Otherwise, the mutator must delay its allocation request until additional garbage collection has completed, thereby increasing the value of GCProgress so that AllocProgress plus 2 nK is less than or equal to GCProgress.

At the time of a flip, both GCProgress and AllocProgress are initialized to zero. Every word relocated out of from-space causes GCProgress to be incremented by one. For record and stack objects, GCProgress is incremented twice for each word copied to account for the effort of tending the word while it is being copied.

Each slice object contains three words. GCProgress is incremented by two when the slice object is copied into to-space. The slice region pointer contained within the slice object is tended during copying of the slice object. After the slice object has been copied, it is placed on a linked list of slice objects waiting to be scanned. This linked list is called the scanning queue. The header of each slice object on the list is overwritten with a pointer to the next slice object on the list, or given a special value to indicate the end of the list. Since the architecture is assumed to be byte-addressable and each slice object is assumed to be word-aligned, the two least significant bits of this pointer are always zero. The least significant bit of this pointer is used to distinguish between slice objects that may refer to descriptor data and those known not to reference descriptor data. When a slice object residing on the scanning queue is eventually scanned, GCProgress is incremented by four. If the slice object being scanned is identified as a descriptor slice object, then the referenced slice region data is scanned and any descriptors contained therein are tended. GCProgress is incremented by one for each of the slice region words that is scanned in this step.

GCProgress is incremented by one for each word of slice region data copied into to-space, excluding the region's header. After the slice region data has been copied, the original object is overwritten with a slice region control block.

The slice region control block divides the slice region into 8-word segments called subregions, and includes one subregion control block for each of these. Each subregion control block consists of a pointer FirstMemRef to the first memory referenced by slice objects pointing into that particular subregion, and a length LastMemLen that, when added to this pointer, represents the last memory referenced by slice objects pointing into the subregion. During each pass of the garbage collector, alignment of all subregions is offset from the beginning of the corresponding slice regions by the number of bytes specified in the ProbeOffset register. The first three fields of the slice region control block are the slice region pointer, the size in words of the controlled slice region, and a pointer to the next on a linked list of all control blocks being garbage collected.

The optimal size for subregions depends on tradeoffs between the bookkeeping overhead required to maintain large numbers of small subregion control blocks, and the benefits of quickly isolating garbage within slice regions by probing for garbage at more closely spaced intervals. To allow pointers to quickly determine which subregion they refer to, the subregion size must be a power of two. Control blocks are not allocated for slice regions smaller than seven words because the slice region is not large enough to represent its own region control block. In order to guarantee that a slice region of size seven words is large enough to represent its own control block, the garbage collector requires that subregion sizes be no smaller than eight words.

GCProgress is incremented by two following initialization of the first three fields in the slice region control block. GCProgress is incremented by half the number of words contained within the corresponding subregion following initialization of each subregion control block. During postprocessing of slice subregion control blocks, GCProgress is once again incremented by half the number of slice data words contained within each of the corresponding subregions.

The final phase of garbage collection consists of initializing all of from-space memory to zero, and all from-space object-space managers to their initial states. For each word so initialized, GCProgress is incremented by 2K/(K+2).

By means of dedicated I/O ports of the GCMM, the mutator is able to obtain the values of certain state variables that represent the garbage collector's progress. The variable ToSpace points to the first word of to-space, CopyDest points to the next to-space word to which live data currently residing in from-space will be copied, NumSliceObjects counts the number of slice objects that have been placed on the copying queue, CopiedSliceObjects counts the number of slice objects that have been copied into to-space, ScannedSliceObjects counts the number of slice objects that have been removed from the scanning queue, NumSliceRegions counts the number of slice regions that have been placed on the copying queue, NumRegionsCopied counts the number of copied slice regions for which region control blocks have been initialized, TotalSliceData is the number of words of slice data contained within slice data regions that have been placed on the copying queue, TotalSliceCopied is the number of words of slice data contained within slice regions that have been copied, TotalSliceScanned is the number of words of slice data that have been scanned during slice-object scanning, TotalSliceControlled is the number of words of slice data that are currently controlled by slice region control blocks, TotalSlicePostprocessed is the number of words of slice data that have been postprocessed, and TotalZappedWords is the number of words of from-space that have been initialized to zero in preparation for the next garbage collection cycle. The current value of GCProgress is represented by the following equation: ##EQU1## Note that the expression above calculates GCProgress as twice the number of words copied into to-space minus the number of words contained within objects for which copying is not worth two units of garbage collection per word plus the appropriate units of garbage collection for each of the special garbage collection operations that has been completed. Having obtained the values of these variables from the GCMM, the mutator computes the value of GCProgress. Since GCProgress is a non-decreasing variable, it is not necessary to recompute GCProgress after each allocation request. Once GCProgress has been computed, the mutator can freely allocate objects, incrementing AllocProgress appropriately for each allocation until AllocProgress is greater than GCProgress, at which time the mutator must obtain updated values for each of the state variables that contributes to the computation of GCProgress in order to update its value. Experimental evidence collected to date suggests that the allocation rates of typical applications rarely exceed the GCMM's rate of garbage collection.

Each allocation request issued by the mutator is serviced within at most seven traditional memory cycles. Thus, transactions between the mutator and the GCMM are always very short, this circumstance thereby facilitating quick context switching between concurrent tasks. The protocol was designed to simplify context switching between tasks sharing access to the garbage-collected memory. Additionally, this protocol allows very high allocation rates as long as GCProgress is greater than AllocProgress. Since the mutator has a better understanding of the system workload and scheduling constraints, it is much more capable than the GCMM to act intelligently with respect to the pacing of garbage collection versus allocation efforts. For example, the mutator may choose to temporarily allow allocation rates to exceed garbage collection rates. Or it might dynamically adjust the constant K at the time of a flip, or even during the garbage collection effort, depending on the amount of live memory that the mutator needs to have garbage-collected and the rates at which it needs to allocate new data. Another option under mutator control is to flip earlier than would otherwise be required, in order to complete the flip during a lull in system activity. And yet another advantage of relegating this decision to the mutator is that the mutator may choose to time share with tasks that do not require dynamic allocation of memory whenever allocation rates begin to exceed garbage collection rates. This allows the mutator to perform useful work while waiting for the garbage collector to catch up.

New memory is allocated from the end of to-space while live objects are being copied from the beginning of to-space. Several dedicated registers delineate the boundaries between to-space memory in different intermediate stages of garbage collection. The CopyDest register indicates the location to which the next word copied out of from-space will be written. The CopyEnd register holds the address just beyond the end of the object currently being copied. The CopySrc register contains a pointer to the next from-space memory cell to be copied into to-space whenever CopyDest is less than CopyEnd. The Reserved register contains the pointer to the next memory available for objects to be placed on the copying queue.

All objects on the copying queue are located between CopyEnd and Reserved. The New register contains a pointer to the most recently allocated object. At the time of a flip, New is initialized to point to the end of to-space. Each allocation request is satisfied by decrementing New by the size of the allocation and returning its updated value.

As long as the amount of live data referenced by the mutator never exceeds the amount of memory that the garbage collector was configured to handle, the garbage collector guarantees to complete garbage collection prior to overflowing to-space.

The garbage collection system's principal responsibilities are, in order of decreasing priority:

1. To respond quickly to requests made by the mutator;

2. To copy live objects into to-space;

3. To scan slice objects that have already been copied into to-space;

4. After all live objects have been copied and scanned, to examine each of the slice regions copied into to-space and to collect holes of unreachable memory as garbage, this phase of garbage collection being called "postprocessing".

During garbage collection, requests to read or write memory that has not yet been copied are recognized by comparing the address of the requested operation with the current values of CopyDest, CopyEnd, and Reserved. References to memory between CopyDest and CopyEnd are redirected to the address computed by adding CopySrc to the difference between the requested memory address and CopyDest. Whenever references to memory between CopyEnd and Reserved are recognized, special circuitry in the GCMM looks up the location of the uncopied object's header. For objects on the copying queue, the word following the title points to the object waiting to be copied out of from-space. The requested memory operation is redirected to the appropriate address in from-space by adding together the address of the object to be copied and the offset of the requested memory operation's address relative to the encompassing object's header location.

Unlike records, stacks, and slice object headers, descriptors contained within slice regions are not tended during copying. This is because it is not possible to determine which of these descriptors are still live until after all live slice objects have been examined by the garbage collector. If, during garbage collection, the mutator attempts to read untended slice region descriptors, the garbage collector tends the descriptor before its value is made available to the mutator.

The ScanBalance variable, the difference between the AllocProgress and GCProgress variables, is not affected by on-demand tending of descriptors. As long as the mutator does not exceed the limits on total amounts of live data, there are sufficient ScanBalance points to tend every live descriptor in the system. Regardless of whether the mutator demands that certain descriptors be tended out of normal scanning order, the ScanBalance points reserved for tending of a descriptor are collected at the time the descriptor is eventually scanned by the garbage collector. A single ScanBalance point is charged for scanning a word, even if the word is not a descriptor in need of tending. No additional ScanBalance points are charged if scanning requires that an object be queued for copying, even though queuing an object for copying requires that the title of the queued object be copied into to-space. Since the title of an object is copied when the object is queued for copying, and since the title does not need to be tended, the two ScanBalance points reserved for relocation of an object's title are available for special type-dependent processing, as described below.

Tending of a descriptor pointing to any address within a record causes the record to be queued for copying. As each word of the record is copied, ScanBalance is decremented by two, and descriptors contained within the record are tended before their values are written to to-space. The two ScanBalance points associated with the record's title are charged when the garbage collector begins copying the object into to-space.

Tending of a descriptor pointing to any address within a stack causes the stack object to be queued for copying. Within the stack object's header, the word immediately following its title identifies the location of the stack's top element. During incremental copying of the stack object, only that portion of the stack beneath its top element is actually copied. At the moment that copying of the stack begins, ScanBalance is decremented by twice the number of words residing above the top-of-stack mark within the stack object, including the two words contained within the object's header. As each word of the stack is copied, ScanBalance is decremented by two to account for copying and scanning of the word, and descriptors contained within the stack object are tended before their values are written into to-space.

Tending of a descriptor pointing to any location within a slice object causes the slice object to be queued for copying. Copying of the slice object is incremental. The pointer field of the slice object is tended as its value is copied. Since copying takes precedence over scanning, this guarantees that the referenced slice region will have been completely copied into to-space by the time that this slice object is eventually scanned. For each word of the slice object copied into to-space, ScanBalance is decremented by one. The ScanBalance points reserved for scanning of the slice object are expended later, when the object is actually scanned. After the slice object has been completely copied, the slice object is linked onto a list of slice objects waiting to be scanned. The title of the slice object is overwritten with the link field, within which the least significant bit distinguishes between slice objects that reference descriptors and those that refer only to terminal data. Since the GCMM is byte-addressable, the least significant bit of every pointer to word-aligned memory is otherwise not needed.

Even though a slice region that contains some live data may contain segments of dead data also, the entire slice region is copied into to-space one word at a time. There are several reasons for this. First of all, the garbage collector cannot know which data within a slice region is garbage until after all live slice objects have been examined. Second, to postpone copying of slice region data until after the garbage collector knows exactly which data within the slice region is live would add a level of indirection to all fetches and stores that reference the slice region before garbage collection has been completed, thereby impairing system performance. And third, to efficiently handle memory operations that access slice regions on the copying queue, it is necessary that the offset between the requested memory address and the slice region's header location be identical in both the original object and within the space into which the slice region will eventually be copied.

For each slice region word copied, ScanBalance is decremented by one.

After completely copying a particular slice region into to-space, but before beginning to copy the next object on the copying queue, the garbage collector overwrites the original slice region with initial values for the slice region control block. The control block is doubly linked with the slice region it controls by temporarily overwriting the slice region's title with a pointer to the control block. The forwarding pointer for the original slice region now serves both as a forwarding pointer and as the reverse link between the slice region and its control block.

When a slice region is copied into to-space, ScanBalance is decremented by one for each word copied. However, the ScanBalance points traditionally set aside for scanning of the slice region are divided equally between initialization and postprocessing of the region's control block. The two ScanBalance points available for processing of the slice region's title are charged when the control block's header is initialized and the slice region's title is overwritten with a pointer to the region's control block. Following initialization of each subregion control block, ScanBalance is decremented by half the number of words within that subregion. Half a ScanBalance point remains unspent for each word of data in the slice region. These remaining points are spent during postprocessing of control blocks, as described below.

After all objects on the copying queue have been copied, the garbage collector begins (or resumes) scanning of slice objects. Remember that the single descriptor within each slice object is tended when the object is copied into to-space and since slice objects are read-only, every slice object that is being scanned points to a slice region that has been copied out of from-space. Scanning of slice objects consists of the following actions: (1) finding the header of the referenced slice region; (2) reading the slice region's header, which is a pointer to the region's control block; (3) calculating which subregion contains the first address referenced (FirstMemRef) by the slice object; (4) updating the FirstMemRef and LastMemLen fields within the appropriate subregion control block; and (5) restoring the slice object's title and removing the slice object from the linked list of objects waiting to be scanned.

Each of the steps above is performed in constant time. Upon completion of these five tasks, ScanBalance is decremented by the number of words in a slice object (normally three, but larger if, for example, all objects must be aligned on 4-word boundaries) plus the ScanBalance point reserved for scanning of the object's title.

Descriptor slice objects are distinguished from terminal slice objects by a single bit in the object's title. Besides the work described above, scanning of a descriptor slice includes the following additional step: (6) tending each of the slice region descriptors referenced by the slice object. For each of the slice region words scanned in this step ScanBalance is decremented by one. Note that overlapping descriptor slices require redundant scanning of the shared data. This is the only task of the garbage collection algorithm whose execution time is not linear in the total amount of live memory. Generally, users of the garbage collector who need guaranteed availability of live memory must account for the space consumed by each slice object and slice region independently. When accounting for descriptor slice objects, an additional fraction of the referenced slice region segment is added into the total storage needs to account for redundant scanning of the shared segment.

The very last phase of garbage collection consists of postprocessing region control blocks. The linked list of region control blocks is walked, and each slice region is examined in search for holes of unaccessed data. When sufficiently large holes of unaccessed data are found between subregions, the original slice region is split into multiple slice regions. Sufficiently large holes are holes that are large enough to allow an appropriately aligned slice region header to overwrite some of the garbage contained within the hole. After shrinking or splitting a slice region, the garbage within the original slice region is no longer contained within any object and will not be copied during subsequent garbage collection flips. Postprocessing is done incrementally by examining the subregion control blocks one at a time from left to right searching for contiguous segments of live data. For each contiguous segment of live data found, the garbage collector overwrites the memory preceding that segment with an appropriate SliceData header. After postproceasing of a region control block completes, division of the slice region into subregions is no longer meaningful.

During postprocessing of each subregion control block, ScanBalance is decremented by half the number of words within that subregion.

Since holes of garbage located at either the front or rear of a slice region are always found by the garbage collector, regardless of ProbeOffset's value, ProbeOffset is never set to zero. Therefore, the smallest control blocks control two subregions, and the minimum size of a control block is consequently seven words.

The garbage collector refrains from creating slice regions smaller than seven words. Whenever smaller segments of live slice region data are isolated, they are enclosed within a slice region that contains enough of the surrounding garbage to make the slice region's total size seven words. It would be possible for the garbage collector to support slice data regions smaller than seven words by treating them specially during certain phases of garbage collection. However, this adds considerable complexity to the garbage collection system, with very limited improvement in terms of storage utilization.

By changing the value of ProbeOffset with each flip of the garbage collector, the garbage collector guarantees that all holes of garbage within a slice region will eventually be found. However, for any particular flip of the garbage collector, the garbage collector promises only that the amount of slice region memory allocated to a particular slice object does not exceed the amount of memory actually used by that slice object by any more than eight words, the size of each subregion. Garbage collector users who need to verify availability of memory must generally use a conservative estimate when accounting for the memory dedicated to each slice.

A computer system incorporating the GCMM is shown is FIG. 1. The GCMM 2, the random-access memory (RAM) 4, and the read-only memory (ROM) 6 connect to the central processing unit (CPU) 8 via the cache 10 by means of the conventional system bus
12. The GCMM plays the role of traditional expansion memory within a standard bus-oriented system architecture.

For the purposes of describing the GCMM and how it works, it is necessary to postulate a specific computer system design. The computer system design described below for the preferred embodiment is only one of a number of computer system designs in which the GCMM could be incorporated.

In the preferred embodiment of a computer system incorporating the GCMM, all memory is byte-addressable. The memory system uses 32-bit words, and physical memory is addressed with 32-bit words. All pointers are word-aligned. Memory words are big-endian.

Insofar as the garbage collector is concerned, all pointers referring to a particular object directly address a memory location contained within the referenced object. With certain CPU architectures, code optimizers might be tempted to rearrange code so that Programmer-defined variables pointing directly to particular objects are replaced with a pointer base and an integer offset, where the base pointer actually points outside the boundaries of the referenced object. If such techniques are used, they must be hidden from the garbage collector. An off-target base pointer must not be written to garbage-collected memory as a descriptor. Further, the mutator must perform the arithmetic necessary to convert the base pointer to a valid descriptor before tending it, and later, to convert the tended descriptor back to an appropriate off-target base pointer.

Pointers need not address the first word in the referenced object.

Write-through caching is used to ensure that the memory system is always aware of the most recent values represented by particular memory locations.

The CPU is capable of directly manipulating its cache. In particular, the CPU is able to invalidate ranges of addresses that may reside in its cache.

The cache may use a write buffer to improve the efficiency of write-through caching. However, it is important that the write buffer be flushed (written) in FIFO order to memory before reading from or writing to an uncached memory-mapped I/O port.

The cache line size must be no larger than one word. The architecture is assumed to be byte-addressable, with cache lines aligned on addresses evenly divisible by four. It is important that the cache not prefetch words that have not been explicitly referenced by the CPU.

The configuration of the GCMM 2 and the manner of interfacing with the conventional system bus 12 is shown in FIG. 2. The bus interface unit (BIU) 16 provides the interface between the system bus 12 and the internal bus 18 used for communication between components within the GCMM.

There are two identical random-access memories--RAM.sub.1 20 and RAM.sub.2 22. Each RAM module consists of 16 MBytes of random access memory. At any particular time, one of the two independent RAM modules represents to-space and the other, from-space. Each 32-bit word of RAM is accompanied by a one-bit tag that distinguishes pointers from non-pointers, a one-bit write-protect tag that prevents the mutator from overwriting the garbage collector's internal data structures, and six bits of error-correcting codes.

Associated with RAM.sub.1 20 and RAM.sub.2 22 are object space managers OSM.sub.1 24 and OSM.sub.2 26 respectively. Each OSM module manages its associated RAM module by maintaining a data base of locations at which each object residing in the RAM module begins. Given a pointer to any location within a memory module, the associated OSM is capable of reporting the address of the start of the object that contains that address in approximately the same time required to perform a traditional memory read or write. The OSM's primitive operations are reset which initializes the OSM, createHeader which installs an object into the OSM's data base, and findHeader which reports the beginning address of the object containing a particular address.

The arbiter 28 oversees access to the internal bus and performs a number of important garbage collection activities using circuitry dedicated to providing rapid context switching between background garbage collection activities and mutator demands.

The main responsibility of the microprocessor 30 is to supervise garbage collection. The local memory 32 provides the memory resources for the microprocessor 30 to perform its supervisory tasks. The microprocessor 30 oversees garbage collection by dividing the job into a large number of small straightforward activities and individually assigning each of these activities to the arbiter 28. The arbiter 28 works on commands from the microprocessor 30 as a background activity while giving highest priority to servicing requests from the BIU.

The organization illustrated in FIG. 2 permits multiple RAM and OSM components to work in parallel.

RAM.sub.1 20 and RAM.sub.2 22 each implements a 3-slot write buffer. OSM.sub.1 24 and OSM.sub.2 26 can each buffer one createHeader request. Thus, as long as sufficient time has passed since a preceding OSM request has been issued, a createHeader request completes instantly. Furthermore, subsequent findHeader requests need not wait for the buffered createHeader request to complete.

In order for the GCMM to collect garbage with minimal supervision from the CPU, it must know for each word of dynamically-allocated memory whether it contains descriptor or terminal data, and it must know which contiguous regions of memory represent indivisible objects. Within dynamically-allocated objects, all descriptors are tagged to distinguish them from terminal data. Object boundaries are identified when objects are allocated. The garbage collector retains size and type information about each allocated object by prepending a one-word header to each allocated object. This header is transparent to the mutator in that it precedes the address returned by the garbage collector in response to an allocation request.

Object headers take the form shown in FIG. 3. The header is marked as read-only to the mutator in order to protect the memory manager's integrity.

Normally, the descriptor tag associated with each object header is zero. However, when the garbage collector decides to copy an object to a particular to-space location, it copies the object's header into the first word of memory reserved for the copy and overwrites the original object's header with a pointer to the object's new location. The garbage collector also sets the original header's descriptor tag to indicate that the object's header really contains a forwarding pointer. At the same time, it overwrites the second word of memory reserved for the object's copy with a pointer to the original object residing in from-space.

The mutator communicates with the garbage collector by reading and writing special I/O addresses which are given symbolic names in the C++ programming language code fragment shown in FIG. 4. Those who are unfamiliar with the C++ syntax are referred to Paul M. Chirlian, Programming in C++, Merrill Publishing Co., Columbus, Ohio, 1990 and Paul J. Lucas, The C++ Programmer's Handbook, Prentice Hall, Englewood Cliffs, N.J., 1992. For convenience in exposition, it is postulated that these port addresses do not conflict with other I/O ports or memory addresses within the system.

The GC.sub.-- Status and GC.sub.-- Result ports provide responses to service requests issued by way of the input ports. The other output ports allow the mutator to examine the internal state of the garbage collector. The GC.sub.-- ToSpace port provides the current address of to-space, and the GC.sub.-- FromSpace port supplies the current address of from-space. The GC.sub.-- SemiSpaceSize port returns the number of bytes in each memory semi-space. The GC.sub.-- Relocated, GC.sub.-- CopyDest, GC.sub.-- Reserved, and GC.sub.-- New ports return the current values of the arbiter's Relocated, CopyDest, Reserved, and New registers respectively. The GC.sub.-- NumSliceObjects port reports the total number o