United States Patent Application20030135495
Kind CodeA1
Vagnozzi, Paul P.July 17, 2003

Database indexing method and apparatus
Abstract
A database indexing method and apparatus which includes an index structure for indexing a plurality of objects that are logically divided into fine slices of 8,000 records each, and the fine slices are grouped into coarse slices of 4,000 fine slices each. The indexes include fine and coarse keys, each of which corresponds to a particular data value and a particular fine or coarse slice. Bit vectors are used to determine which of the fine slices or objects identified by the fine slices have particular attributes, and these bit vectors can be stored in compressed format. The index structure can be used to index data where no data records exist. A word indexing technique for crossing page boundaries within a document is also disclosed, as well as a technique for enumerating and displaying virtual folders and their contents.

Inventors:Vagnozzi; Paul P. (Farmington Hills, MI)
Correspondence Name and Address:REISING, ETHINGTON, BARNES, KISSELLE, ET AL P.O. BOX 4390
JAMES D. STEVENS
TROY
MI
48099
US
Series Code:176566
Filed:June 21, 2002
U.S. Current Class:707/3
U.S. Class at Publication:707/3
Intern'l Class:G06F 007/00

Claims


What is claimed is:
1. A computer-readable memory for storing one or more indexes used to associate data with a plurality of objects, wherein each of the objects has a plurality of attributes associated therewith and one or more data values for each of the attributes, said computer-readable memory comprising: a non-volatile data storage device; an index stored on said data storage device, said index being associated with a first attribute of the objects, wherein said index comprises a number of keys that include one or more coarse keys and a number of fine keys, wherein each of said fine keys is associated with a group of the objects and with one of the first attribute's data values, and wherein said one or more coarse keys are each associated with a particular one of the first attribute's data values and with a set of said fine keys that are also each associated with that particular data value.

2. A computer-readable memory as defined in claim 1, wherein some of said fine keys are associated with a first data value and others of said fine keys are associated with a second data value, wherein both said first and second data values are data values associated with the first attribute.

3. A computer-readable memory as defined in claim 2, wherein said index includes coarse keys associated with each of said first and second data values.

4. A computer-readable memory as defined in claim 1, wherein said keys include a first portion that identifies whether it is a group key or a set key, a second portion that identifies the group or set to which it corresponds, and a third portion that indicates the data value to which it corresponds.

5. A computer-readable memory as defined in claim 4, wherein said third portion comprises an attributename::attributevalue pair wherein the attributename identifies the attribute with which the key is associated and the attributevalue indicates the data value with which the key is associated.

6. A computer-readable memory as defined in claim 4, wherein said keys are stored in order based upon the contents of said first, second, and third portions.

7. A computer-readable memory as defined in claim 4, wherein said first portion comprises the left most portion of each key, said second portion comprises the middle portion of each key, and said third portion comprises the right-most portion of each key.

8. A computer-readable memory as defined in claim 7, wherein said first portion comprises a single bit.

9. A computer-readable memory as defined in claim 1, wherein said index includes a number of bit vectors, each comprising a sequence of bits, and wherein each of said bit vectors is associated with a particular one of the keys and with the data value associated with that particular key.

10. A computer-readable memory as defined in claim 9, wherein said index includes a link for each of said keys, and wherein at least some of said links each indicate the location of the bit vector corresponding to the key associated with the link.

11. A computer-readable memory as defined in claim 9, wherein at least some of said bit vectors comprise fine bit vectors, with each bit within any selected one of the fine bit vectors corresponding to one of the objects and indicating whether its corresponding object has the data value associated with the bit vector as it first attribute.

12. A computer-readable memory as defined in claim 11, wherein each bit position with a fine bit vector corresponds to a different one of the objects.

13. A computer-readable memory as defined in claim 11, wherein each of said fine keys is associated with a fine link and wherein at least some of said fine links each provide a pointer to a corresponding one of said fine bit vectors.

14. A computer-readable memory as defined in claim 13, wherein at least one other of said fine links identifies a single object within the group of objects associated with that fine link.

15. A computer-readable memory as defined in claim 14, wherein each of said fine links includes at least one bit that indicates whether the link includes a pointer to a fine bit vector or an identifier of the single object.

16. A computer-readable memory as defined in claim 9, wherein at least some of said bit vectors comprise coarse bit vectors each associated with a different one of the coarse keys, with each bit within any selected one of the coarse bit vectors corresponding to a different group of objects and indicating whether any of the objects with the corresponding group has the data value associated with the coarse bit vector as it first attribute.

17. A computer-readable memory as defined in claim 16, wherein each of said coarse keys is associated with a coarse link and wherein at least some of said coarse links are each used to link one of the coarse bit vectors with its associated key.

18. A computer-readable memory as defined in claim 17, wherein at least one other of said coarse links identifies a single group of objects within the set of groups associated with that coarse link.

19. A computer-readable memory as defined in claim 18, wherein each of said coarse links includes at least one bit that indicates whether the link includes a pointer to a coarse bit vector or an identifier of the single group.

20. A computer-readable memory as defined in claim 18, wherein each of the ones of said coarse links that identify a single group include at least one bit that identifies whether or not all of the records within that single group contain the data value associated with the coarse link.

21. A computer-readable memory as defined in claim 1, wherein said index is stored as a B-tree.

22. A computer-readable memory as defined in claim 21, wherein said B-tree includes a root node, a plurality of intermediate nodes, and a plurality of leaves and wherein some of said keys are stored at said leaves and others of said keys are stored at said root and intermediate nodes.

23. A computer-readable memory as defined in claim 1, wherein said non-volatile data storage device comprises a plurality of fixed magnetic disk drives, wherein said database is stored as a single file that spans at least two of said fixed magnetic disk drives.

24. A computer-readable memory as defined in claim 1, further comprising a second index stored on said non-volatile data storage device, said second index being associated with a second attribute of the objects, wherein said second index comprises a number of keys that include one or more coarse keys and a number of fine keys, wherein each of said fine keys of said second index is associated with a group of the objects and with one of the second attribute's data values, and wherein said one or more coarse keys of said second index are each associated with a particular one of the second attribute's data values and with a set of said fine keys from said second index that are also each associated with that particular data value.

25. A computer-readable memory as defined in claim 24, wherein said indexes are each stored in a separate file on said non-volatile data storage device.

26. A computer-readable memory as defined in claim 24, wherein said indexes are stored as a single file and together comprise a single index with each key of the single index including an indexname::attributename::- attributevalue triplet, wherein the indexname identifies one of the two indexes, the attributename identifies the attribute with which the key is associated, and the attributevalue indicates the data value with which the key is associated.

27. A computer-readable memory for storing one or more indexes used to associate data with a plurality of objects, wherein each of the objects has a plurality of attributes associated therewith and one or more data values for each of the attributes, said computer-readable memory comprising: a non-volatile data storage device; an index stored on said data storage device and being associated with a first one of the attributes, said index including a plurality of compressed bit vectors used to identify which objects have a particular data value for the first attribute, wherein at least some of said compressed bit vectors utilize run length encoding and have a variable run length field size.

28. A computer-readable memory for storing one or more word indexes used to index words contained in documents, said computer-readable memory comprising: a non-volatile data storage device; a word index stored on said data storage device, said word index being associated with a text word contained in at least one of the documents, wherein said word index includes a number of bit vectors that include a sequence of bits wherein each bit corresponds to a portion of one of the documents, with some of the bits corresponding to an individual page of the document and others corresponding to portions of two adjacent pages of the document.

29. A computer-readable memory for storing indexes used to associate a time stamp with a plurality of objects, said computer-readable memory comprising: a non-volatile data storage device; a plurality of indexes stored on said data storage device, each having time stamp information related to an event concerning each of the associated objects, wherein each index includes a number of keys with the keys of each index being associated with a different length time interval than the keys of the other index(es), and wherein the time stamp for each of the objects is determinable from a combination of data contained in each of the indexes.

30. A computer-readable memory for storing one or more indexes used to represent hierarchical relationships between objects, wherein each of the objects has a plurality of attributes associated therewith and one or more data values for each of the attributes, said computer-readable memory comprising: a non-volatile data storage device; a plurality of indexes stored on said data storage device and each being associated with a different one of the attributes and each index associating data values of its corresponding attribute with objects that have those data values of the corresponding attribute; and a computer program operable to access the indexes and to provide a graphical user interface displaying a virtual folder corresponding to a particular one of the data values and displaying a set of object identifiers contained within that virtual folder, wherein the object identifiers represent those objects that are associated with the particular data value.

Description



CROSS-REFERENCE TO RELATED APPLICATION

[0001] This application claims the priority of U.S. Provisional Application No. 60/299,952, filed Jun. 21, 2001 and entitled Database Indexing Method and Apparatus. The entire content of that provisional application is hereby incorporated by reference.

TECHNICAL FIELD

[0002] This invention relates in general to techniques for storing and retrieving data on digital computers, and in particular, to database indexing techniques used for document management.

BACKGROUND OF THE INVENTION

[0003] As the processing speed and storage capacity of digital computers continues to increase, so does their suitability for management of very large databases. To be useful, a database typically needs to be searchable so that a user can perform keyword searching of the database and retrieve information associated with the keywords. For large databases, linear searching of the data for keywords is typically too time consuming to be useful. Consequently, large databases are often indexed to provide fast query processing and retrieval of data.

[0004] As is known, in most databases, data is organized logically and often physically into individual fields within a record, with each record representing a collection of data about an object such as a patient, a vehicle, or a document, and each field within the record containing a different item of data about the object, such as the patient's last name, first name, age, gender, etc. To provide flexibility, the database management program must not only permit single keyword searches of the data fields, but must support more complex queries, including boolean expressions (e.g., AND, OR, NOT) of keywords.

[0005] Often, a database will include both high and low cardinality data; that is, the database will include types of data that can have many different values as well as types of data that will have very few different values. Examples of low cardinality data include a person's gender (male or female), political party affiliation, or the make of a vehicle. Examples of high cardinality data include such things as last names, birth dates, vehicle identification numbers (VIN), and social security numbers (SSN), the last two of which will likely include a unique value in every record in the database. For optimum flexibility, a database index must permit efficient searching for both high and low cardinality data. However, typical storage methods are designed and optimized for retrieval of high cardinality data only, using single or compound key indexes.

[0006] One way of indexing a database is through the use of keys that provide a fast way to locate specific items of data within the database. These keys are used by a database management program to locate and retrieve records from the database that contain the data associated with the keys. Bitmaps or bit vectors are sometimes used along with keys to identify which records contain a particular item of data within a particular data field. These bit vectors comprise a string of bits with each bit representing a single record in the database. A particular bit vector will represent a particular item of data that is found in a particular field in at least one of the records of the database. The presence of a set bit (i.e., a bit set to a logical one) in the bit vector indicates that the record associated with that bit contains the particular item of data. Thus, for example, where a database contains patient records that include a "first name" field, a bit vector of "0010100000" that is associated with the name "Elizabeth" for the "first name" field will indicate that the third and fifth records are for patients named Elizabeth.

[0007] Sometimes, in addition to indexing a database, the database itself is stored along with the index rather than maintaining the database as a separately stored structure. For example, in U.S. Pat. No. 4,606,002 to A. Waisman et al., a B-tree is used to store the database data along with an inverted B-tree index that uses keys and associated sparse array bit maps to identify which records contain particular items of data. A record index is used to identify different tables of records (e.g., a table of patient records versus a table of doctor records) and to distinguish between the database data and the indexes. The data is stored using odd numbered record identifiers and the associated indexes are stored using the next even numbered record identifier. Each key and associated sparse array bit map are associated with a range of records and are stored together in the B-tree. The key itself comprises a record identifier, field identifier, data value, and range value, in that order. The record identifier indicates to which table of data the key relates. The field identifier is a numeric identifier of the field within the records to which the key relates. The data value is the actual item of data contained in at least one of the records to which the key relates. And the range value identifies the range of records to which the key relates. The sparse array bit map includes three levels of bit vectors. The bottom level comprises a number of one-byte bit vectors, with the individual bits of each bit vector indicating which records within the associated range of records contain the associated data value in the field specified by the field identifier. The middle level contains one-byte bit vectors in which each bit represents one of the one-byte bit vectors of the bottom level. A bit in the middle level bit vectors is set (to a logical one) if any of the bits in the bottom level bit vector that it represents is set; otherwise it is zeroed. This same structure is carried out to the upper level which includes a single byte representing all of the middle level bit vectors. Where a bit vector would not include any set bits, the bit vector is not allocated and its absence is indicated by its associated bit from the next higher level being zeroed.

[0008] As noted by Waisman et al., the use of bit vectors simplifies the processing of boolean expressions since two bit vectors can be combined in accordance with the specified boolean operator and the resulting bit vector represents the records that satisfy the boolean expression. One problem in combining upper level bit vectors in which the bits do not represent individual records is in the processing of NOT expressions. Since a set bit indicates only that at least one (but not necessarily all) records in the associated range of records includes the data value, the NOT operation cannot be accomplished simply by inverting the bit. Rather, as discussed in Waisman et al., the lower level bits that represent individual records must be inverted and used.

[0009] Since, in Waisman et al., the data is stored at the bottom level of a B-tree, consecutively numbered records need not be physically stored together. This permits the use of variable length fields and permits fields to be added to the database without having to reorganize the database or make changes to the database management program that is used to access the data. However, this type of database structure can complicate retrieval of records, since the fields of any one record may not all be physically stored together and since separate I/O accesses may often be required for retrieval of even a small group of consecutive records. Given the relative slowness of I/O access, the data storage structure utilized by Waisman et al. can result in undesirably slow retrieval of records.

SUMMARY OF THE INVENTION

[0010] In accordance with the invention, there is provided a database indexing method and apparatus in which an index structure can be used to associate data with a plurality of objects, wherein each of the objects has a plurality of attributes associated therewith and one or more data values for each of the attributes. The index is associated with a first attribute of the objects, and it comprises a number of keys that include one or more coarse keys and a number of fine keys, wherein each of the fine keys is associated with a group of the objects and with one of the first attribute's data values. The one or more coarse keys are each associated with a particular one of the first attribute's data values and with a set of the fine keys that are also each associated with that particular data value. Bit vectors can be used for each data value to indicate whether or not each object has a particular data value for its first attribute. This arrangement permits the storage and retrieval of information concerning the objects without the need for records to store the actual data.

[0011] In accordance with another aspect of the present invention, there is provided an indexing method and apparatus in which in which the index is associated with an attribute of a group of objects and utilizes a plurality of compressed bit vectors to identify which objects have a particular data value for the attribute. At least some of the compressed bit vectors utilize run length encoding and have a variable run length field size.

[0012] In accordance with another aspect of the present invention, there is provided an indexing method and apparatus for use in indexing words contained in documents. The word index is associated with a text word contained in at least one of the documents, and includes a number of bit vectors that include a sequence of bits wherein each bit corresponds to a portion of one of the documents. Some of the bits correspond to an individual page of the document and others correspond to portions of two adjacent pages of the document.

[0013] In accordance with another aspect of the present invention, there is provided an indexing method and apparatus used to associate a time stamp with a plurality of objects. For this, a plurality of indexes are used, each having time stamp information related to an event concerning each of the associated objects. Each index includes a number of keys with the keys of each index being associated with a different length time interval than the keys of the other index(es). The time stamp for each of the objects is determinable from a combination of data contained in each of the indexes.

[0014] In accordance with yet another aspect of the present invention, there is provided an indexing method and apparatus that stores one or more indexes used to represent hierarchical relationships between objects such as files stored on a computer system. Each of the objects has a plurality of attributes associated therewith and one or more data values for each of the attributes. A plurality of indexes are used with each being associated with a different one of the attributes and each index associating data values of its corresponding attribute with objects that have those data values of the corresponding attribute. A computer program is used to access the indexes and to provide a graphical user interface displaying a virtual folder corresponding to a particular one of the data values and displaying a set of object identifiers contained within that virtual folder. The object identifiers then represent those objects that are associated with the particular data value.

BRIEF DESCRIPTION OF THE DRAWINGS

[0015] A preferred exemplary embodiment of the present invention will hereinafter be described in conjunction with the appended drawings, wherein like designations denote like elements, and:

[0016] FIG. 1 depicts the structure of a sample database used for explaining the index structure utilized by the present invention;

[0017] FIG. 2 is an overview of query processing using a key-based index;

[0018] FIG. 3 depicts the logical separation of data records into coarse and fine slices for use in generating database indexes;

[0019] FIG. 4 is a diagrammatic representation of sample vehicle color data for each of a number of records from the database of FIG. 1;

[0020] FIG. 5 depicts the general form of the structure of an index utilized by the present invention;

[0021] FIG. 6A depicts an index structure for the sample data of FIG. 4;

[0022] FIG. 6B depicts an improved index structure for the sample data of FIG. 4 using bit vector compression;

[0023] FIG. 7 shows the layout of keys used in the indexes of FIGS. 5, 6A and 6B;

[0024] FIG. 8 shows the layout of the fine links used in the indexes of FIGS. 5, 6A, and 6B;

[0025] FIG. 9 shows the layout of the coarse links used in the indexes of FIGS. 5, 6A, and 6B;

[0026] FIG. 10 shows the layout of the coarse bit vector used in the indexes of FIGS. 5, 6A, and 6B;

[0027] FIG. 11A depicts the B-tree structure which is used to store the keys and links of the index of FIG. 6A;

[0028] FIG. 11B depicts the B-tree structure which is used to store the keys and links of the index of FIG. 6B;

[0029] FIG. 12 is a block diagram of a computer system for use in implementing the present invention;

[0030] FIG. 13 is a flow chart depicting a process for counting the number of records that satisfy a given query;

[0031] FIG. 14 is a flow chart depicting a process for retrieving the records that satisfy a given query;

[0032] FIGS. 15-23 depict various tables used in describing an indexing technique that uses bit vectors to manage arbitrary sets and subsets of objects;

[0033] FIGS. 24-56 are used to depict various aspects of a page indexing technique that can be used for document management; and

[0034] FIGS. 57-71 depict sample data sets, tables, and flow charts used to demonstrate and illustrate the use of the indexing techniques disclosed herein for enumeration of virtual folders and their contents.

DESCRIPTION OF THE PREFERRED EMBODIMENT

[0035] Data Storage within the Database

[0036] Referring to FIG. 1, there is shown a database 20 comprising a table of records 22. Each of the records has a fixed length and is assigned a unique record number, starting with zero. This allows the records to be stored sequentially in a single file with the location within the file of any particular record simply being determined by multiplying the record length (in bytes) by that record's assigned record number and using the resulting value as an offset from the beginning of the file. This arrangement provides very simple and fast allocation, de-allocation, and re-use of data record space within the file. Also, by storing all of the records within a single file, entire databases can be easily created or deleted. Creation simply requires creating a new, empty file. Deletion simply requires deleting an existing file. All user access to the database is by way of a database management program which allows the user to add, change, and delete data from the database. As will be discussed below, the database management program supports boolean and other query processing using key-based indexes that provide fast access to the data within the database.

[0037] As shown in FIG. 1, the database 20 consists of vehicle information with each record 22 having an identical format that includes a number of program data fields 24 and a number of user data fields 26. The program data fields include validity, self-id, next-free, and checksum fields. These fields are not accessible to the user, but are used by the database management program as a part of managing the database, as will be described below. The user data fields 26 include the actual data fields used to store the database data. In the illustrated embodiment, these fields include vehicle make, model, year, VIN, engine, transmission, color, and a number of option fields for storing information about various options a vehicle may have; for example, a sun-roof, aluminum wheels, side-impact airbags, etc. As will be appreciated, the user data fields can be specified by the user as a part of initially setting up the database, with the user specifying the size and data type (e.g., string, date, integer) of each user data field to be included in the database. All fields, both program data fields and user data fields, are fixed length fields so that the records all have a fixed length and can be easily accessed using a calculated offset, as described above.

[0038] The validity data field is used to mark the record as either in-use (valid) or deleted (invalid). This is used where a record has used and then deleted by the user, in which case the validity field is used to indicate that the record does not contain valid data and can be re-used. The validity field can be a single bit, although a byte pattern can preferably be used to prevent loss of data due to a corruption of a single bit. The validity field can be used during recovery operations to determine the validity of data contained in the record's fields. This field can be indexed along with the user data fields, as will be described below. The self-id data field contains a unique identification of the record. The self-id comprises the file identifier (e.g., filename of the database) and the record number of the record. This field is used by the database management program to verify that the record is in fact what it is believed to be. The next-free field points to the next free (i.e., deleted) record using the offset of the record to which it points. This field is used to form a stack of deleted records which can be re-used when new data records are added to the database. Finally, the checksum field contains a checksum value computed on the entire contents of the record, except for the checksum item itself. The checksum provides validity checking to insure database consistency and correctness.

[0039] Indexing of the Database

[0040] To permit keyword searching of user data fields within database 20, all searchable user data fields 26 are indexed with keys that are used to identify which records contain a particular item of data (i.e., data value) within a particular field. Typical queries might be, for example:

[0041] VIN=XYZ123

[0042] MAKE=Chevrolet

[0043] MODEL=Corvette AND YEAR=1975

[0044] ENGINE=V6A OR ENGINE=V8G

[0045] MODEL=Scout AND (Trans=T3 OR Trans=T4)

[0046] AND NOT (COLOR=Grey OR YEAR<1969)

[0047] FIG. 2 provides an overview of how query processing is accomplished using these indexes. For each indexed data field 24 and 26, a pointer 28
is provided which points to an index 30 for that particular data field. The pointers are stored in a table 32 that is separate from the actual database itself. Using the index 30, the database management program obtains a list 34 (represented by one or more bit vectors) of records 22
that contain the keyword used in the query. For boolean and range searches (i.e., for multiple keyword queries), the lists are processed by a query processor module 36 that compares the lists 34 to each other in accordance with the appropriate boolean logic, resulting in a list 38 of records that satisfy the query. The records are then retrieved and, if desired, processed by an optional sort 40, resulting in a final, sorted query result record list 42.

[0048] The indexes 30 are actually collections of keys stored in a B-tree. In creating the indexes, separate keys are generated not only for each user data field, but also for each data value that is stored within that data field in at least one of the records. Associated with each key is a link that is used to determine which records contain the data value associated with the key. To optimize the retrieval of records based upon query processing, a hierarchical structure of keys and bit vectors are used, with each key and bit vector representing no more than a certain number of records in the database. Thus, when creating the index, multiple keys and bit vectors are generated for each data value of each data field and the number of keys and bit vectors generated will depend upon the total number of records in the database and the distribution of data among those records. Two types of keys and bit vectors are used: coarse and fine, with the fine keys and bit vectors each representing a group of consecutive records and the coarse keys and bit vectors each representing a set of consecutive fine bit vectors.

[0049] The indexing of database 20 using these keys and bit vectors will now be described in detail in connection with FIGS. 3-11. FIG. 3 depicts the logical separation of records into groups of what will be referred to as fine slices, with the fine slices being logically organized into sets of what will be referred to as coarse slices. As shown in the top portion of FIG. 3, each fine slice comprises 8,000 consecutive records and each coarse slice comprises 4,000 consecutive fine slices. Accordingly, a single coarse slice represents 32 million consecutive records in the database. Each fine slice has a unique, absolute fine slice number which, for a given record k and fine slice length fsl, is equal to:

[0050] k DIV fsl,

[0051] where DIV indicates integer division which returns the integer quotient. Thus, for (absolute) record number 40,420,973 and a fine slice length of 8,000, the absolute fine slice number will be equal to 5,052
(40,420,973 DIV 8,000). Similarly, each coarse slice has a unique, absolute coarse slice number that, for a given record k, a given coarse slice length csl, and a given fine slice length fsl, is equal to:

[0052] k DIV (csl.times.fsl).

[0053] Thus, for record number 40,420,973 with a coarse slice length of 4,000 and a fine slice length of 8,000, the absolute coarse slice number will be 1 (40,420,973 DIV 32,000,000).

[0054] Within any particular fine slice, each record 22 can be identified by a relative record number which indicates the position of the record within that particular fine slice. For a given record k, the relative record number is equal to:

[0055] k MOD fsl,

[0056] where MOD indicates modulus division which returns the integer remainder. Thus, for a fine slice length of 8,000, (absolute) record number 40,420,973 has a relative record number of 4,973 (40,420,973 MOD 8,000). Similarly, within any particular coarse slice, each fine slice can be identified by a relative fine slice number which indicates the position of the fine slice within that particular coarse slice. For a given record k, the relative fine slice number is equal to:

[0057] (k MOD (csl.times.fsl)) DIV fsl,

[0058] Thus, for the slice lengths given above, the relative fine slice number for record number 40,420,973 would be 1,052 ((40,420,973 MOD 32,000,000) DIV 8,000). As will be discussed further below, the relative record numbers and relative fine slice numbers are used in connection with the links associated with the fine and coarse keys, respectively.

[0059] For purposes of illustration, FIG. 4 provides sample data from database 20 of FIG. 1 showing different data values stored in the "vehicle color" field 26 of a number of records 22 within the database. To aid in understanding the index structure, FIG. 4 also includes columns listing the absolute coarse and fine slice numbers for the records 22, the relative fine slice numbers of the records within a particular coarse slice, and the relative record numbers of the records within a particular fine slice. It will be appreciated that the table of FIG. 4 is simply a logical view of the data and associated slice numbers and does not represent any actual data structure used by the database management program. Also, while the sample data provided in FIG. 4 will be used in connection with the following description and attached drawings, it will be appreciated that the sparseness of the vehicle color data is only provided for the purpose of simplifying the illustration of the database and that, for low cardinality data such as vehicle color, it is probable that most if not all fine slices will contain a large number of records having a particular data value, such as blue.

[0060] For each data value of an indexed data field, there will be one key generated for each fine slice and coarse slice that contains at least one record having the data value within the data field. For example, assuming the database contains 160.5 million records, there will be a minimum of 1
coarse key and 1 fine key and a maximum of 6 coarse keys and 20,063 fine keys generated for each item of data, with the actual number of coarse and fine keys depending upon the number and distribution of the data value among the records in the database. For instance, in the sample data of FIG. 4, only records 32,128,000, 32,128,001, and 60,800,000 contain data value ="white" in the vehicle color field. Thus, for this data value there will be a total of three keys: one coarse key (since all three records are within the same coarse slice) and two fine keys--one for fine slice 4016, which contains both records 32,128,000 and 32,128,001, and one for fine slice 7,600, which contains record number 60,800,000. These three keys are shown in the sample index of FIG. 6, which will be discussed below.

[0061] Referring now to FIG. 5, there is shown the general form of an index 30 for one of the data fields. As mentioned above, the index is comprised of keys 44 and associated links 46, with the key indicating the particular slice and data value with which it is associated and the link being used in determining which records contained in the slice include the data value in the associated user data field. For each data value contained in the data field 26 in at least one record 22 of the database, there will be at least one coarse key 44 and one fine key 44. Very high cardinality data, such as VIN number, may only have a single coarse and fine key, whereas low cardinality data, such as gender, is likely to be found in every fine slice in the database and can therefore require a key for every fine and coarse slice contained in the database. Thus, for any particular data field having data of cardinality l with a number m of coarse slices and a number n of fine slices, the index will logically take the form of FIG. 5, with there potentially being up to (n+m).times.l separate keys and links.

[0062] As a specific example, FIG. 6A depicts the actual contents of the vehicle color index 30 for the sample data from FIG. 4. FIG. 6B shows this same index 30 as it would be implemented using bit vector compression which will be described further below. The sample index includes a plurality of keys 44 and a link 46 associated with each of the keys. As mentioned above, for each data value (e.g., black, blue, gold, green, etc.) found anywhere in the database in the "color" data field, there is provided a coarse key for each coarse slice and a fine key for each fine slice containing at least one record having that data value within the "color" field. Thus, as explained above, for the color white, there would be one coarse key and two fine keys for the sample database. Although the index can be stored in various formats such as in the table format shown in FIGS. 5 and 6, it is preferably stored in a B-tree as will be discussed in connection with FIG. 11. In whatever form the index is stored, the keys are maintained in order by ascending key value, as will now be described in connection with FIG. 7.

[0063] FIG. 7 depicts the format of the keys 44, whether coarse or fine. Each key includes three portions 44a-c that are concatenated together into a single item for ordering of the keys within the index. The first portion 44a is a single bit which indicates the type of key, with a zero indicating that it is a coarse key and a one indicating that it is a fine key. The second portion 44b indicates the absolute slice number for the key, with the first slice being zero. Thus, for a fine key and a fine slice length of 8,000, fine slice 2 would correspond to records 16,000
through 23,999. For a coarse key, coarse slice 2 would correspond to fine slices 8,000 through 11,9999 (and, therefore, records 64,000,000 through 95,999,999). The third and final portion 44c of a key is the key's data value, which is simply the data value (e.g., black, blue, gold, green, etc.) to which that key corresponds.

[0064] Since, for both the fine and coarse slices there is only one key per data value, no two keys will be the same and, consequently, the three portions of the keys that are concatenated together provide a unique key value that is used to maintain the ordering of the keys within the B-tree index. The keys are stored in order of ascending key value. Thus, as shown in FIGS. 6A and 6B, the coarse keys (which begin with a cleared bit) will all be listed in the index before any of the fine keys (which begin with a set bit). Within the group of coarse keys, the keys will then be listed in order of slice number. For two or more coarse keys having the same slice number, the keys will be listed in order of the keys' data values (e.g., black, blue, gold, green, etc.), which may be alphabetically ordered for characters and strings or numerically ordered for integer and decimal numbers. Similarly, the fine keys will be ordered by slice number and, among fine keys having the same slice number, by key data value.

[0065] As mentioned above, the link 46 associated with each fine key is used to indicate which records within the associated fine slice contain the data value associated with the fine key. Referring now to FIG. 8, the layout for the fine link 46 is shown. The fine link can take any of three forms--a pointer to a bit vector, a relative record number (RRN), or a compressed bit vector (CBV). The first portion of the link comprises a link type which is a single bit indicating which type of link it is. A zero (i.e., cleared bit) indicates that the link is a pointer to a bit vector and a one (i.e., set bit) indicates that the link stores data identifying the records having the associated data value. This data is either a relative record number or a compressed bit vector. The first type of link (pointer to a bit vector) is used whenever the fine slice includes at least two records containing the data value. The pointer can either be an offset in the case of the bit vector being stored in the same file as the index, or can be a filename of another file along with an offset, if necessary. In either event the bit vector will include one bit for each record within the fine slice which, in the illustrated embodiment, would be 8,000 bits, with the first bit indicating whether or not the first record in the fine slice contains the data value, the second bit indicating whether or not the second record contains the data value, and so on for each of the remaining records in the fine slice. The second type of link (relative record number) is used whenever there is only one record within the fine slice that contains the associated data value. In that case, the fine link provides the relative record number of that one record. The third type of link (compressed bit vector) is used whenever the fine slice includes at least two records containing the data value, and the bit vector can be compressed to fit within the compressed bit vector portion of the link. When the bit vector can be compressed this provides significant savings of space and improvement in efficiency. As will be described further below, 61 bits are utilized for compression of the bit vector, 5 of which are used for specifying the compression type and the remaining 56 for storing the compressed bit vector. This provides a very large savings over the 8000 bit uncompressed fine bit vector. The second and third types of links are distinguished using the second bit of the link with a zero indicating that the link contains a relative record number and a one indicating that it contains a compressed bit vector. The third bit of the link is reserved for an "ALL" bit that is used by the coarse links, as will be described below.

[0066] Referring back to FIG. 6A, the two fine keys for slice 1 provide an example of the first two types of links. The first fine key associated with slice 1 has a value of blue and can be represented as f: 1:blue, with "f"indicating that it is a fine key, the "1" being the absolute slice number, and "blue" being the data value to which the key relates. As indicated in FIG. 6A, the link associated with key f:1:blue is of type 0, meaning that the link contains a pointer to a bit vector 48. In this case bit vector 48 contains a set bit (logical one) in bit positions 0
and 2. All other bit positions are cleared to a zero. This indicates that relative record numbers 0 and 2 of fine slice 1 contain blue in the vehicle color field. Referring back to the sample database of FIG. 4, it will be evident that this is correct--records 8,000 and 8,002 (which are relative records 0 and 2 of fine slice 1) contain the value "blue" in the vehicle color field.

[0067] Turning back to FIG. 6A, the second key for fine slice 1, namely, f:1:red, has a link of type 1, meaning that the link does not contain a pointer, but instead provides the relative record number (RRN=1) of the only record within fine slice 1 that contains "red" in the vehicle color field. Referring again to FIG. 4, it will be seen that relative record number 1 (which is absolute record number 8001) is in fact the only record within fine slice 1 that contains the value "red" in the vehicle color field. For high cardinality data such as a VIN, there will be a great number of keys created (since the number l of potential data values will be high) but very few records (often only one record) with the data value. Thus, where a slice has only a single record containing the data value, the storage of a record number within the link rather than both a pointer and an almost 1 KB bit vector can result in the saving of large amounts of storage memory.

[0068] Referring now to FIG. 9, the layout is shown for the coarse links that are associated with the coarse keys. As with the fine links, the coarse links can be any of three types which are identified using two bits at the beginning of the link. The first type of link (pointer to bit vector) is indicated by a zero in the first bit position and contains a pointer to a bit vector that represents each of the fine slices within the associated coarse slice. The second type (relative fine slice number) is indicated by a one in the first bit position and a zero in the second position, and is used whenever the coarse slice contains only one fine slice having any records that contain the data value. In this case the link contains the relative fine slice number (RFSN) of that one fine slice. Note that, in addition to the relative fine slice number, the second type of link also utilizes the single "ALL" bit which, as will be discussed below in greater detail, is used to indicate whether or not all of the records within the fine slice include the data value. The third type (compressed bit vector) is indicated by a one in both the first and second bit positions and contains a compressed bit vector representing the fine slices within the associated coarse slice. This compressed bit vector is created in the same manner as that of the fine link of FIG. 8
and this technique used to generate these compressed bit vectors will be described further below.

[0069] Examples of these two types of coarse links can be seen in FIG. 4. As shown therein, the value "orange" is found in records contained within fine slices 8, 20, and 3,000, all of which are within coarse slice 0. Thus, in FIG. 6A, key c:0:orange (coarse slice 0, orange) has associated with it a link that contains a pointer to a bit vector 48 in which bit positions 8, 20, and 3,000 are set to one and the others cleared to zero. As another example, while the value "green" is found in more than one record of the database of FIG. 4, it is only located in one record within coarse slice 0. Thus, key c:0:green of FIG. 6A includes a link that is not a pointer to a bit vector, but rather is the relative fine slice number (RFSN=0) of the fine slice that contains the record having "green" in the vehicle color field.

[0070] With reference to FIG. 10, it will be seen that, unlike the fine bit vectors 48, the coarse bit vector 48 is actually two different bit vectors concatenated together. The first of these two bit vectors includes what will be referred to as "ANY" bits, with the ANY bit vector 48a including a single ANY bit for each of the 4,000 fine slices contained within the coarse slice. An ANY bit is used to indicate whether any of the records contained within its associated fine slice has the data value in its associated field. If so, the ANY bit is set to one. Thus, an ANY bit will be zero only if none of the records within its associated fine slice contain the data value. The second of these two bit vectors includes what will be referred to as "ALL" bits, with the ALL bit vector 48b also including a single ALL bit for each of the 4,000 fine slices. An ALL bit is used to indicate that all of the records contained within its associated fine slice have the data value. If so, the ALL bit is set to one. If the data value is not included within even a single record within the fine slice, then the corresponding ALL bit is cleared to zero. As will be discussed below, the ALL bit is useful in processing queries involving the NOT operator.

[0071] As discussed above, where a particular data value does not exist in any of the records contained in a particular fine slice, no fine key is created. Thus, where an ANY bit in a coarse key is zero, no fine key is created for the fine slice associated with that ANY bit. Similarly, where a particular data value does not exist within any of the records contained in a particular coarse slice, no coarse key or fine keys are created for that coarse slice. Thus, in FIG. 6A, there is no coarse 0 key for "silver" because none of the first 32 million records contain silver in the vehicle color field. Rather, the only keys for silver are a coarse slice 1 key which has a link to its relative fine slice 23, and a fine slice 4023 key which has a link to its relative record number 0 (which is absolute record number 32,184,000), which as shown in FIG. 4 is the only record listing silver as the vehicle color.

[0072] Referring now to FIG. 11A, there is shown the actual structure of the "color" field index 30 of FIG. 6A. The index is stored as a B-tree with keys 44 stored not just at the leaves of the tree, but also at the root and intermediate nodes. This provides faster searching of the B-tree on average, since the search need not traverse all levels of the tree when searching for a key that happens to be located either at the root or an intermediate node. Also, when keys are deleted, the levels in the tree decrease faster than in traditional B-tree structures, since a sole key is never left at a terminal node, but is instead moved up to the next level node. Furthermore, the keys at the root and intermediate nodes are used to determine the search path through the tree. This saves storage space because the data stored at the root and intermediate nodes that is used to determine the search path through the tree is not duplicative of data stored at the leaves of the tree, as in traditional B-trees.

[0073] Compression of Bit Vectors

[0074] As mentioned above, the second type of link (relative record number) is useful for indexing fields that contain high cardinality data (vehicle identification numbers, socials security numbers). In the case of low or medium cardinality data, a lot of records will have the same data values, and thus a lot of bits will be turned on within a fine slice. In this case, the third type of link (compressed bit vector) can often be used within the link rather than both a pointer and a separate, almost 1 KB bit vector. When used, this compressed bit vector results in the same saving of large amounts of storage memory as the use of the single record number approach. The records for low or medium cardinality data can often be clustered together sequentially, with gaps between the clustered sequences. In a fine slice, this results in a bit vector with a relatively small number of alternating strings of 0 bits and strings of 1
bits, with each string being of arbitrary length. The compression method described below for compressing the bit vector has been designed to maximize the effective compression of this kind of data.

[0075] The method used to create a compressed bit vector is an adaptive method. It consists of multiple types of compression. These types of compression fall into two major categories, with multiple minor variations of each category forming sub-categories. Each type of compression is optimized for a particular set of conditions in the bit vector to be compressed. The compression process tries each of the applicable compression types until either the bit vector is successfully compressed into the allowed space; or all compression types fail. Thus, the compression process adapts to each instance of bit vector to be compressed. This adaptive process can be extended by adding additional compression types as required.

[0076] The compression type, consisting of the compression category and compression sub-category, is stored as part of the compressed bit vector. This allows the decompression process to properly and unambiguously decompress the compressed bit vector.

[0077] The first compression category is variable run-length encoding. This method compresses a bit vector by viewing the bit vector as a string of alternating runs of identical bit values (either 0 or 1); and encoding this string of runs as a string of run-length numbers. Since a single bit can have only two values, either 0 or 1, the encoded string of runs does not need to specify a bit value; it need only specify a run length. The bit value of the first run is specified by the compression sub-category; the bit value for each subsequent run simply alternates from the previous run. For example, the following bit vector:

[0078] 0000 1111 1111 1111 1111 1111 1111 0000 0000 0000 0000 1100 0000

[0079] would be compressed and encoded as follows, where the bit value of the first run is 0:

[0080] (4) (24) (16) (2) (6)

[0081] If the total length of the uncompressed bit vector is known, as is the case in the present invention, there is no need to encode the last run. If there is space remaining for the last run in the compressed bit vector, it is encoded so that the decompression process can properly interpret the compressed bit vector. If there is not enough space for the last run in the compressed bit vector, it is not encoded. Since the remaining bits in the uncompressed bit vector must all have the alternate value from the previous run, the decompression process can simply calculate the number of uncompressed bits left and set them to the next alternate value. The optimal compressed and encoded bit vector thus becomes:

[0082] (4) (24) (16) (2)

[0083] Each encoded run-length field occupies a certain number of bits. In the simplest implementation, all run-length fields occupy the same number of bits, which is chosen as the minimum number of bits required to hold the longest run possible. The longest run possible is the size of the uncompressed bit vector, since the entire bit vector could contain the same bit value. For a given total uncompressed bit vector length tubl, this fixed run-length field frlf is calculated as follows:

frlf=floor(log.sub.2 tubl)+1

[0084] where floor is the function returning the largest integer less than or equal to its argument. For the current example, the value of tubl is 52, and thus the value for frlf is calculated as 6. Since the compressed bit vector contains 4 runs, the compressed bit vector will require 4.multidot.6 or 24 bits.

[0085] This implementation can be improved by allowing the run-length fields to occupy a variable number of bits. Since the longest run possible decreases as the remaining uncompressed portion of the bit vector decreases, each successive run-length field need occupy only as many bits as are required to hold the longest remaining run possible. The longest remaining run possible is the size of the remaining uncompressed bit vector, since the entire remaining bit vector could contain the same bit value. For a given remaining uncompressed bit vector length rubl, this variable run-length field vrlf is calculated as follows:

vrlf=floor(log.sub.2 rubl)+1

[0086] For the current example, this gives the following compressed encoded bit vector:

1
encoded runs: (4) (24) (16) (2) remaining uncompressed length: 52 48 24 8
variable run length field size: [6] [6] [5] [4]

[0087] Thus, the compressed bit vector will require 6+6+5+4 or 21 bits.

[0088] This method of using variable run-length field sizes can be further improved by taking advantage of the fact that a run length of 0 will never be encoded; the minimum possible run-length is 1. By biasing the encoded run-length by -1, this can in some cases reduce the size of the variable run-length field required. This biasing is accomplished by subtracting 1 from the run-length value before it is encoded during compression; and adding 1 to the run-length value after it is decoded during decompression. This allows the remaining uncompressed bit vector length to also be biased by -1 when calculating the variable run-length field size. Thus, for a given remaining uncompressed bit vector length rubl, this biased variable run-length field bvrlf is calculated as follows:

bvrlf=floor(log.sub.2 (rubl-1))+1

[0089] For the current example, this gives the following compressed encoded bit vector:

2
biased encoded runs: (3) (23) (15) (1) biased remaining uncompressed length: 51 47 23 7
biased variable run length field size: [6] [6] [5] [3]

[0090] Thus, the compressed bit vector will require 6+6+5+3 or 20 bits.

[0091] The sub-category of the compressed bit vector indicates whether the bit vector starts with a 0 value bit or a 1 value bit. This is necessary because the biased run-length value does not allow for a zero length run. It also provides better compression, because there is no need to waste a run-length field specifying a length of zero bits. If the first run starts with the bit value 1, the encoding process is the same as if the first run started with the bit value 0. The only difference is that the first run-length value indicates the number of 1 value bits, and the bit value of the following runs alternate from there. For example, the following bit vector:

[0092] 1111 0000 0000 0000 0000 0000 0000 1111 1111 1111 1111 0011 1111

[0093] would be compressed and encoded as follows:

3
biased encoded runs: (3) (23) (15) (1) biased remaining uncompressed length: 51 47 23 7
biased variable run length field size: [6] [6] [5] [3]

[0094] Thus, the compressed bit vector will require 6+6+5+3 or 20 bits. Note that the encoded runs for this compressed bit vector are identical to the encoded runs for the compressed bit vector of the prior example, which is actually a different bit vector. This is because the pattern of the alternating runs for the two bit vectors is the same, even though the bit values are reversed. However, because the compression type is stored as part of the compressed bit vector, the two compressed bit vectors are distinct. This allows the decompression process to decompress these two compressed bit vectors into their proper and correct original uncompressed bit vectors. The compression type for the prior example specifies that the uncompressed bit vector starts with a 0 bit value. The compression type for the current example specifies that the uncompressed bit vector starts with a 1 value.

[0095] In the case of the present invention, the total size available for the compressed bit vector is both known and fixed. Thus, in order to be compressible, an uncompressed bit vector must compress to a size less than or equal to the total size available for the compressed bit vector. For example, the following bit vector:

[0096] 0101 0000 1111 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000

[0097] would be compressed and encoded as follows:

4
biased encoded runs: (0) (0) (0) (0) (3) (3) biased remaining uncompressed length: 51 50 49 48 47 43
biased variable run length field size: [6] [6] [6] [6] [6] [6]

[0098] Thus, the compressed bit vector will require 6+6+6+6+6+6 or 36
bits. If the total available size for the compressed bit vector run-length encoding was 24 bits, this bit vector would be uncompressible. However, this bit vector consists of a large number of short runs at the most significant (left) end of the uncompressed bit vector, followed by a small number of long runs at the least significant (right) end of the bit vector. Because the compression process operates in the most significant (left) to least-significant (right) direction, the biased variable run length field size remains high. This method can be improved by implementing a reverse-direction compression process, and using the sub-category of the compressed bit vector to also indicate the direction of the run-length encoding process.

[0099] By starting at the least significant (right) end of the uncompressed bit vector, and building runs from right-to-left, the current bit vector would be compressed and encoded as follows:

5
biased encoded runs: (39) (3) (3) (0) (0) (0) biased remaining uncompressed length: 51 11 7 3 2 1
biased variable run length field size: [6] [4] [3] [2] [2] [1]

[0100] Thus, the compressed bit vector will require 6+4+3+2+2+1 or 18
bits. This will fit within the total available size for the compressed bit vector run-length encoding of 24 bits, making this bit vector compressible.

[0101] Thus, the compression category of variable run-length encoding contains the following compression types:

[0102] 1) Variable Run-Length, from Most Significant Bit, first run is 0s

[0103] 2) Variable Run-Length, from Most Significant Bit, first run is 1s

[0104] 3) Variable Run-Length, from Least Significant Bit, first run is 0s

[0105] 4) Variable Run-Length, from Least Significant Bit, first run is 1s

[0106] With the compression type of variable run-length encoding, from Most Significant Bit, first run is 0s the following bit vector:

[0107] 0000 0000 0000 0000 0000 0001 0101 0100 0000 0000 0000 0000 0000

[0108] would be compressed and encoded as follows:

6
biased encoded runs: (22) (0) (0) (0) (0) (0) (0) (0) biased remaining uncompressed 51 28 27 26 25 24 23 22
length: biased variable run length [6] [5] [5] [5] [5] [5] [5] [5] field size:

[0109] Thus, the compressed bit vector will require 6+5+5+5+5+5+5+5 or 41
bits. If the total available size for the compressed bit vector run-length encoding was 24 bits, this bit vector would be uncompressible. In this case, the compression type of variable run-length encoding, from Least Significant Bit, first run is 0s does not provide any improvement. However, this bit vector consists of a large number of short runs closely packed together in the middle of the uncompressed bit vector. Because the short runs are located in the middle of the bit vector, the biased variable run length field size remains high. This method can be improved by another compression category which encodes this string of short runs as a literal, instead of as a series of runs.

[0110] The second compression category is two runs with literal encoding. This method compresses a bit vector by viewing the bit vector as a literal string of arbitrary bit values (both 0 and 1) surrounded on both ends by a string of identical bit values (either 0 or 1); and encoding these three strings as a run-length number followed by a literal string of bits followed by a run-length number. Since a single bit can have only two values, either 0 or 1, the encoded runs of identical bits surrounding the literal string do not need to specify a bit value; they need only specify a run length. The bit value of the first run and the bit value of the last run is specified by the compression sub-category. For example, the following bit vector:

[0111] 0000 0000 0000 0000 0000 0001 0101 0100 0000 0000 0000 0000 0000

[0112] would be compressed and encoded as follows, where the bit value of the both the first run and the last run is 0:

[0113] (23) 1010 101 (22)

[0114] If the total length of the uncompressed bit vector is known, as is the case in the present invention, there is no need to encode the last run. If there is space remaining after the literal string in the compressed bit vector, the literal string is extended to fill the remaining space, setting each filled-in bit to the bit value of the last run. Since the remaining bits after the literal string in the uncompressed bit vector must all have the last run value specified by the compression sub-category, the decompression process can simply calculate the number of uncompressed bits left and set them to the compression sub-category value. The optimal compressed and encoded bit vector thus becomes:

[0115] (23) 1010 1010 0000 00. . .

[0116] The first encoded run-length field occupies a certain number of bits. This is chosen as the minimum number of bits required to hold the longest run possible. The longest run possible is the maximum distance from the end of the bit vector to the start of the literal string. If the compression method can start from either end of the uncompressed bit vector, the maximum distance from the end of the bit vector to the start of the literal string is equal to one-half the size of the uncompressed bit vector. If the literal string starts more than halfway from the most significant (left) end of the uncompressed bit vector, it will start less than halfway from the least significant (right) end of the uncompressed bit vector. The sub-category of the compressed bit vector can be used to also indicate the direction of the run-length encoding process (the end from which to start compressing). For a given total uncompressed bit vector length tubl, the fixed run-length field frlf is calculated as follows:

frlf=floor(log.sub.2 (tubl/2)+1

[0117] For the current example, the value of tubl is 52, and thus the value for frlf is calculated as 5. If the total available size for the compressed bit vector two runs with literal encoding was 24 bits, this would allow 19 bits for the literal string.

[0118] This method of compression can be further improved by taking advantage of the fact that a run length of 0 will never be encoded; the minimum possible run-length is 1. By biasing the encoded run-length by -1, this can in some cases reduce the size of the fixed run-length field required. This biasing is accomplished by subtracting 1 from the run-length value before it is encoded during compression; and adding 1 to the run-length value after it is decoded during decompression. Thus, for a given total uncompressed bit vector length tubl, the biased fixed run-length field bfrlf is calculated as follows:

bfrlf=floor(log.sub.2 ((tubl-1)/2))+1

[0119] For the current example, the value of tubl is 52, and thus the value for bfrlf is calculated as 5. If the total available size for the compressed bit vector two runs with literal encoding was 24 bits, this would allow 19 bits for the literal string. For this example, this would give the following compressed encoded bit vector:

[0120] (22) 1010 1010 0000 0000 000

[0121] Thus, for certain patterns of data, the two runs with literal encoding compression category allows for successful compression where the variable run-length encoding compression category fails.

[0122] The compression category of two runs with literal encoding contains the following compression types:

[0123] 5) Two Runs w/ Literal, from Most Significant Bit, first run is 0s, last run is 0s

[0124] 6) Two Runs w/ Literal, from Most Significant Bit, first run is 0s, last run is 1s

[0125] 7) Two Runs w/ Literal, from Most Significant Bit, first run is 1s, last run is 0s

[0126] 8) Two Runs w/ Literal, from Most Significant Bit, first run is 1s, last run is 1s

[0127] 9) Two Runs w/ Literal, from Least Significant Bit, first run is 0s, last run is 0s

[0128] 10) Two Runs w/ Literal, from Least Significant Bit, first run is 0s, last run is 1s

[0129] 11) Two Runs w/ Literal, from Least Significant Bit, first run is 1s, last run is 0s

[0130] 12) Two Runs w/ Literal, from Least Significant Bit, first run is 1s, last run is 1s

[0131] The present embodiment performs optimization of the fine link in the following manner:

[0132] If the fine slice contains only a single record, the link is formatted as a single record link, with the relative record number (RRN) contained within the link;

[0133] otherwise, if the fine slice bit vector can be compressed to fit within the link, the link is formatted as a compressed bit vector, with the compressed bit vector stored within the link;

[0134] otherwise, the link is formatted as a pointer to a bit vector, with the bit vector stored external to the link.

[0135] For the coarse links, the ANY and ALL bit vectors of FIG. 10 are treated as a single bit vector for the purpose of compression.

[0136] The advantage of using compressed bit vectors is shown in FIGS. 6B and 11B. As shown in FIG. 6B, the compression permits the bit vectors to be stored in the link, rather than needing a pointer to a separate bit vector. When compression is used, the link includes both the compressed bit vector plus a four-bit compression type (CT) data value indicating which of the twelve compression schemes disclosed above are used. For example, for the blue coarse slice 0 key, the compression type is 2
(since the uncompressed bit vector begins with a one bit) and, using the biased, variable run length encoding described above, the compressed bit vector can be stored as a single biased encoded run of (2). For the orange coarse slice 0 key, the first compression type (CT=1) results in a compressed bit vector of 78 bits, too big to fit into the available 56
bits in the link. However, compression type 3 compresses the bit vector to only 51 bits which fits. Therefore, the resulting biased encoded run is (4998)(0)(2978)(0)(1 0)(0).

[0137] Query Processing Using the Indexes

[0138] Referring now to FIG. 12, there is shown a computer system 50 for use in implementing the database, index structure, and query processing shown in the previous figures. Computer system 50 includes a computer 52
having a microprocessor 54, RAM 56, a hard disk 58, a keyboard 60, and monitor 62. Computer 52 can be any of a number of commercially-available personal computers running an operating system such as WindowsNT.RTM., with microprocessor 54 comprising an Intel.RTM. Pentium.RTM. II or equivalent processor. Database 20 is stored as a single file on a computer-readable memory such as a hard drive 58. Similarly, each of the indexes are stored on hard drive 58 as a separate file. Hard drive 58 can comprise a fixed magnetic disk or other non-volatile data storage. Depending upon the size of database 20, the non-volatile data storage device may comprise a number of hard drives 58a-n, such as in a RAID array, with the database spanning two or more of these hard drives. As mentioned above, the database management program 64 is used to setup and maintain the database 20 and its indexes, as well as to perform query processing and associated retrieval of records using the indexes. As with database 20, database management program 64 is also stored in computer-readable format on hard drive 58. As will be appreciated, computer 52 can be a server attached via a network interface card (not shown) to a network, whether it be a local area network or a global computer network such as the Internet.

[0139] Query processing is implemented by computer 52 by way of microprocessor 54 executing instructions from database management program 64. Program 64 locates the one or more records that satisfies a particular user query by creating a target keys (e.g., c:0:blue) for each coarse and fine slice and then searches the appropriate index for those target keys, starting with the lowest key valued key (i.e., coarse slice 0). If no key is found, a bit vector of all zeros is returned. If a matching key is found in the index, then the associated link is used to obtain a bit vector for that key. If the link is of type 0, as shown in FIGS. 8 and 9, then the bit vector identified by the link is returned. Where one or both of the keys' links are of type 1; that is, they contain a relative fine slice number (in the case of a coarse key) or a relative record number (in the case of a fine key) rather than a pointer to a bit vector, then a bit vector is created and, for a fine bit vector, the bit corresponding to the record identified by the link is set to one and the remaining bits of the vector being cleared to zero. When creating a coarse bit vector (which includes both ANY bits and ALL bits), the ANY bit corresponding to the fine slice number identified by the link is set to one, with the remaining ANY bits being cleared to zero, and the ALL bit corresponding to the fine slice number identified by the link is set to the same value (0 or 1) as the ALL bit contained in the link, with the other ALL bits being cleared to zero. In this way, query processing can always be carried out using bit vectors, regardless of which type of link is stored in the index.

[0140] In the case of simple queries, such as MAKE=Chevrolet, once a coarse bit vector has been obtained, it can be used to determine which fine slices contain records satisfying that query. The keys for those fine slices (e.g., f:0:Chevrolet) can then be accessed, in order of their key value, and their associated bit vectors obtained. As records containing the data value are identified, they are retrieved for processing.

[0141] For boolean operations, such as would be required for a query of MODEL=Corvette and YEAR=1975, corresponding bit vectors for each of the keyword search terms are obtained in the manner described above, and then are logically combined in accordance with the boolean logic (AND) specified in the user's query. The following operators are used to perform boolean operations on the bit vectors:

[0142] AND_BV

[0143] This is a binary operator, taking two bit vector parameters, and returning a single bit vector result. All bits of the two bit vectors are logically AND-ed together, yielding a single result bit vector. The operation is identical for coarse and fine bit vectors.

[0144] OR_BV

[0145] This is a binary operator, taking two bit vector parameters, and returning a single bit vector result. All bits of the two bit vectors are logically OR-ed together, yielding a single result bit vector. The operation is identical for coarse and fine bit vectors.

[0146] NOT_BV

[0147] This is a unary operator, taking one bit vector parameter, and returning a single bit vector result. All bits of the one bit vector are logically NOT-ed (complemented), yielding a single result bit vector. The operation is different for coarse and fine bit vectors. For fine bit vectors, all bits are simply NOT-ed in place. For coarse bit vectors, the ANY and ALL bits are NOT-ed and then the ANY and ALL bit vectors are swapped; that is, the ALL bit vector is moved to the left portion of the coarse bit vector as shown in FIG. 10 and thereby becomes the ANY bit vector for the NOT-ed coarse bit vector. Similarly, the ANY bit vector is moved to right portion of the coarse bit vector so that it becomes the ALL bit vector of the NOT-ed coarse bit vector.

[0148] The following are basic "find" operators that are used in searching through an index to obtain a bit vector for specified target keys or key ranges.

[0149] FIND_EQUAL_BV

[0150] This operator searches an index to find the key that matches the specified target key, which is a parameter to this operator. There will be at most one entry in the B-tree matching the target key. If the target key doesn't exist, a bit vector of all zero bits is created and returned. At the end of this operator, a current path structure is created pointing to the location in the B-tree where the target entry was found, or where it would have been found if it had existed, and the target key is saved for use by the FIND_NEXT_BV operator discussed below.

[0151] FIND_NEXT_BV

[0152] This operator searches an index to find the next key whose field Slice Type and Absolute Slice Number values (see FIG. 7) match the target key's field Slice Type and Absolute Slice Number values. Thus, the key data value portion of the key is ignored. The target key is the target key saved by the last previous FIND_EQUAL_BV operator. There can be any number of entries in the B-tree matching the target key. If the next target key does not exist, a special completed result value is returned to indicate that no more entries exist matching the target key. The search starts from the current path created by the last previous FIND_EQUAL_BV operator, or updated by the last previous FIND_NEXT_BV operator. At the end of this operator, the current path structure is updated to point to the location in the B-tree where the next target key was found.

[0153] The following are relational "find" operators that are used to search through an index. Each execution of one of these operators finds one or more target keys in an index, and returns the bit vector, either coarse or fine, associated with the target keys. If multiple keys are found, their associated bit vectors are logically OR-ed together to form a single result bit vector. This single result bit vector is an accumulation of all the keys found in the search. Each operator executes on a single Slice Type value and Absolute Slice Number value. Depending on the operator, it executes on one or more Key Data Values. These operators are executed multiple times to operate on more than one Slice Type or Absolute Slice Number.

[0154] FIND_LSS_SLICE

[0155] This operator searches an index to find all entries whose Key Data Value (see FIG. 7) is less than the specified target key, which is a parameter to this operator. It operates as follows. A FIND_EQUAL_BV operator is executed on the target key, generating an initial FIND_LSS_SLICE result bit vector. The FIND_NEXT_BV operator is executed continuously until it returns a completed result. Each execution result bit vector is logically OR-ed with the OR_BV operator into the FIND_LSS_SLICE result bit vector. The FIND_LSS_SLICE result bit vector is logically NOT-ed with the NOT_BV operator. The FIND_LSS_SLICE result bit vector is returned as the operator result.

[0156] FIND_LEQ_SLICE

[0157] This operator searches an index to find all entries whose Key Data Value is less than or equal to the specified target key, which is a parameter to this operator. It operates as follows. A FIND_EQUAL_BV operator is executed on the target key. The initial FIND_LEQ_SLICE result bit vector is set to all zero bits. The FIND_NEXT_BV operator is executed continuously until it returns a completed result. Each execution result bit vector is logically OR-ed with the OR_BV operator into the FIND_LEQ_SLICE result bit vector. The FIND_LEQ_SLICE result bit vector is logically NOT-ed with the NOT_BV operator. The FIND_LEQ_SLICE result bit vector is returned as the operator result.

[0158] FIND_EQL_SLICE

[0159] This operator searches an index to find all entries whose Key Data Value is equal to the specified target key, which is a parameter to this operator. It operates as follows. A FIND_EQUAL_BV operator is executed on the target key, generating a FIND_EQL_SLICE result bit vector. The FIND_EQL_SLICE result bit vector is returned as the operator result. ps FIND_PEQL_SLICE

[0160] This operator searches an index to find all entries whose Key Data Value is partially equal to the specified target key, which is a parameter to this operator. Partially equal means that the entry Key Data Value matches the specified target key for the length of the target key's Key Data Value (the target key is a partial Key Data Value). It operates as follows. A FIND_EQUAL_BV operator is executed on the target key, generating an initial FIND_PEQL_SLICE result bit vector. The FIND_NEXT_BV operator is executed continuously until it returns a completed result, or the next key's Key Data Value is greater than the target Key Data Value. Each execution result bit vector is logically OR-ed with the OR_BV operator into the FIND_PEQL_SLICE result bit vector. The FIND_PEQL_SLICE result bit vector is returned as the operator result.

[0161] FIND_NEQ_SLICE

[0162] This operator searches an index to find all entries whose Key Data Value is not equal to the specified target key, which is a parameter to this operator. It operates as follows. A FIND_EQUAL_BV operator is executed on the target key, generating an initial FIND_NEQ_SLICE result bit vector. The FIND_NEQ_SLICE result bit vector is logically NOT-ed with the NOT_BV operator. The FIND_NEQ_SLICE result bit vector is returned as the operator result.

[0163] FIND_GEQ_SLICE

[0164] This operator searches an index to find all entries whose Key Data Value is greater than or equal to the specified target key, which is a parameter to this operator. It operates as follows. A FIND_EQUAL_BV operator is executed on the target key, generating an initial FIND_GEQ_SLICE result bit vector. The FIND_NEXT_BV operator is executed continuously until it returns a completed result. Each execution result bit vector is logically OR-ed with the OR_BV operator into the FIND_GEQ_SLICE result bit vector. The FIND_GEQ_SLICE result bit vector is returned as the operator result.

[0165] FIND_GTR_SLICE

[0166] This operator searches an index to find all entries whose Key Data Value is greater than the specified target key, which is a parameter to this operator. It operates as follows. A FIND_EQUAL_BV operator is executed on the target key. The initial FIND_GTR_SLICE result bit vector is set to all zero bits. The FIND_NEXT_BV operator is executed continuously until it returns a completed result. Each execution result bit vector is logically OR-ed with the OR_BV operator into the FIND_GTR_SLICE result bit vector. The FIND_GTR_SLICE result bit vector is returned as the operator result.

[0167] FIND_RANGE_SLICE

[0168] This operator searches an index to find all entries whose Key Data Value is greater than or equal to the specified minimum target key, and less than or equal to the specified maximum target key, which are parameters to this operator. It operates as follows. A FIND_EQUAL_BV operator is executed on the minimum target key, generating an initial FIND_RANGE_SLICE result bit vector. The FIND_NEXT_BV operator is executed continuously until it returns a completed result, or the next Key Data Value is greater than the maximum target key's Key Data Value. Each execution result bit vector is logically OR-ed with the OR_BV operator into the FIND_RANGE_SLICE result bit vector. The FIND_RANGE_SLICE result bit vector is returned as the operator result.

[0169] FIND_RANGE_X_SLICE

[0170] This operator searches an index to find all entries whose Key Data Value is greater than the specified minimum target key, and less than the specified maximum target key, which are parameters to this operator. It is similar to the FIND_RANGE_SLICE operator defined above, except the range is exclusive of both (minimum and maximum) endpoint values, rather than inclusive. It operates as follows. A FIND_EQUAL_BV operator is executed on the minimum target key. The initial FIND_RANGE_X_SLICE result bit vector is set to all zero bits. The FIND_NEXT_BV operator is executed continuously until it returns a completed result, or the next Key Data Value is greater than or equal to the maximum target key's Key Data Value. Each execution result bit vector is logically OR-ed with the OR_BV operator into the FIND_RANGE_X_SLICE result bit vector. The FIND_RANGE_X_SLICE result bit vector is returned as the operator result.

[0171] FIND_RANGE_LX_SLICE

[0172] This operator searches an index to find all entries whose Key Data Value is greater than the specified minimum target key, and less than or equal to the specified maximum target key, which are parameters to this operator. It is similar to the FIND_RANGE_SLICE operator defined above, except the range is exclusive of the left (minimum) endpoint value, rather than inclusive. It operates as follows. A FIND_EQUAL_BV operator is executed on the minimum target key. The initial FIND_RANGE_LX_SLICE result bit vector is set to all zero bits. The FIND_NEXT_BV operator is executed continuously until it returns a completed result, or the next Key Data Value is greater than the maximum target key's Key Data Value. Each execution result bit vector is logically OR-ed with the OR_BV operator into the FIND_RANGE_LX_SLICE result bit vector. The FIND_RANGE_LX_SLICE result bit vector is returned as the operator result.

[0173] FIND_RANGE_RX_SLICE

[0174] This operator searches an index to find all entries whose Key Data Value is greater than or equal to the specified minimum target key, and less than the specified maximum target key, which are parameters to this operator. It is similar to the FIND_RANGE_SLICE operator defined above, except the range is exclusive of the right (maximum) endpoint value, rather than inclusive. It operates as follows. A FIND_EQUAL_BV operator is executed on the minimum target key, generating an initial FIND_RANGE_RX_SLICE result bit vector. The FIND_NEXT_BV operator is executed continuously until it returns a completed result, or the next Key Data Value is greater than or equal to the maximum target key's Key Data Value. Each execution result bit vector is logically OR-ed with the OR_BV operator into the FIND_RANGE_RX_SLICE result bit vector. The FIND_RANGE_RX_SLICE result bit vector is returned as the operator result.

[0175] Retrieval of Records

[0176] As a result of the query process, a list of absolute record numbers is generated, with the list representing a subset of all of the records contained in the database. The records are listed in record number order as an inherent result of the index structure and query processing techniques described above. The retrieval operation is used to select and retrieve from database 20 the list of records generated as a result of the query processing. The retrieval operation described is designed to provide very fast retrieval response.

[0177] In the preferred embodiment, there are two types of retrieval operations: COUNT and FIND. A COUNT retrieval simply counts the selected subset of records. Since the records themselves don't need to be retrieved, this operation is extremely fast. The retrieval operation to select and COUNT a subset of records from a database is shown in FIG. 13. A FIND retrieval retrieves each of the records in the subset. The retrieval operation to select and FIND a subset of records from a database is shown in FIG. 14. Since the selection operation rapidly identifies the selected records, the retrieval time is in general proportional to the number of records selected, not the number of records in the data record table.

[0178] A unique aspect of the retrieval operation is that it operates by examining the indexes in record number order first, and then field value order, rather than field value order first, and then record number order. One reason this is possible is that, as shown in FIG. 11, the index B-tree is organized and ordered for efficient index-sequential operation in this search order. A given slice and data value combination can be efficiently located in the index B-tree. This is accomplished using normal B-tree search methods, starting at the top of the tree, and utilizing a binary search in each tree node, until the target index entry is found. Since all of the field values for the given slice are stored in sequential entries in the B-tree immediately preceding and following the target key, they may be retrieved easily and extremely quickly. A second reason this is possible is the combination of coarse and fine bit vectors. As a result of this combination, a single coarse bit vector represents 32,000,000 records. This means that by processing a single coarse slice for the retrieval criteria, 32,000,000 records have been processed, and the fine slices of interest in the current coarse slice have been identified. From there only the fine slices of interest are directly accessed and processed 8,000 records at a time. Then only the individual records of interest are counted or retrieved. A third reason this is possible is the ability to logically NOT a bit vector rapidly. For example, the NOT EQUAL criteria is implemented by finding the bit vector EQUAL to the criteria, and NOT-ing this bit vector. This is much faster than finding all the bit vectors NOT EQUAL to the criteria. The special capability of a coarse bit vector to contain both "ANY" and "ALL" bit vectors allows the NOT operation to work as effectively on coarse bit vectors as on fine bit vectors. Yet a fourth reason this is possible is the ability to rapidly evaluate a LESS THAN, LESS THAN OR EQUAL, GREATER THAN, GREATER THAN OR EQUAL, or RANGE criteria by simply OR-ing together bit vectors stored sequentially in the index. Since each coarse bit vector represents 32,000,000 records, it is much faster to enumerate the Key Data Values within a slice this way, than to find the Key Data Values first, and then enumerate the record numbers. Thus, the retrieval criteria is quickly processed in record number order 32,000,000 records at a time. Even with a record count of 1 billion records, only 32 coarse slices would need to be examined.

[0179] During execution of the criteria code, the validity index (see FIG. 2) may be used by the retrieval operation. The validity index is a standard index with a one bit for every in-use data record, and a zero bit for every deleted data record. If a NOT_BV operator is executed at any time in the criteria code, a flag is set specifying that the validity index is needed. This flag is needed because a zero bit in a bit vector represents deleted data records as well as data records not matching the current criteria. If this flag is set at the end of execution of the query processing, the validity index bit vector for the current coarse or fine slice is AND-ed in to the current result bit vector with the AND_BV operator. This eliminates deleted records which were introduced by the NOT_BV operator from the final result bit vector.

[0180] The database should be locked against update during certain portions of the retrieval operation. The retrieval operation is optimized to reduce the number of times and duration of this locking. The database is locked with a shared-lock (reader lock) only during execution of the query process. This allows any number of other retrieval operations on the table to proceed concurrently, while temporarily locking out update operations. The database is locked at the beginning of the query processing, and unlocked at the end of the query processing. Since the query is executed a slice at a time (via the bit vector mechanism), a single lock-unlock cycle covers query processing for 32,000,000 records for coarse slices, and 8,000 records for fine slices. The design of the index makes criteria code execution very simple and quick. In this way, the number of lock-unlock cycles is minimized, and the duration of the time the database is locked against update is minimized.

[0181] The database is unlocked during the loading and processing of retrieved data records. This creates the possibility of a data record being modified between the time the query is executed, and the time the data record is loaded. This means that the actual data record loaded could no longer match the retrieval criteria. This problem is avoided by assigning a unique update transaction number to each data record update. This update transaction number is stored in the data record itself. The then-current update transaction number is captured and stored by the retrieval operation during the execution of the query. When a data record is loaded during a retrieval operation, its update transaction number is compared against the query's update transaction number. If the data record update transaction number is higher, it means the data record has been updated since the execution of the query, and may no longer match the retrieval criteria.

[0182] When this condition is detected, the updated data record is not processed. Instead, the retrieval operation is interrupted, and restarted at the current coarse and fine slices. The query is re-executed for the current coarse slice and then for the current fine slice. Processing of the current fine slice is then restarted at the bit represented by the record number which was being loaded and processed at the time of the interruption. This insures that the data record is consistent with the newly executed query, and processing resumes (unless the data record has been updated again, in which case the interruption/restart will be repeated). Thus, the retrieval operation can evaluate retrieval criteria for large numbers of records at a time, with minimal locking, while providing consistency of results. The method of retrieval operation processing in record number order provides the ability of interrupting and restarting retrieval at any record number location simply and effectively.

[0183] Although retrieval operations operate a slice at a time for speed and efficiency, this does not mean that a FIND retrieval operation has to process all of the records meeting the retrieval criteria. A FIND retrieval operation can be terminated whenever a data record is loaded and processed. For example, an application may decide that it should process only 100 data records at a time. The FIND retrieval operation would count the data records as they were loaded and processed, and stop the FIND retrieval operation when that count reached 100. Because retrieval operations process in record number order, this is very easy to implement and very efficient. In addition, a subsequent FIND retrieval operation can simply start with the next record after the last record previously loaded and processed, by starting with the coarse and fine slice containing that next record number.

[0184] Data Indexing Without Records

[0185] In the discussion above, it was shown how indexes and bit vectors could be used to index data that is stored within records. The example of automobile records for an automobile database was used for this purpose. This approach to using indexes and bit vectors can be generally abstracted and this will be done using the example of employees and employee attributes. Generally, if there are several employees in an organization a user would want to be able to go a list of employees and select sets or sub-sets of employees based on attributes. For example, finding all of the employees who belong to a given department, or all those employees that are manager level, or all the employees that work on a given product or some combination. For a more complex example, finding all the employees who are in engineering and a supervisor or higher level and work on just storage products, and the kind of queries previously addressed exactly maps into the developed database architecture. The difference, however, is that the records do not exist. Previously, a record for each automobile with information concerning that automobile can have associated indexes to find associated information. In the case of abstraction, for the example of employees, the database of employee information does not have to exist. The employee information is a separate set of data and the indexing approach described below provides the ability to abstract it and build a list of employees that have attributes names and values, without needing to have and maintainbut we don't really need to ever have data records.

[0186] For example, a human resources department wants to keep a skills database to find employees so that when they need to find people to work on certain projects they can. Or to route information to recipients, in a company its very common to be sending information out to employees which could be reports coming out of the computer systems, financial reports, production and sales reports. The right report has to be sent to the right person, so the attributes of the employees can be used to build queries that find the people. For instance, the financial people need to receive the financial reports. For these applications, however, lists are created, but the actual employee record is not necessary. This example is actually a one way implemented distribution list, but it is a more efficient way to implement distribution lists than just keeping a list of employees.

[0187] For another example, assume there were 100 different reports that had to be distributed to 100 different groups of employees in a company. If a simple distribution list is used, then there would be 100 reports each with various employees on them every time an employee changed or moved from one division or one level or one product or one department to another--all 100 lists would have to accessed to update them. However, by using the attribute method described herein, all that is needed to do is go to that one employee and change his attributes. If he switched departments, his department attribute can be changed; if he switched products, the product attribute is changed and it would automatically be reflected in all of the 100 distribution lists because those distribution lists are queried against the attribute database.

[0188] The following is an example of how to use this with employee attributes and then this concept will be generalized further. In the explanation that follows, the employee application is provided as an example, but really this can be used to maintain an attribute value database for any kind of structure for any kind of data.

[0189] FIG. 15 shows for the purposes of illustrating the preferred embodiment, a list of employee attribute names and associated attribute values. In this table, department, for example, is an attribute name with example attribute values: engineering, manufacturing, sales, services, and administration. Division is another attribute name and has the values U.S., Europe and Asia and so forth. Level is an attribute name that indicates employee level, with the sample attribute values of hourly, salaried, supervisor, manager and director. Products is yet another sample attribute name, with attribute values of workstations, servers, disk storage, magnetic tape, financial software, manufacturing software and consultant services. Cost center designates which cost center the employee belongs to, with range of attribute values from 1000 to 9999. All of the attribute values may not apply to all attribute names, but on some attribute names it is possible that a given employee could have more than one attribute value. For example, in the products attribute name, a given employee could work with financial software and manufacturing software or with workstations and servers and disk storage.

[0190] FIG. 16 illustrates an example set of data including a set of 10
sample employees, each of which is assigned a department, a division, a level, various products (an employee can have multiple products or no products that they are associated with), and the costs centers. The employee can have one or more cost centers or a range of cost centers.

[0191] FIG. 17 shows a conventional way to implement this employee attribute data, specifically implementing it with data records. In the example illustrated in FIG. 17, we've got a table called Employee Attributes and each single record in that table indicates an employee name, an attribute name, and an attribute value. So in a single record one attribute value for one attribute name is associated with one employee. So if with reference to this example, Employee Peter is in record number 0, and his department attribute is engineering. As illustrated, this actually is a table that has three fields. A field called employee name or employee, a field call attribute name, and a field called attribute value. So record 0 has been created with employee Peter, attribute name Department and attribute value Engineering. Another record called record number 1 with employee Peter, attribute name Division, and Value U.S., etc. Looking at records 3 and 4, specifically products, this employee has two products (two values) for the attribute name products so that takes two records, record number 3 is employee name Peter, attribute name Products and attribute value Work Stations. Likewise record number 4 is employee name Peter, attribute name of Products, and attribute value Servers. This employee Peter has a large number of costs centers. He has cost center 1100 and then also cost centers 1500 through 1599 so a record for each one of them will have to be made. At record number 5 is the cost center attribute name and the attribute value is 1100, and then at record number 6 for the attribute name cost center the attribute value is 1500, the next record is 1501 and then with the ellipses there are all the values in between records 8
through a and then for a+1 the cost center attribute value is 1598, and the next record is 1599, and then move on to the next employee Paul.

[0192] There are so many records per employee to provide the flexibility that is required in this example--there could be thousands of permutations to complicate things even more, requiring the need to add attributes at any time. A given attribute name could have any number of attribute values, so it would be best to not define one employee record with a given number of attribute names and attribute values; instead the need exists to make a relational table like this where for each attribute name, attribute value, and employee name there is a separate record.

[0193] There are two problems with the example illustrated in FIG. 17, that require solving. The first problem is the amount of data and the time to create it. In this conventional approach, for every employee and their set of attributes a lot of records are created. Each of these records takes space and it also takes time to create it and delete it. The second problem is not as apparent. If this table is designed and implemented the way described earlier in the patent, each one of the columns (employee, attribute name and attribute value) would be a field in the record and they would be a key field, they would be indexed with an index by employee, an index by attribute name, and an index by attribute value. This would provide the kind of processing needed to find all employees who are in the U.S. division, for example. One could then search with the method described above using the fields saying find it where the attribute name field is equal to division and the attribute value field is equal to the U.S. and that would execute very quickly and that would give in this case record number 1 for division U.S., it would give record number a+4 for division U.S., and it would give record c+2
for division U.S. The record numbers would have to be retrieved to find out that record number 1 is employee Peter, record number a+4 is employee Paul and record number c+2 is employee Robert.

[0194] However, this approach falls apart or fails when we try to do a combination to find everybody who is either in the division U.S. or is a director. In this example, searching for division U.S. would uncover record 1; searching for the level of director would uncover record a+4
for level director, also record a+5 would return, and the problem is that there is no way to indicate that record a+4 is the same person as record a+5 because different record numbers have been retrieved. Reviewing the way it works, when AND and OR and NOT are all used, it assumed that the results included the same record number for the same value. So in this case, if it was desirable to find everybody who was in the division U.S. and was a director that would fail. For division U.S. record number 1 is returned, and next record a+4. Then for level director record a+5 is returned, but those would be different bits; a+4 and a+5 are different bits. When those are AND-ed together, a zero bit would be returned, and the result is that nobody is both division U.S. and is a director where it is very obvious here that Paul is both in the U.S. division and his level is a director.

[0195] Since each attribute name/value pair for a given employee or given entity is stored in a separate record, then the normal retrieval operations using the above-described bit vector mechanism cannot resolve that. One way to do this, which is the way the traditional database would do, is to: (1) execute the queries, (2) come up with a list of record numbers, (3) go through those record numbers, and (4) see if they relate to the same person.

[0196] To overcome these difficulties, an arbitrary index can be generated using the indexing approach described in conjunction with FIGS. 1-14, except without the use of data records. Thus, the index is the database itself. The index is referred to as arbitrary index because it uses arbitrary ID numbers; that is, they are arbitrary in that they are not related to the attribute data that they are indexing. FIG. 18 shows the assignment of an arbitrary ID number to each employee and in other words, employee Peter in this case is assigned the arbitrary ID number 0, Paul is 1, Mary 2, and so forth on through Lisa which is 9, and these numbers should be assigned sequentially and they should be closely packed together because, as discussed further, the ID number really becomes (takes the place of) the record number. It's a pseudo-record number so if they are packed together then they will correspond to a single bit in a contiguous bit vector that will provide the same performance previously discussed. FIG. 18 is an example implementation for the case of a database with employees, where it is desirable to add a field to the employee record. In the master employee record, for each employees there is an employee number, name, date of birth, etc., and if it is desirable to add a field for ID number, assign it in a row. To assign employee ID, it is desirable to start with 0 and continuing sequentially, all while maintaining a counter to track numbers already used.

[0197] FIG. 19 illustrates a sample database, where the Employee ID column is added help in understanding the following figures. This sample data is depicted in FIG. 19 in the form of a table for ease of explanation, although it will be appreciated that no such database table would be used. This data shows, for instance, that employee Peter is ID 0, in department Engineering, in division U.S., level Salaried, the products he is responsible for is associated with Workstations and Servers, and the cost centers are 1100 though 1599.

[0198] FIG. 20 illustrates how the data is stored with a method that utilizes the indexes without the need for records. Recalling the discussion above, an index consists of a key and associated with it is a list of records and, referring back to FIG. 2, everything except item 32
(which is the database list of field names) would be used That is, the records 32 can be eliminated resulting in the arbitrary index 30 that comprise index, model index, etc. For the employee example, the arbitrary index is the employee attribute index. Then in place of list of records it would be lists of ID numbers (corresponding to bits within a bit vector). The query processor would then return a query result ID number list in ID number. From that point, the ID number list could be processed (e.g., sorted), as shown in FIG. 2.

[0199] To properly illustrate the operation of an index, also referring to FIG. 11A, begin with an index for field color, which could be created as a structure of an index for employee attributes and it would work the same. There is the word "fine," colon (:), the slice number, colon (:), then instead of, for example, orange or brown the key would be an entry from this table; and the key would be shown as a pair. It's the attribute name concatenated with the attribute value and in the table of FIG. 20
appears the concatenation with two colons (::), which can be any pattern that is unique that is not part of the data, for example, it could be a null character, it could be a special unicode character, but basically its just a reserved character that indicates that the attribute name has stopped and the attribute value has started. By using this then it becomes transparent to the key management and bit vector management routines that are written to assume just a simple string. As the key, in this case the string is Department::Engineering or Division::Asia and, for purposes of the database management part of the system, it is being treated as one key. The key is really subdivided into two separate strings or a pair of strings which are the attribute name and the attribute value. Now all possible combinations are enumerated, showing that if it desirable to implement either a new attribute name or a new attribute value for an existing attribute name, all that occurs is key creation and then start processing it.

[0200] The following routines are used for such things as creating arbitrary indexes, destroying arbitrary indexes, inserting attribute name/value keys into an arbitrary index, deleting an attribute name/value key from an arbitrary index, showing the arbitrary index keys or enumerating them, and doing a find or query against the arbitrary index retrieving a list of ID's. The database provides exported methods whereby the definition of attributename::value arbitrary indexes, insertion/deletion operations, and retrieval operations can all be performed under the control of applications external to the database. These operations are performed manually by the routines that manage the members of the set, instead of being automatically tied to the creation, modification, and deletion of data records. This allows the indexing and retrieval operations of the database to be utilized by arbitrary sets of arbitrary objects.

[0201] The following operations are specially defined for arbitrary indexes:

[0202] CREATE_ARB_INDEX