Home
Patent Search
IMT Blog
REGISTER
|
SIGN IN
United States Patent
7058929
Charnell , ; et al.
June 6, 2006
Title
Direct invocation of methods using class loader
Abstract
A system and method of direct invocation of Methods using class loaders. The method includes compiling a call to and a Method of a first class (assuming that the Method is final), determining whether the second class includes an instance of the Method of the first class, determining whether the instance of the Method of the second class overrides the Method of the first class, and altering the compiled code of the Method of the first class if the instance of the Method of the second class overrides the Method of the first class.
Inventors:
Charnell; William Thomas
(Bucks,
GB
)
, Plummer; Wayne
(Bucks,
GB
)
, Darnell; Stephen
(Maidenhead,
GB
)
, Dias; Blaise Abel Alec
(Middx,
GB
)
, Guthrie; Philippa Joy
(Bucks,
GB
)
, Kramskoy; Jeremy Paul
(Long Kitton,
GB
)
, Sexton; Jeremy James
(Herts,
GB
)
, Wynn; Michael John
(Maidenhead,
GB
)
, Rautenbach; Keith
(Bucks,
GB
)
, Thomas; Stephen Paul
(Bucks,
GB
)
Assignee:
Esmertec AG
(Dubendorf,
CH
)
Appl. No.:
09/859,135
Filed:
May 16, 2001
Foreign Application Priority Data
Nov 16, 1998 [GB] 9825102.8
Current U.S. Class:
717/135
717/140
717/150
Current International Class:
G06F 9/44 (20060101)
Field of Search:
717/158,166,117,7,154,130-150 707/206
U.S. Patent Documents
20020108106
August 2002
Kramskoy
20030009743
January 2003
Fresko
4675829
June 1987
Clemenson
4924408
May 1990
Highland
5210876
May 1993
Uchida
5301260
April 1994
Miyashita
5301325
April 1994
Benson
5339436
August 1994
Tairaku et al.
5367685
November 1994
Gosling
5442792
August 1995
Chun
5450575
September 1995
Sites
5452457
September 1995
Alpert et al.
5469574
November 1995
Chang et al.
5530964
June 1996
Alpert et al.
5551040
August 1996
Blewett
5590332
December 1996
Baker
5598561
January 1997
Funaki
5603030
February 1997
Gray et al.
5613120
March 1997
Palay et al.
5655122
August 1997
Wu
5675804
October 1997
Sidik et al.
5721854
February 1998
Ebcioglu et al.
5761513
June 1998
Yellin et al.
5764989
June 1998
Gustafsson et al.
5815720
September 1998
Buzbee
5835771
November 1998
Veldhuizen
5848274
December 1998
Hamby et al.
5857104
January 1999
Natarjan et al.
5872978
February 1999
Hoskins
5873104
February 1999
Tremblay et al.
6072953
June 2000
Cohen et al.
6463582
October 2002
Lethin et al.
6490599
December 2002
Kolodner et al.
Other References
Core Java vol. l--Fundamentals, by Cay S. Horstmann, Gary Cornell 1999 Sun Microsystems, Inc. cited by examiner .
Microsoft Computer Dictionary, Fifth Edition, 2002. cited by examiner .
Microsoft Computer Dictionary, Third Edition, 1997. cited by examiner .
"Java in a Nutshell", published by O'Reilly & Associates, Inc., By David Flanagan, 1997. cited by examiner .
Java World--Di Giorgio--Jul. 1997--Use native methods to expand the Java environment. cited by other .
Karaorman, M. et al.--jContractor: a reflective Java library to support design by contract--V 1616, Jul. 19-21, 1999, pp. 175-196, Saint-Malo, Fr. cited by other .
Java Native Interface Specification--Java Native Interface Specification Release May 16, 1997 , Sun Microsystems, Inc., California. cited by other .
Dyadkin, L.J.--Multibox Parsers--ACM Sigplan Notices, Association for Computing Machinery, New York, vol. 29, No. 7, Jul. 1, 1994, p. 54-60. cited by other.~
Primary Examiner:
Dam; Tuan
Assistant Examiner:
Chow; Chih-Ching
Attorney, Agent or Firm:
Caeser, Rivise, Berstein, Cohen & Pokotilow, Ltd.
Parent Case Text
CROSS-REFERENCE TO RELATED APPLICATIONS
This is a continuation of International Application PCT/GB99/00788, filed on Mar. 16, 1999, which claims priority to U.K. Patent Application GB9825102.8, filed on Nov. 16, 1998, now abandoned.
Claims
What is claimed is:
1. A method of invoking a Method of a first class in a computer system, the method comprising: compiling by a compiling system a call to a Method of the first class assuming the Method is final; determining the occurrence of a second class and further determining whether the second class includes an instance of the Method of the first class; determining whether the instance of the Method in the second class overrides the Method of the first class; altering the compiled code of the Method of the first class in response to the further determining whether the instance of the Method in the second class overrides the Method of the first class; and operating the computer system in accordance with the altered compiled code.
2. A method as claimed in claim 1, further comprising marking the Method of the first class as final.
3. A method as claimed in claim 1, further comprising determining whether the Method of the first class is final before altering the compiled code.
4. A method as claimed in claim 1, further comprising determining whether the second class is valid.
5. A method as claimed in claim 4, further comprising throwing an exception if the second class is not valid.
6. A method as claimed in claim 1, wherein compiling includes optimizing the code of the call.
7. A method as claimed in claim 6, wherein optimizing the code includes creating a patch.
8. A method as claimed in claim 7, wherein altering the compiled code includes undoing a patch.
9. A method as claimed in claim 1, wherein altering the compiled code includes deleting code.
Description
Computer System, Computer-Readable Storage Medium and Method of Operating Same, and Method of Operating that System
This invention relates, in its most general aspects, to a computer system and to a method of operating that system, and to improvements in the performance of various operations within such a system. It also relates to a computer-readable storage medium. The computer system may be, may include, or may be part of, a virtual machine. The computer-readable storage medium may contain executable code or other instructions for programming the computer system/virtual machine.
In recent years, there have been developments in programming languages towards what is known as an object-oriented language. In these developments, concepts are regarded as `objects`, each carrying with it a set of data, or attributes, pertinent to that object, as well as information relating to so-called `methods`, that is functions or sub-routines, that can be performed on that object and its data. This is well known to those skilled in the art of computing and/or programming.
The advent and rapid advancement in the spread and availability of computers has led to the independent development of different types of systems, such as the IBM and IBM-compatible PC running IBM-DOS or MS-DOS or MS-Windows applications, the Apple Macintosh machines running their own Apple System operating system, or various Unix machines running their own Unix operating systems. This proliferation of independent systems has led to useful applications being available only in one format and not being capable of running on a machine for which the application was not designed.
Under such circumstances, programmers have devised software which `emulates` the host computer's operating system so that a `foreign` application can be made to run successfully in such a way that, as far as the user is concerned, the emulation is invisible. In other words, the user can perform all of the normal functions of say a Windows-based application on a Unix machine using a Unix-based operating system without noticing that he is doing so.
A particularly notable product of this type is that developed by Insignia Solutions of High Wycombe, GB and Santa Clara, Calif., USA and known under the name `SoftWindows 2.0 for Powermac`. This software enables a physical Macintosh computer to emulate a PC having an Intel 80486DX processor and 80487 maths co-processor plus memory, two hard disks, IBM-style keyboard, colour display and other features normally found on recent versions of the PC-type of computer.
Furthermore, there is an ever-increasing demand by the consumer for electronics gadgetry, communications and control systems which, like computers, have developed independently of one another and have led to incompatibility between operating systems and protocols. For example, remote-control devices for video players, tape players and CD players have similar functions, analogous to `play,` `forward,` `reverse,` `pause,` etc, but the codes for transmission between the remote control, or commander, operated by the user may not be compatible either between different types of equipment made by the same manufacturer or between the same types of equipment made by different manufacturers. There would be clear benefits of having software within the equipment which can produce for example the correct `play` code based upon a `play` command regardless of the specific hardware used in the equipment. Such software is commonly known as a `Virtual Machine.`
Other uses and applications are legion: for example, set-top boxes for decoding television transmissions, remote diagnostic equipment, in-car navigation systems and so-called `Personal Digital Assistants.` Mobile telephones, for instance, can have a system upgrade downloaded to them from any service provider.
Emulation software packages tend to have certain features in common, notably that they are not general purpose but are dedicated. They are of most benefit in rapid development areas and have a distinct advantage in enabling manufacturers to cut costs. In particular, they can divorce software from the physical machine, i.e., the effect of the software in the physical machine can be altered by the emulating software without having to go into the machine's native software to implement those changes.
The specific object-oriented language used in some of the implementations described later is that known as Java (registered trade mark to Sun Microsystems Corporation). Some of the following implementations will enable Java to be used in smaller devices than is currently possible because of the improved performance and/or reduced memory footprint. Future uses projected for embedded software (virtual machines) include computers worn on the body, office equipment, household appliances, and intelligent houses and cars.
While it is recognised that there are clear advantages in the use of virtual machines, especially those using object-oriented languages, there are naturally areas where it is important and/or beneficial for some of the operations that are carried out within the system to be optimised. These may include reducing the memory requirement, increasing the speed of operation, and improving the `transparency` of the system when embedded in another system. One of the principal aims of the inventions described herein is to provide a Virtual Machine which is optimised to work as quickly as possible within a memory constraint of, for example, less than 10, 5, 2 or even 1 Mbyte. Such a constraint is likely to be applicable, for example, to electronics gadgetry and other equipment where cost (or size) is a major constraint.
Reference will be made, where appropriate, purely by way of example, to the accompanying figures of the drawings (which represent schematically the above improvements) in which:
FIG. 1 shows certain components of the virtual machine.
General Considerations
A specific example of a preferred embodiment of virtual machine is now described with reference to FIG. 1.
The virtual machine 20 is an executable code installed in the particular item of equipment 22. It can provide a degree of independence from the hardware and operating system. The virtual machine may typically include any, some, or all of the following features: an operating engine, a library of routines, one or more interpreters, one or more compilers, storage means for storing a plurality of instruction sequences, queue management means, and buffer management means.
The virtual machine is coupled to one or more applications 24 on one side (the "high level" side), and, on the other side (the "low level" side), perhaps via various intermediate logical units, to the hardware 26 of the item of equipment. The hardware can be regarded as including various ports or interfaces 28 (perhaps an interface for accepting user input); the virtual machine receives events from those ports or interfaces. The hardware also includes one or more processors/control means 30
and memory 32.
Agent's Reference No. 1--Computer System, Computer-Readable Storage Medium and Method of Operating Same, and Method of Operating that System
The present invention relates to a computer system and to a method of operating a computer system. In particular, the invention relates to computer systems including a compiler for compiling code for execution. In a preferred embodiment, the invention relates to Dynamic Compilation of the Dominant Path.
This invention is preferably related to the optimisation of the runtime representation of object-oriented computer languages by means of runtime compilation technology and preferably to the optimisation of the runtime representation of object-oriented computer languages by means of runtime compilation technology. Aspects of the invention are related to optimised execution of virtual machines, and in particular Java virtual machines.
The invention relates in particular to trace scheduling, optimising compilers, dynamic compilation, profile guided optimisations, just in time compilers and the Java VM specification.
In some applications, for example using the Java language, code may be interpreted directly using an interpreter. The interpreter translates the code during execution and thus, the interpretation of code can be very slow. The execution of compiled code is therefore preferred since such execution is generally significantly faster than interpretation.
Standard compilers translate all of the code of an application to give a complete compiled runtime representation of the code for execution. Such standard compilation is time consuming, especially where optimisation of the compiled code is desired, and is usually carried out off-line before execution of the code.
The Just-in-Time (JIT) compiler provides on-line compilation of code. For example, using a JIT compiler, when a method is first encountered in the execution of the code, the execution is stopped and the JIT compiler compiles the whole of the method, optimising where possible. Thus the JIT compiler compiles the whole method, including parts of the method which are unlikely to be used. Such compilation wastes time in compilation and the compiled version of the code takes up space in the memory. This can present a particular problem for an embedded application where minimising the use of memory is of importance.
Generally, compilers of the runtime representation of computer languages and in particular so-called Just-in-time (JIT) compilers, compile the representation of a whole method at a time, or a larger unit (for example, a file or one of many classes at a time). Often a significant portion of an application relates to handling exceptional situations, or rarely executed code. Typically, the compiler blocks any further progress of the application until the compilation completes.
The conventional compilation approach therefore spends time compiling code which is rarely executed, and the compiled result occupies space which would have not been needed if the rarely executed code were not present. Optimisation opportunities are often reduced by having to cater for control paths through the rarely executed code.
Offline compilers which use profile input from a previous run of the application can often optimise the frequently executed paths of an application to mitigate the latter problem. However they still must compile every path through the application, and cannot easily react when an application exhibits different behaviour, to that of the profile run.
For the JIT compiler, when the `invoke` instruction for a method is encountered, control is passed to the JIT compiler and, if the method has not previously been compiled, a compiled version is created. The compiled version is then used for the subsequent execution of the method. Once the budgeted memory available to the JIT compiler is used, the compilation of new methods is not possible and the use of the JIT compiler ceases. Methods subsequently found will be interpreted, thus slowing subsequent execution of the non-compiled code.
The amount of memory available to the compiler varies depending on the computer system used. The overall memory allocated to the compiler includes the code buffer space, the space allocated to the compiler for building required internal data structures and for register allocation. That memory is usually set aside for the compiler prior to compilation.
JIT compilers were designed for use on desktop computer systems having plenty of memory. The memory allocated to the compiler is generally so great that the amount of buffer space available to the compiler is, in practice, unlimited.
For embedded systems, however, the amount of memory allocated to the compiler might be 70 or 80K. Clearly, that imposes constraints on the amount of code that may be compiled.
In summary, the Invention described in this application involves any, some or all of the following features, in any combination:
1. Compile fragments of code for the dominant path rather than whole methods.
2. Use execution history to determine which paths through the application are the dominant ones.
3. Use a fallback interpreter to interpret infrequently executed code.
4. Have an online compilation system which can compile code on demand as the application executes. This system does not block progress of the application. The system runs as a separate thread, whose priority is adaptive.
5. Have the ability to incorporate new fragments of code into a running multi-threaded system.
6. Support removal of fragments of code from a running multi-threaded system.
7. Constrain the amount of memory used by the dynamic compiler during its execution at any time.
The invention described in this application aims to, among other things, reduce the performance impact of online compilation, generate code which is optimised for the dominant paths through an application, allow better optimisation of code, within time and memory constraints, reduce the storage overhead of compiled code which is rarely executed, improve application responsiveness in a multi-threaded computer system, and reduce the amount of memory used by the compiler itself.
According to the present invention, there is provided a computer system including a compiler for compiling the code of an application, wherein the compiler is arranged to compile a fragment of the code.
By compiling only fragments of code rather than whole methods, it is made possible only to compile the most desirable sections of code, leaving the less desirable fragments uncompiled.
By this method, the compilation may be made more efficient as only those fragments required are compiled. Also, the memory of the system need not be filled with compiled versions of rarely executed code.
Where reference is made to a fragment of code, it preferably refers to a section of code which represents less than a whole method. Preferably the fragment of code includes one or more blocks of code. It is preferred that the smallest unit of compilation of the code is a block.
A particularly preferred feature of the invention is that the fragment of code is a dominant path fragment of the code.
It will be understood that a dominant path fragment includes a fragment including a number of blocks of code which represents a preferred execution route through the relevant code. For example, where a section of code includes a conditional branch, on repeated execution of code through the branch, one path through the branch is likely to be preferred over another path through the branch. The fragment of code associated with the preferred route through the branch is preferably considered to be a dominant path fragment.
As indicated below, in some cases, another less preferred route through the branch may also be a dominant path.
In a preferred embodiments of the present invention, the dominant path fragments of code include code which is frequently executed. Preferably, the dominant path does not include infrequently executed code. Such infrequently executed code may include, for example, code for handling infrequently encountered exceptions.
By compiling only the dominant path, in accordance with a preferred embodiment of the invention, the storage overhead of storing compiled code which is rarely executed can be minimised. Further, optimisation techniques can be used to optimise the execution of the dominant path code thus increasing the speed of execution of the dominant path code. Further, the compiler need not waste time on-line in compiling rarely executed code and so the overall speed of execution in the system can be improved.
In a preferred embodiment of the invention, a fragment of code is considered to be part of a dominant path if it is executed more than a predetermined number of times.
Preferably, the computer system further includes an execution history recorder for recording the number of times a fragment of code is executed, preferably interpreted.
Preferably the execution history recorder records the number of times a block of code is interpreted.
In preferred embodiments of the invention, as well as recording how many times a particular block has been interpreted, the execution history recorder also records further information regarding the execution of the block, for example, from where the transfer of control into the block came and to where control was transferred out of the block. The recorder preferably also records what type of code was executed in the block.
Preferably a fragment which has been interpreted a number of times which is equal to or greater than a threshold is able to be compiled. Preferably, the threshold is greater than or equal to 2, 5 or even 10.
Thus the frequently executed blocks of code are compiled. It is generally unpreferable for unexecuted blocks to be compiled. In preferred embodiments of the invention, no unexecuted blocks are compiled.
Preferably the system further includes a compiler manager and the execution history recorder is arranged to alert the compiler manager when a fragment of code has been interpreted the threshold number of times. In preferred embodiments of the invention, the compiler manager administers a queue of frequently executed blocks for compilation. Preferably the queue is managed in such a way that only the more frequently executed blocks are chosen from the queue for compilation by the compiler.
Preferably, the threshold is able to be dynamically tuned. For the example above, in which the compiler manage administers a queue, if the queue is persistently long, the threshold is preferably raised so that fewer blocks are sent to the queue for compilation.
It is highly preferable for the execution history recorder to be arranged to record during the execution of the application. It is preferred for the execution history to be collected on-line so that a representation of the dominant path for the particular execution of the application by the system may be determined and used to generate the compiled code. In the alternative, when information regarding the dominant path is captured from a previous run, there is a risk that conditions may have changed from the previous run and the dominant path of the previous run is not a representation of the dominant path of the present run. Furthermore, the dominant path may change during a run.
Preferably, the system further includes an interpreter for interpreting the code of the application and the execution history recorder is arranged to record the interpretation of fragments of code. It is more efficient for the interpreter to manage the execution history recordal. It is envisaged that the recordal of execution of compiled fragments of code could be carried out but in many cases it is thought that it would not be worthwhile having regard to the time and memory required to do so.
Most preferably, the execution history recorder is arranged to record a path of execution from a first fragment to a second fragment. Preferably, the path of execution from a first block to a second block is recorded. In a preferred embodiment, the execution history recorder records, for the execution of a particular block, to where control was transferred from the block. Thus, for a particular block, the most likely successor block can be determined. Thus a dominant path from the particular block can be determined. If the particular block passes the threshold number of executions and is compiled, a dominant path from that particular block through the most likely successors can be compiled.
Thus, preferably, the compiler is arranged to compile a path of fragments.
Preferably, the system is arranged so that only fragments in which all of the code has been executed are able to be compiled. Some sections of code are not always suitable for compilation. If sections of the code have not been executed, the unexecuted portions might include "hidden" code which is unsuitable for compilation. Compilation of such unexecuted code is avoided in preferred embodiments of the invention.
In embodiments of the present invention, a block of code is unsuitable for compilation if it has not executed all the way to a control transfer. As a result, there may still be symbolic resolution required--a job left for the interpreter to implement.
Preferably, the compiled version of the dominant path exposes only one external entry point to the rest of the system. Therefore, assumptions may be made in the compilation of the code. Thus the compiler is preferably arranged to create compiled fragments having only one external entry point.
Where the fragments of code are compiled, preferably the compiler is able to optimise the compiled code. Such optimisations might include inlining. Where compiled code has been optimised, in particular where assumptions have been made when optimising the code which might later prove to be untrue or too limiting, preferably the compiled code is associated with a marker to indicate that a particular optimisation or assumption has been made.
In preferred embodiments of the invention, several optimisations are made, in many cases using various assumptions, to produce particularly efficient compiled code for the dominant path.
Preferably, the system includes a fallback interpreter. Preferably the fallback interpreter is not used when a compiled version of code is available, but is used when no compiled version is available, an exception occurs, or an assumption proves false during execution.
Preferably the system includes an interpreter and at least one portion of compiled code wherein, on execution of the code, at least a first portion of the code is executed from compiled code and at least a second portion of the code is executed from non-compiled code by the interpreter. Preferably, the system uses a fall back interpreter.
This feature is of particular importance and may be provided independently. Thus, a further aspect of the invention provides a computer system including an interpreter and further including the code of an application, the code including at least one portion of compiled code, wherein, on execution of the code, at least a first portion of the code is executed from the compiled code and at least a second portion of the code is executed by the interpreter.
The interpreter can be used where there is no compiled versions of the code available or, for example, where assumptions made in the compilation of the code are found to be untrue. Thus more aggressive optimisation is thus made possible to produce optimised code which might not be `safe` to use in all cases. Where a case is identified in which the compiled version is not safe to use, the fallback interpreter can complete the execution of the necessary code without excessive disruption to the execution and without the need to cease execution while a fresh compiled version of the section of code is produced.
Preferably the system further includes a searching device for determining whether there is a compiled version of a fragment of code. Thus, the possibility of time being wasted when an interpreter interprets a section of compiled code is available, is reduced. Preferably, the compiler is able to compile on-line. Thus the compiler is able to create compiled versions for any new dominant path fragments which may appear during a run.
In a preferred system, the system is multi-threaded. Preferably the compiler runs on a separate thread to the thread executing code.
Preferably, the compiler is able to limit the memory which is used by itself and by the compiled fragments. Thus the compiler preferably has a memory management policy enforced by the compiler to limit the memory used by compilation. This is of particular importance for virtual machines which have limited memory. Preferably the system also includes a deletion device for deletion of compiled code. Thus compiled versions of less frequently used code are able to be deleted to release memory for new compiled code.
The present invention finds particular application for virtual machines, in particular in embedded systems. It is envisaged that the invention could also find general use in systems for which there is the choice of executing compiled code and interpreting code. The invention is of particular use in systems having memory constraints.
The invention also provides a compiler for compiling code in a computer system, the compiler being arranged for the compilation of a fragment of code. Preferably the compiler is arranged for the compilation of a dominant path fragment of the code.
Accordingly, the invention provides a computer system containing a compiler for compiling the operating code of an application, in which only dominant path (or near dominant path) fragments of the code are compiled.
This technique can afford the primary advantage of enhancing performance and reducing compiled space. It is important for a small memory application and involves a mixture of trade offs between memory size, compilation time and performance.
In its preferred form, it also enables the use of key optimisation techniques, involving loops and inlining, without the overhead of global dataflow analysis, and hence allows the compiler itself to execute much faster than compilers that do perform global dataflow analysis. The memory usage of the compiler itself is also much lower.
In the system as defined, advantageously only the dominant path of execution is compiled, rather than all the paths through the code, while the remaining paths are interpreted.
It is a particularly preferred feature that the compiler is operating on-line, in the sense that as the operating code is running parts of it are being compiled; what is termed the dominant path may be constantly changing as execution of the code progresses.
The invention further provides a method of operating a computer system, the computer system including a compiler for compiling the code of an application, wherein a fragment of the code is compiled.
Preferably, the number of times a fragment of code is executed is recorded by an execution history recorder.
In a preferred embodiment wherein the system further includes a compiler manager and the execution history recorder alerts the compiler manager when a fragment of code has been executed a threshold number of times, and preferable wherein the execution history recorder records during the execution of the application.
The invention provides in a further aspect a method of operating a computer system including an interpreter and further including the code of an application, the code including at least one portion of compiled code, wherein the method includes executing at least a first portion of the code from the compiled code and executing at least a second portion of the code using the interpreter.
Preferably the compiler compiles on line. Preferably the memory available to the compiler is limited and preferably the method further includes the step of deleting compiled code.
Also, according to the invention, there is provided a method of operating a computer system containing a compiler for compiling the operating code of an application, the method including compiling only the dominant path fragments of the code.
The method can enhance the performance and reduce the compiled space requirement of the computer system and the memory space requirements of the compiler itself.
Advantageously, information identifying the dominant path is provided from the execution history of the code. The execution history information is preferably derived dynamically as the program runs. The execution history information is advantageously captured from a previous run of the code.
In its preferred embodiment, infrequently executed code is interpreted in a fallback interpreter, whereby preferably execution of the code can continue without the need for compiled code for the infrequently executed code.
Advantageously, an online compilation system is provided which can compile code on demand as the application/program executes whereby compilation information can be generated in response to the appearance of a new frequently executed path.
When the computer system is operating in a multi-threaded system, new fragments of code are preferably incorporated into the multi-threaded system, whereby preferably to achieve smoother operation without stopping running threads.
The invention further provides a method of operating a computer system containing a compiler for compiling the operating code of an application, the method including compiling only the dominant path fragments of the code.
Preferably the method includes compiling a fragment of the code and preferably includes compiling a dominant path fragment of the code.
The invention also provides the use of a fall back interpreter to execute infrequently executed code.
Further provided by the invention is code for a computer system, the code including compiled code produced by a method as aforesaid.
Any, some, or all of the features of any of the aspects of the invention may be applied to any other aspect.
Reference will be made, where appropriate, purely by way of example, to the accompanying figures of the drawings (which represent schematically the above improvements) in which:
FIG. 1A shows paths of execution;
FIG. 1B shows the comparative costs of compiling dominant paths;
FIG. 1C shows a dispatch table;
FIG. 1D is a schematic representation of apparatus for carrying out the invention; and
FIG. 1E shows paths of execution through code.
The following considerations apply to any and all the inventions and aspects of the inventions described above.
1. Compile Fragments of Code for the Dominant Path Rather than Whole Methods.
A summary of a preferred embodiment is as follows:
The compiler takes as input the runtime representation of the source program, and execution history information (which may be obtained as described below). The execution history information could be live (that is, dynamically changing as the program runs), or captured from a previous run of the program.
Execution history information is combined with structural information determined from the runtime representation of the program source, to establish what is the dominant path of the program the compiler should compile. Unexecuted code is preferably never included in the dominant path.
The compiler treats the dominant path as a super-block fragment, laying the code out sequentially, even though the program source may not be. Branches and tests are adjusted where necessary to make the dominant path fall-through. Code and registers are optimised with the assumption that the dominant path will be followed to the end. This improves performance on modem processor architectures. Critically, the dominant path only exposes one external entry point. This greatly simplifies and enhances optimisations.
As shown in FIG. 1A, where the path of execution would leave the dominant path, the appropriate run-time tests are inserted with a forward branch 1000 to some stub code referred to as an "Outlier" 1002. The outlier stub updates any state which the dominant path has not written back yet, before transferring control out of the fragment. The mainline code of dominant paths are generally kept together, as are the outlier stubs as shown at 1002. This improves performance on modem processors, especially where branch prediction software/hardware initially assumes that forward branches are less likely. It also provides better instruction cache behaviour.
Compiling dominant paths of execution allows loop optimisations and inlining to be performed, while simplifying the analysis required for many optimisations. It obviates the need for the compiler to have to resolve symbolic references. That is left to the fallback interpreter.
For example, when loading a new class symbolic references are used, for example, for fields so that when the first time the reference is seen it is necessary to load the class hierarchy satisfying the symbolic references. Where, in a preferred embodiment of the invention, all of the relevant code has been interpreted at least once, the symbolic references have already been resolved before the code is compiled.
Often exceptions need to be recognised in the middle of a loop after some global state has changed. The exception check can be performed early outside the loop, forcing the code into the fallback interpreter, thus allowing the check to be removed from the loop, and code motion to be performed in the presence of those exceptions.
The fallback interpreter will execute the loop and recognise the exception at the right time, albeit more slowly. It is assumed that exceptions rarely occur, and therefore the benefits of the optimised loop will outweigh the disadvantages.
Various optimisations can be made in compiling the code. The optimisations may be made at block level or may be more widespread, in particular where several blocks are involved. An advantage of the preferred embodiments of the invention is that flow analysis need not be carried out. Registers are preferably used for the compiled code to give faster execution of the compiled code.
Where the fall back interpreter is available for use, it is possible to make various assumptions when compiling the code and to omit several safety checks which might otherwise have been required if no fallback interpreter were available. If later any of the assumptions is proved wrong, or if the lack of safety checks would cause something to go wrong, the fallback interpreter can be used to interpret the relevant non-compiled code.
When the compiler is being executed online as the application is executed, the compilation overheads are often critical. By only compiling the dominant path, the compiler is simpler, quicker, and uses less memory for its analysis and therefore can afford to perform more optimisations than would otherwise be feasible, especially in a small memory system.
2. Use Execution History to Determine which Paths Through the Application are the Dominant Ones.
Execution history is captured as the application executes. It is maintained at the block level, when a transfer of control occurs. It is preferred for the execution history recorder to record when a block is entered (when the transfer of control into the block occurs). The execution history recorder may also record other details relating to the execution of the block, for example which is the next block (successor) that was executed after the block in question. Thus information about the preferred route of execution through the blocks of code may be obtained rather than only information about individual blocks.
For each block an entry count and list of successors is kept with a count associated with each. These counts act as an indicator of popularity. Execution history records also contain an indication of what instruction caused the transfer of control which ends the block. Only blocks that have executed up to the transfer of control are candidates. For blocks which have not executed all of the way through, it is not known what type of code is `hidden` in that part of the block which has not been executed. Such hidden code might contain code which requires symbolic resolution. It is therefore preferred that such blocks are not compiled. Where the count of the block is made in the execution history recorder as the control is transferred from the block, only blocks which have executed to the end will be counted. Alternatively, or in addition, checks can be carried out prior to compilation to check whether the block has executed to the end.
When memory is constrained, execution history records are recycled in two ways. Firstly, the list of successors is limited to a small number, and when a new successor is encountered the least popular existing successor is replaced with the new one. When there are no free execution history records, all of the history records associated with the least frequently used method are moved to the free list.
In summary, compilation of a fragment is triggered by the entry count of a block exceeding a given threshold. The threshold may be fixed, or dynamically tuned. However, if the state of the history block indicates that the block is already queued to be compiled, or is not compilable, it is ignored. Such a block may not be queued for compilation.
In a preferred embodiment, when the code is first executed, none of the code is compiled. Execution is initially carried out by the interpreter. As each block is interpreted, the count of the block held by the execution history is increased by one. The execution history recorder records, for each block, from where the transfer of control into the block came and to where the control was transferred from the block. The execution history may also contain further information about the execution of the block, for example the type of code executed in the block. A threshold is set and when the count for a particular block reaches the threshold value, the block is entered on the queue for compilation. The threshold may be 5; when a particular block has been executed 5 times, it is entered on the queue.
The compiler is associated with a compiler manager which manages the queue of blocks for compilation. When a particular block reaches the threshold number of executions, the execution history recorder sends a message to the compiler manager to enter the block on the queue for compilation. The compiler is running on a separate thread and checks at intervals to see whether there is an item for compilation in the queue and, at some time, the compiler will start to compile the block referred to at the top of the queue.
In a preferred embodiment, the queue is managed so that new entries onto the queue are entered at the top of the queue and are therefore most likely to be compiled. When the queue is managed in that way, blocks which reach the threshold many times are more likely to be compiled than blocks which reach the threshold only a few times, or once. So that the queue does not become unmanageable, the compiler manager may delete part or all of the queue from time to time.
If it is found that too many blocks are being queued for compilation, the threshold can be raised. Equally, if few, or no, blocks are being queued for compilation, the threshold can be lowered. This can be carried out dynamically during the execution of the application. The compiler manager can monitor the length of the queue and, if desired, send a message to the execution history recorder to increase or decrease the threshold.
When the compiler compiles a block which is queued by the compiler manager, it may proceed to compile just that single block. It is preferred, however, that the compiler uses the information gathered by the execution history recorder regarding the successors of the block and compiles not only the single block which has reached the threshold but also the most popular successors of the block, thus compiling the most popular path from the block (the dominant path). It will be appreciated that the successors of the block may or may not have been executed the threshold number of times to be eligible for compilation in their own right but, nevertheless, are compiled as a part of the dominant path from a block which has been executed the threshold number of times.
When the compiler takes a block for compilation, it carries out checks to determine whether the block is one which is desirable to compile, for example, if it is able to be compiled, and whether there is already a compiled version of the block available.
The compiler then traces the dominant path (though the most popular successors of the block) until it gets to the end of the method or comes across a piece of code which it is not desirable to compile, for example because a compiled version already exists. Other code which is not desirable to compile would be code which merges back into the dominant path other than at the original block that triggered compilation. Flow analysis would be required for optimal compilation otherwise. The compiler detects and prevents such control flow merges from occurring (having determined the likely flow at a branch, the unlikely flow is handled by generating code to exit the fragment). It will not pass beyond the end of the method but it will follow, for example, invokes to follow the dominant path. When the compiler stops in its tracing of the dominant path, it starts to compile the code, starting at the beginning of the dominant path.
When a compilation triggers, the dominant path can be determined by following the most popular successors a block at a time, including following method calls.
Generally speaking, execution history of the running application is a good indicator of which paths are the dominant ones.
It will be appreciated that, where there are two or more paths through a method, both or all of the paths through the method may be dominant paths and be compiled if the relevant blocks are executed sufficient times.
Execution history does not need to be accurate, and can be updated in a number of ways. Rather than track execution history in compiled code, which would slow execution down significantly, execution history is maintained by the fallback interpreter.
3. Have a Fallback Interpreter which Interprets Infrequently Executed Code.
Having a fallback interpreter means that when infrequent or exceptional code is executed, execution can continue without the presence of compiled code for it. The fallback interpreter maintains execution history. It also means that all issues to do with class resolution can be solely handled by the fallback interpreter.
Where only the dominant path of the code is compiled, where the path of execution leaves the dominant path, interpretation of non-compiled code will be necessary. Furthermore, optimisations may have been carried out in the compilation of the compiled code and, if it is discovered at a later stage that assumptions which were made in the optimisations were incorrect, the fallback interpreter is used to interpret the relevant section of code. Also, the run starts execution using the interpreter before any compiled versions of the code have been created.
It will be seen, therefore, that there are many occasions where it might be necessary to pass control of execution from the compiled version to the interpreter and away from the interpreter when compiled code is available.
As is described in more detail below for a particular embodiment, while the interpreter is translating code, checks are carried out to see if there is a compiled version of the code next to be executed. Thus unnecessary interpretation can be avoided.
Again, as discussed in more detail below, when control is passed to and from the interpreter and between separate pieces of compiled code, special conversion devices are provided. Examples of such devices are "glue code" and "outliers". The conversion devices help to ensure the smooth transfer of execution between compiled versions of the code. They hold, for example, information regarding the address of code to be interpreted at the end of a compiled section and are of particular importance where optimisations have been made in the compiled version to ensure that the variables are up to date and are stored on the correct registers, for example, when the execution is transferred.
For example, when a jump is made from the compiled code to the interpreter, the interpreter expects memory state to be current, so if a memory location has been put into a register for the compiled version, it needs to be returned to the correct memory location before the interpreter proceeds.
4. Have an Online Compilation System which can Compile Code on Demand as the Application Executes.
As and when application behaviour changes, a dynamic compiler can generate optimised code for any new frequently executed paths which show up. By running as a separate thread, this allows the application to continue useful work via the fallback interpreter.
5. Have the Ability to Incorporate New Fragments of Code into a Running Multi-Threaded System.
Smoother operation is obtained if a new fragment of code can be incorporated without stopping running threads.
Once the compiler has completed the compilation of the dominant path for a particular block, it sends a message to the compiler manager that the compilation has been completed. Until complete, the compiled code is kept from the executable code. The compiler manager loads the compiled code in the executable code. The necessary changes are made in the dispatch tables and code cache to indicate that the compiled code is available for the relevant block and where the compiled code is. The introduction of the compiled code is carried out atomically so that the stopping of running threads is not required.
6. Support Removal of Fragments of Code from a Running Multi-Threaded System.
Removal of code fragments is also key to being able to operate in restricted memory environments. It also allows code which was optimised for one dominant path to be replaced with different code when new dominant paths appear. Code can be compiled with optimistic optimisations on the basis that they can be deleted if the optimistic assumptions under which the code was compiled are broken.
As indicated above, where assumptions made about the dominant path are found to be incorrect for subsequent execution of the code, the fallback interpreter can be used to interpret a non-dominant path through the code. However, if a dominant path which has been compiled is subsequently executed infrequently, it would be desirable to remove the compiled version of the code to release the memory used by the compiled version.
In some embodiments, the number of times of execution of each piece of compiled code is monitored and, if it is executed infrequently, can be marked as suitable for deletion.
In a preferred embodiment, the number of times a code buffer is accessed is recorded. Before passing control into a buffer, its execution count is increased. The least popular buffer may be deleted when desirable.
For example, at a certain point, the compiler may run out of code buffer space. A buffer is then deleted. If a count has been made of the number of times control has been passed into the various buffers, the least popular buffer may be deleted. Alternatively, the oldest buffer may be deleted.
It will be appreciated that various checks will usually be carried out before the deletion of the buffer to reduce the risk of disruption to the system. See, for example, Agent's reference no. 6 of this specification.
The fact that compilation costs can be radically reduced is illustrated by the schematic diagram in FIG. 1B in which the comparative time taken up in profiling, compiling and executing at full speed for the invention 1020 and the typical prior art 1022 are shown as a proportion of a 10-second time slot.
Use of the dominant path also allows the dynamic compiler to be memory constrained by truncating a fragment some way along the path when the compiler reaches its budgeted memory limit. This is impossible in prior-art compilers.
Thus, when the compiler has used all of its allocated memory, the compilation of a fragment can be terminated. It will be understood that suitable steps would usually need to be taken so that at the end of the truncated compiled fragment, control can be passed back to the interpreter so that execution can continue at the correct byte code address and with the correct updated parameters and register structures, where required.
It is crucial in small memory computer systems that the compiler adheres to a memory budget. Prior art compilers typically view memory as an unlimited resource. Hence they may consume large amounts of memory during compilation, to build internal representations of its input program, and to hold results of dataflow analysis and the like.
In contrast, the dynamic compiler works within external configurable constraints imposed upon it at system start up or build time. It then compiles as much of a fragment as it can within these constraints. If necessary, it truncates the fragment, by relying on the feedback interpreter to receive control at the truncation point. This is impossible in prior art compilers, where the unit of compilation is a method or greater, and where no interaction with a fallback interpreter is available.
There now follows an example of a run in which execution history is used to determine a dominant path, the dominant path fragment is compiled and execution switches between compiled and non-compiled code.
The system described includes a virtual machine (VM) and includes an interpreter (in C language) and a Java application. The system is multithreaded and includes a Java main thread, a Compiler Manager thread and a compiler thread.
For example, the Java application includes Class A:
TABLE-US-00001 Class A static main ( ) { for (i=f; i<100; i++) Aa=newA( ); a.method(i); }
The Java thread is started: Java A class load A
Class A is loaded and A's dispatch table is loaded. The dispatch table is shown schematically in FIG. 1C. FIG. 1C shows A's dispatch table 1030 having various address entries 1032. For example, the main method is located at address 4000.
The main program of the VM identifies the address of the method main A at 4000 and calls glue code:
call glue (4000)
Glue code is a part of the conversion device which enables the execution to switch between the use of the interpreter and the execution of compiled code. Glue code includes several devices for effecting smooth transfer between the execution of compiled code and non-compiled code. Glue code includes sections for one or more of: 1. updating states of memory locations and register states. 2. passing control to the interpreter when no compiled version of code is available or optimisations made in compiling code are found to be inappropriate. 3. passing control away from the interpreter when a compiled version of code for execution is available.
The conversion device may include outliers as described above for updating the states. For example, when an exception is encountered in execution of compiled code, control may pass first to an outlier for states to be updated before passing to the glue code for instructing the interpreter to begin executing the code for dealing with the exception.
The glue code then calls the interpreter to start to execute code beginning at address 4000:
call interpreter (4000)
The interpreter starts at address 4000 and executes the byte code until it reaches the invoke instruction. The interpreter returns to the glue code which determines that the interpreter is trying to perform the invoke. The interpreter knows where the invoke is in the dispatch table, and tells the glue code.
The glue code takes the object reference for the method off the stack and looks at the dispatch table to get the address for the method.
If a compiled version of the start of the method is available, the address of the compiled version will be entered in the dispatch table, and the compiled version is executed.
If there is no reference to a compiled version of the start of the method, the dispatch table includes an entry for "invoke glue" and a return is effected to a separate section of the glue code which starts interpretation of the method at the relevant address:
call interpreter (5000)
When the interpreter jumps into the method, it sends a message to the execution history recorder that the method is about to be executed.
At the end of the method, there is a return, and the interpreter returns to the glue code which returns the execution to the previous method for interpretation or execution of a compiled version as indicated above.
The glue code includes a dedicated portion for handling returns which ensures that the register, stacks, and so on are correct for the execution of the next piece of code. For example, where the method has been executed from a compiled version and the next piece of code is to be interpreted, anything put onto registers for the compiled version has to be restored into the correct memory location before the next section of code is generated. Thus the return handling glue code restores any states which have been altered as a result of the use of the compiled code.
Thus the return to the glue code further returns to the return handling glue code before execution passes to the next portion of code.
The various portions of glue code described above may all be a part of the same piece of glue code, or may be separate glue code pieces. The updating of the states may be carried out by outliers as described above and in Agent's reference no. 3
of this specification.
A further example below describes the action of the interpreter for a transfer of control other than an invoke.
In this embodiment, the following method has just been invoked and is to be executed using the interpreter:
TABLE-US-00002 void func (int p, int a) { int x = p; for (int i=a; i<p; i++) { x=x/i; } }
The interpreter executes the method in byte code, symbolised in numbered lines as follows:
TABLE-US-00003 Bytecode: Java: O iload_1 x=p; 1 istore_3 2 iload_2 i=a; 3 istore 4 5 goto 16 8 iload_3 x=x/i; 9 iload4 11 idiv 12 istore_3 13 i inc 4 1 i++; 16 iload 4 i<p? - reiterate if true 18 iload_1 19 if_icmplt 8 22 return
The method void func is called for the first time. There is no compiled version so the method starts execution by the interpreter. At execution time, the following blocks (groups of lines of code) are recognised by the interpreter: b.sub.1={0
5} b.sub.2={19} b.sub.3={8 19} (not a basic block) b.sub.4={22}
The interpreter executes the first block b.sub.1. The interpreter runs an execution history recorder in which it records that b.sub.1 has been executed once and has a count of 1. (Preferably, it also records that the successor of b.sub.1, is b.sub.2 and that b.sub.1 was executed all of the way through. For simplicity, references to the recordal of such extra information is omitted below).
At the end of the block, the interpreter consults the code cache to see if there is a compiled version of the next block b.sub.2. (Note that in this example, while there is a transfer of control from one block to another, there is not an invoke and thus there is no return to the glue code. In an alternative embodiment, the interpreter might return to the glue code after every block, but that is likely to be time consuming. In the preferred embodiments described herein, the interpreter only returns to the glue code when a. it encounters an invoke, b. it encounters a return, c. it finds from the code cache that there is a compiled version of the next block, or d. via an exception.
In this case there is no compiled version, so the interpreter proceeds to execute b.sub.2, giving b.sub.2 a count of 1 in the execution history recorder. The interpreter consults the cache again and, finding no compiled version of b.sub.3, proceeds to execute b.sub.3. For the present example, the loop is repeated 3 times so when a return is made from the method by block b.sub.4 (going through the return handler glue code as described above), the counts of the blocks in the execution history recorder are as follows: b.sub.1=1 b.sub.2=1 b.sub.3=3 b.sub.4=1
If the threshold for compilation is 5, none of the blocks b.sub.1, b.sub.2 or b.sub.3 will be queued for compilation.
After the next time the method void func is called, the counts will be as follows: b.sub.1=2 b.sub.2=2 b.sub.3=6 b.sub.4=2
Thus the execution history recorder sends a message to the Compiler Manager to queue b.sub.3 for compilation. At some later time, the compiler will consult the queue, and compile b.sub.3. Before compilation, the compiler determines the dominant path from b.sub.3 using the record for b.sub.3 in the execution history recorder which indicates the successors of b.sub.3. In this simple case, the most popular successor of b.sub.3 is b.sub.3 so that only the single block b.sub.3 representing the loop is compiled. The compilation of b.sub.3 may be optimised for example by using registers to store the values of p, x, i and a. A pre-exception condition check could be inserted for an i=0 check (division by zero) (see Agent's reference no. 2 of this specification). When the compiler has completed the compilation, it notifies the Compiler Manager what compilation has been done, where the compiled version is and whether it includes a method entry point or not. The compiled version is not available for execution at this time.
In due course, the compiler manager will load the compiled version of b.sub.3. The code cache is updated so that the host code address for that part of the method now points to where the compiled code is.
At a later time when the method func is called, the interpreter consults the code cache after execution of b.sub.2 and finds that a compiled version of b.sub.3 is available.
The interpreter returns to the glue code which, as described above, effects the execution of the compiled version of b.sub.3.
At a later time still, the method func will have been executed 5 times so that b.sub.1 and b.sub.2 are queued for compilation.
When b.sub.1 is taken for compilation, the compiler will determine the dominant path from b.sub.1. The successor of b.sub.1 is b.sub.2 (the compiler does not consider b.sub.3 for compilation as part of the dominant path on this occasion because there is already a compiled version).
The fragment b.sub.1 and b.sub.2 is compiled and the dispatch table is updated.
On a subsequent execution, the compiled code for b.sub.1/b.sub.2 is executed, a return is made to the glue code, which effects execution of the b.sub.3 compiled code. If the path from compiled b.sub.1/b.sub.2 to the glue to the compiled b.sub.3
is effected a sufficient number of times, a patch connecting the compiled b.sub.1/b.sub.2 to compiled b.sub.3 may be made. (Patching is described in more detail under Agent's reference no. 12 of this specification). Thus the execution can be made more efficient because the step through the glue is no longer required.
At a later time, a memory manager associated with the compiler manager decides that memory for the compiler should be freed. The oldest buffer chosen for deletion includes the compiled version of b.sub.3. The compiler manager calls the deleter to delete the buffer. Certain checks have to be carried out before deletion (see for example Agent's reference no. 6 of this specification). In the example given above, there is a particular problem because a patch was inserted between the compiled code for b.sub.1/b.sub.2 (which is not deleted) and the compiled code for b.sub.3 (which will be deleted). For a discussion of how this problem may be overcome, see Agent's reference no. 12 of this specification).
FIG. 1D shows apparatus 1040 suitable for carrying out the embodiment described above.
The apparatus 1040 includes an interpreter 1042 for interpreting Java code 1043 in the computer system. When the interpreter reaches the end of a block of code, unless there is an invoke or a return, it consults the code cache using the code cache searcher 1044 to see if a compiled version of the next block is available. If there is, the converter device 1046 (which includes the glue code referred to above) carries out the necessary changes and alterations before passing control to an executer 1048 for executing the compiled version 1049 of the code.
As interpreter 1042 executes, it records in the execution history recorder 1050 which blocks of code have been executed as well as further details about the execution of the block, for example which blocks were executed before and after the block and what type of code was executed.
The execution history recorder 1050 notifies the compiler manager 1052 when a block is executed a threshold number of times. The block is held in a queue 1054 managed by the compiler manager 1052. A threshold tuner 1056 monitors the length of the queue from information from the compiler manager 1052. Based on information regarding the length of the queue, the threshold tuner 1056 alters the threshold for the execution history recorder 1050 to send a block to the compiler manager.
A compiler 1058 compiles blocks referred to in the queue 1054. The compiler 1058 uses information from the execution history recorder 1050 regarding the execution of the block to determine the dominant path from the block and prepares a complied version of the code. When the compiled version is complete, the compiler 1058 notifies the compiler manager 1052 which updates the necessary dispatch tables and code caches and loads the compiled version.
The compiler manager 1052 includes a memory manager 1060 which monitors the memory available to the compiler 1058. If memory available becomes low, the memory manager 1060 instructs a deleter 1062 to delete some of the compiled code. Also, if the queue 1054 becomes too long, the compiler manager 1052 instructs the deleter 1062 to delete some or all of the queue 1054.
FIG. 1E shows paths of execution through code of a method generally referred to as 1066.
The figure shows schematically various fragments of code, for example 1068, 1070, 1072. Such fragments of code may each represent a block of code.
The code shown in the Figure has one external entry point 1074. After block 1072, there is a conditional branch 1076, for example an exception check. If an exception occurs, the execution passes along path A to code 1078 to handle the exception. Otherwise, code passes along path B to code block 1080 at which point there may be a call (path C to block 1082) or the execution may follow path D to code sections 1083, 1084. Execution may pass along path E to block 1085 or path F to block
1086.
Information about execution runs through the code 1066 is recorded on the execution history recorder 1050 run by the interpreter 1042.
If block 1068 is found to have been executed by the interpreter the threshold number of times, it is passed to the queue 1054. The compiler 1058 consults the execution history in the recorder 1050 and finds that: 1. The more popular successor of 1072 is 1080 (that is, execution passed along path B more often than along path A); 2. The more popular successor of 1080 is 1083 (that is, execution passed along D more often than along C); and 3. The more popular successor of 1084 is 1085 (that is, execution passed along D more often than along C).
The compiler 1058 determines that the dominant path is therefore 1068, 1070, 1072, 1080, 1083, 1084, 1085 through the code. The dominant path is indicated as 1088.
While the compiler 1058 was tracing the dominant path 1088, it noted that fragment 1084 was never executed all the way through (path F was never followed). Thus, 1084 is not a suitable candidate for compilation and the dominant path fragment for compilation does not include fragments 1084 or 1085.
Thus the compiled dominant path fragment includes fragments 1068, 1070, 1072, 1080 and 1083.
In any or all of the aforementioned, certain features of the present invention have been implemented using computer software. However, it will of course be clear to the skilled man that many of these features may be implemented using hardware or a combination of hardware and software. Furthermore, it will be readily understood that the functions performed by the hardware, the computer software, and such like are performed on or using electrical and like signals.
Features which relate to the storage of information may be implemented by suitable memory locations or stores. Features which relate to the processing of information may be implemented by a suitable processor or control means, either in software or in hardware or in a combination of the two.
In any or all of the aforementioned, the invention may be embodied in any, some or all of the following forms: it may be embodied in a method of operating a computer system; it may be embodied in the computer system itself; it may be embodied in a computer system when programmed with or adapted or arranged to execute the method of operating that system; and/or it may be embodied in a computer-readable storage medium having a program recorded thereon which is adapted to operate according to the method of operating the system.
As used herein throughout the term "computer system" may be interchanged for "computer", "system", "equipment", "apparatus", "machine" and like terms. The computer system may be or may include a virtual machine.
In any or all of the aforementioned, different features and aspects described above, including method and apparatus features and aspects, may be combined in any appropriate fashion.
It will be understood that the present invention(s) has been described above purely by way of example, and modifications of detail can be made within the scope of the invention.
Each feature disclosed in the description, and (where appropriate) the claims and drawings may be provided independently or in any appropriate combination.
Agent's Reference No. 2--Computer System, Computer-Readable Storage Medium and Method of Operating Same, and Method of Operating that System
The present invention relates to computer systems and to methods of operating computer systems. In particular, the invention preferably relates to a computer system including a compiler for compiling code and to a method of compiling code in a computer system. Preferably the invention relates to computer systems running interpreted languages, for example Java. The invention preferably relates to object-oriented programs (preferably Java). In a preferred embodiment, the invention relates to pre-exception condition checks.
In order to avoid problems arising during the course of a program or method execution in an object-oriented program such as Java, safety systems are normally built in which will detect an impermissible situation and throw an error and/or an exception. The system will usually respond to the exception condition being detected and will cease execution in the area where the exception has been detected. In some such systems, an exception handler will be invoked in order to handle the exception, for example to close down an illegal operation, before allowing the execution to continue.
Java throws both errors and exceptions. For simplicity, these will be referred to herein as `exceptions`. It should be understood that the term `exception` used herein is to be interpreted broadly to include, for example run-time errors, exceptions and other occurrences that occur in the Java language and/or in other languages, unless clear from the context otherwise.
Java is a language which is rich in exceptions. Java also has various mechanisms for dealing with exceptions when they occur.
For example, a section of code may include the term `y=i/z`. If, when the code is executed, z=0, a `divide by zero` exception is thrown. When compiled, the method containing the possible exception is marked to throw an exception.
If a method is invoked in Java which has declared itself to throw an exception, then the Java compiler requires that any method which invokes that method also to declare an exception or to provide an exception handler to deal with the exception. Thus the exception can ripple up the call chain until it is either caught and dealt with by an exception handler or falls off the end of the chain. This will be well understood by those familiar with the Java language who will also appreciate that there are essentially two types of exceptions in Java, namely `checked` and `unchecked`.
A `checked` exception will either be `caught` or `thrown`. Indeed the compiler will force a checked exception to be caught or thrown. In contrast, an `unchecked` exception is more like a runtime error, such as divide-by-zero, and neither Java nor C++ forces declaration of a throw.
Consider the situation where a stack is formed in which a particular exception, such as divide-by-zero, is declared in the uppermost, or oldest, frame a whilst the most recent frames b, c, d and so on are regarded as being added in sequence below frame a. If the exception is encountered in frame d, the evaluation stack for that frame is cleared, the VM creates an exception object and a reference to it will be placed on the evaluation stack of the frame with a matching handier.
The object reference indicates the type of exception and goes to a table for instructions (assuming there are any for that exception) on how the exception is to be handled. For example, the table might indicate that if the exception occurs in any of lines 1 20, it will be handled in line 21.
When the exception in d is encountered, first frame d is searched for the handler but, since the exception is declared in frame a it clearly will not be found so frame d is wiped and the search continues in frame c. The same situation obtains in c, so the search continues backwards, wiping each of d, c and b in turn, until frame a is reached where the handler can be located. It should be emphasised that only local variables are stored in the wiped frames, so there is no loss of valuable information; all global variables (called arrays, objects, static fields in Java) and objects created in other programming languages (for example) remain stored in the heap.
Java is a language rich in exceptions. Java state must be written to as dictated by the semantics of the Java program.
When the Java program is compiled, however, it is possible to make various optimisations. One such optimisation might be possible in the case where the fragment of code to be compiled includes a loop. It is desirable to move any loop invariable operations outside the loop to make execution at run-time more efficient. However, that can give rise to difficulties where an exception may occur within the loop. Thus, in the following simple example, one cannot update "x" before the array access is executed, in case the array access "arr[i]" raises an "index out of bounds" exception. If the write to "x" was incorrectly moved before the access, and an exception did occur, we would now have an incorrect value for "x".
TABLE-US-00004 x = . . . ; b = 10 for (int i=a; i<b; i++) { arr[i]++; x = b }
Standard code-motion optimisations, such as loop invariance, are thus blocked in the presence of such exceptions, which act as barriers across which code cannot be moved.
In the above example, "x" is being written with a loop-invariant value (10). In the presence of the potential exception, we cannot move the write outside of the loop. If "a" did not fall within the range of allowable index values for the array "arr", then the first access to "arr[i]" would raise an exception and "x" would have the same value extant at entry to the loop, and not the value 10. Moreover, the exception check itself executes within the loop body, hence incurring its own execution penalty.
If optimisations are to be made in the case of the compilation of the code of the above example, it would be necessary to carry out an analysis to prove that `i` could never fall outside the range of allowable index values. If that can be proved, then the write to x could be safely moved outside the loop. In order to prove the necessary conditions, complex analysis of the code would be required. In some cases local analysis might be sufficient, for example where it can be shown from an analysis of a basic block of a single method that the exception would not occur. In most cases, however, it will be necessary to look at several blocks, for example back to the block in which the array was created to be able to make the proof. In that case, global data flow analysis (the analysis of an entire single method) or interprocedural analysis (the analysis of the entire program or class) would be required. Clearly, such analysis is time consuming and costly in memory usage and could really only be contemplated for use in off-line compilation. In any case, if it is found as a result of the detailed analysis that the exception might occur, optimisation would in any case not be possible. Thus, such analysis is rarely done in practice at runtime on limited memory systems and optimisations of code in which exceptions may occur are usually not attempted.
Another example involves an exception condition covering the situation where a point may be reached in a division step where the denominator equals zero.
This example involves division of a variable x by another variable i. There may be certain circumstances where i becomes zero, leading to division by zero, a non-calculable function, such as follows:
TABLE-US-00005 int x=p; b=10; for (int i=a; i<b; i++) { x=x/i; y=b; }
It is not advisable to throw the exception too early for fear that the program loop may have executed something which is of value. It is not impossible for a loop including a possible exception to be circulated a large number of times (perhaps on average 10 times) before the exception is raised.
Thus, while it would be desirable to remove the loop invariant term out of the loop to save time at run-time in the repeated execution of the loop, it would not be safe to move the term out of the loop without having carried out detailed analysis.
The present invention seeks to mitigate this and/or other problems.
According to the invention, there is provided a method of compiling a fragment of code including a possible exception, the method including the step of including a pre-exception condition check.
The pre-exception condition check is preferably included in the compiled version of the fragment of code. By using a pre-exception condition check, it can be determined early on before the code which might raise an exception is executed, whether an exception will occur. If the check shows that no exception will occur, it will then be safe to execute the code including the exception.
It will be understood that the pre-exception condition check will preferably be included immediately before the body of the fragment of code in which the exception might occur. Thus, code other than the pre-exception check can be optimised. Preferably, the condition check is included at the beginning of the compiled fragment. That is especially preferred where the fragment contains a loop.
Preferably, the fragment of code is compiled on the assumption that the exception will not occur. When the pre-exception check is used, if the check is passed it is known that the exception will not occur. Thus optimisations may be made which would not have been safe to make if it were not known whether or not the exception would occur. Thus the compiled code can be more efficient, both in terms of the increased speed in executing the code as well as being more compact, thus occupying less memory space.
Preferably, the method includes providing a bailout device for use if the condition check determines that an exception will occur. In many cases, the pre-exception condition check will determine that no exception will occur and execution of the compiled code can proceed. However, in some cases, an exception will occur and the condition check will determine that an exception condition is imminent. The bailout device preferably allows the exception to be encountered in the interpreter at the expected point of execution.
Preferably, if the original code fragment included code for dealing with the exception, that code is not included in the compiled version as a part of the optimisation procedure. In any case, the code is preferably compiled so as not to be cluttered with code for use in the further detection and handling of exceptions which occur infrequently. Rather than the code for dealing with exceptions being compiled, therefore, preferably an interpreter is used to interpret uncompiled code for handling the exception. Preferably, the bailout device is arranged to pass control to an interpreter. The control is forced to pass to the interpreter because, since there is a compiled version of the code, the interpreter would normally not be used for execution of that code.
Thus, in effect, the compiled version of the code is preferably prepared only for use when the exception does not occur and is compiled so as to optimise the compiled code for that situation. Where the exception does occur, the compiled code is preferably not used and the interpreter is used to execute up to the point of detecting the condition, and raising the exception. It would be possible to provide two versions of the compiled code: one for use in the case where the exception occurred and one for use where the exception did not occur, each version of code being optimised for the relevant situation. In many cases however, that would be undesirable, especially where the system was one having limited memory (for example a VM). The compiled version of the code for use where an exception occurred would be infrequently used and would clutter up the memory allocated for compiled versions of code.
Where the compiled code has been optimised, it is possible that the condition of states, (for example the values of integer variables and the register states) when the condition check reveals that the exception will occur, is not the same as for the corresponding uncompiled code. Preferably, the bailout device includes an outlier for updating states.
Preferably, the fragment is a dominant path fragment of code. Preferably, at least part of the code forms a loop. In particular where the memory available is limited, as in a virtual machine, it is highly preferable not to compile code which is infrequently executed. Preferably, the method also includes the step of determining a dominant path through the code. Preferably, infrequently executed code, for example non-dominant path fragments of code are not compiled. Preferably, the compiler compiles only dominant path fragments of code.
According to the invention there is further provided the use of a pre-exception condition check in compiled code.
The invention also provides a compiler for compiling code according to the method described above.
Also provided by the invention is an apparatus for compiling a fragment of code including a possible exception, the apparatus including means for including a pre-exception condition check.
The apparatus is preferably a part of a computer system, preferably a virtual machine. The invention relates in particular to interpreted languages, and has particular relevance to Java.
Preferably, the compiler is arranged to include the condition check at the beginning of the compiled fragment and preferably the compiler is arranged to compile the fragment of code on the assumption that the exception will not occur. This is of particular relevance where the fragment includes a loop.
Preferably, the apparatus includes a bailout device for use if the condition check determines that an exception will occur. The bailout device is preferably provided on compilation by the compiler.
Preferably, the apparatus further includes an interpreter and the bailout device is arranged to pass control to the interpreter. Preferably, the interpreter is arranged to interpret the code for handling the exception.
Preferably, the bailout device includes an outlier for updating states. In particular, where control is relinquished from the execution of compiled code and, it will often be necessary to update states before the control is passed.
Preferably, the fragment is a dominant path fragment of code and preferably the compiler is arranged to compile the dominant path code. Preferably, the compiler is arranged to compile only dominant path fragments of code. Preferably the compiler is an on-line compiler. The execution time impact of the compiler and the amount of memory that it uses can be reduced if the compiler only compiles dominant path fragments of code.
The invention also provides code compiled using a method described above.
According to the invention, there is also provided code for a computer system, the code including a fragment of compiled code including a possible exception, the code further including a pre-exception condition check.
Preferably, the code further includes a bailout device for use if an exception is indicated and preferably, the bailout device includes means for forcing a transfer of control to an interpreter.
Also provided by the invention is a computer-readable storage medium having structured data recorded thereon including code as described above, and also a computer-readable storage medium having a programme recorded thereon for carrying out a method as described above.
Further provided by the invention is a computer system when programmed with a method as aforesaid, and a computer system when programmed according to a method in which a fragment of code including a possible exception is compiled, the method including a pre-exception check.
The invention aims to allow optimisations relating to code motion in the presence of exception conditions within loops, which in turn improves the execution speed of the resulting compiled fragment.
The solution is achieved by use of "pre-exception condition checks", whereby the compiled fragment contains equivalent checks placed prior to the loop entry point.
Advantageously, such a check critically relies upon the presence of the fallback interpreter. If the check detects an exception condition, then control reverts to the fallback interpreter without the possibility of reentering the fragment at this loop entry point. The failback interpreter continues execution at the loop entry point, and hence executes up to the point where the exception is encountered at its correct control point, thus raising the exception with all Java states containing the correct values. If the pre-exception condition check passes however, then the fragment is safely usable, and any code motion optimisations are valid.
In the above example therefore, one could have moved the loop-invariant assignment of "x" out of the loop, so long as it follows the check. This allows omission of the original exception check in the loop, which also offers improved performance.
Preferably all pre-exception condition checks are effected outside any execution loops, to reduce any time penalty of execution of the checks (in particular where the loop may be repeated a large number of times).
Preferably the compiled code includes several pre-exception condition checks, to check for several possible exceptions. Such checks may be arranged as a collection of individual checks, or may include a single check which determines whether any of a number of exception conditions exists.
Preferably, the computer system includes a virtual machine. The method of the invention finds particular application in the context of a virtual machine (VM). A VM requires a small memory footprint in embedded systems, and the present invention allows the footprint of the compiled version of code in the virtual machine to be reduced.
The invention finds particular application for interpreted languages, where an interpreter may be used, and in particular the Java language. The interpreter can be used as a fall back for when an exception is indicated. If the interpreter were not present, a number of different compiled versions of code might have to be provided to deal with alternative routes through the code, for example in the presence of exceptions. Such an arrangement might reduce, or indeed cancel, any benefit in reduced memory space occupied by compiled versions of the code.
There is likely to be a balance between the number of checks which can be inserted into the compiled version of code (the checks incurring a time penalty at execution) and the benefit of reduced execution time in execution of the optimised compiled code.
The benefits of the invention may include increased safety in execution (by use of the condition checks), preferably without incurring increased execution time and memory penalties.
A further advantage of the invention is the choice of the fast (unchecked) route through the compiled fragment or the slow (exception detecting route through the fallback interpreter. The invention enables the fast route to take advantage of code motion (including exception condition checks) outside of a loop, even in the presence of exception conditions within the loop. This choice is unavailable to prior compilers which have compiled the entire method and whose compiled methods do not have the ability to interact with an interpreter to field exception conditions.
By virtue of the invention, the performance of the compiled fragment may be greatly improved due to the ability to move code out of loops. Hence greater freedom is available to the dynamic compiler in its choice and application of optimisations which are not normally available to prior compilers.
According to alternative aspects of the invention, there is provided a computer system including (preferably during the running of a program) means for compiling an exception check to identify the occurrence of an exception condition, and means for executing an exception, when identified by the exception check, in an interpreted language.
Optionally there may also be provided means for carrying out an exception check to identify the occurrence of an exception condition.
In another aspect, the invention provides a method of operating a computer system including the steps of: running a program; compiling an exception check to identify the occurrence of an imminent exception condition, and executing an exception, when identified by the exception check, in an interpreted language.
Preferably, the exception check is carried out outside a processing loop, whereby preferably to avoid the need for the exception check to be carried out at each circulation of the loop. An advantage of the invention is the choice of taking the fast (unchecked) route through the compiler or the slow (exception detecting) route through the interpreter which is not available to prior compilers off-line.
It may be possible, according to the invention, to decide outside the loop that the exception will be reached at some future point in time. When that occurs, control is passed off to the interpreter and therefore there is no necessity for the exception to be checked in each circulation of the loop.
The exception check itself is compiled but interpretation of the exception itself in the slower interpreter serves to save compilation time and, in particular, reduce memory requirements by not having multiple compiled versions of code for dealing with possible exceptions, but does not prejudice optimisation. Indeed, optimisation can be positively enabled. (In Java, exception handling is carried out a programming level.)
Any, some or all of the features of any aspects of the invention may be applied to any other aspect.
The following considerations apply to any and all the inventions and aspects of the inventions described above.
Preferred embodiments of the invention will now be described, purely by way of example, having reference to the accompanying figures of the drawings (which represent schematically the improvements) in which:
FIG. 2A shows apparatus for carrying out the method of the invention;
FIG. 2B shows a fragment of code including an exception; and
FIG. 2C shows a compiled fragment of code in accordance with the present invention. Consider the following example:
A method is called:
invoke func (20,200)
The method func:
TABLE-US-00006 void func (int p, int a) { int x=p; int b=10; int y; for (int i=a; i<b; i++){ x=x/i; y=b; } }
It will be seen that an exception will occur if i=0 and a divide by zero is attempted. Previously, it would not have been possible to move the loop invariant code (y=b) out of the loop because, if the exception occurred, the write to x would be affected.
When the method func is first invoked, it will be executed by the interpreter. If an exception occurs, it will be dealt with in the normal way and, because the code is being interpreted, the write to x will only occur if the exception does not occur. In accordance with a preferred aspect, if fragments of the code of the method func are executed sufficient times by the interpreter such that the fragments are considered to be dominant path fragments of the code, they are queued for compilation. A detailed discussion will be found in Agent's reference no. 1 of this specification.
From that discussion, it will be seen that it is likely that the loop will be compiled first, and that the dominant path for the loop includes only the block or blocks including the loop.
As explained in Agent's reference no. 1 of this specification, the repeating loop represents a third block b.sub.3. The byte code (as translated by the interpreter) can be symbolised as follows (the equivalent Java instruction being indicated):
TABLE-US-00007 Bytecode Java O iload_1 x=p; 1 istore_3 2 sipush 10 b=10; 4 istore 5 6 iload_2 i=a; 7 istore 4 9 goto 21 12 iload_3 x=x/i; 13 iload 4 15 idiv 16 istore_3 17 iload 5 y=b; 19 istore 6 21 iinc 4 1 ++; 24 iload 4 i<p ?. Reiterate if true 26 iload_1 27 if_icmplt 12 30 return
Block b.sub.3 is represented by lines 12 to 27 of the bytecode. When block b.sub.3 has been executed sufficient times, it will be queued for compilation.
The compiler sees that there is a possible `divide by zero` exception in the block b.sub.3. A pre-exception condition check is inserted into the compiled version of the block b.sub.3. In the present case, the check is inserted at the beginning of the compiled fragment. (It could, of course, be inserted at any point before the exception might occur in the compiled code. Where the exception could occur within a loop, preferably the check is inserted prior to the entry point of the loop. Often, as in the present example, the entry point of the loop will, in any case, be the start of a dominant path fragment.)
The compiler will also see that the block b.sub.3 includes a loop invariant term y=b, and that an optimisation can be carried out to remove the loop invariant term from the loop.
A compiled version of block b.sub.3 might be, for example, as shown in the left-hand column below (given in simplified code for clarity). An indication as to the step performed by each section of compiled code is included in the right-hand column.
TABLE-US-00008 Compiled code Step performed cmp i, 0 compare i with zero ble glue_bailout if i is less than or equal to 0, go to the glue code load r.sub.a, b load b into the register store r.sub.a, y y=b (loop invariant step) load r.sub.n, i load registers for start of loop load r.sub.m, x div r.sub.s, r.sub.m, r.sub.n x/i and store result in register s add r.sub.n, 1 i++ cmp r.sub.n, r.sub.a i<b blt if i<b, repeat the loop (from div r.sub.s, r.sub.m, r.sub.n)
The first two lines of the compiled code above include the pre-exception condition check. If i is greater than zero, the check passes and the remainder of the compiled code is executed (from the third line). If i is less than or equal to 0, the second line of code transfers the execution to the glue code of the bailout device as described below. The remainder of the compiled block b.sub.3 is not then executed. Note that the interpreter then interprets the loop from the start of the loop body, through to the point where an exception is detected. Thus the check in the compiled code is giving an early warning of an imminent exception rather than an immediate one. In some case this can reduce the number of steps carried out in the compiled code which have to be "undone" before control is transferred to the interpreter.
It will be seen that various optimisations have been made in the compiled version of the loop. In particular, the loop invariant term y=b has been moved outside the loop. That would not have been safe to do if there had not been a pre-exception condition check present.
The above example has been simplified. In practice there may also be an `index out of bounds` pre-exception condition check (either before or after the i is less than or equal to 0 check), for the situation where i is out of bounds for the execution of the loop. Thus, each section of compiled code may have several pre-exception condition checks. Examples of types of pre-exception condition checks are discussed below.
For a detailed discussion of the execution of code including compiled and non-compiled fragments see Agent's reference nos. 1 and 3 of this specification. A summary of some of the steps is given here for the above example in the case in which the condition check determines that there is an exception condition.
The first line of the compiled code is executed to check if i is less than or equal to 0. If it does, the second line of code directs the execution to a specific entry point of the glue code. The glue code then forces control to pass to the interpreter. The glue code tells the interpreter at which address to start to interpret code (and not to consult the code cache before executing (because the code cache will contain a reference to the compiled version and in this case the compiled version cannot be used)). The glue code indicates to the interpreter to recommence execution at the beginning of the non-compiled version of the block b.sub.3 (from iload.sub.--3, see above). The interpreter sees the exception at the correct time and it is dealt with accordingly. (The interpreter cannot raise the exception too early.)
Once the interpreter has executed the fragment including the exception, the control may pass back through the glue code for the execution of a compiled version of code as discussed in Agent's reference no. 1 of this specification.
Equally, where an `index out of bounds` pre-exception condition check is inserted, if the relevant check fails, control is passed to the glue code, and to the interpreter.
A separate pre-exception condition check could be used for any exception which could occur in the code to be compiled. One pre-exception condition check could be used to check for several possible exceptions.
A suite of such pre-exception checks are available for use, including early typecast check, early bounds check against the possible range of array index values, early null-reference check, early divide by zero, and early object type check, to enable code motion and other early checks to be applied to inlined methods.
A checkcast check proves whether or not an object of a given type can be stored in a field for that type--for example, the check could answer the question whether a `graphics` type object could be stored in a `car` type object.
Java (and other object oriented languages) has a hierarchical structure for classes where if Class A extends a Class O and Class B extends a Class O then Class A and Class B are not related. Conversely, if Class A extends Class O and Class B extends Class A then Class B is a subclass of A and the system could use B where it uses A. Thus it will be seen that there is scope for an exception to arise where the hierarchy of objects in a section of code is not appropriate.
The checkcast condition check checks to see that the class relationship is correct and, if the checkcast check fails, control passes to the bailout device.
A bounds check, as the name implies, proves whether the array index is within the permitted limits, that is, the bounds, of the array, otherwise it refers to the bailout device (glue code) to raise the exception for the index being out of bounds. An example is given above of a situation in which an `index out of bounds` exception might be raised.
A null-reference check identifies whether a field reference is null, in which case nothing can be done with that field.
As an example, consider the following steps: aload s //push reference for an object onto stack getfield At this stage the `gefield` loads the specified field from the object. If the situation arises: aload s getfield (class X, field Y) then if s is null, nothing further can be done and an exception must be raised by the getfield. The pre-exception condition check determines whether there will be a null. If so, the bailout device is called.
A divide-by-zero check, as has already been discussed in the examples above, determines whether a situation will or may be reached where the denominator of a divider function becomes zero, an uncomputable function.
An object type check can best be described as a check to ensure that objects are fitted into the hierarchical structure of an object-oriented system with the correct implementation of methods.
As an illustration of this check, consider the situation where a method might call draw where draw is a method for drawing an object of the Graphics class. If there is no subclass of graphics at that stage which includes a different implementation of draw, it can been assumed that the method draw is final and will not be overridden by a new draw method. Thus, it is assumed that the draw method is not polymorphic, even though it is potentially polymorphic. The code can be compiled with the assumption that the draw method is final. Optimisations can be made based on that assumption, for example inlining of the method draw into the code. See Agent's reference no. 9 of this specification.
The object type check is made to determine whether the called method can appropriately be implemented on the relevant object. In the present example, the check will determine whether the object is a graphics type rather than anything else and whether the draw method is appropriate for the object.
Apparatus for carrying out the method of the present invention is shown schematically in FIG. 2A. The apparatus includes an interpreter 2000 for interpreting code. An execution history recorder 2002 records details of the execution of the code by the interpreter 2000. When a block of code is executed a predetermined number of times, the execution history recorder 2002 notifies the compiler manager 2004 which administers a queue of blocks for compilation. The compiler 2006 consults the queue and takes blocks for compilation, determines the dominant path from the records of the execution history recorder 2002. The compiler also determines whether there are any possible exceptions which may occur in the dominant path fragment to be compiled. If so, the necessary pre-exception condition checks are inserted at the beginning of the compiled fragment of code. The compiler 2006 compiles the fragment and sets up any necessary links to bailout devices 2008. The compiled code is executed by the execution device 2010. If the pre-exception condition check indicates that an exception will occur, the bailout device 2008 transfers to glue code 2014 which passes control to the interpreter 2000 for execution of non-compiled code relating to the exception.
FIG. 2B shows a section of uncompiled Java code 2100. Code section 2100 would be executed using the interpreter 2000.
The section 2100 includes a loop 2102. Within the loop 2102 is a possible exception 2104 (for example a division which might result in a `divide by zero` exception). The loop 2102 also includes a loop invariant term 2106 which it is desired to move out of the loop to increase the speed of execution of the loop 2102.
After several executions of the code 2100, it is found that the code fragment forming the loop 2102 is a dominant path fragment of code and it is queued for compilation. FIG. 2C shows the compiled version of the code fragment (indicated generally as 2108). The compiled code fragment 2108 includes a pre-exception condition check 2112 to check to see whether the exception will occur. The compiled version still includes a loop 2114 but, due to optimisations made in the compilation, it is smaller than before, and quicker to execute. The loop invariant term 2116 has been moved out of the loop 2114, to increase the speed of execution. The pre-exception condition check 2112 includes a path 2118 to a bailout device 2008 for the case in which it is found that an exception will occur.
In any or all of the aforementioned, certain features of the present invention have been implemented using computer software. However, it will of course be clear to the skilled man that any of these features may be implemented using hardware or a combination of hardware and software. Furthermore, it will be readily understood that the functions performed by the hardware, the computer software, and such like are performed on or using electrical and like signals.
Features which relate to the storage of information may be implemented by suitable memory locations or stores. Features which relate to the processing of information may be implemented by a suitable processor or control means, either in software or in hardware or in a combination of the two.
In any or all of the aforementioned, the invention may be embodied in any, some or all of the following forms: it may be embodied in a method of operating a computer system; it may be embodied in the computer system itself; it may be embodied in a computer system when programmed with or adapted or arranged to execute the method of operating that system; and/or it may be embodied in a computer-readable storage medium having a program recorded thereon which is adapted to operate according to the method of operating the system.
As used herein throughout the term `computer system` may be interchanged for `computer`, `system`, `equipment`, `apparatus`, `machine` and like terms. The computer system may be or may include a virtual machine.
In any or all of the aforementioned, different features and aspects described above, including method and apparatus features and aspects, may be combined in any appropriate fashion.
It will be understood that the present invention(s) has been described above purely by way of example, and modifications of detail can be made within the scope of the invention.
Each feature disclosed in the description, and (where appropriate) the claims and drawings may be provided independently or in any appropriate combination.
Agent's Reference No. 3--Computer System, Computer-Readable Storage Medium and Method of Operating Same, and Method of Operating that System
The present invention relates to a computer system and to a method of operating a computer system. Preferably, the invention relates to the management of memory in a computer system, and in particular to the management of cache memory in a computer system. In a preferred embodiment, the invention relates to outliers for spatial separation of infrequent code etc.
In a computer system there are various levels of cache memory. It is of benefit to the system, in terms of improved efficiency and therefore speed, if the caches themselves can be operated efficiently. It has been appreciated pursuant to the present invention that it would be advantageous to have code which is likely to be executed frequently located in the caches and in particular in the fastest cache. In the embodiment of the invention described below, Java code is compiled for faster execution at run-time using a dynamic compiler. In order to improve cache density of useful code (density), as one of the aims of the invention, it would be beneficial to have in the fastest of the caches the compiled code that the dynamic compiler has produced.
Prior art solutions do not maximise the density of cache memory. For example, as is discussed in more detail below, it has been appreciated that the fast caches of prior art systems are often occupied by large amounts of infrequently accessed code reducing the density of frequently accessed code in the cache which may lead to more cache misses. The present invention seeks to mitigate this and/or other problems.
According to a first aspect of the present invention, there is provided a computer system including a compiler, the compiler being arranged to compile dominant path fragments of code.
A dominant path represents a frequently executed path of execution through the code and may include a large number of individual blocks of code. By arranging for the dominant path to be compiled (and preferably only the dominant path to be compiled), the density of useful code in the compiled version of the code is increased since the compiled version includes only code which is executed frequently. Thus the density of useful code in the cache can be increased.
By arranging for the dominant path to be compiled, it is possible to arrange for blocks of code including the most frequently executed paths through the code to be more likely to be stored in the cache, and more likely to be stored in the same (L1) cache as other blocks of the dominant path code