Home
Patent Search
IMT Blog
REGISTER
|
SIGN IN
United States Patent
5428786
Sites
June 27, 1995
Title
Branch resolution via backward symbolic execution
Abstract
Possible values for a computed destination address of an execution transfer instruction are found by a backward search through a flowgraph of a program. During the search, a symbolic expression for the destination address is successively modified to reflect the effect of each prior instruction until the symbolic expression represents an absolute or program-counter relative address, or until the search can no longer continue. The search can no longer continue, for example, when an instruction is reached that affects the value of the expression in an indefinite way. When backward symbolic execution reaches the entry point of a block in the flowgraph, backward symbolic execution proceeds backward to each predecessor block that has not already been examined for the execution transfer instruction. Therefore multiple definite values as well as a value of "unknown" may be found for a computed destination address. Preferably backward symbolic execution is performed while constructing the flowgraph, in order to find the locations of additional instructions. As additional instructions are found, new blocks and new paths between blocks are added to the flowgraph. Backward symbolic execution is repeated when the new paths may provide additional values for the computed destination addresses.
Inventors:
Sites; Richard L.
(Boylston,
MA
)
Assignee:
Digital Equipment Corporation
(Maynard,
MA
)
Appl. No.:
666070
Filed:
March 7, 1991
Current U.S. Class:
717/151
Field of Search:
395/375,700,650,500
U.S. Patent Documents
4951195
August 1990
Fogg, Jr. et al.
5005119
April 1991
Rumbaugh et al.
Foreign Patent Documents
0372835
Jun., 1990
EP
9001738
Feb., 1990
WO
Other References
Morgan et al., 8086/8088: 16-Bit Microprocessor Primer, BYTE/McGraw-Hill, 1982, pp. 130-138. .
Banning, "The XDOS Binary Code Conversion System," COMPCON 89 (Sep. 27, 1989) San Francisco, Calif., pp. 282-287. .
Hunter and Banning, "DOS at RISC," Byte, vol. 14, No. 12, (Nov. 1989), St. Peterborough, United States, pp. 361-368. .
Gaines, "On the Translation of Machine Language Programs," Communications of the Association for Computing Machinery, vol. 8, No. 12, (Dec. 1965), New York, N.Y., pp. 736-741. .
Bergh et al., "HP 3000 Emulation on HP Precision Architecture Computers," Hewlett-Packard Journal, Dec. 1987, pp. 87-89. .
Beyond RISC!-An Essential Guide To Hewlett-Packard Precision Architecture, Wayne E. Holt, Ed., 1988, pp. 225-238. .
Eve M. Tanner, "Providing Programmers with a Driver Debug Technique," Hewlett-Packard Journal, Oct. 1989, pp. 76-80. .
Program Flow Analysis: Theory and Applications, Muchnick & Jones, eds., Prentice-Hall, Englewood Cliffs, N.J., 1981, pp. 160-161, 178-179, 184-187, 264-265, 272-275, 280-283, 294-297. .
The Handbook of Artificial Intelligence, vol. II, Barr & Feigenbaum, eds., William Kaufmann, Los Altos, Calif., 1982, pp. 297-379. .
S. Reiss, "PECAN: Program Development System That Supports Multiple Views," IEEE Transactions on Software Engineering, SE-11, No. 3., Mar. 1985, New York, N.Y., pp. 276-285. .
Michael Saari, "68000 Binary Code Translator," 1987 FORML Conference Proceedings, 1987, pp. 48-52. .
Max Schindler, "Translator Optimizes Transfer of 8-bit Programs to 16-bit Programs," Electronic Design, Jul. 23, 1991, pp. 35-36..~
Primary Examiner:
Lall; Parshotam S.
Assistant Examiner:
Treat; William M.
Attorney, Agent or Firm:
Arnold, White & Durkee
Claims
What is claimed is:
1. A method of operating a digital computer to analyze a series of instructions in a computer program; each of said instructions having a respective program address in said computer program; said computer program including instructions that transfer execution to specified addresses and that together with the program addresses of said instructions define an execution sequence for said instructions, said instructions also including instructions that specify operations that modify contents of memory at specified memory addresses, and instructions that specify operations that modify contents of specified general purpose registers, said program including at least one execution transfer instruction specifying a transfer of program execution to a destination address in said program determined from contents of a specified one of said general purpose registers; said method comprising a search for at least one value for said destination address prior to program execution by the steps of:
a) forming an expression of said destination address in terms of the contents that a designated one of said general purpose registers will have during program execution;
b) inspecting, in reverse order of said execution sequence, instructions in said execution sequence that precede said execution transfer instruction, and
(1) when an inspected instruction specifies modification of the contents of a general purpose register designated in said expression but does not specify a modification such that said expression expresses a value that is definite with respect to the address of said inspected instruction, modifying said expression to account for the specified modification so that said expression expresses said destination address in terms of contents that at least one general purpose register designated in said expression will have during program execution when program execution reaches the program address of the inspected instruction; and
(2) when an inspected instruction specifies modification of the contents of a general purpose register designated in said expression such that said expression expresses a value that is definite with respect to the address of said inspected instruction, determining said value for said destination address from said expression.
2. The method as claimed in claim 1, wherein said expression includes a memory access function expressing said destination address in terms of the contents of memory at a memory address specified by a memory access argument expression, and wherein said method further includes the step of attempting to match the memory access argument expression to a memory address specifier in the inspected instruction when the inspected instruction modifies the contents of memory at a specified memory address, and when a match occurs, replacing the memory access function in said expression with an expression of the modified contents of memory at the specified memory address.
3. The method as claimed in claim 2, wherein said step of attempting to match the memory access argument of said expression to a memory address specifier in the inspected instruction includes searching instructions preceding said inspected instruction to find an instruction that sets said memory address specifier to a form of expression matching said memory access argument expression.
4. The method as claimed in claim 1, further comprising the step of attempting to match said expression to a predefined expression format for a subroutine argument passed to a subroutine during a subroutine call operation, and when a match occurs, recording that a callback may occur to said subroutine argument from said execution transfer instruction.
5. The method as claimed in claim 1, further comprising the step of attempting to match said expression to a predefined expression format for a subroutine argument of a subroutine call instruction when the address of the inspected instruction is a destination address of the subroutine call instruction, and when a match is found, continuing said search by inspecting, in reverse order of said execution sequence, instructions in said execution sequence that precede said subroutine call instruction.
6. The method as claimed in claim 5, wherein said subroutine argument is stored in memory on a stack having a memory address indicated by the contents of a stack register, said subroutine call instruction sets an argument pointer register with a value indicating a memory address of the subroutine argument, and wherein said expression is modified to account for the effect of the subroutine call instruction by substituting a designation of the stack register for a designation of the argument pointer register.
7. The method as claimed in claim 5, wherein said subroutine call instruction passes a number of subroutine arguments and includes a specifier specifying the number of subroutine arguments, and wherein the predefined expression format for an argument of said subroutine is dependent upon the specified number of subroutine arguments.
8. The method as claimed in claim 1, wherein said step of determining said value for said destination address from said expression includes the steps of computing the destination address from the program address of the inspected instruction when the inspected instruction includes a register specifier specifying a program counter register.
9. The method as claimed in claim 1, wherein said expression is stored in a fixed format including a memory access function flag, a register designation number, and a displacement constant.
10. The method as claimed in claim 1, wherein said expression is stored in a variable-length format including a string of characters selected from a predefined set of register designators, operators and numerical constants.
11. The method as claimed in claim 10, wherein said operators include addition, multiplication, and a memory access operation.
12. The method as claimed in claim 10, wherein said modifying said expression includes substituting in said expression a new expression for each occurrence of a register designator designating a register specified by a register specifier of said inspected instruction, and then simplifying the expression by applying rules of algebraic simplification.
13. The method as claimed in claim 12, wherein said expression expresses a value that is definite with respect to the address of said inspected instruction when said expression simplifies to an absolute or program-counter relative program address.
14. The method as claimed in claim 1, further comprising the step of validating said value for said destination address after said value is determined from said expression, and wherein said validating tests whether said destination address has an access mode that is either absolute, absolute indirect, program-counter relative, or program-counter relative indirect, and rejects as invalid destination addresses that do not fall within a respective predefined set of valid addresses for each access mode.
15. The method as claimed in claim 1, wherein said instructions include instructions having opcodes, auto-increment specifiers, and auto-decrement specifiers, and wherein said step of modifying said expression to account for the specified modification of the contents of said designated one of said general purpose registers comprises the following steps in sequence:
(i) for each auto-increment specifier in the instruction that specifies an auto-increment register designated in the expression, incrementing a displacement in the expression that is associated with the auto-increment register designated in the expression; and then
(ii) for each operation specified by an opcode in the instruction that modifies the contents of a general purpose register designated in the expression or that modifies a memory location indicated by a memory access argument in the expression, modifying said expression to account for the modification by the opcode; and then
(iii) for each auto-decrement specifier in the instruction that specifies a auto-decrement register designated in the expression, decrementing a displacement in the expression that is associated with the auto-decrement register designated in the expression.
16. The method as claimed in claim 1, wherein said step of inspecting further includes:
(3) when an inspected instruction specifies a modification of the contents of a general purpose register designated in the expression but said modification is a predetermined kind of modification that cannot be accounted for by modifying said expression in a predetermined way, determining a value of "unknown" for said expression and terminating the inspecting of instructions that precede said inspected instruction.
17. The method as claimed in claim 1, wherein said step of inspecting further includes:
(3) when an inspected instruction specifies a modification of the contents of said designated one of said general purpose registers and said inspected instruction has an opcode in a predefined subset of instruction opcodes for which modification of said expression is not permitted, determining a value of "unknown" for said expression and terminating the inspecting of instructions that precede said inspected instruction.
18. The method as claimed in claim 1, wherein said step of inspecting further includes:
(3) when an inspected instruction has an opcode in a predefined subset of instruction opcodes which modify neither the contents of the general purpose registers nor addressable memory locations, discontinuing the inspecting of the instruction without any modification of said expression to account for any effect of the instruction and without determining any value for said destination address from said inspected instruction.
19. The method as claimed in claim 1, wherein said execution sequence includes a plurality of execution paths in said sequence prior to said execution transfer instruction, and wherein said step of inspecting includes inspecting instructions in reverse order backwards through each of said execution paths to determine a plurality of values for said destination address including a value for said destination address obtained by inspecting instructions in reverse order of said execution sequence through each of said paths.
20. The method as claimed in claim 19, wherein at least one of said execution paths loops back onto itself, and wherein said inspecting instructions in reverse order backwards through each of said execution paths includes the step of checking whether said each of said execution paths loop back onto itself, and terminating the inspecting of instructions in said each of said execution paths when said each of said execution paths loop back onto itself.
21. The method as claimed in claim 19, further comprising the steps of: grouping consecutive ones of said instructions into blocks, wherein the instructions in each block are executed in program address sequence and said execution transfer paths occur from a last instruction in one of said blocks to a first instruction in one of said blocks; and generating for each block a list of predecessor blocks from which execution transfer paths to said each block originate, and a list of successor blocks to which execution transfer paths from said each block extend; and wherein said step of inspecting instructions in reverse order backwards through each of said execution paths includes inspecting instructions in reverse order backwards through the block including the execution transfer instruction and through predecessor blocks beginning with the predecessors of said block including the execution transfer instruction.
22. The method as claimed in claim 21, wherein each of said blocks has an epoch number attribute, and wherein said search for at least one value of said destination address is assigned an epoch number, and wherein each block is searched no more than once for a value for said destination address by setting the epoch number attribute of said each block equal to the epoch number when the block is searched, and detecting an attempt to search said each block a second time by comparing the epoch number to the epoch number attribute of the block.
23. The method as claimed in claim 19, wherein said computer program includes a sequence of instructions beginning at a destination address found for said execution transfer instruction, said sequence of instructions beginning at said destination address found for said execution transfer instruction defines an execution transfer path that loops back to an instruction in said program preceding said execution transfer instruction in said execution sequence, and wherein said method further comprises the step of decoding said sequence of instructions beginning at said destination address of said execution transfer instruction to find said execution transfer path to said instruction in said program preceding said execution transfer instruction in said execution sequence, and thereupon again searching for a value of said destination address of said execution transfer instruction by inspecting, in reverse order of said execution sequence, decoded instructions which precede in said execution sequence said execution transfer path that loops back.
24. A method of operating a digital computer to search for instructions in an original computer program and to translate the instructions that are found to generate a translated program; each of said instructions having a respective program address in said computer program; said computer program including instructions that transfer execution to specified addresses and that together with the program addresses of said instructions define an execution sequence for said instructions, said execution sequence beginning at an entry point address in said program; said instructions also including instructions that specify operations that modify contents of memory at specified memory addresses, and instructions that specify operations that modify contents of specified general purpose registers, said program including at least one execution transfer instruction specifying a transfer of program execution to a destination address in said program determined from contents of a specified one of said general purpose registers; said method comprising the steps of:
(a) decoding instructions beginning at said entry point address, and when decoding said instructions, recognizing that said execution transfer instruction has a destination address determined by the contents of a specified one of said general purpose registers, and thereupon performing a search for at least one value of said destination address, said search including
(1) forming an expression of said destination address in terms of the contents that a designated one of said general purpose registers will have during program execution,
(2) inspecting, in reverse order of said execution sequence, instructions in said execution sequence that precede said execution transfer instruction, and
(i) when an inspected instruction specifies modification of the contents of a general purpose register designated in said expression but does not specify a modification such that said expression expresses a value that is definite with respect to the address of said inspected instruction, modifying said expression to account for the specified modification so that said expression expresses said destination address in terms of contents that at least one designated general purpose register will have during program execution when program execution reaches the program address of the inspected instruction, and
(ii) when an inspected instruction specifies modification of the contents of a general purpose register designated in said expression such that said expression expresses a value that is definite with respect to the address of said inspected instruction, determining said value for said destination address from said expression,
(b) decoding instructions beginning at the destination address found for the execution transfer instruction, and
(c) translating the decoded instructions to generate said translated program.
25. The method as claimed in claim 24, wherein said expression includes a memory access function expressing said destination address in terms of the contents of memory at a memory address specified by a memory access argument expression, and wherein said method further includes the step of attempting to match the memory access argument expression to a memory address specifier in the inspected instruction when the inspected instruction modifies the contents of memory at a specified memory address, and when a match occurs, replacing the memory access function in said expression with an expression of the modified contents of memory at the specified memory address.
26. The method as claimed in claim 25, wherein said step of attempting to match the memory access argument of said expression to a memory address specifier in the inspected instruction includes searching instructions preceding said inspected instruction to find an instruction that sets said memory address specifier to a form of expression matching said memory access argument expression.
27. The method as claimed in claim 24, further comprising the step of attempting to match said expression to a predefined expression format for a subroutine argument passed to a subroutine during a subroutine call operation, and when a match occurs, recording that a callback may occur to said subroutine parameter from said execution transfer instruction.
28. The method as claimed in claim 24, further comprising the step of attempting to match said expression to a predefined expression format for a subroutine argument of a subroutine call instruction when the address of the inspected instruction is a destination address of a subroutine call instruction, and when a match is found, continuing said search by inspecting, in reverse order of said execution sequence, instructions in said execution sequence that precede said subroutine call instruction.
29. The method as claimed in claim 24, wherein said execution sequence includes a plurality of execution paths in said sequence prior to said execution transfer instruction, and wherein said step of inspecting includes inspecting instructions in reverse order backwards through each of said execution paths to determine a plurality of values for said destination address including a value for said destination address obtained by inspecting instructions in reverse order of said execution sequence through each of said paths.
30. The method as claimed in claim 29, wherein at least one of said execution paths loops back onto itself, and wherein said inspecting instructions in reverse order backwards through each of said execution paths includes the step of checking whether said each of said execution paths loop back onto itself, and terminating the inspecting of instructions in said each of said execution paths when said each of said execution paths loop back onto itself.
31. The method as claimed in claim 29, further comprising the steps of: grouping consecutive ones of said instructions into blocks, wherein the instructions in each block are executed in program address sequence and said execution transfer paths occur from a last instruction in one of said blocks to a first instruction in one of said blocks; and generating for each block a list of predecessor blocks from which execution transfer paths to said each block originate, and a list of successor blocks to which execution transfer paths from said each block extend; and wherein said step of inspecting instructions in reverse order backwards through each of said execution paths includes inspecting instructions in reverse order backwards through the block including the execution transfer instruction and through predecessor blocks beginning with the predecessors of the block including the execution transfer instruction.
32. The method as claimed in claim 29, wherein said computer program includes a sequence of instructions beginning at said destination address found for the execution transfer instruction, said sequence of instructions define an execution transfer path that loops back to an instruction in said program preceding said execution transfer instruction in said execution sequence, and wherein said decoding of instructions beginning at said destination address of said execution transfer instruction finds said execution transfer path to said instruction in said program preceding said execution transfer instruction in said execution sequence, and thereupon initiates a search for another value of said destination address of said execution transfer instruction by inspecting, in reverse order of said execution sequence, decoded instructions which precede in said execution sequence said execution transfer path that loops back.
33. A digital computer system for searching for instructions in an original computer program and translating the instructions that are found to generate a translated program; each of said instructions having a respective program address in said computer program; said computer program including instructions that transfer execution to specified addresses and that together with the program addresses of said instructions define an execution sequence for said instructions, said execution sequence beginning at an entry point address in said program; said instructions also including instructions that specify operations that modify contents of memory at specified memory addresses, and instructions that specify operations that modify contents of specified general purpose registers, said program including at least one execution transfer instruction specifying a transfer of program execution to a destination address in said program determined from contents of a specified one of said general purpose registers; said digital computer system comprising, in combination:
(a) means for decoding instructions beginning at said entry point address, and when decoding said instructions, recognizing that said execution transfer instruction has a destination address determined by the contents of a specified one of said general purpose registers, and thereupon performing a search for at least one value of said destination address, said search including
(1) forming an expression of said destination address in terms of the contents that a designated one of said general purpose registers will have during program execution,
(2) inspecting, in reverse order of said execution sequence, instructions in said execution sequence that precede said execution transfer instruction, and
(i) when an inspected instruction specifies modification of the contents of a general purpose register designated in said expression but does not specify a modification such that said expression expresses a value that is definite with respect to the address of said inspected instruction, modifying said expression to account for the specified modification so that said expression expresses said destination address in terms of contents that at least one general purpose register designated in said expression will have during program execution when program execution reaches the program address of the inspected instruction, and
(ii) when an inspected instruction specifies modification of the contents of a general purpose register designated in said expression such that said expression expresses a value that is definite with respect to the address of said inspected instruction, determining said value for said destination address from said expression;
(b) means for decoding instructions beginning at the destination address found for the execution transfer instruction; and
(c) means for translating the decoded instructions to generate said translated program.
34. The digital computer system as claimed in claim 33, wherein said means for decoding instructions beginning at said destination address found for said execution transfer instruction includes means for finding execution transfer paths to instructions in said program preceding said execution transfer instruction in said execution sequence, and thereupon initiating searches for additional values of said destination address of said execution transfer instruction by inspecting, in reverse order of said execution sequence, decoded instructions which precede in said execution sequence said execution transfer paths to instructions in said program preceding said execution transfer instruction in said sequence.
Description
RELATED APPLICATIONS
This application discloses subject matter that is related to subject matter disclosed in the following copending applications, which are filed herewith and assigned to Digital Equipment Corporation, the assignee of the present invention, and are incorporated by reference herein:
Richard L. Sites, LOCATING PROGRAM CODE BY SUCCESSIVE CODE EXECUTION AND INTERPRETATION, U.S. application Ser. No. 07/666,212 filed Mar. 7, 1991;
Richard L. Sites, USE OF STACK DEPTH TO IDENTIFY MACHINE CODE MISTAKES, U.S. application Ser. No. 07/666,210 filed Mar. 7, 1991;
Scott Robinson, Richard L. Sites, and Richard Witek, IMPROVED SYSTEM AND METHOD FOR PRESERVING INSTRUCTION STATE-ATOMICITY FOR TRANSLATED PROGRAM CODE, U.S. application Ser. No. 07/666,071 filed Mar. 7, 1991;
Richard L. Sites, CROSS-IMAGE REFERENCING OF PROGRAM CODE, U.S. application Ser. No. 07/666,223 filed Mar. 7, 1991;
Scott Robinson and Richard L. Sites, IMPROVED SYSTEM AND METHOD FOR PRESERVING INSTRUCTION GRANULARITY FOR TRANSLATED PROGRAM CODE, U.S. application Ser. No. 07/666,091 filed Mar. 7, 1991;
Thomas R. Benson, USE OF STACK DEPTH TO IDENTIFY ARCHITECTURE AND CALLING STANDARD DEPENDENCIES IN MACHINE CODE, U.S. application Ser. No. 07/666,083 filed Mar. 7, 1991;
Thomas R. Benson, REGISTER USAGE TRACKING TO SUPPORT COMPILED 32-BIT CODE IN 64-BIT ENVIRONMENT, U.S. application Ser. No. 07/666,084 filed Mar. 7, 1991;
Thomas R. Benson, MAPPING ASSEMBLY LANGUAGE ARGUMENT LIST REFERENCES ACROSS MACHINE ARCHITECTURES, U.S. application Ser. No. 07/666,085 filed Mar. 7, 1991;
Thomas R. Benson, TRACKING VAX.TM. CONDITION CODES FOR PORTING TO RISC ARCHITECTURE, U.S. application Ser. No. 07/666,082 filed Mar. 7, 1991;
Daniel L. Murphy, EFFICIENT AND FLEXIBLE LINK OF PROGRAM UNITS AT PROGRAM ACTIVATION, U.S. application Ser. No. 07/666,023 filed Mar. 7, 1991; and
Daniel L. Murphy, AUTOMATIC ADJUSTMENT OF INTERFACE CONVENTIONS BETWEEN TWO DISSIMILAR PROGRAM UNITS, U.S. application Ser No. 07/666,028 filed Mar. 7, 1991.
Richard L. Sites, AUTOMATIC FLOWCHART GENERATION FOR PROGRAM ANALYSIS AND TRANSLATION, U.S. application Ser. No. 07/666,196 filed Mar. 7, 1991;
BACKGROUND OF THE INVENTION
1. Field of the Invention
The present invention relates to the translation of program code for a digital computer, and more particularly to the translation of program code from one language to another in cases where the location of all of the program code to be translated is not known until the program code is actually executed.
2. Description of the Background Art
Computer language translation programs are well known for translating high-level languages such as Pascal, Fortran, Cobol, PL/I or C into machine language. For these languages, programs are coded in an English-like style. A language translation program called a compiler reads the high level language program (called the source program) and translates it into a machine language program (called the object program).
One major advantage of high level languages, besides the ease with which they can express algorithms, is their machine independence. They hide the specifics of the hardware machine and instruction set. Nonetheless, there are a number of applications where machine language programming is desirable. To increase the execution speed of a program, it is often desirable to write code for repetitively executed procedures in machine language to minimize the number of machine cycles that are needed to execute the procedures. Machine language programming is also required in many computer systems to directly control specific hardware features of a particular computer. For example, the parts of an operating system that manage memory and input/output devices are often written in machine language.
Machine language programming is usually written in assembly language code instead of binary code. Assembly language permits the programmer to specify machine operations using symbolic names for memory locations and instructions. A program called an assembler translates the assembly language program into binary machine code. The assembler does all of the work of remembering the values of symbols and the addresses of data elements. However, unlike the high level language, each assembly language instruction corresponds to exactly one machine instruction.
More recently there has arisen a need to translate machine language for one kind of computer to machine language for another kind of computer. This need has arisen due to rapid advances in computer hardware that have made new computer architectures more cost effective. In particular, for more than a decade most high performance computers for general purpose applications used a "complex instruction set architecture" (CISC) characterized by having a large number of instructions in the instruction set, often including variable-length instructions and memory-to-memory instructions with complex memory accessing modes. The VAX.TM. instruction set is a primary example of CISC and employs instructions having one to two byte opcodes plus from zero to six operand specifiers, where each operand specifier is from one byte to eighteen bytes in length. The size of the operand specifier depends upon the addressing mode, size of displacement (byte, word or longword), etc. The first byte of the operand specifier describes the addressing mode for that operand, while the opcode defines the number of operands: zero to six. The opcode itself, however, does not always determine the total length of the instruction, because many opcodes can be used with operand specifiers of different lengths. Another characteristic of the VAX.TM. instruction set is the use of byte or byte string memory references, in addition to quadword or longword references; that is, a memory reference may be of a length variable from one byte to multiple words, including unaligned byte references.
The CISC architecture provided compactness of code, and also made assembly language programming easier. When the central processor units (CPU) were much faster than memory, it was advantageous to do more work per instruction, because otherwise the CPU would spend an inordinate amount of time waiting for the memory to deliver instructions. Recently, however, advances in memory speed as well as techniques such as on-chip cache and hierarchical cache have eliminated the primary advantages of the CISC architecture. Therefore the selection of the instruction architecture is now dictated by the required complexity of the CPU for maximizing execution speed at reasonable cost. These considerations indicate that a reduced instruction set architecture (RISC) has superior performance and cost advantages.
Reduced instruction set or RISC processors are characterized by a smaller number of instructions which are simple to decode, and by the requirement that all arithmetic/logic operations are performed register-to-register. Another feature is that complex memory accesses are not permitted; all memory accesses are register load/store operations, and there are only a small number of relatively simple addressing modes, i.e., only a few ways of specifying operand addresses. Instructions are of only one length, and memory accesses are of a standard data width, usually aligned. Instruction execution is of the direct hardwired type, as distinct from microcoding. There is a fixed instruction cycle time, and the instructions are defined to be relatively simple so that they all execute in one short cycle (on average, since pipelining will spread the actual execution over several cycles).
Unfortunately there is a vast amount of computer software already written for established instruction architectures, and much of that software includes machine language programming that did not originate from the compilation of a high-level language. In these cases the software can not be "ported" to the new computer architecture by the usual method of re-compiling the source code using a compiler written for the new instruction architecture.
In some cases, assembly language code exists for the machine language programming of existing computer software. Therefore it should be possible to write a translator program for translating each assembly language instruction into one or more machine instructions in the new instruction architecture that perform the same basic function. The practicality of such a direct translation is dependent upon the compatibility of the new instruction architecture. For translating CISC code including VAX.TM. instructions to RISC code, for example, the practicality of the translation is improved significantly by innovations in the RISC CPU hardware and the RISC instruction set, as further described in Richard L. Sites and Richard T. Witek, "Branch Prediction in High-Performance Processor," U.S. application Ser. No. 07/547,589 filed Jun. 29, 1990, herein incorporated by reference.
In many cases existing computer software includes binary machine language code for which there does not exist a complete or coherent set of high-level or assembly language source code. This presents a very difficult problem of locating all of the binary machine code to translate. In the usual case a portion of the binary machine code in a program cannot be found prior to execution time because the code includes at least one execution transfer instruction, such as a "Jump" or "Call" instruction, having a computed destination address. At execution time, the destination address is computed, and execution is transferred from the instruction to the "missing" code.
In more unusual cases, some of the binary machine code in a program is not created until execution time. These unusual cases are typically due to poor programming techniques, although code is often created at execution time for security purposes, for example, as part of a "license check" routine. The "license check" routine, for example, writes a sequence of instructions to a scratch memory area and then executes the sequence of instructions. To circumvent the licensing routine, one must discern the mode of operation of the routine from the sequence of instructions written to the scratch area. But the sequence of instructions is absent from the usual print-out or dump of the program, because the sequence of instructions does not exist until execution time.
When there is a problem of locating all of the machine code in an original program, the code has been interpreted at execution time. The translation process of the interpreter is similar to an assembly language translator, but interpretation at execution time is about two orders of magnitude slower than execution of translated code, due to the fact that a multiplicity of instructions in the interpreter program must be executed to interpret each machine instruction. To permit the use of an incomplete translation, an interpreter program, a copy of the original program, and address conversion information is provided together with the translated code when porting the original program to a CPU using the new instruction architecture. When execution of the translated code reaches a point corresponding to an execution transfer in the original program to untranslated code, the CPU transfers execution instead to the interpreter program to interpret the untranslated code in the original program. The interpreter successively translates each untranslated machine instruction into one or more machine instructions for the new architecture and executes them. The interpreter uses the address conversion information to transfer execution back to the translated program at an appropriate time. The presence of untranslated code, however, has a substantial impact on performance unless almost all of the instructions can be located and translated prior to execution time.
SUMMARY OF THE INVENTION
As introduced above, program code cannot be located in the usual case because the code includes an execution transfer instruction having a computed destination address. In accordance with an important aspect of the invention, the computed destination address is resolved by a backward search from the execution transfer instruction through the program to find at least one prior instruction which is used to resolve the computed destination address to an address value that is fixed with respect to the addresses of the instructions in the program. Therefore, at least one additional instruction for the program is found at the address value to which the computed destination address is resolved.
Preferably a plurality of prior instructions are used to resolve the computed destination address by a process of backward symbolic execution in which the destination address for the execution transfer instruction is represented by a symbolic expression, and the expression is successively modified to reflect the effect of each prior instruction. For example, suppose the execution transfer instruction is "JMP R6+4" which transfers execution to the absolute address computed by adding four to the contents of a general purpose register R6. The computed destination address, for example, is represented symbolically as "R6+4". Suppose that in the backward direction through the program, the next instruction that references R6 is "MOVAB 4(R5),R6" which adds 4 to the contents of a general purpose register R5 and puts the sum into R6. Then the symbolic expression "R6+4" is pushed back through the prior "MOVAB 4(R5),R6" instruction to obtain the modified expression "R5+8". The modified expression has the same value before the prior instruction as the symbolic expression has after the prior instruction.
The process of backward symbolic execution continues until either the computed destination address is resolved to an absolute or image-relative address, or a prior instruction is reached which affects the value of the computed destination address but for which backward symbolic execution is not permitted. Backward symbolic execution is not permitted, for example, for certain instructions (such as XORB2 R1,R2) for which backward symbolic execution is unable or unlikely to resolve the computed destination address. In this case it is convenient to reduce the computed destination address to a symbolic value of "UNKNOWN" to indicate the possibility of missing code. Permitting symbolic execution for only a limited set of instructions (such as additions, subtractions, loads, and stores involving general purpose registers), however, allows the resolution of a large fraction of the computed destination addresses found in typical programs.
Preferably backward symbolic execution to resolve computed destination addresses is performed while constructing a flow graph of the instructions found for the program. The flow graph includes execution paths interconnecting basic blocks of the instructions found for the program. Each basic block is a sequence of contiguous instructions that has a single known entry point at the beginning of the basic block and a single exit point at the end of the block. Execution is transferred only to the beginnings of the basic blocks, and execution is transferred only from the ends of the basic blocks.
When backward symbolic execution for the computed destination address of a given execution transfer instruction reaches the entry point of a basic block, backward symbolic execution proceeds backward to each predecessor block that has not already been examined for the given execution transfer instruction. This last qualification avoids infinite loops in the backward symbolic execution. For this purpose, for example, an "epoch number" is incremented each time that backward symbolic execution is begun for a different given execution transfer instruction having a computed destination address, and that value of the epoch number is assigned to an epoch number attribute for each block when the backward symbolic execution is performed through each block.
Because backward symbolic execution may proceed from the entry point of one basic block to the ends of a plurality of other blocks, it is possible that the computed destination address of a given execution transfer instruction may be resolved to a plurality of different values. It is also possible that the process of backward symbolic execution will cause a series of contiguous instructions identified as a basic block to be re-identified as two basic blocks. This happens when a computed destination address is resolved to be an address in a basic block of code other than the address of the first instruction in that basic block of code.
Whenever a new value for a computed destination address is resolved, either additional instructions are located or a new execution transfer path is discovered to previously located instructions. Whenever a new execution transfer path is discovered to a previously located instruction, it is possible that backward symbolic execution from the previously located instruction along the new path may resolve one or more additional values for a computed destination address in an execution transfer instruction following the previously located instruction. Therefore backward symbolic execution is repeated in this case. The repetition of backward symbolic execution may locate additional execution transfer paths to previously located instructions causing still further repetition of backward symbolic execution, but eventually no additional execution transfer paths will be found, so that the process terminates.
BRIEF DESCRIPTION OF THE DRAWINGS
Other objects and advantages of the invention will become apparent upon reading the following detailed description and upon reference to the drawings in which:
FIG. 1 is a block diagram of a CISC digital computer system having a pipelined processing unit especially constructed for processing complex instructions;
FIG. 2 is a diagram showing the state of the pipelined processing unit of FIG. 1 when simultaneously performing different tasks for a number of instructions;
FIG. 3 is a diagram showing the preferred format of a variable length instruction from the complex instruction set;
FIG. 4 is a diagram of a particular variable length instruction for performing an addition between longwords;
FIG. 5 is a table showing the decoding of the mode information in the first byte of a specifier;
FIG. 6 is a block diagram of a RISC digital computer system;
FIG. 7 is a block diagram of preferred instruction formats for instructions in the RISC instruction set;
FIG. 8 is a block diagram of a translator having an analyzer and a RISC code generator for translating an original CISC program to a translated program having RISC instructions;
FIG. 9 is a block diagram showing data and program input and data output from the RISC computer when execution is shared between a partial RISC translation of an original CISC program and an interpreter for interpreting untranslated portions of the original CISC program;
FIG. 10 is a flowchart of steps in a RISC code generator of FIG. 8 that call the interpreter used in FIG. 9;
FIG. 11 is a flowchart of the interpreter and its operation when execution is shared with the partial RISC translation as shown in FIG. 9;
FIG. 12 is a flowchart of an address checking routine that determines when execution is passed from the interpreter to the partial RISC translation;
FIG. 13 is a flowchart of the operation of the translator of FIG. 8;
FIG. 14 is a summary page generated by the translator of FIG. 8 and including a memory map of the original CISC program having been translated;
FIG. 15 is a flowgraph of the original CISC program;
FIG. 16 is an error-specific flowgraph of the original CISC program;
FIG. 17 is block diagram of data structures used by the analyzer of the translator for automatic generation of flowgraphs from machine code;
FIG. 18 is a flowchart of a procedure used by the analyzer for the automatic generation of flowgraphs from machine code;
FIG. 19 is flowchart of a procedure used by the analyzer for investigating paths from execution transfer instructions during the automatic generation of flowgraphs from machine code;
FIG. 20 is a flowchart of a procedure used by the analyzer for performing backward symbolic execution in an attempt to find values of a computed destination address prior program execution;
FIG. 21 is a fixed format representation of a term in a symbolic expression for a destination address that may include a constant displacement, a specified general purpose register, and an indication of direct or indirect memory access;
FIG. 22 is a fixed format representation of a term in a symbolic expression for a destination address that may include a constant displacement, a specified general purpose "base" register, a specified general purpose "index" register, a constant offset, and an indication of direct or indirect memory access;
FIG. 23 is a schematic representation of a flowgraph of a main program routine illustrating an indirect program loop including execution transfer paths shown by dashed lines;
FIG. 24 is a flowchart of a procedure used in backward symbolic execution for searching a basic block in an attempt to find values for a given symbolic expression;
FIG. 25 is a flowchart of a procedure used in backward symbolic execution for pushing a given symbolic expression backward through an instruction in order to modify the symbolic expression so that the modified expression would represent the same value during program execution just prior to execution of the instruction as the given symbolic expression would represent just after execution of the instruction;
FIG. 26 is a format for data that could be predetermined for each possible instruction opcode to assist in pushing symbolic expressions backward through instructions;
FIG. 27 is a flowchart of a routine for checking whether the pushing of a symbolic expression through an instruction has converted the expression to a form that represents a definite address in the computer program or in the operating system of the computer, and whether that address represents a permissible destination for the transfer of program execution;
FIGS. 28 and 29 together comprise a flowchart of a procedure including three scanning modes for finding plausible program code in unknown code areas of program memory;
FIG. 30 is a flowchart of a general procedure for program conversion and maintenance by iteratively retranslating an original program with the benefit of an execution log file identifying locations of untranslated program code that was interpreted during program execution;
FIG. 31 is a flowchart of a procedure for alternately translating and retranslating two mutually dependent programs and testing for convergence in a situation where the two programs cannot be easily merged or translated together at the same time;
FIG. 32 is a block diagram of image information files for two mutually dependent programs and showing linkages between the files; and
FIG. 33 is a flowchart of a procedure which exploits linkages between the information files for two mutually dependent programs in order obtain rapid convergence when the programs are alternately translated and re-translated.
While the invention is susceptible to various modifications and alternative forms, a specific embodiment thereof has been shown by way of example in the drawings and will be described in detail herein. It should be understood, however, that it is not intended to limit the invention to the particular form disclosed, but on the contrary, the intention is to cover all modifications, equivalents, and alternatives falling within the spirit and scope of the invention as defined by the appended claims.
DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENT
The preferred embodiment of the invention is advantageously employed in translating program code for a complex instruction set computer (CISC) to program code for a reduced instruction set computer (RISC).
Turning now to the drawings and referring first to FIG. 1, there is shown a complex instruction set digital computer 20 which includes a central processing unit 21, an input/output unit 22, and a main memory 23.
Both data and instructions for processing the data are stored in addressable storage locations within the main memory 23. An instruction includes an operation code (opcode) that specifies, in coded form, an operation to be performed by the CPU, and operand specifiers that provide information for locating operands. In a typical CISC instruction architecture, the first byte of each instruction contains the opcode, and the following bytes contain the operand specifiers. The instructions preferably have a variable length, and various types of specifiers can be used with the same opcode, as disclosed in Strecker et al., U.S. Pat No. 4,241,397 issued Dec. 23, 1980, incorporated herein by reference.
The main memory 23 includes a large amount of storage, and therefore it is constructed of inexpensive but relatively slow storage devices such as dynamic random access (DRAM) memory chips. Because the CPU 21 executes instructions at a faster rate than the access time of the main memory, the CPU includes a memory access unit 24 that keeps a copy of a portion of the data and instructions in a high-speed cache memory of relatively small storage capacity. Usually the CPU obtains the data and instructions that it needs from the cache memory in the memory access unit 24, and when the required instructions or data are not found in the cache memory, the memory access unit accesses the main memory unit 23 to "refill" the cache memory with the required instructions or data.
The memory access unit 24 may also include means such as a translation buffer for translating "virtual" addresses used in the instructions to "physical" addresses used in the main memory 23. This technique permits the programmer to reference a "virtual address space" including a greater number of addressable memory locations than are provided in the main memory 23. Therefore the programmer is less constrained by limited storage capacity of the main memory 23.
To provide even faster access to data, the instructions may reference a limited number of general purpose registers included in a multi-port register file 25. Some of these general purpose registers may be assigned specific functions, for example, one general purpose register is typically used as a "program counter" and provides the address of the instruction being decoded when referenced as a source operand or specifies the address of the next instruction to be decoded when referenced as a destination operand.
The execution of an individual instruction is broken down into multiple sub-tasks that are performed by respective pipelined functional units. These pipelined functional units include an instruction buffer 26 for fetching instructions from memory, an instruction decoder 27 for decoding the fetched instructions, an operand unit 28 for fetching source operands identified by source operand specifiers, an execution unit 29 for performing arithmetic, logic, and special operations upon the source operands to obtain results, and a retire unit 30 for storing the results at memory locations or registers identified by destination operand specifiers.
For further details regarding the preferred construction of the complex instruction set digital computer 20, one may refer to Fite et al., "Decoding Multiple Specifiers in a Variable Length Instruction Architecture", U.S. Pat. No. 5,148,528
issued Sep. 15, 1992 incorporated herein by reference.
Turning now to FIG. 2, there is shown a state diagram of the pipelined functional units (26-30 in FIG. 1) for an ideal situation where the digital computer (20 in FIG. 1) executes instructions at an average rate of one instruction per cycle. Generally, the following steps are performed during the execution of each instruction by respective ones of the pipelined units 26-30: instruction fetch, instruction decode, operand fetch, execute, and store result.
One cycle is the minimum time for each functional unit to perform its sub-task. A given functional unit might not perform its sub-task in a single cycle due to contention in the memory access unit (24 in FIG. 1), for example, when the memory access unit is performing a "refill" operation. Moreover, when executing particularly complex instructions, a given functional unit may require more than one cycle to perform its sub-task due to the complexity of the instruction. In any case, by the use of dedicated hardware stages, the steps can be overlapped to some degree in pipelined operation, thereby increasing the total instruction throughput.
FIG. 3 depicts a typical instruction 40 that can be processed by the central processing unit (CPU) shown in FIG. 1. This instruction corresponds to the VAX.TM. variable-length instruction architecture as described in Levy and Eckhouse, Jr., Computer Programming and Architecture, The VAX-11, Digital Equipment Corporation, (1980), incorporated herein by reference. The instruction 40 includes an operation code 41 consisting of either one or two bytes. If the first byte 42 has a value of FD hexadecimal, then it is recognized as a double-byte operation code. Otherwise, the instruction decoder (27 in FIG. 1) recognizes the operation code as including only a single byte. The instruction 40 may further include up to six specifiers following the operation code.
The operation code indicates how many specifiers are included in the instruction. The specifiers used in connection with any given operation code may have various attributes and different lengths. The attributes of a particular specifier are determined at least in part by an addressing mode in the first byte of the specifier. However, the permissible attributes of the specifier are some times limited by the operation code. Further, for a particular kind of addressing mode known as "immediate addressing," the length of the specifier information is determined by a "data type" specified by the specifier.
A specific variable length instruction is shown in FIG. 4. In assembler notation, this instruction is written as "ADDL3 R0,#4,L203(R2)". In machine code, the instruction includes eight bytes generally designated 45. The first byte is an operation code of 23 hexadecimal which corresponds to the assembler mnemonic "ADDL3." The operation code indicates that a first longword operand is to be added to a second longword operand and the longword result is to be stored at a destination.
Following the operation code is a "register specifier" having a value of 50 hexadecimal. The hexadecimal digit of 5 denotes that the specifier is a register specifier, and the hexadecimal digit 0 indicates that the specified register is the R0
general purpose register in the CPU. The register specifier therefore specifies that the first source operand is the content of the general purpose register R0.
Following the register specifier is a "short literal specifier" having a value of 04 hexadecimal. The short literal specifier specifies a value of four for the second source operand.
Following the short literal specifier is the first byte of a "complex specifier" that specifies the destination of the addition operation. The hexadecimal digit E indicates a "longword displacement" addressing mode in which the following four bytes are to be interpreted as a thirty-two-bit address displacement to be added to the value of the content of a base register to obtain an address specified by the complex specifier. The hexadecimal digit 2 indicates that the general purpose register R2 is to be used as the base register. The complex specifier therefore specifies that the sum or result of the longword addition indicated by the operand code is to be stored in memory at an address computed by adding the value of 203 hexadecimal to the content of the general purpose register R2.
Turning now to FIG. 5, there is shown a decoding table for decoding the first byte of an operand specifier which is not a branch displacement. If the two most significant bits of the first byte of the operand specifier are both zero, then the operand specifier consists of the single first byte, and the six least significant bits of this byte are interpreted or decoded as specifying a six-bit value referred to as a "short literal."
If the first two most significant bits of the first byte of an operand specifier are not zero, and assuming that the byte is not part of a branch displacement, then the byte is decoded as a particular one of twelve possible register addressing modes relating to a specified one of sixteen general purpose registers R0 to R15 in the CPU. The most significant four bits of the byte (constituting a register mode field) are decoded to indicate the addressing mode, and the four least significant bits (constituting a general purpose register address field) are used to address a particular one of the sixteen general purpose registers.
If the register mode field has a hexadecimal value of four, then an "index mode" is specified in which the value of the content of the general purpose register addressed by the register address field is multiplied by the size in bytes of the operand (e.g., by 1, 2, 4, 8 or 16 for respective byte, word, longword, quadword or octaword data types) and the sum is included as a term in the address computation performed for an immediately following complex specifier; the next byte must have a register mode field with a value of 6 to F hexadecimal, and a register address field which addresses a base register for the complex specifier.
If the register mode field has a hexadecimal value of five, then the specifier is a "register specifier" in which the operand value is found in the general purpose register indicated by the register address field or, if the specifier is for the destination of the instruction, then the specifier specifies that the result is to be stored in the general purpose register indicated by the register address field.
For each of register modes 6, 7 and 8, the designated register contains the memory address for the operand. For a source operand, the operand value is read from this memory address, and for a destination operand, the result is written to this memory address. In mode 6 the designated register contains the address of the operand. In register mode 7 the content of the designated general purpose register is first decremented before computation of the address; in mode 8 the content of the designated general purpose register is incremented after the register is used to compute the address. Register mode 9 is similar to register mode 8, except that the content of the designated general purpose register specifies the address in memory at which the operand address will be found rather than the operand itself.
Modes 10 through 15 are various kinds of "displacement modes." In a displacement mode a displacement value, which may comprise a byte, word, or longword in modes 10, 12 and 14 respectively, is added to the content of the designated general purpose register to obtain the operand address. The operand is determined in a similar fashion in modes 11, 13 and 15 except that the sum of the displacement value and the content of the general purpose register identifies a memory address at which the address of the operand can be found.
In modes 8 through 15, the register address field of the first byte of the operand specifier can designate any of the general purpose registers, including register R15 which is the program counter. For modes 8 and 9, if the program counter is addressed, the value of the program counter itself is incremented which causes program execution to jump over operand data or an operand address disposed in the instruction stream. In mode 8, this special case is known as an "immediate" addressing mode, and for mode 9 it is known as an "absolute" addressing mode. Specifically, when modes 8 and 9 are decoded for any of the general purpose registers 0 through 14, the next specifier or the next operation code appears immediately following the byte designating the mode and the general purpose register. For the immediate mode, however, a number of bytes of the immediate data appear and the number of bytes is determined by the specifier's datatype.
Because of the variety and complexity of the VAX.TM. variable-length instructions, the digital computer 20 of FIG. 1 is very complex to achieve performance approaching the ideal of one VAX.TM. variable-length instruction executed per cycle. Recent advances in semiconductor processing technology and memory architecture, however, have made it possible to fabricate a single-chip central processing unit having comparable performance when executing only simple instructions. To achieve comparable performance, the so-called "reduced instruction set computer" (RISC) executes simple instructions at a rate substantially in excess of one instruction per cycle. This performance is obtained with minimum hardware complexity by imposing the requirement that all arithmetic/logic operations are performed register-to-register. In addition, complex memory accesses are not permitted; all memory accesses are register load/store operations, and there are only a small number of relatively simple addressing modes, i.e., only a few ways of specifying operand addresses. Instructions are of only one length, and memory accesses are of a standard data width, usually aligned. Instruction execution is of the direct hardwired type, as distinct from microcoding.
Turning now to FIG. 6, there is shown a block diagram of a reduced instruction set (RISC) computer 50. The RISC computer 50 includes a central processing unit 51, an input/output unit 52, and a main memory 53. The central processing unit 51
includes a memory access unit 54, register file 55, and a number of functional units including an instruction unit 56, an addressing unit 57, an integer and logic execution unit 58, and a floating point execution unit 59. Because all arithmetic/logic operations are performed register-to-register, the execution units 58, 59 do not directly access the memory unit 54. Memory-to-register load operations and register-to-memory store operations are performed by the addressing unit 57.
To execute instructions at a rate substantially in excess of one per cycle, the instruction unit 56 can fetch and decode at least two instructions simultaneously when resources are available, and the addressing unit 57, the integer and logic execution unit 58, and the floating point execution unit 59 can execute three different instructions simultaneously. In a preferred implementation, for example, two instructions can be decoded and issued simultaneously when one instruction is from Column A and the second instruction is from Column B:
______________________________________ Column A Column B ______________________________________ Integer Operate Floating Operate Floating Load/Store Integer Load/Store Floating Branch Integer Branch ______________________________________
For further details regarding the RISC computer 50 of FIG. 6, one may refer to the above-mentioned Richard L. Sites and Richard T. Witek, "Branch Prediction in High-Performance Processor," U.S. application Ser. No. 07/547,589 filed Jun. 29,
1990, herein incorporated by reference.
Referring to FIG. 7, there are shown the preferred formats of the various types of instructions of the RISC instruction set executed by the computer 50 of FIG. 6. Each instruction has a fixed length of 32 bits.
A memory instruction 70 contains a 6-bit opcode in bits <31:26>, two 5-bit register address fields Ra and Rb in bits <25:21> and <20:16>, and a 16-bit signed displacement in bits <15:0>. This instruction is used to transfer data between the register file and memory, to load an effective address to a register in the register file, and for subroutine jumps. The displacement field <15:0> is a byte offset; it is sign-extended and added to the contents of register Rb to form a virtual address. The virtual address is used as a memory load/store address or a result value depending upon the specific instruction.
A branch instruction 71 includes a 6-bit opcode in bits <31:26>, a 5-bit address field in bits <25:21>, and a 21-bit signed branch displacement in bits <20:0>. The displacement is treated as a longword offset, meaning that it is shifted left two bits (to address a longword boundary), sign-extended to 64-bits and added to the updated contents of PC 33 to form the target virtual address (overflow is ignored).
Operate instructions have two different formats 72 and 73. The format 72 has three register operands, and the format 73 has two register operands and a literal. The operate format is used for instructions that perform integer register operations, allowing two source operands and one destination operand in register file 43. One of the source operands can be a literal constant. Bit-12 defines whether the operate instruction is for a two source register operation or one source register and a literal. In addition to the 6-bit opcode at bits <31:26>, the operate format has a 7-bit function field at bits <11:5> to allow a wider range of choices for arithmetic and logical operation. The source register Ra is specified in either case at bits <25:21>, and the destination register Rc at <4:0>. If bit-12 is a zero, the source register Rb is defined at bits <20:16>, while if bit-12 is a one then an 8-bit zero-extended literal constant is formed by bits <20:13> of the instruction. This literal is interpreted as a positive integer in the range 0-255, and is zero-extended to 64-bits.
FIG. 7 also illustrates the floating point operate instruction format 74, used for instructions that perform floating point register to floating point register operations. The floating point operate instructions contain a 6-bit opcode at bits <31:26> as before, along with an 11-bit function field at bits <15:5>. There are three operand fields, Fa, Fb and Fc, each specifying either an integer or a floating-point operand as defined by the instruction; only the registers 13 are specified by Fa, Fb and Fc, but these registers can contain either integer or floating-point values. Literals are not supported. Floating point conversions use a subset of the floating point operate format 74 and perform register-to-register conversion operations; the Fb operand specifies the source and the Fa operand should be reg-31 (all zeros).
The last instruction format 75 is used for privileged architecture library (PAL or PALcode) instructions, which specify extended processor functions. In these instructions a 6-bit opcode is present at bits <31:26> as before, and a 26-bit PALcode function field <25:0> specifies the operation. The source and destination operands for PALcode instructions are supplied in fixed registers that are specified in the individual instruction definitions.
The six-bit opcode field <31:26> in the instruction formats of FIG. 7 allows only 2.sup.6 or sixty-four different instructions to be coded. Thus the instruction set would be limited to sixty-four. However, the "function" fields in the instruction formats 72, 73 and 74 allow variations of instructions having the same opcode in bits <31:26>.
The preferred embodiment of the present invention more particularly concerns a translator for translating programs for the CISC computer 20 of FIG. 1 to programs for the RISC computer 50 of FIG. 6. A major difficulty that must be addressed by the translator is that it is not always possible to translate all of the instructions in the original CISC program. One reason that translation is not always possible is that the CISC architecture permits CISC programs to be executed from "read-write" memory areas so that the programs may modify themselves during execution. The RISC architecture, however, requires programs to be executed from "read only" memory areas. Although CISC program instructions to be executed from "read-write" memory areas cannot be translated to CISC instructions, they can be interpreted on the RISC computer during execution time, at the expense of a severe penalty in execution speed. Fortunately, very few CISC programs modify themselves during execution. Many CISC programs that have program instructions in "read-write" memory areas, for example, do not actually modify themselves.
A more typical reason for failing to translate all of a CISC program is the difficulty of finding all of the CISC instructions to be translated. To run a CISC program, only the starting address of the program need be known. Although one may parse the instructions beginning at the starting address, the CISC program usually will include at least one execution transfer instruction, such as a "Jump" or "Call" instruction, having a computed destination address. Although the destination address is easily computed at execution time, it could be entirely indeterminate at translate time, for example, because the destination address is computed based on program input data. In the typical case, however, the present invention provides techniques for resolving, at translate time, a majority of the computed destination addresses.
Turning now to FIG. 8, there is shown a block diagram of a CISC-to-RISC translator 80 and its associated input and output images and data structures. The translator 80 receives an original program 81 having CISC instructions and generates a corresponding translated program 82 having RISC instructions. In association with the translated program 82, the translator 80 generates address conversion information 96, which correlates addresses of CISC instructions in the original program with addresses of RISC instructions in the translated program.
In addition to a RISC code generator 83, the translator 80 includes a program analyzer 84 that analyzes the original program 81 to separate instructions from data, to trace program flow for RISC code optimization, and to detect CISC instructions that specify behavior that cannot be reproduced by the RISC instruction architecture.
The program analyzer 84 may reference supplemental input 85 to find the location of instructions in the original program. The supplemental input 85 may include, for example, a linker map file 86 generated when the original program 81 was created by linking together a number of component images. In this case the linker map will identify entry points to instructions in the original program, and provide names for the entry points. These names are useful for making messages and other output of the translator more readable.
The supplemental input 85 may include a program counter (PC) sampling histogram file 87 that might have been generated to identify frequently executed instructions in the original program. If a PC sampling histogram file is available, then the program analyzer may use it to ensure that an attempt is made to translate CISC instructions at every program counter address having a sampling frequency greater than zero; i.e., every program counter address corresponding to an instruction that was executed at the time that the PC sampling histogram file was generated.
The supplemental input 85 may also include an execution log file 88 containing the addresses in the original program of instructions that were not translated in a previous translation, but which were interpreted by the interpreter during execution of the previously translated program. As further described below in connection with FIGS. 9 and 11, the execution log file is generated, for example, during interpretation of the previously untranslated code, and it includes pairs of the origin addresses in the original program of instructions which transfer execution to the untranslated code, and the destination addresses in the original program where the untranslated code is located. The translator uses the information identifying the pairs of origin and destination addresses to translate the previously untranslated instructions in the original program. The origin addresses are used to identify the execution paths that were not discovered during prior translation of the original program. As further described below, identification of these execution paths may permit the program analyzer 84 to discover additional execution paths that were not taken during execution of the previously translated program or interpretation of the untranslated portions of the original program. Preferably this feedback of information from execution to re-translation is performed after each execution of the translated program so that virtually all of the instructions in the original program will eventually be located and translated.
The program analyzer 84 also generates auditing output 89 including error, warning and information messages 90; a summary page 91 showing a memory map of the original program 81 identifying regions of the instructions that were found and the location of entry points; flowgraph language files 92 that define a flowchart of the translated portions of the original program 81; and an output image information file 93 for the original program.
An image information file contains information about the interface characteristics or properties of predefined entry points in an image. The translator 80 generates an output image information file 89 for the original program 81 when the original program is first translated. During retranslations, the information input file for the original program can be used as an input file 94, and it will be updated during retranslation based upon additional instructions and execution transfer paths discovered by the program analyzer 84.
Image information files are used by the program analyzer 84 to resolve references to other images and to generate the appropriate linkage. For example, when the original program 81 calls library routines, the supplemental input 85 should include image information files 95 for these shared library routines. Preferably the image information file is an ASCII file that is easily edited by the programmer to add or correct information.
Turning now to FIG. 9, there is shown a block diagram of the images and data files used when executing the translated program 82 on the RISC computer 50. In addition to the translated program 82 and its associated address conversion information
96, the RISC computer 50 uses as input a copy of the original program 81, an interpreter 101 that is written in RISC code for interpreting CISC instructions, and data input 102 for the original and translated programs. The interpreter 101 is called upon to interpret untranslated instructions in the original program 81, so that the original program 81 serves as data input for the interpreter. The address conversion information 96 also serves as data input to the interpreter 101, and the interpreter uses this information when passing execution back to the translated program at appropriate times. The data output from the RISC computer includes output data 103 from the original and translated programs, and the execution log file 88, which is output data from the interpreter.
Turning now to FIG. 10, there is shown a flowchart of steps in the RISC code generator 83 that generate calls to the interpreter when CISC instructions are not translated prior to execution time. As noted above, code in "read-write" memory areas usually should not be translated because the program might modify the code during execution. The user may, however, believe that program will not modify the code during execution, and therefore the user may wish to have code in read-write memory translated to avoid the severe performance penalty associated with code interpretation. Therefore the user may set a software override switch, and in step 111 the RISC code generator 83 tests the override switch to decide whether to translate a CISC instruction in a "read-write" memory area. If the override switch is not set, then in step 112 the RISC code generator 83 checks whether the CISC instruction is in a read-write memory area. In this case, the code generator issues a message in step 113
to warn the user that the CISC instruction must be interpreted, and therefore a performance penalty will result. Then in step 114 the code generator generates RISC code to call the interpreter and pass the CISC instruction address as a parameter. Finally, in step 115 the code generator advances to the next CISC instruction that is found in a "read-only" memory area, in order avoid translating CISC instructions that will be interpreted during the call to the interpreter.
The RISC code generator cannot generate code for CISC instructions that cannot be located in the original program. The RISC code generator, however, can recognize execution transfer instructions in the original program that have unresolved computed destination addresses, and therefore generate RISC code that calls the interpreter to interpret the untranslated CISC instructions at execution time. In step 116 the RISC code generator checks whether the CISC instruction has an unresolved computed destination address. If so, then in step 117 the user is warned that the interpreter must be called to interpret untranslated CISC instructions, and in step 118 the code generator generates RISC code to compute the destination address, and to call the interpreter and pass the address of the CISC execution transfer instruction and the computed destination address as parameters.
In step 119, the code generator generates RISC code for a CISC instruction which does not have an unresolved computed destination address. This is done by translating the CISC instruction to one or more RISC instructions that have the same effect as the CISC instruction. This code translation is made more efficient by using information from the program analyzer (84 in FIG. 8) about execution transfers and register usage in the program. The RISC computer (50 in FIG. 6), for example, preferably has subroutine call and return instructions which reference a return address in a specified general purpose register, and the RISC computer provides a large number of general purpose registers. Therefore it is inefficient to translate CISC subroutine call and return instructions into RISC instructions which emulate stack operations in cases where register could be used instead.
There is also an anomalous case of CISC instructions which reference the program counter directly as a source operand. In the preferred construction of the RISC computer, the program counter cannot be referenced directly as a source operand, although the program counter value can be computed by decrementing the return address provided by a subroutine call instruction. However, the program counter value in this case will be the program counter of the RISC computer, which is referenced to the RISC code, and not the program counter of the CISC computer, which is referenced to the CISC code. An excessive amount of overhead would be required to maintain in a general purpose register a constantly changing value corresponding to the CISC program counter. Therefore the program analyzer (84 in FIG. 8) will attempt to resolve references to the program counter for execution transfers in terms of relative jumps in the program, but will not attempt to translate code that uses the value of the program counter for any other purpose.
Turning now to FIG. 11, there is shown a flowchart of the interpreter and its interaction with execution of the translated program. In step 131, instructions in the translated program 82 (or operating system routines called by the translated program) are executed until the interpreter is called. As shown, the interpreter is entered at a first entry point 132 when called to interpret a CISC instruction at a computed destination address, and is entered at a second entry point 133 when called to interpret a CISC instruction in a "read-write" area of memory.
A computed destination address might not necessarily be in an untranslated portion of the original program. Therefore, in step 134, the computed address is checked to determine whether execution should continue in the interpreter, or whether execution should return to an entry point in the translated program or somewhere else.
When the computed destination address is in the original program but does not correspond to an entry point in the translated program, it is desirable to log the destination address in the execution log file (88 in FIG. 8) together with the address of the execution transfer instruction having the destination address. This is done in step 135. During retranslation, the program analyzer (84 in FIG. 8) reads the information in the execution log file to locate untranslated code, for the case where the computed destination address does not correspond to a translated CISC instruction, or to identify a new execution transfer path in the translated program, for the case where the computed destination address corresponds to a translated CISC instruction but not an entry point in the translated program.
In step 136, the interpreter interprets the CISC instruction at the address in the original program. A general purpose register in the RISC computer is allocated to reproduce the contents of a corresponding register in the CISC computer. This set of allocated registers defines a model of the state of the CISC computer. The interpreter program reads the opcode of the CISC instruction and performs a table look-up to determine the number of specifiers to decode and the address of a program routine for the opcode. Then the interpreter decodes each specifier. Literal and register source operands are transferred to general purpose registers allocated as source operand registers. Complex specifiers are decoded and the specified operands are fetched from memory and transferred to the general purpose registers allocated as source operand registers. Then the program routine for the opcode is executed. Finally, the destination specifier is decoded, and the result is transferred to the specified destination.
In step 137 the address of the next CISC instruction in the original program is computed. This computation is based upon the instruction interpreted in step 136. The next CISC instruction follows the prior CISC instruction unless the prior instruction transfers execution, in which case the address of the next CISC instruction is the destination address of the prior instruction.
In step 138 the computed address is checked to determine whether execution should continue in the interpreter, or whether execution should return to an entry point in the translated program or somewhere else. Execution in the interpreter continues in step 136. In addition, step 136 is the initial step from the entry point 133.
Turning now to FIG. 12, there is shown a flowchart of the address check routine used in steps 134 and 138 of FIG. 11. In step 141 the computed address is compared to the beginning and ending addresses of the original program to determine whether it is in the original program. If the destination address is outside of the original program, then the computed destination address, for example, could be the entry point of an operating system routine for the operating system of the CISC computer (20
in FIG. 1). The operating system of the RISC computer (50 in FIG. 6) may or may not have a corresponding routine. Therefore in step 142, the computed destination address is validated, for example, by look-up in a table of CISC destination addresses having corresponding entry points in the RISC operating system. If the computed destination address is invalid, the error is reported and execution is terminated in step 143. Otherwise, in step 144, execution is transferred to the entry point of the RISC operating system routine corresponding to the computed destination address.
When step 141 determines that the computed destination address is in the original program, it is possible that the computed destination may correspond to a translated CISC instruction in the original program. But even in this case, it may be desirable to interpret the CISC instruction because it may be desirable for translation to assume that the CISC program has only certain predefined entry points so that the translator can translate blocks of CISC instructions rather than individual instructions in order to generate more optimum RISC code. In any case, the interpreter must know the correspondence between CISC instruction addresses and corresponding entry points in the translated program if execution is ever to return from the interpreter to the translated program. This correspondence is provided by the address conversion information (96 in FIG. 8) associated with the translated program. Preferably this address conversion information is provided by a conversion table of the CISC addresses and corresponding entry points in the translated program. Therefore, in step 145, the interpreter checks whether the computed destination address corresponds to an entry point in the translated program, for example, by look-up in the conversion table. When the computed destination address corresponds to an entry point in the translated program, the interpreter uses the address conversion information to covert the computed CISC address to the RISC entry point address in step 146, for example, by reading the corresponding RISC entry point address from the conversion table. In step 147, execution returns to the entry point in the translated program. Otherwise, in step 148, execution returns to the interpreter.
Turning now to FIG. 13, there is shown a flowchart of the steps performed by the translator. 80. In the first step 161, the original program and the supplemental input files are read into memory. Then in step 162 the translator searches for entry points in the original program. This is further described below with reference to a specific example for the original program organization shown in FIG. 14.
In step 163 the translator follows threads of execution from the entry points in order to locate instructions in the original program and to generate a flowgraph of the program. The flow graph identifies "basic blocks" of instructions and execution paths interconnecting the basic blocks. Each basic block is a sequence of contiguous instructions that has a single known entry point at the beginning of the basic block. Execution is transferred only to the beginnings of the basic blocks, and execution is transferred only from the ends of the basic blocks. The generation of the flowgraph is further described below with reference to FIGS. 18 and 19.
In step 164 the translator scans the original program for plausible code. The result of such a scan is useful for quantifying the number of plausible instructions in the original program for comparison to the number of instructions located by the translator. The result of the comparison, for example, is an estimate of the "percentage of code found" computed as 100 times the bytes of known code divided by the sum of the bytes of known code and the bytes of plausible code. This estimate can be used by a systems analyst in choosing a porting strategy for the original program or for monitoring or augmenting the performance of the translator. If the translator is unable to translate a high percentage of the instructions, then the systems analyst may run a number of test cases upon the translated program and employ feedback of information from execution to re-translation to reduce the indicated estimate of untranslated code before releasing the translated program for general use. As a last resort, the systems analyst may have to search for documentation by the original programmer or inspect the code at the address locations indicated as containing plausible instructions in an attempt to verify the estimate and make a judgement whether to release the translated program for general use, or to direct the translator to translate the plausible code that is believed to contain untranslated instructions. The preferred method of scanning for plausible code is described further below with reference to FIGS. 29 and 29.
In step 165 the flowgraph is analyzed to find possible errors indicated by deviations from typical programming practices and standard conventions. In a typical program, for example, a data item is pushed on a stack so that it can be popped off at a later time, without regard to how many other data items are pushed on and popped off the stack by intervening instructions. If the number of pushes do not cancel the number of pops in the intervening instructions, then errors will be caused by improper programming of the intervening push and pop instructions. If the intervening instructions include separate paths that branch apart and rejoin at an intervening node, and the improper programming occurs in one of the paths, then the improper programming will also cause the calculated stack depth to be different for the two paths when they join at an intermediate node. This is especially likely to cause an error if the node is an RSB subroutine return instruction. Therefore, one of the error checks performed with the aid of the flowgraph is to calculate the stack depth at all paths from the main entry points of the original program, and to check whether the calculated stack depth is the same whenever these paths rejoin. If not, the translator issues a warning message that says: "Stack depths don't match on all incoming paths at VA=%X", where VA is the virtual address of the node in the original program.
The flowgraph is also analyzed for compliance with typical programming conventions of the CISC instruction architecture. For the VAX.TM. architecture, for example, a "return mask" is used in connection with procedure calls (CALLS and CALLG instructions) to indicate which registers to load during a return. If the return masks are different depending on the path to the return instruction, then there probably was a programming error. The translator issues a warning: "Return Masks don't match on all incoming paths VA=%X". In addition, if a RET instruction is found that may be executed with different return masks, that instruction must be translated as a call to the interpreter so that the interpreter will inspect the return mask to determine the particular registers to load during the return.
Another calling convention associated with the procedure calls is that register and condition code bits are to be set by the procedure before they are used in the procedure, rather than using register values or condition codes that might be set by the calling routine. If a deviation from this calling convention is found, the translator issues the warning: "Possible uninitialized variable(s) at call entry VA=%X: %s". The "%s" in this message stands for a list of the possible uninitialized variables that were found. Other deviations from coding conventions of the VAX.TM. instruction architecture include a failure to align the stack on longword boundaries (which causes a severe performance penalty for the RISC code), the use of a RET instruction to return from a JSB, and the use of an RSB instruction to return from a CALLS or CALLG. The translator analyses the flowgraph for these deviations and issues warning messages identifying the particular deviation and a corresponding address.
To reduce the amount of required searching through the flowgraph, the analysis of the flowgraph for possible errors also performs propagation of resources used in each basic block. Each basic block uses various resources, such as the stack, general purpose registers, and condition codes. To deduce the behavior of subroutines, their resource usage is propagated back to their calling routines. This information is used to terminate the search for the setting of a register used in a subroutine called by a CALLS or CALLG instruction, and this information is also used later during RISC machine code generation (step 168) to avoid needless generating of condition code or register values that are never used. Moreover, the consistency checks can be performed by comparing the resources for the basic blocks. For example, the "return mask" warning is generated when a basic block has two or more preceding basic blocks and the return masks for the preceding basic blocks are different, and the "stack depth" warning is issued when a basic block has two or more preceding basic blocks whose stack depth changes are different. A final scan over all of the basic blocks is used to uncover any failure to align the stack on longword boundaries (i.e., a stack change for a basic block that is not divisible by four bytes), the use of a RET instruction to return from a JSB, and the use of an RSB instruction to return from a CALLS or CALLG.
In step 166 the image information file for the original program is created, using the information about the resources used in each basic block. The image information file, for example, consists of an image header, which identifies the image name and version of the image being described, followed by a series of image property records (one per line) each of which attaches a particular property to an offset in that image. Each property record consists of an image offset followed by the property name, optionally followed by a comma-delimited list of attribute values. The attribute values are specific to particular properties. An image offset is a representation of the offset in the image to which the property applies. It may consist of a symbol (which could be defined by other image information file records) .+-.the hexadecimal offset value. Either the symbolic name or the offset value may be omitted (but not both).
The properties and their defined attributes preferably include:
______________________________________ Interface Properties Property Name Attributes Interpretation ______________________________________ absolute.sub.-- ok specifies RISC operating system support for entry point jmpentry symbolic name defines a JMP entry point and name callback parameter defines a CALL to a number procedure parameter callentry symbolic name defines a CALL entry point and name caselimit integer specifies maximum number of cases delta.sub.-- sp integer specifies change in stack pointer by routine image identification identifies the image that the image information file describes jmpback parameter defines a JMP to a number procedure parameter jsbentry symbolic name defines a JSB entry point and name no.sub.-- standard.sub.-- return specifies no standard return sets list of registers specifies registers set by routine symbol symbolic name attaches a symbolic name to an offset uses list of registers specifies registers used by routine ______________________________________
An informal representation of the syntax of an image information file is shown below.
______________________________________ <WHITESPACE> :== non-null sequence of tabs and spaces <COMMA> :== [<WHITESPACE>] `,` [<WHITESPACE>] <NL> :== newline <comment line> :== `;` any text <NL> <IIF file> :== <IMAGE ID>] <IMAGE.sub.-- PROPERTIES> <IMAGE.sub.-- ID> :== <WHITESPACE> `image` <IMAGE.sub.-- NAME> <COMKA> <VERSION><NL> <IMAGE.sub.-- PROPERTIES> :== <IMAGE.sub.-- OFFSET> <PROPERTY> [<ATTRIBUTE>, . . . ] <NL> <IMAGE.sub.-- OFFSET> :== [<SYMBOL>] (`+`.vertline.`-`) [<OFFSET>] ______________________________________
Comment records (lines beginning with `;`) may occur anywhere in the file and are ignored by the translator.
An example of an image information file is shown below, for an image named "VAXCRTL":
______________________________________ ; comments image VAXCRTL, "V05-001" +000000 jabentry "C$MAIN" +000000 uses "AP FP SP RSB " +000000 sets "R0 R1 R2 R3 R4 R5 R6 R7 R8 R9 R10 AP SP N Z V C" ______________________________________
This image information file provides the translator with the following information about VAXCRTL.
VAXCRTL+0 (i.e., the first memory location in the image, at an offset of zero from the beginning of the image) is marked as a JSB entry with the name C$MAIN. From this information, the translator can confirm that the JSB is proper, and can insert the name VAXCRTL:C$MAIN into flowgraphs. VAXCRTL+0 is also marked as using VAX.TM. registers AP, FP, and SP, and the VAX.TM. instruction RSB. From this last piece of information, the translator can legitimately continue parsing VAX instructions at the return point, just after a calling instruction in a calling image.
VAXCRTL+0 is also marked as setting VAX.TM. registers R0 . . . R10, AP, and SP, and the VAX condition code bits NZVC. From this information, the translator can infer, for example, that a test of R0 after the calling instruction refers to the value left in R0 by C$MAIN, not to an earlier value put in R0 by the calling image itself.
The following example shows an image information file for a library file, LIBRTL, that has numerous entry points:
______________________________________ ; comments image LIBRTL, V05-000 +000000 jsbentry LIB$AB.sub.-- ASC.sub.-- EBC +000000 jsbentry LIB$AEF.sub.-- EBC.sub.-- ASC +000200 jsbentry LIB$AB7.sub.-- PCASE +000300 callentry LIB$ANALYZE.sub.-- SDES +0004F0 callentry LIB$SIGNAL +0004F8 callentry LIB$STOP LIB$STOP no.sub.-- standard.sub.-- return +000500 callentry LIB$SIG.sub.-- TO.sub.-- RET +000508 callentry LIB$SKPC ______________________________________
In step 167 of FIG. 13, a flowgraph language file is created for the translated program. This flowgraph language file can then be used as input to flowgraph program that will send commands to a laser printer to draw a flowgraph of the translated program. The flowgraph program, for example, may also receive an optional scale parameter that controls the size of the printed flowgraph, and an optional "code block" parameter that may specify the address of a particular block of code to be printed on a page with other blocks to which it is connected. The flowgraph consists of different kinds or shapes of blocks, arcs interconnecting the blocks, and instructions printed in assembler notation in each block. Each basic block of instructions found by the translator has a corresponding block in the flowchart.
The flowgraph language file contains a definition of the kinds of blocks in the flowgraph, the interconnections between the blocks, and the instructions (in ASCII, assembler notation) that go into each block. The flowgraph language file is created from an internal binary data structure that includes a data block or record defining each basic block of instructions found by the translator. Each data block or record contains the starting address (in the original program) of the basic block of instructions, the length of the basic block in bytes, the kind of the block, an "epoch number" attribute used to indicate whether the block has been included in a particular search, a list of addresses of "successor" data blocks, and a list of addresses of "predecessor" data blocks. A data block can be created dynamically and easily inserted or removed from the data structure by inserting or deleting references to the data block in its successor and predecessor blocks, and the data structure is easily searched or scanned in either a forward or backward manner by successive references to the successors or predecessors.
There are five possible kinds of blocks used in the flowgraph, and each kind of block has a distinctive shape. The kinds of blocks include CALLx entries, JSB entries, "normal" blocks, CALLx placeholders, and JSB placeholders. "CALLx" refers to a procedure call, which is made by a CALLS or CALLG instruction in the VAX.TM. instruction architecture.
In the printed flowgraph, a CALLx entry block is hexagonal and contains the name of the block, the name of the procedure if known, and the call mask in hex. The name of the block consists of the address (in the original program) of the fist byte of the block in hex followed by ".sub.-- CALL". A CALLx entry block is reached via a CALLS or CALLG instruction. In the VAX.TM. instruction architecture, the main entry point of the original program is a CALLx entry block; this main entry point is called by the operating system to run the program.
In the printed flowgraph, the JSB entry block is oval and contains the name of block, and the name of the subroutine if known. The name of the block consists of the address (in the original program) of the first byte of the block in hex followed by ".sub.-- JSB". A JSB entry point is reached via a JSB, BSBB, or BSBW instruction. JSB entries are typically used in macro code.
In the printed flowgraph, a normal block is rectangular and contains the name of the block and all of the instructions within the block. The name of block is the address (in the original program) of the first byte of the block in hex. A normal block is reached via the normal flow of execution or via an execution transfer from a branch, case, jump, or return instruction.
In the printed flowgraph, a CALLx placeholder block is dashed hexagonal and contains the name of the block, and the name of the called procedure if known. The name of the block consists of the address (in the original program) of the first byte of the called procedure in hex (if known at translate time, or else zero), followed by ".sub.-- " and a "call number" of the placeholder for the called procedure. A Callx placeholder block represents the flow in and out of a procedure and any recorded side effects for each call of that procedure. If a procedure is called from five different places, there will be five different placeholder blocks with five different call numbers for those calls (in addition to a CALLx entry block for the procedure).
In the printed flowgraph, a JSB placeholder block is dashed oval and contains the name of the block, and the name of the subroutine if known. The name of the block consists of the address (in the original program) of the first byte of the subroutine in hex (if known at translate time, or else zero), followed by ".sub.-- " and a unique "call number" of the placeholder for the called subroutine. A JSB placeholder block represents the flow in and out of a subroutine and any recorded side effects for each call of that subroutine.
In a preferred form of the flowgraph language, there are three basic operators: graph (G), block (B) and arc (A). Each line of the flowgraph language file starts with one of these operators; a semicolon, indicating that the line contains a comment; a blank, indicating that alphanumeric data in the line to be included in a block defined by a previous line having a block operator; an H followed by a block name, indicating that a named block is to be highlighted; or a C followed by the names of two blocks, indicating that the two blocks are to be connected with a wide dotted line, for example to highlight a path including an error.
A line beginning with the operator G delimits a new graph, and may include alphanumeric text that is printed as a heading for the graph.
A line beginning with the operator B also includes a name (such as a hex address) for a block of code, followed by a letter code indicating the kind and shape of the block (e.g., A=normal, B=CALLx entry, C=JSB entry, D=CALLx placeholder, E=JSB placeholder). An optional numeric scale factor may immediately follow the operator B.
A line beginning with the operator A specifies a directed arc from a first-named block to a second-named block, and after the two block names, a letter code indicates the style of the arc (e.g., A=normal branch drawn with a solid line, B=true branch drawn with a dotted line).
The following is an example of a flowgraph language file for printing the flowgraph in FIG. 15:
______________________________________ ; FLOWGRAPH LANGUAGE FILE FOR DHRYSTONE.sub.-- SHR ; G DHRYSTONE.sub.-- SHR B 2E00.sub.-- CALL B 2E00.sub.-- CALL main Transfer.sub.-- address mask:0000 A 2E00.sub.-- CALL 2E02 A B 2E02 A 2E02 SUB2 #08, SP JSB @0000063D(PC) A 2E02 0.sub.-- 12972C.sub.-- 1 A B 0.sub.-- 12972C.sub.-- 1 E 0.sub.-- 12972C .sub.-- 1 VAXCRTL:C$MAIN A 0.sub.-- 12972C.sub.-- 1 2E0B A B 2E0B A 2EOB CALLS #00, 00000006(PC) A 2E0B 2E18.sub.-- 1297E4.sub.-- 1 A B 2E18.sub.-- 129714.sub.-- 1 D 2E18.sub.-- 1297E4.sub.-- .sub.-- 1 Proc0 A 2E18.sub.-- 1297E4.sub.-- 1 2E12 A B 2E12 A 2E12 CVTWL #0001, R0 RET ______________________________________
In step 168 of FIG. 13, the translator receives the basic blocks of instructions from the original program, and generates corresponding RISC machine code. The CISC instructions are translated one at a time into RISC instructions that will reproduce the same results and condition codes. Preferably each CISC instruction is translated into a corresponding sequence of RISC instructions including in sequence four groups of instructions. The first group has instructions that fetch source operands and place them in temporary storage. The instructions in the second group operate upon the source operands to produce results in temporary storage. The instructions in the third group update the contents of memory or registers and are subject to possible exceptions. Finally, the instructions in the fourth group are those that update the contents of memory or registers and are free of possible exceptions. For further details regarding the translation of the CISC instructions to RISC instructions, one may refer to the above-cited related application by Scott Robinson and Richard Sites entitled "Improved System and Method of Preserving Instruction Granularity for Translated Program Code."
In step 169 of FIG. 13, the translator creates the translated program by placing the RISC machine code into a program image format that is recognized by the operating system of the RISC computer. In association with this translated program image, the translator also generates a mapping between the address of each basic block of CISC instructions in the original program and the address of the corresponding basic block of RISC instructions in the translated program.
In step 170 of FIG. 13, the translator creates a summary page, which graphically illustrates where the translated code is located in the program image. The summary page is further described below with reference to FIG. 14.
Finally, in step 171 of FIG. 13, the translator may create flowgraph language files for printing flowgraphs of subroutines having errors that were located in step 165 during analysis of the program flowgraph. An example of an error-specific flowgraph is shown in FIG. 16. The error-specific flowgraph has enlarged blocks 210, 211, 212, and 213 along the path where the error was traced, and a heavy dotted line 214 along an arc over the path where the error was traced. The error indicated in FIG. 16 is an uninitialized variable (general purpose register R3) being used in a procedure. The heavy dotted line 214 connects the block 210 at the beginning of the procedure to the block 213 where the uninstantiated variable was first used. This error was not discovered prior to being revealed by the translator because during execution of the original program, execution apparently never followed the indicated error path around block 215.
Turning now to FIG. 14, there is shown a summary page of the kind generated in step 170 of FIG. 13. As shown in FIG. 14, the summary page includes a memory map 181 of the original program. At the top of the memory map, there is an annotation including the program name 182, the date 183 when the original program was created, and the percentage 184 of code found during the translation.
The memory map 181 shows the organization of a typical program using the VAX.TM. instruction architecture and VMS.TM. operating system. The program includes a header 185, a read/write image section 186, a number of read-only sections 187, a fixup vector section 188, a debug symbol table (DST) 189, a debug module table (DMT) 190, and a global symbol table (GST) 191. The number 192 of disc blocks per line of the memory map is printed at the upper left of the memory map, and the total number
193 of disc blocks in the memory map is printed at the lower right of the memory map. In this example, there is one disk block per line, and 12 disc blocks total. The translated code is indicated by narrow bands 194, 195 and 196 in the read-only section 187. These narrow bands are made up of a short vertical line for each byte of code that is translated. Plausible code is indicated by a wider band 197.
To the right of the memory map 181, the summary page lists up to 100 entry points that are defined by the image format of the original program. During step 161 of FIG. 13, the translator parses the command line of switches and the file name at the beginning of the original program, and reads the original program into memory. The in step 162 of FIG. 13, the translator searches for the entry points defined by the image format of the original program. The translator looks at the first few and last few bytes of the first disc block to see if it is a header for a translatable VAX.TM./VMS.TM. program image. If so, the translator parses the rest of the header, looking for image section descriptors (ISDs), and organizing the disk blocks into the described image sections, such as the images sections 185-191 shown in the memory map 181 of FIG. 14. If a VAX.TM./VMS.TM. fixup vector is present, the translator parses it to determine addresses in the original program that are indirect references to other images.
To find entry points that are defined by the image format of the original program, the translator scans the image header 185 for entry points but ignores debugger entry points. The translator then scans, when present, the debug symbol table (DST) 189 and, when present, the global symbol table (GST) 191, to find CALL entries and SYMBOL entries. Validity checks are performed on each possible CALL entry point, such as insuring that the CALL mask is valid, the mask is followed by legal code, and the code does not contain privileged instructions. The legal code check decoded the code through three layers of basic blocks from the call mask. SYMBOL entries are ignored when the symbols are flagged by the linker as being absolute references, because absolute references are typically constants. Otherwise, a SYMBOL entry might be a JSB entry point or a mislabeled CALL entry point. The code from the entry point is decoded and the legal code check just described is performed assuming that the entry point is a CALL entry,