Home
Patent Search
IMT Blog
REGISTER
|
SIGN IN
United States Patent
6289498
Dupenloup
September 11, 2001
Title
VDHL/Verilog expertise and gate synthesis automation system
Abstract
A method of fabricating an integrated circuit chip (IC), said method comprising the steps of defining the IC at the RTL code level, translating said RTL code into a generic netlist description, generating logic synthesis tool scripts based on said generic netlist description, and executing said logic synthesis tool scripts to synthesize the RTL code. The step of generating logic synthesis tool scripts comprises the substeps of identifying hardware elements and structure of the IC design, determining interrelationships between said identified hardware elements and structures, and generating logic synthesis tool scripts to synthesize said identified hardware elements to netlists as a function of said hardware elements and said interrelationships.
Inventors:
Dupenloup; Guy
(Marly-le-Roi,
FR
)
Assignee:
LSI Logic Corporation
(Milpitas,
CA
)
Appl. No.:
027422
Filed:
February 20, 1998
Current U.S. Class:
716/18
Current International Class:
G06F 17/50 (20060101)
Field of Search:
395/500.02,500.19,500.04 716/1,3,18
U.S. Patent Documents
5210700
May 1993
Tom
5493508
February 1996
Dangelo et al.
5526277
June 1996
Dangelo et al.
5544066
August 1996
Rostoker et al.
5544067
August 1996
Rostoker et al.
5557531
September 1996
Rostoker et al.
5572437
November 1996
Rostoker et al.
5740347
April 1998
Avidan
5790435
August 1998
Lewis et al.
5801958
September 1998
Dangelo et al.
5812416
September 1998
Gupte et al.
5854752
December 1998
Agarwal
5867395
February 1999
Watkins
5870308
February 1999
Dangelo
5880971
March 1999
Dangelo et al.
5903475
May 1999
Gupte et al.
5956256
September 1999
Rezek et al.
Primary Examiner:
Smith; Matthew
Assistant Examiner:
Garbowski; Leigh Marie
Attorney, Agent or Firm:
Mitchell, Silberberg & Knupp LLP
Claims
What is claimed is:
1. A method of designing an integrated circuit chip (IC), said method comprising the steps of:
obtaining a definition of the IC in RTL code;
translating said RTL code into a generic netlist description;
generating a logic synthesis tool script based on said generic netlist description;
executing said logic synthesis tool script to synthesize the RTL code.
2. The method according to claim 1 wherein said step of generating a logic synthesis tool script comprises the steps of:
identifying hardware elements and structure of the IC;
determining interrelationships between said identified hardware elements and structure; and
generating the logic synthesis tool script to synthesize said identified hardware elements to netlists as a function of said hardware elements and said interrelationships.
3. The method according to claim 2 wherein said generated logic synthesis tool script comprises logic synthesis tool commands to perform following operations:
library customization;
hierarchy ungrouping of modules included within an IC design;
initial mapping of the IC design;
characterization of each module included within the IC design; and
re-synthesis of the IC design.
4. The method according to claim 3 wherein said generated logic synthesis tool script further comprises a command to synthesize each module included within the IC design.
5. The method according to claim 2 further comprising analysis and detection of predetermined configurations of hardware.
6. A method of generating instructions for a logic synthesis tool to synthesize an integrated circuit chip (IC) design to a gate-level description of the IC design, said method comprising the steps of:
identifying hardware elements and structure of the IC design;
determining interrelationships between said identified hardware elements and structure; and
generating a logic synthesis tool script as a function of said hardware elements and said interrelationships.
7. An apparatus for designing an integrated circuit chip (IC), said apparatus comprising:
means for obtaining a definition of the IC in RTL code;
means for translating said RTL code into a generic netlist description;
means for generating logic synthesis tool scripts based on said generic netlist description;
means for executing said logic synthesis tool scripts to synthesize the RTL code.
8. An apparatus for generating instructions for a logic synthesis tool to synthesize an integrated circuit chip (IC) design to a gate-level description of the IC design, said apparatus comprising:
means for identifying hardware elements and structure of the IC design;
means for determining interrelationships between said identified hardware elements and structure; and
means for generating a logic synthesis tool script as a function of said hardware elements and said interrelationships.
9. A computer storage medium having encoded thereon instructions for a computer processor, comprising:
a computer encoded instruction for obtaining a definition of the IC in RTL code;
a computer encoded instruction for translating said RTL code into a generic netlist description;
a computer encoded instruction for generating logic synthesis tool scripts based on said generic netlist description;
a computer encoded instruction for executing said logic synthesis tool scripts to synthesize the RTL code.
10. A computer storage medium of claim 9 wherein said computer storage medium is selected from a group consisting of magnetic device, optical device, magneto-optical device, floppy diskette, CD-ROM, magnetic tape, computer hard drive, and memory card.
11. A computer storage medium having instructions thereon for generating instructions for a logic synthesis tool to synthesize an integrated circuit chip (IC) design to a gate-level description of the IC design, said computer storage device comprising:
a computer encoded instruction for identifying hardware elements and structure of the IC design;
a computer encoded instruction for determining interrelationships between said identified hardware elements and structure; and
a computer encoded instruction for generating a logic synthesis tool script as a function of said hardware elements and said interrelationships.
12. A computer storage medium of claim 11 wherein said computer storage medium is selected from a group consisting of magnetic device, optical device, magneto-optical device, floppy diskette, CD-ROM, magnetic tape, computer hard drive, and memory card.
Description
This application includes a microfiche appendix that consists of two microfiche for a total of 150 frames. All references herein to the term "Appendix" are intended to refer to this microfiche appendix.
BACKGROUND OF THE INVENTION
1. Field of the Invention
This invention relates to a method of analyzing and optimizing design of integrated circuit (IC) designs. In particular, the present invention relates to a method of analyzing IC circuit designs and automating design synthesis by generating synthesis scripts.
2. Description of the Related Art
Today, the design of most digital integrated circuits (IC's) is a highly structured process based on an HDL (Hardware Description Language) methodology. FIG. 1 illustrates a simplified flowchart representation of an IC design cycle. First, as indicated by the reference number 102, the IC to be designed is specified by a specification document.
Then, the IC design is reduced to an HDL code, as indicated by the reference number 104. This level of design abstraction is referred to as the Registered Transfer Level (RTL), and is typically implemented using a HDL language such as Verilog-HDL ("Verilog") or VHDL. At the RTL level of abstraction, the IC design is specified by describing the operations that are performed on data as it flows between circuit inputs, outputs, and clocked registers. The RTL level description is referred to as the RTL code, which is generally written in Verilog or in VHDL.
The IC design, as expressed by the RTL code, is then synthesized to generate a gate-level description, or a netlist. This is referred to by the reference number 106 of FIG. 1. Synthesis is the step taken to translate the architectural and functional descriptions of the design, represented by RTL code, to a lower level of representation of the design such as a logic-level and gate-level descriptions.
The IC design specification and the RTL code are technology independent. That is, the specification and the RTL code do not specify the exact gates or logic devices to be used to implement the design. However, the gate-level description of the IC design is technology dependent. This is because, during the synthesis process, the synthesis tool uses a given technology library, 108 of FIG. 1, to map the technology independent RTL code into technology dependent gate-level netlists.
An integrated circuit chip (hereafter referred to as an "IC" or a "chip") comprises cells and connections between the cells formed on a surface of a semiconductor substrate. The IC may include a large number of cells and require complex connections between the cells.
A cell is a group of one or more circuit elements such as transistors, capacitors, and other basic circuit elements grouped to perform a function. Each of the cells of an IC may have one or more pins, each of which, in turn, may be connected to one or more other pins of the IC by wires. The wires connecting the pins of the IC are also formed on the surface of the chip.
A net is a set of two or more pins which must be connected, thus connecting the logic circuits having the pins. Because a typical chip has thousands, tens of thousands, or hundreds of thousands of pins, that must be connected in various combinations, the chip also includes definitions of thousands, tens of thousands, or hundreds of thousands of nets, or sets of pins. The number of the nets for a chip is typically in the same order as the order of the number of cells on that chip. Commonly, a majority of the nets include only two pins to be connected; however, many nets comprise three or more pins. Some nets may include hundreds of pins or thousands or tens of thousands to be connected. A netlist is a list of nets including names of connected pins or a list of cells including names of nets that connect to pins of cells.
A netlist may be generic or technology specific. A generic netlist is a netlist created from the RTL code that has not yet been correlated with a technology specific library of cells. A technology specific netlist, or a mapped netlist, is a netlist created after the IC design has been mapped to a particular technology-specific library of cells.
Continuing to refer to FIG. 1, after the synthesis of the design, the gate-level netlist is verified 110, the layout of the circuits is determined 112, and the IC is fabricated 114.
At the RTL level, designers must take all key design decisions such as design hierarchy and partitioning, clocking scheme, reset scheme, and locations of registers. All those decisions are contained and reflected in the RTL code. The RTL code is technology independent, as well as independent from design tools.
As a result, some characteristics of the RTL code can strongly influence further design steps, including logic synthesis, gate-level simulation, static timing analysis, test insertion and layout. Unexpected problems and difficulties with the IC design can be encountered at any of these steps and cause implementation obstacles impacting project schedules and costs.
Some problems, referred to as showstoppers, may render the design not feasible for fabrication. For example, it may be realized during clock distribution that the design uses an unsupported clocking scheme, such as clock signals that are gated "on the fly" whenever needed. A clock signal is gated "on the fly" when a gate, usually an AND gate, is used to turn on a clock signal only when need for a particular sub-circuit but turned off the rest of the time. This is a common technique to reduce power consumption. The problem arises if and when the same clock signal is needed elsewhere. Then, clock distribution cannot be made, and the RTL code needs significant re-work.
Other design problems may present implementation obstacles requiring the engineering efforts to be much higher than expected. For example, it may be realized during logic synthesis that the design is partitioned in a very "synthesis unfriendly" manner. In such a case, the automatic features of the synthesis tools cannot be used, and, in its place, a lot of manual work is required to meet timing and other parameters.
Encountered late in the design cycle, such problems can greatly impact project schedules and design cost. The later the problems are discovered, the more significant the impact and the higher the cost in time and expenditure to correct the error. For example, timing or routability problems encountered during layout can require a new run through logic synthesis, gate-level verification, and test logic insertion. Modifying the RTL code late in the design process is generally the worst case scenario because once the RTL code is modified, all design steps must be re-run, including the RTL functional validation. For many design projects, RTL modification is not even a viable option.
To identify the potential problems with the IC design as early as possible, RTL code can be analyzed, manually or automatically. However, some design issues can be missed if the RTL code itself is used to analyze the design. In addition, some constructs of the languages used for the RTL code, such as Verilog and VHDL, leave room for more than one interpretation when implementing them in hardware. These shortcomings exist because the languages used for the RTL code, Verilog and VHDL, lack formerly-defined synthesis semantics. These languages lack the formerly-defined synthesis semantics because they were developed as simulation languages before logic synthesis tools were available.
SUMMARY OF THE INVENTION
The general purpose of the present invention is to provide the means to analyze IC designs early in the design process in order to allow correction of problems early on. Therefore, an object of the present invention is to extract critical design information from RTL code and identify early in the design process issues that can impact further design steps. The size and complexity of RTL code make "manual" RTL analysis unworkable.
Based on the context described above, another object of the invention is to define tools referred to as "synthesis script generation tools", that automatically extract from RTL code design information that is required for synthesis, including design hierarchy, clock sources and fanouts, hierarchy purity of modules, and types of pins that drive module outputs, and create optimized scripts to synthesize the design in a given target technology, using a given target synthesis tool. Purity of hierarchy refers to whether a particular level includes sub-designs only, logic only (if leaf), or sub-designs mixed with logic. Types of pins that drive module outputs may be registered or unregistered, and driven or not-driven by a flipflop.
It is a further object of this invention to provide a method of accessing the generic netlist from the Synopsys Design Compiler or similar synthesis tools. As discussed above, a generic netlist is a netlist created from the RTL code which has not been correlated with a technology-specific library. For example, RTL code describing a select function between sixteen input signals to a single output signal may be implemented as a multiplexer circuit (a "MUX"). A generic netlist may represent the sample circuit as a 16.times.1 MUX having a 16 input signals, four input selection signals, and one output signal. In contrast, a technology-specific netlist may represent the sample circuit as a cascade of several 4.times.1 MUX's.
Another object of the present invention is to extract critical design information from a generic netlist and identify as soon as possible issues that can impact further design steps. Analysis of RTL code may miss some design issues. These potentially problematic issues which can be missed at the RTL code analysis phase can be identified if the IC design is analyzed at the generic netlist level.
Accordingly, it is a further object of this invention to provide a method of accessing the generic netlist from the Synopsys Design Compiler or similar synthesis tools. Before information can be extracted from a generic netlist, one must first have access to the generic netlist.
Another object of the present invention is to effectively analyze mapped designs for buffering trees and determine their structure, their root pins, or active transitions or levels on their leaf pins.
Another object of the present invention is to utilize mapping techniques to maintain the known names of the source pins of the clocks even after the initial mapping process. During initial mapping process, the names of cells and pins are assigned by synthesis tools. Because of the name assignments, the names of the source pins of internal clocks are modified and are no longer available for resynthesis and characterization steps.
Another object of the present invention is to increase the speed in which large designs are synthesized by creative use of dc_shell command to characterize the modules of the design. As discussed herein, synthesis of IC designs involve iterations of the following two steps: bottom-up synthesis of sub-modules, and top-down characterization. The top-down characterization step can be improved by characterizing a list of module instances rather than characterizing one module at a time.
Another object of the present invention is to define a practical technique to synthesize the IC design having DesignWare modules. As discussed herein, DesignWare modules are typically predefined structured logic circuits with predefined characteristics. Because they are predefined to be general logic elements, DesignWare components may include circuits and pins which may be not necessary such as unused I/O ports. The present invention discloses techniques, including ungrouping and resynthesis to improve the performance of the synthesis script.
These and other aspects, features, and advantages of the present invention will be apparent to those persons having ordinary skill in the art to which the present invention relates from the foregoing description and the accompanying drawings.
Accordingly, the present invention is a method of fabricating an integrated circuit chip (IC), said method comprising the steps of defining the IC at the RTL code level, translating said RTL code into a generic netlist description, generating logic synthesis tool scripts based on said generic netlist description, and executing said logic synthesis tool scripts to synthesize the RTL code. The step of generating logic synthesis tool scripts comprises the substeps of identifying hardware elements and structure of the IC design, determining interrelationships between said identified hardware elements and structures, and generating logic synthesis tool scripts to synthesize said identified hardware elements to netlists as a function of said hardware elements and said interrelationships.
BRIEF DESCRIPTION OF THE DRAWINGS
FIG. 1 is a flowchart depicting the design cycle with respect to the early analysis of RTL code.
FIG. 2 illustrates a clock domain interface between a first clock domain and a second clock domain.
FIG. 3A illustrates asynchronous clock domains.
FIG. 3B illustrates related clock domains with coincident active edges.
FIG. 3C illustrates related clock domains with sequenced active edges.
FIG. 4 illustrates the exchange of a data bus between two asynchronous clock domains.
FIG. 5 illustrates a scheme where delay cells are used to delay signals.
FIG. 6 illustrates that some violations will not be caught when creating modules.
FIG. 7 illustrates one possible time budgeting scheme.
FIG. 8 illustrates that modules that mix logic with sub-modules can create complex paths spanning other several modules that are compiled independently.
FIG. 9 illustrates that modules that are compiled independently must have all their outputs "registered", or driven, by flipflops.
FIG. 10A illustrates that flipflops can then be chained together to isolate the RAM in scan mode.
FIG. 10B illustrates that if there is some logic in "RAM's shadow", then specific logic has to be added to isolate the RAM in test mode.
FIG. 11 illustrates the concept of using multiple buffers in a tree structure to provide a signal on line to a plurality of elements.
FIG. 12 illustrates a preferred process of translating an RTL code into a generic netlist.
FIG. 13 illustrates an overview of the system.
FIG. 14 illustrates that bottom-up synthesis starts from leaf modules in the design hierarchy.
FIG. 15 illustrates the characterization of a synthesized module.
FIG. 16 illustrates the technique where a "snapshot" of the design is obtained, with conditions and constraints on I/O modules, that both reflect the current implementation of the design and synthesis goals.
FIG. 17 illustrates default constraints used for initial mapping.
FIG. 18 shows the result of the initial mapping process.
FIG. 19 illustrates the iterative improvement process.
FIG. 20A illustrates "broken timing paths."
FIG. 20B illustrates that the delay consumed in driving flipflops can easily be approximated.
In FIG. 20C illustrates the clock period allocation technique.
FIG. 21 illustrates that synthesis scripts generation tools have to include three types of elements.
FIG. 22 illustrates ungrouping of small modules, used to build bigger modules that are more appropriate for synthesis.
FIG. 23 illustrates how grouping can be used to eliminate broken timing paths due to non-registered module outputs, and to embed clouds of logic mixed with design hierarchy.
FIG. 24 illustrates support for design hierarchy re-arrangement.
FIG. 25 illustrates module processing order for parallel bottom-up synthesis.
FIG. 26 illustrates the database to be used to run VEGA scripts.
FIG. 27 is a flowchart illustrating the script flow implemented by VEGA.
FIG. 28 is a flowchart illustrating the structure of initial mapping script.
FIG. 29 is a flowchart illustrating operations performed on each module by initial mapping.
FIG. 30 is a flowchart illustrating the structure of characterization.
FIG. 31 is a flowchart illustrating the structure of constraints setting on top-level.
FIG. 32 is a flowchart illustrating the structure of re-synthesis.
FIG. 33 is an example of RTL code and equivalent hardware view for the RTL analysis.
FIG. 34A illustrates the most intuitive PTL model for a register with partial asynchronous reset.
FIG. 34B shows how Synopsys Design Compiler maps the VHDL code of FIG. 4A to a target technology.
FIG. 34C shows how AMBIT BuildGates maps the same piece of VHDL code to a target technology.
FIG. 35A shows an example of module with unconnected input pins.
FIG. 35B shows how Synopsys Design Compiler ties unconnected module input pins to logic zero.
FIG. 36 illustrates the logic synthesis process.
FIG. 37 illustrates the failing implementation of a latch with clear.
FIG. 38 shows transforms from templates in the RTL code, that are based on basic statements and constructs of the HDL, are transformed in a straightforward manner to equivalent hardware structures.
FIG. 39 illustrates external and internal clocks.
FIG. 40 illustrates the process used to map cells that create internal clocks.
FIG. 41A illustrates a clock source retrieved through using a connected port.
FIG. 41B illustrates a clock retrieved through using a connected clock input pin on a RAM.
FIG. 41C illustrates a clock source retrieved through using a connected net.
FIG. 42 is a diagram illustrating the altering of internal clocks through initial mapping.
FIG. 43 is a diagram illustrating handling clock generators with a "blackbox_design" direction.
FIG. 44 is an example of a buffering tree used for clock distribution.
FIG. 45 is an example of parallel buffers.
FIG. 46 illustrates the environment in which the present invention generally is operated and practiced.
FIG. 47 is an illustration of an integrated circuit chip fabricated in accordance with the design ultimately derived by use of the present invention.
DESCRIPTION OF THE PREFERRED EMBODIMENTS
Presented herein is a system for analyzing a circuit design at the RTL level. The system can be based on the extraction and analysis of information from RTL code. Preferably, however, the system analyzes information extracted from a generic netlist created by a logic synthesis tool.
A. RTL DESIGN ANALYSIS
As discussed above, information for analysis can be extracted directly from RTL code. This section discusses the extraction of information from RTL code and the analysis of such information.
1. Extract Critical Design Information
The first step in utilizing RTL code is to extract the critical design information required for the analysis.
a. Identify Key Hardware Elements And Their Key Pins With Active Edges Or Levels.
RTL code (in Verilog or VHDL) can be parsed in order to identify key hardware elements. Such key hardware elements include flipflops, latches, tristate buffers, bidirectional buffers and memories. With respect to these key hardware elements, key pins with the elements' active edges or levels can also be identified. For example, with regard to a flipflop, the following information can be extracted: The data input pin; the clock pin with an active edge (rising or falling), a clear pin with an active level (low or high); and a preset pin with active level (low or high).
In addition to the key hardware elements, interconnections between hardware elements must be understood. Finally, the function of the clouds, or sets, of combinational logic needs to be understood to some extent, so that RTL analysis is able to track design issues.
Extracting key hardware elements is referred herein as "inference." Accordingly, for example, "flipflop inference", "latch inference", "tristate logic inference," and "bidirectional logic inference" refer to the inference of flipflops, latches, tristate logic and bidirection logic respectively. Inference involves identifying templates in the RTL code that indicate the presence of those elements.
For example, the following Verilog-HDL construct implies that signal "Q" is driven by a flipflop, that is clocked on the rising edge of net "CLK" and cleared asynchronously on the low level of net "RESET_N". This is the meaning of that code, that is the usual way of describing a flipflop or a register (set of flipflops). RTL designers are quite familiar with this kind of template, that logic synthesis tools also recognize.
always @ (negedge RESET_N or posedge CLK) begin if (.about.RESET_N) Q = 1'b0; else Q = DATA; end;
Flipflops, latches, tristate buffers and bidirectional buffers of a technology library can also be directly instantiated in the RTL code. For example, the following Verilog-HDL instantiation statement implies that signal "Q" is driven by an "FD1A" flipflop of a technology library, that is clocked on the rising edge of "CLK" and cleared asynchronously on the low level of net "RESET_N":
A library of technology cells can be used to identify cells and their special pins.
Memories used in RTL code and associated key pins can be identified based on naming conventions provided by the chip manufacturer and utilized in the code or through asking the user to declare those cells as being memories. For example, using the naming convention of one semiconductor fabricator, the following VHDL module declaration indicates the presence of a synchronous one-port 256.times.8 bit RAM:
entity RRS1P256x8 is port ( CLK : in STDU_LOGIC; WE : in STDU_LOGIC; OEN : in STDU_LOGIC; DI7 : in STDU_LOGIC; DI6 : in STDU_LOGIC; DI5 : in STDU_LOGIC; DI4 : in STDU_LOGIC; DI3 : in STDU_LOGIC; DI2 : in STDU_LOGIC; DI1 : in STDU_LOGIC; DI0 : in STDU_LOGIC; A7 : in STDU_LOGIC; A6 : in STDU_LOGIC; A5 : in STDU_LOGIC; A4 : in STDU_LOGIC; A3 : in STDU_LOGIC; A2 : in STDU_LOGIC; A1 : in STDU_LOGIC; A0 : in STDU_LOGIC; DO7 : out STDU_LOGIC; DO6 : out STDU_LOGIC; DO5 : out STDU_LOGIC; DO4 : out STDU_LOGIC; DO3 : out STDU_LOGIC; DO2 : out STDU_LOGIC; DO1 : out STDU_LOGIC; DO0 : out STDU_LOGIC) ; end RRS1P256x8;
In the above VHDL module, the clock pin is "CLK" active on the rising edge. The write enable pin, which controls whether data is being written into or read from the RAM, is pin "WE" active on the high level. The RAM also features output enabling, controlled through pin "OEN" active on the low level.
Accordingly, as a first step in the process, a conventional parser searches the RTL code for recognizable patterns and naming conventions implying critical hardware elements, as well as specific pins of those elements and also correlates technology cells instantiated in the code (if any) with a library of technology cells.
b. Determine Critical Design Information.
Following the identification of key hardware elements and associated key pins, critical design information can be extracted from RTL code, including design hierarchy, nets, hierarchy purity of modules, registered/unregistered module outputs, logic surrounding memories, data busses, and high-fanout nets and fanout statistics.
The "design hierarchy" includes hierarchical modules, memories, IP cores and hard macros, and instantiated technology cells. Nets include, for example, multiply-driven nets, clock nets, asynchronous reset nets, synchronous reset nets, and RAM write enable nets.
A "multiply-driven net" is a net that is driven by more than one driver. Possible drivers are cell output pins, input ports on the top-level module, and module output pins that are unconnected. Tristate nets and bidirectional nets fall into this category. A multiply-driven net can be identified in an RTL code through searching for design objects and statements that can assign values to the corresponding RTL signal. Design objects that can assign a value to a signal are connected input and input/output ports on the top-level of the design. Statements that can assign a value to a signal are process statements and concurrent signal assignment statements. If a signal can be assigned a value by more than one top-level port and/or statement, then the corresponding net in the implied hardware structure is multiply-driven.
A "clock net" is a net connected to one or more clock pins of flipflops, enable pins of latches, and clock pins of synchronous RAMs. A clock net can be identified in an RTL code as a signal that controls the assignment of other signals, based on a transition of that signal from one level to another (implies flipflops) or based on setting of that signal to a given level (implies latches). Templates are well know by RTL designers, and are recognized by logic synthesis tools during the "translation" phase. A Verilog-HDL example has been given above, for a flipflop clocked on the rising edge of a signal CLK.
An "asynchronous reset net" is a net connected to one or more asynchronous clear and preset pins of flipflops. An asynchronous reset net can be identified in the RTL code as a signal that controls the loading of constant data into a set of flipflops (resp. latches) regardless of the clock (resp. the enable) that controls those flipflops (resp. latches). Templates are well know by RTL designers, and are recognized by logic synthesis tools during the "translation" phase. A Verilog-HDL example has been given above, for a flipflop cleared on the low level of a signal RESET_N.
A "synchronous reset net" is a net connected to one or more synchronous clear and preset pins of flipflops. If the target technology library does not include flipflops with synchronous reset pins, synchronous reset modes implied by the RTL code are implemented by logic synthesis like any other functional clock-synchronous mode, using the data input pins of flipflops. A synchronous reset net can be identified in the RTL code as a signal that controls the loading of constant data into a set of flipflops synchronously to the clock that controls those flipflops. Templates are well know by RTL designers, and are recognized by logic synthesis tools during the "translation" phase.
A "RAM write enable net" is a net connected to one or more write enable pins of asynchronous or synchronous RAMs. In most cases, recognizing a RAM write enable pin from RTL code is difficult. Modeling uses behavioral code for memories, not RTL code, and numerous templates and constructs can be used. Naming conventions or other indentifying indication from designers is needed.
A "clock domain" is defined as the set of all flipflops and synchronous RAMs that are clocked on the same edge of the same clock net. A "clock domain interface" is the logic that allows data to be transferred from one clock domain to another. Referring to FIG. 2, two clock domains 120 and 122 are illustrated. The first clock domain 120 comprises two flipflops 124 and 126 and a set of combinational logic circuits 128 (represented by a "cloud"). The elements 124, 126, and 128 are driven by a first clock signal CLK1 121. The second clock domain 122 includes flipflops 132 and 134 and a set of circuits 136, and is driven by a second clock signal CLK2 123. FIG. 2 also illustrates a clock domain interface 130 between the first clock domain 120
and the second clock domain 122. The interface 130 may include combinational logic circuits 138. Clock domain extraction can proceed after all clock nets have been extracted, including all memory elements that are connected to each clock net together with active edges and levels. Memory elements include flipflops, latches and synchronous RAMs. A clock domain is defined as the set of memory elements that are controlled by the same clock net on the same edge or level. A given memory element can belong to several clock domains. Clock domain extraction is being implemented in VEGA and will be described in further disclosures.
The "hierarchy purity" of a module is defined as "pure" if the module contains only sub-modules. It is defined as "leaf" if it contains only logic (i.e., no further hierarchy exists below that level). The hierarchy purity of a module is defined as "mixed" if it contains both sub-modules and logic. A module in a RTL code is pure if it only contains statements that instantiate other modules. "A module is a "leaf" it contains no statements that instantiate other modules. Other modules are "mixed".
Busses are groups of nets related together. Modeled as single entities in the RTL code, busses are split into a number of individual nets by logic synthesis. Generally, busses are identified for analysis purposes by recognizing naming conventions. For example, the following VHDL statement declares a 32-bit bus named "DATA" that will be split into 32 individual nets by logic synthesis:
The "fanout" of a net is defined as the number of input pins connected to the net. If the net is a multiply-driven net, it is defined as the total number of pins connected to the net minus one, as there should be only one pin driving the net at a time. The generic netlist is more suitable here, but only gives an approximate value of fanouts. More accurate values can be obtained through synthesizing the design to a generic library like the Synopsys GTECH library. Actual values will only be obtained after the design is mapped to the target technology. However, it must be noted that the fanout values of some critical nets, in particular clock nets and asynchronous reset nets, is fully accurate when working on the generic netlist, and could also be directly computed from the RTL code.
2. Critical Design Analysis
The design hierarchy is, in general, a piece of information that is used in all design steps. It shows how the design has been partitioned in modules of lower complexity. The number of hierarchy levels, the number of modules and the approximate complexity of each module are key information at all design steps, in particular for logic synthesis and layout.
Once the above-discussed information is extracted from the RTL code, the information is then analyzed and tested in order to determine whether there exist associated potential design problems. The hierarchy is analyzed to figure out how the design has been partitioned into smaller units. If the hierarchy is too detailed, it will have to be dissolved to a large extend for synthesis. Ideally, modules should include a single clock domain, mostly for synthesis efficiency reasons.
a. Multiply-Driven Nets.
The present invention analyzes and tests the design with respect to various issues associated with a tristate net or a bidirectional net. For example, if all drivers are simultaneously put in the high impedance state, then the net can "float" at an undefined voltage value, causing the value of currents to be unpredictable for testing. A device must be connected to the net to pull its voltage up or down when it is not driven (on-chip bus holders, external resistors), or the logic that controls enable pins of drivers must be designed in such a way that the net is always driven. If the net is connected to a pull-up or pull-down device (which can be showed by RTL analysis), it cannot float. Otherwise, simulation has to be used to make sure the net is always driven.
Second, there should never be more than one driver active at the same time. Otherwise, the logic value of the net cannot be predicted, and hot spots are created on the chip. This is referred to as "contention". This condition is detected using simulation.
Finally, in test modes, a net should always be driven if it is not connected to a pull-up or pull-down device, and no contention should occur. The real value of RTL analysis is mostly to point at tristate nets. Once aware of the presence of such nets, the way they are handled can be investigated (may be with help from the designers who wrote the code). Again, simulation is used to detect this problem.
Multiply-driven nets can also be created "by accident", in particular when working with resolved signals in VHDL ("std_logic" and "std_logic_vector" signal types). If a signal is assigned by several process statements or concurrent assignment statements, then the signal has several drivers. This will not be reported by the simulator, and will only be reported as a warning by the synthesis tool. If the chip was fabricated, the logic value of the signal would be unpredictable, and hot spots would be created. Multiply-driven nets created "by accident" can be distinguished from tristate and bidirectional nets through looking at drivers, that are not tristate pins or bidirectional pins.
b. Clock Nets.
In accordance with the present invention, all clock sources and nets are identified. It is generally critical to identify all clock sources, compute the fanout of each clock net, and identify pieces of the design where clock nets are connected to input pins other than clock input pins on flipflops, latches or synchronous RAMs.
For logic synthesis, all clocks must be properly defined, including source pin, period, and waveform. Ideally, all clock sources should be located in specific modules, referred to as "clock generators", that are synthesized apart from the rest of the logic. Clock constraints are then easily defined and the synthesizer is easily directed not to insert buffers on clock nets that will need Balanced Clock Trees (BCTs). Modules that create and use internal clocks require a more complex synthesis approach, in particular if timing constraints associated with those clocks are tight.
Chip manufacturers often use BCTs to distribute clock signals to flipflops and synchronous RAMs with a low and predictable skew. BCTs are implemented during layout. This is a complex operation, that involves significant engineering effort and requires appropriate planning. The total number of BCTs that can be implemented on a chip is limited. Not all clock nets require a BCT for distribution. In particular, clock nets with low fanouts (typically less than 100) can be buffered through synthesis, and laid out as any other net. Some attention needs to be paid to cell placement though, to make sure that the skew is under control.
Clock gating and clock dividing are techniques that are used more and more to save power, that has become a limiting factor in some applications. Clock gating consists in switching the clock off in some portions of the design when they are not active. Clock dividing consists in creating lower frequency clocks from a master clock and distributing them to some sections of the design that operate at lower speed. When used intensively in a design, those power saving techniques require specific clock distribution schemes to make sure that the clock skew is under control. For example, clock gating can be implemented through generating enable signals that control gates directly included in the BCT (Gated BCT). Such clock distribution schemes need to be discussed and anticipated when the RTL code is being developed. The chip manufacturer provides guidelines to model clock signals in RTL, in particular for gated clocks and multiplexed clocks. If those guidelines are not followed, then BCT insertion tools can fail requiring significant RTL code re-work.
c. Clock Domains and Clock Domain Interfaces.
As described above, a clock domain is the set of all flipflops and synchronous RAMs that are clocked on the same edge of the same clock net. Each of the clock domains are identified and analyzed. Clock domains that exchange data need to be interfaced in reliable and predictable ways, depending on relationships between clocks.
When the relationship between clocks is unknown and can vary over time, clock domains are referred to as "asynchronous clock domains". Asynchronous clock domains should be interfaced through a double level of flipflops to reduce the probability of meta-stability. Asynchronous clock domains are illustrated by FIG. 3A. Referring to FIG. 3A, two clock domains 180 and 181 are illustrated. The first clock domain 180 comprises a flipflop 183 and combinational logic circuits 182. The second clock domain 181 comprises two flipflops 185 and 186 and combinational logic circuits 187. Other circuits 188 may be in between the clock domains 180 and 181 The user has to indicate to the system that the two clocks are asynchronous (have no known relationship). There is no way for the system to determine this fact.
When the relationship between clocks is known and stable over time, clock domains are referred to as "related clock domains". If active edges are coincident, it is preferable to have combinational logic between related clock domains to minimize the risk of hold time violations. Related clock domains with coincident active edges are illustrated by FIG. 3B.
Referring to FIG. 3B, two clock domains 190 and 191 are illustrated. The clock domains 190 and 191 have combinational or logic 196 in between the domains to minimize the risk of hold time violations. The clock domain 190 includes flipflop 193
and other circuits 192, and clock domain 191 include a flipflop 194 and other circuits 195.
If active edges are sequenced, it is preferable not to have any combinational logic that would contribute with clock skew to make timing constraints more difficult to meet. Related clock domains with sequenced active edges are illustrated by FIG. 3C. Referring to FIG. 3C, two clock domains 200 and 201 are illustrated. The first clock domain 200 comprises a flipflop 203 and combinational logic circuits 202. The second clock domain 201 comprises flipflop 204 and combinational logic circuits
205. Note that the clock domains 200 and 201 are connected directly, without having any combinational logic in between them.
There are several motivations for identifying clock domains and clock domain interfaces in the RTL code early in the IC design process. From the chip architecture prospective, clock domains reflect fundamental partitioning decisions. Unclear and interlaced clock domains often indicate partitioning decisions that are inadequate for an integrated circuit implementation. Such problems should obviously be discovered when the RTL code is still in early stages and where the problems can be efficiently corrected.
From the design success prospective, clock domain interfaces must be checked carefully to make sure that no timing hazard can affect the design functionality after layout and fabrication. RTL analysis extracts clock domains and clock domain interface logic. Then, the user has to analyze this data based on clock relationships. A typical example of a potential timing hazard is illustrated by the exchange of a data bus between two asynchronous clock domains of FIG. 4. Referring to FIG. 4, two clock domains 210 and 212 are illustrated. The first clock domain 210 comprises a set of flipflops 214a to 214z which provides outputs A0 to A31 from clock domain 210. Some outputs from clock domain 210 travel through circuits having a first wire delay 220 while other outputs travel through circuits having a second wire delay 222. The data from clock domain 210 having traveled through circuits having wire delays 220 and 222 service as inputs to clock domain 212. Clock domain 212 has flipflops
216a to 216z which drive other flipflops or circuits such as flipflops 218a to 218z. In this example, the wire delay time 220 is different then the wire delay time 222. Because of different delays through wires 220 and 222, there is no guarantee that the same event is captured on each bit of the bus, causing the bus value to be erroneous. Chips can fail after fabrication because of this issue. Possible solutions include using Gray encoding to make sure that only one bit of the bus changes at a time. This can be accomplished during the RTL code stage.
From the RTL simulation prospective, delta-time hazards can be encountered at clock domain interfaces, in particular for related clock domains with coincident edges. The problem is encountered when running RTL simulation. When the problem exists, the design does not simulate as expected. Typically, data is transferred from one clock domain to another in one clock cycle instead of two clock cycles. Knowing the clock domains and clock domain interfaces, an RTL designer should be able to quickly analyze and fix. Without such knowledge, this is much more difficult. Such issues are usually solved through adding delays to RTL signal assignment statements to mimic the hardware behavior.
From the logic synthesis prospective, each module that contains logic (leaf module) should involve one clock domain only, in particular if the design is timing critical. The ability of logic synthesis tools to meet timing constraints is reduced when several clock domains are simultaneously involved, and run times may increase significantly. When a module is synthesized, it is easy to see, based on clock domain information, whether the module includes one or several clock domains. If several, it can be difficult for the synthesis tool, especially if timing constraints are tight.
From the gate-level simulation prospective, the knowledge of clock domains and clock domain interfaces is needed to ensure that test benches are appropriate and do not cause any setup and hold violations. RTL test benches are generally re-used for gate-level validation. So it should be checked when the RTL code is being developed that test benches will be re-usable at the gate-level.
From the static timing analysis prospective, creating tool setup files is a significant engineering effort that mostly consist in defining clock domains, relationship between clocks, and interaction between clock domains.
Finally, from the scan testing prospective, the knowledge of clock domains is required for defining scan chains. Depending on the number of clock domains and the number of flipflops in each clock domain, a different scan chain can be allocated to each clock domain, or the same scan chain can link together flipflops from different clock domains. In this last case, the RTL code has to make provision for clock multiplexing logic.
d. Asynchronous Reset Nets and Synchronous Reset Nets.
RTL descriptions usually do not deal with the uninitialized simulation state (`U` state) and the unknown simulation state (`X` state). This would be difficult to handle and would reduce the clarity and the level of abstraction of RTL descriptions.
States `U` and `X` play a key role for gate-level verification, however. Gate-level simulation starts with all signals in the `U` state. Reset signals are then activated and, through setting flipflops and latches to known values (`0` and `1` values), should set the design to a known configuration from where it can evolve in a predictable manner. If all flipflops and latches are initialized through asynchronous reset and/or synchronous reset, no `U` states can propagate and generate `X` states. If this is not the case, it must be carefully checked that the gate-level simulation will initialize properly and will be able to proceed. For this reason, it is important to identify all flipflops and latches that are not resettable. Flipflops that are not resettable can easily be identified because they are not controlled by a reset signal--asynchronous or synchronous.
Test issues are also associated with reset signals. In test modes, all reset signals must be under control to avoid that the design or some portions of the design reset unexpectedly. For example, when using scan testing, all asynchronous reset signals must be disabled in scan mode to make sure that no elements of scan chains can reset. It may be needed to add specific logic to control reset signals in test modes, like multiplexers and gates controlled through "test enable" signals.
Reset nets often have very high fanout values. For example, a single asynchronous reset net may be connected to all flipflops and latches of a design. Properly buffering and distributing such nets may raise difficult issues, and in some cases BCTs are used. This makes another motivation for identifying all reset nets in a design as soon as possible.
e. RAM Write Enable Nets.
RAM write enable nets need to be identified as soon as possible. Once identified, the RAM write enable nets can be checked to ensure that the RAM write enable logic has been implemented according to supported schemes.
"Pulse generator" logic is required to write data to a RAM in a single clock cycle and as soon as input data and addresses are available. However, "pulse generator" logic creates a pulse on the RAM write enable pin that is contained within the clock period and consistent with setup and hold times. This implementation of pulse generators requires delaying signals. This is generally not allowed, in particular in an ASIC context, because of uncertainty on delay values due to process variations and layout effects (wire delays).
Therefore, it is generally preferable that specific cells, referred to as "delay cells", are used to delay signals, and that RAM write enable logic follows given schemes that have proved to be fully reliable. An example of such a scheme is illustrated by FIG. 5. Referring to FIG. 5, random access memory chip (RAM) 220 is illustrated. Flipflop 222 provides the input data signal to the RAM 220 and flipflop 224 provides the address for the input data. The pulse generator signal is provided by flipflop 226. The signal from the pulse generation circuit 226 creates a pulse on the RAM write enable pin 227. This implementation of pulse generator requires delay signals. The delay signal of the clock 228 is provided by delay cell 229 which, in combination with an AND gate 230, provides synchronized write enable signal on the line 227. For reading operations the output from the RAM 220 is provided to flipflop register 232.
For that reason, RAM write enable nets need to be identified as soon as possible in the design process. The design can then be checked to make certain that RAM write enable logic has been implemented according to supported schemes.
f. Hierarchy Purity of Module and Pins Driving Each Module Output.
The hierarchy purity of modules is key information for logic synthesis, together with the type of pins that drive each module output. For each module output, the RTL object that can assign a value to the output has to be located. It can be an input or an input/output port on the top-level or an assignment statement. If it is an assignment statement that is inside a clock-synchronous section, then the output is registered. Here, a netlist or a similar representation is quite efficient.
Current logic synthesis tools are too limited to synthesize an entire design at once with realistic run times and memory requirements. Depending on available processing resources, the logic synthesis tool and the nature of the logic to be synthesized, modules of sizes that range from 5,000 gates to 50,000 gates can be synthesized. Building larger designs requires that modules of reasonable sizes are first synthesized independently, then put together. When related combinational logic is split between modules, the logic synthesis tool does not work on entire flipflop-to-flipflop timing paths, but on artificially broken segments. Referring to FIG. 6, if the logic synthesis tool cannot work under entire flipflop timing path illustrated by module C 236, then module C is synthesized in portions. First portion, module A 237 is synthesized first. Then, module B 238 is synthesized.
Two types of issues are associated with such a situation. First, exploratory synthesis on individual modules may not reveal all paths that violate clock constraints. For example, on FIG. 6, paths from the output of flipflop 239 to the input of flipflop 240 may violate clock constrains because of the combination of circuits 241 and 242. The violating path from flipflop 239 to flipflop 240 cannot be detected if module A 237 and module B 238 are synthesized separately. However, when module A
237 and module B 238 are synthesized together as module C 236, then the violating path from the flipflop 239 to flipflop 240 can be detected. Some violating paths can still be found when putting modules together. For an RTL design team, it means that some violations will not be caught when creating modules, but only when integrating modules together. These concepts are illustrated by FIG. 6.
Second, allocating "time budgets" to modules may be required to properly constrain synthesis. So far, time budgets had to be set manually, requiring a lot of engineering effort and potentially impacting the turn-around-time of synthesis. FIG. 7
illustrates one possible time budgeting scheme. Referring to FIG. 7, given a predetermined amount of time for a signal to travel from flipflop 253 of module A 250 via module B 251 to flipflop 237 of module C 252, a portion of that predetermined time is allocated to each of the modules 250, 251 and 252. Because of the fact that the signals must travel through combinational logic circuits 254, 255 and 256, the available time, in the example of FIG. 7, has been allocated as follows: 30% to module A 250,
50% to module B 251, and 20% to module C 252.
Artificially breaking flipflop-to-flipflop timing paths should be avoided as much as possible. This is achieved when two guidelines are followed. First, the design hierarchy should be pure down to leaf modules. In other words, modules should either contain sub-modules only, or logic only. If allowed, modules that mix logic with sub-modules can create complex paths spanning other several modules that are compiled independently, as illustrated by FIG. 8, and can also increase synthesis run times and reduce optimization opportunities. Referring to FIG. 8, module C 260, as illustrated, is impure hierarchical mixed logic module. Module C 260 contains module A 261 and module B 262 as well as combinational logical 263. Modules A 261 and B
262 are pure leaf modules because each of them contains only logic circuits. Module A 261 contains combinational logic circuits 264 and flipflop 265, and module B 262 contains combinational logic circuits 266 and flipflop 267.
Second, modules that are compiled independently must have all their outputs "registered", or driven, by flipflops. In this case, the delay consumed in output flipflops can be neglected, and no time budgets are needed. The diagram of FIG. 9
illustrates this concept. Referring to FIG. 9, modules 270 and 271 are illustrated. Module 270 having combinational logic circuits 272 has a flipflop 273 which drives the output of the module 270. Therefore, module 270 has registered outputs. Delay of the flipflop 273 can be ignored in this case. Module 271 also has combinational logic circuits 274 and flipflop 275.
For functional reasons, it is not always possible to register all module outputs. However, timing paths that span over several modules synthesized independently often require manual investigation and time budgets. So it is critical to evaluate how many paths of that type are included in a design to assess the synthesis turn-around-time. The ratio "number of registered outputs/number of combinational outputs" of each module, that is easy to compute, is a valuable criterion for that purpose.
g. Logic Surrounding Memories
The logic that surrounds memories, particularly RAM's, are extracted by the present invention. Currently, VEGA extracts only the WE nets. Then, given this pointer, the user has to investigate in the code. Using a TCL interface it is possible to extract the whole WE generation logic. When scan testing is used for a design, possibly in conjunction with "RAM BIST" (Built-In Self Test) for testing RAMs, all RAMs should be surrounded by flipflops. Those flipflops can then be chained together to isolate the RAM in scan mode as illustrated by FIG. 10A. Referring to FIG. 10A, the RAM 280 and the surrounding flipflops 281 and 282 are grouped in the scan chain 283 for scan testing. In this case, because combinational logic circuits 284 and 285 are outside the scan chain 283, the RAM 280 and the flipflop 281 and 282 can be chained together and isolated.
If there is some combinational logic in RAM's shadow, or scan chain, then specific logic has to be added to isolate the RAM in test mode as illustrated by FIG. 10B. Referring to FIG. 10B, combinational logic circuits 288 and 289 are in between the RAM 280 and flipflops 286 and 287. In such a case, the scan chain 292 includes the combinational logic circuits 288 and 289. Then, specific logic circuits 290 and 291 must be added to surround the RAM 280 and separate it from combinational logic circuits 288 and 289 which are in RAMs shadow. This may reduce the efficiency of ATPG tools (Automatic Test Pattern Generation), and require manual writing of some test vectors. The logic that needs to be added also impacts timing, which can be an issue if critical paths go through the RAM.
h. Data Busses
Data bus design information is extracted from the RTL code. Two critical issues are associated with the routing of data busses of significant width, especially if they connect several modules together at the chip top-level. First, the dispersion of wire delay values for the different bits of a bus should be minimized, in such a way that the bus value gets valid on connected terminal pins within a minimum time interval. Also, congestion can occur in some routing areas. The RTL analysis brings busses to the attention of users who are not familiar with the design, like the application engineer in charge of floorplanning. This is key information, that can result in a lot of issues if missed.
Ideally, the routing of busses should be defined when creating the chip floor plan, that consists in placing the input/output cells (bounding diagram), and approximately defining the shapes and positions of modules. Defining bus routing at the floorplan level is referred to as "Bus Planning", and consists in defining routes of nets in terms of used wiring areas (global routing).
With deep sub-micron processes, more focus is being put on early floor planning based on RTL code, which is referred to as "RTL Floorplanning". The design hierarchy is defined in the RTL code, and the sizes of modules that contain logic can be approximated based on the RTL code, for example through a quick logic synthesis pass with simplified constraints.
The extracted bus design information is then passed to Bus Planning and RTL Floorplanning modules in lower level synthesis.
i. High-Fanout Nets and Fanout Statistics.
High fanout nets must be identified because the high fanout nets' buffering and routing requirements analysis is important elements of a successful fabrication. The fanout of each net is computed. A predefined threshold is used to determine whether or not it is a "high-fanout" net. High fanout nets typically include clock nets, reset nets, and scan enable nets.
Clock signals are usually distributed using BCTs, to make sure that the skew is within a given range. Logic synthesis is then directed not to insert any buffer on those nets. BCTs are sometimes used for other high-fanout nets than clock nets, like reset nets and scan enable nets. In general, high-fanout nets are buffered by logic synthesis, that uses "buffering trees" to make the fanout of each cell lower than a given value. FIG. 11 illustrates the concept of using multiple buffers 340a,
340b, . . . 340e, in a tree structure to provide a signal on line 342 to a plurality of elements. The definition of a maximum fanout value depends on routing tools (e.g.,70). Specific constraints, referred to as "design rules", can be used to properly direct logic synthesis.
Statistics can be computed from the fanout value of each net extracted from the RTL code, including: the average number of pins per net; and tables showing the percentage of nets for each fanout value. For a layout engineer, those figures are helpful to develop an early understanding of how "routing friendly" the design is going to be.
3. Verifying Design Rules--RTL Design Rule Checking (DRC)
RTL design rule checking consists in checking basic design rules, including:
Combinational loops;
Unconnected pins;
Pins permanently tied to logic zero or one;
Nets that connect several pins on the same module;
Nets that are not driven;
Nets that drive nothing; and
Input/output pins directly connected together (including "feedthroughs", that directly connect an input pin to an output pin).
It must be noted that some of the rules listed above are showstoppers, such as combinational loops, while some others should be considered warnings, such as unconnected pins. In accordance with the present invention, these design rules are checked with respect to the information extracted from the RTL code.
4. Present the Design Information in Efficient Ways
The information should be presented in such a way that users immediately get "pointers" to pieces of code that involve issues and that design situations can quickly be evaluated. For example, the following information can be highlighted in order to direct the users towards pieces of code that need to be investigated:
Connections of a multiply-driven net to output pins that are not output pins of tristate buffers. This usually indicates multiply-driven nets created through accidentally connecting signals together.
Connections of a clock net to input pins that are not clock pins on flipflops, or enable pins on latches, or clock pins on synchronous RAMs. This usually indicates portions of a design where "clock manipulation" takes place, like clock gating, clock multiplexing, or clock dividing.
Connections of an asynchronous reset net to input pins that are not clear or preset pins on flipflops or latches. This usually indicates portions of a design where "reset signal manipulation" takes place, like gating used to disable reset in scan mode.
Connections of a write enable net to input pins that are write enable pins on a RAM. This usually indicates a non-supported or non-recommended scheme for creating a write enable pulse.
Clock nets that are active on both edges. This can indicate unclear and unseparated clock domains, or may bring the attention to other characteristics like the duty cycle of the clock.
Clock nets that are connected to both flipflops and latches. This usually indicates tricky clocking schemes, that need to be further understood.
Whenever possible, the information should be presented in a synthetic way. For example, the design hierarchy can be reported with synthesis-related key information attached to module names, like their hierarchy purity, and the numbers of registered and combinational outputs.
Summaries of extracted design information should also be created. In particular, for each extracted clock net, the following summary should be provided:
Clock source;
Fanout;
Active level;
List of connections to other input pins than clock pins on flipflops, enable pins on latches, or clock pins on synchronous RAMs;
List of connections to synchronous RAMs; and
List of connections to enable pins on latches.
Statistics on extracted design information should also be provided whenever they can help users develop "global pictures" of a design. For example, the following design hierarchy statistics should be provided:
Number of modules in hierarchy;
Number of different types of modules in hierarchy;
Number of hierarchy levels;
Number of modules that mix hierarchy and logic; and
Average complexity of modules that include logic.
B. EXTRACTING GENERIC NETLIST FROM RTL
After the RTL code has been finalized, the code is read in by logic synthesis tools such as the Synopsys Design Compiler ("Synopsys" or "Design Compiler"). Then, the logic synthesis tool synthesizes the design, as reflected by the RTL code, into the gate-level description of the design. Although the preferred embodiment described herein is discussed in terms of the Synopsys Design Compiler, the present invention is applicable to any synthesis systems.
Logic synthesis tools such as Synopsys Design Compiler provide two commands that are employed successively to read in RTL code. The analyze command causes the compiler to parse the RTL code and store the parsed information as binary files in libraries. The elaborate command causes the compiler to build a generic netlist from the binary files created by the command analyze.
When the elaborate operation is completed, the generic netlist is available in the Synopsys Design Compiler's work space and can be accessed. For example, the generic netlist can be written out as a Verilog-HDL model, using the "write-format verilog" command. It can also be written out as a VHDL model, using the "write-format vhdl" command. Design objects that build the generic netlist can be accessed through "dc_shell",which is the Design Compiler's command shell, or environment.
Once extracted, a generic netlist can be parsed and analyzed in the same manner as discussed above with respect to RTL code. The information derived from the generic netlist parsing and analysis can also be utilized in the same manner as the information derived from parsing and analyzing RTL code.
The "write-format verilog" command is fast, typically taking about 10 minutes on a Sun/Ultra-II workstation for a 350 kGates design. The obtained Verilog-HDL models are generally easy to parse. However, directions of cell pins (input, output, input/output) do not appear in the Verilog-HDL models. In addition, some cells of the generic netlist are represented as concurrent signal assignments with combinational expressions on their right hand side, similar to the following example.
As a result, the names of cells of the generic netlist that correspond to those signal assignments cannot be found in Verilog-HDL models. In addition, specific, expensive Synopsys license, referred to as "Verilog writer", is required to use the "write-format verilog" command.
As mentioned above, the generic netlist also can be written out as a Verilog-HDL model, using the "write-format vhdl" command. The command "write-format vhdl" is also fast, typically taking about 15 minutes on a Sun/Ultra-II workstation for a
350 kGates design. However, cells are modeled using VHDL generics to parameterize numbers and names of inputs and outputs, and a package is included to handle signal types. As a result, the VHDL models are much more difficult to parse than the Verilog-HDL models.
The Synopsys Design Compiler infers "Designware cells" from arithmetic operators used in the RTL, including `+`, `-`, `*`, `<`, etc. Those cells do not appear in the VHDL models, but are modeled through process statements, as in the following example:
add_369_plus : process (UCONV_TIMEOUT_LOP_3.sub.-- port,UCONV_TIMEOU T_LOP_2_port,UCONV.sub.-- TIMEOUT_LOP_1_port, UCONV_TIMEOUT_LOP_0_port, i_2_port, i_0_port) variable A : SIGNED( 3 downto 0 ); variable B : SIGNED( 3 downto 0 ); variable Z : SIGNED( 3 downto 0 ); begin A := (UCONV_TIMEOUT_LOP_3_port, UCONV_TIMEOUT.sub.-- LOP_2_port, UCONV_TIMEOUT_LOP_1_port, UCONV_TIMEOUT_LOP_0_port ); B := (i_2_port, i_2_port, i_2_port, i_0_port ); Z := A + B; ( ARG2489_3_port, ARG2489_2_port, ARG2489_1_port, ARG2489_0_port ) <= Z; end process;
Thus, the names of DesignWare cells that are used in the generic netlist cannot be found in VHDL models. Moreover, a specific Synopsys license, referred to as "VHDL writer", is needed to use the "write-format vhdl" command.
Design Compiler provides a command language named "dc_shell". All objects of the generic netlist created by the elaborate command can be accessed through the following dc_shell commands:
"current design"--to set the context to a given module;
"find"--to search for design objects like cells, pins and nets;
"all_connected"--to search for the net connected to a given pin/port, or for the pins/ports connected to a given net; and
"get_attribute"--to access design objects attributes, like direction of pins (input, output, input/output).
From the software development prospective, dc_shell is quite limited because procedures are not supported, all variables are global, recursion is not supported (this is an important issue, because designs consist in trees), and run-time performance is very low. Therefore, dc_shell is not an appropriate language for complex applications like RTL analysis.
Referring to FIG. 12, a preferred process of translating an RTL code into a generic netlist and extracting the generic netlist is illustrated.
The RTL code 350 is read in by the Design Compiler 352 to be processed. The analyze command 354 causes the Design Compiler to parse the RTL code and to create binary files. The elaborate command 356 causes the Design Compiler to build a generic netlist from the binary files. After the elaborate command, the generic netlist 358 is resident within the workspace of the Design Compiler 352, but not outside of the Design Compiler's workspace.
The generic netlist 358 is then read by the dump script 360 and an ASCII "dump file" 362 is created. The script that creates the ASCII file containing the generic netlist representation of the IC design is referred to as "dump script". The ASCII file 362 itself is referred to as "dump file" because it is a "dump " of the generic netlist that the design compiler produced within its own work space. The results of the execution of the dump script is the dump file 362 which exists outside the Design Compiler's workspace and is available for analysis outside the design compiler 352.
A script is a set of commands understood by the environment in which the commands operate. In this case, a script is a set of commands which is understood and can be executed by a design compiler. In the preferred embodiment, a script is a set of dc_shell commands, and each of the elements of the set is a command for the dc_shell. The script is submitted or presented to the command environment, dc_shell, to be executed.
The following is the outline of the dump script 360 implemented for the Synopsys Design Compiler. It writes out the generic netlist available in the Design Compiler's workspace for the current design, set through using the "current_design" command. A variable named "dump_file" contains the name of the target ASCII file.
Find all input ports of current design
For each input port
Search for net connected to input port
Write out port name, port direction and connected net end for
Find all output ports of current design
For each output port
Search for net connected to port
Write out port name, port direction and connected net end for
Find all cells of current design
For each cell
Search for is_hierarchial attribute of cell
Search for ref_name attribute of cell
Find all cell pins
For each cell pin
Search for direction of pin
Search for net connected to pin
Write out pin name, pin direction and connected net end for end for
An example of the dump script as implemented for the Synopsys Design Compiler is reproduced as Appendix A1 attached hereto. Also, an example of the ASCII dump file produced by the dump script of Appendix A1 is reproduced as Appendix A2 attached hereto.
The dump script technique used to extract the generic netlist from a Design Compiler has several benefits. First, no additional Synopsys license is required. The dc_shell language is always provided with the basic Design Compiler license. Accordingly, both Verilog-HDL users and VHDL users can run the dump script, without acquiring an additional tool license.
Second, all objects of the generic netlist can be accessed. As described above, the "write-format verilog" and "write-format vhdl" commands both represent some cells as behavioral constructs, making for instance name and pin names non-available for those cells.
Third, directions of pin cells can be accessed. As described above, the "write-format verilog" command does not provide pin directions.
Fourth, other useful information can be accessed. In particular, the boolean attribute "is_hierarchical" is available to indicate whether a cell is hierarchical. A hierarchical cell is defined as an instance of a module that has sub-modules (hierarchy below). This information cannot be obtained with the "write-format verilog" and "write-format vhdl" commands.
Finally, because the dump file is available outside the design compiler, the design can be fully analyzed using complex applications. Inside the Design Compiler, only the dc_shell commands are available to analyze the design. The dc_shell language is used only to access objects of the generic netlist and create the dump file. Outside the Design Compiler, applications may be built using powerful programming languages, like C or C++, to analyze the design as represented by the dump file.
It must be noted that mapped netlists can be loaded using Design Compiler's "read" command, then written out using the dump script. Netlists in formats supported by Synopsys, including Verilog-HDL, VHDL, Edif and NDL (LSI Logic format), can thus be translated to the dump file format.
All types of input design descriptions that are supported by the synthesis tool can be can be mixed for describing the same design. For example, it is possible to handle designs that mix Verilog-HDL code and VHDL code, a need that is sometimes encountered. It is also possible to mix RTL modules described in Verilog-HDL or/and VHDL with gate-level modules that use any type of netlist description, like Verilog-HDL, VHDL, Edif, and NDL (LSI Logic format).
Synopsys binary files, referred to as "DB files" can be loaded using Design Compiler's "read" command, then written out using the dump script. Those DB files can contain unmapped designs, or mapped designs, or a mix of unmapped and mapped designs.
In addition to the script reproduced above, two additional scripts have been created. First, the "dump-all script" dumps all the designs that have been loaded in the design compiler's workspace. This script uses the command "find(design, "*")" to obtain a list of all loaded designs. This is the command provided by dc_shell to obtain a list of all the designs that are present in the work space. Each design on the list is then dumped, using a loop.
Second, the "dump-tree script" dumps all the designs within the tree under a given top-level. This script uses the command "find(design-hierarchy, <top-level>)" to obtain a list of all designs that are under design <top-level>. This is the command provided by dc_shell to obtain a list of all the designs that are present in the work space in the design tree under a given module. Each design on the list is then dumped, using a loop.
Both the dump-all script and dump-tree script call the basic dump script, that writes out the generic netlist for the current design.
1. Faster Dump Script
For large designs, writing Design Compiler's database into an ASCII file using the dump script can be CPU intensive. Typically, it will take 4 to 5 hours on a Sun/Ultra-II workstation for a 350 kGates design.
To decrease the time required to dump the file, two techniques are utilized to modify the dump script as outlined above for a faster dump script file. The first approach is applicable when working with lists. When attributes are needed for some design objects, the "get_attribute" dc_shell command can be applied to a list of objects, instead of being applied sequentially to each object. The command then returns a list of attributes that match the list of objects one-to-one, and that is written out with a single "echo" command. This process is faster than loops that use the "get_attribute" command on a single object at a time. This technique can be applied to:
Direction attributes for a list of ports that belongs to the current design;
Direction attributes for a list of pins that belongs to a given cell;
Hierarchical/not-hierarchical attributes for a list of cells; and
reference name attributes for a list of cells (names of instantiated modules).
Second, the "all_connected" command, which returns the net connected to a given pin/port or the list of pins/ports connected to a given net, consumes significant amounts of CPU time. The number of calls to that command can be reduced through describing nets explicitly in the dump file as lists of connected pins/ports, instead of searching for nets connected to each pin/port. A "fast dump script" has been designed to take advantage of this fact. These two approaches have been used to improve run times. The following is an outline of another embodiment of the dump script 360 of FIG. 12:
Find all input ports of current design
Search for direction attributes of port list
write out port list
write out direction list
Find all cells of current design
Search for is_hierarchial attributes of cell list
Search for ref_name attributes of cell list
write out cell list
write out is_hierarchial list
write out ref_name list
For each cell pin
Find all cell pins
Search for direction attributes of pin list end for
Find all nets of current design
For each net
Search for list of connected pins/ports
Write out net name
Write out list of connected pins end for
An example of the fast dump script as implemented for the Synopsys Design Compiler is reproduced as Appendix A3 attached hereto. Also, an example of the ASCII dump file produced by the dump script of Appendix A3 is reproduced as Appendix A4
attached hereto.
A disadvantage of the fast dump script is that it creates dump files that are not fully human readable, because they contain lists of objects and associated attributes that have to be matched one-to-one. Parsing remains trivial though. An example of dump file created by the fast dump script is given in Appendix A4.
2. Using TCL (the exTensible Command Language)
Yet another technique to decrease the time required to create the dump file is to utilize the TCL (the exTensible Command Language) available in the public domain. The basic idea is to store object names in associative arrays that are provided by TCL, in order to avoid calling dc_shell to query the same information several times. When information is needed, associative arrays are searched for the required information. If it is not available, dc_shell is called to obtain it. The information is then saved in associative arrays for further re-use, with fast access times. This principle can be applied to the "is_hierarchical" and "ref_name" cell attributes, and to names and directions of cell pins. As the same types of cells are instantiated a large number of times, a significant amount of time can be saved.
The complete listing of a dump script based on TCL, referred to as "TCL dump script", is given in Appendix A5. The obtained dump files are at the same format as files created by the script, and are fully human readable (see Appendix A2 for an example).
The TCL dump script has minor disadvantages:
TCL and a number of extensions have to be installed. All this software is public domain though, and is free of charge.
When a cell is encountered for the first time, its list of pins is stored in associative arrays. For all further occurrences of the same cell, the list of pins is read back from associative arrays. Therefore, although there are no obvious reasons for this to happen, cells with identical names and different pin sets cannot be detected, and can lead to inconsistencies in the dump file.
Results obtained with the various dump scripts for a 350 kGates design are summarized Table 1 below. CPU times are for a Sun/Ultra-II workstation.
TABLE 1 Run times of various dump scripts CPU time Type of script (Hours:Minutes) Human readable dump file Dump script 5:45 Yes Fast dump script 2:19 No TCL dump script 0:50 Yes
C. VEGA SYSTEM OVERVIEW
Once the dump file, representing the IC design, is created, the information contained in the dump file can be parsed, analyzed, and various useful functions can be performed. Disclosed herein is a system to read and parse the dump file and produce analysis reports and scripts for an efficient synthesis of the IC design. The system extracts design information and creates RTL analysis reports and logic synthesis scripts. Although the present invention is disclosed in terms of the Synopsys Design Compiler, the techniques disclosed herein is applicable to any design compilers, such as AMBIT.
FIG. 13 illustrates an overview of the VEGA system. A preferred embodiment of the system is implemented based on Synopsys, and has the following properties:
The input dump file 362 may be of any format sufficiently describing the IC design. In the preferred embodiment, the dump file 362 is the ASCII dump file of the generic netlist generated by the dump script from Synopsys as described above; and the synthesis scripts the tool generates are Design Compiler scripts ("dc_shell" scripts).
As described above, it is preferred that RTL analysis uses the generic netlist created by the target logic synthesis tool as its input description, rather than the RTL code itself; however, the use of the generic netlist is a mere preference in the current implementation. In fact, any input format is sufficient if the input format sufficiently describes the underlying IC design. The generic netlist created by "translation" represents the "synthesis view" of the RTL code, and reflects interpretations of the RTL code that may be made by the target synthesis tool.
Input files 162 to VEGA are dump files described herein are created using the dump script described in the same disclosure. Dump files are ASCII files that contain a human-readable description of Design Compiler's generic netlist.
Continuing to refer to FIG. 13, dump files 362 are read in, using a parser 363 that builds a data structure to represent the design information they contain. This data structure is optimized for fast information query.
Then, a setup file 365 named "VEGA_extract.setup", is read in. This file is available for VEGA users to configure and drive design information extraction and reporting. The setup file is discussed in further detail below.
The IC design loaded from dump files 363 are "linked" 366 to VEGA libraries 368. The linking process 366 comprise the steps of matching, or mapping, all cells of the dump files 360 with cells described in a library 368. VEGA libraries 168
contain descriptions of Synopsys Design Compiler cells that are used in generic netlists created by the "elaborate" command, and of LSI Logic technology cells that are used in mapped designs or mapped design pieces. Linking to libraries allows identification of all cells used in dump files, together with their pins that have special functions, like clock pins, clear pins, RAM write enable pins, etc.
The next step in the VEGA process is to pre-process 370 the generic cells. This step applies only to designs that were built from RTL code by Design Compiler's "elaborate" command. It consists in cleaning up designs through removing cells that drive nothing and nets that are either unloaded or undriven, and in extracting the function of sequential cells. Design Compiler represents all generic flipflops and latches as instances of a component named "SEQGEN". Depending on how the pins of a "SEQGEN" cell are used, the cell implements either a flipflop or a latch.
Following the pre-processing 370 of the generic cells, nets with specific functions are identified 372, including multiply-driven nets, clock nets, asynchronous reset nets, and RAM write enable nets.
Design information reports 380 are created 374. Various reports which can be generated are described in detail in the following sections. Finally, synthesis scripts 378 are generated 376.
The "VEGA_extract.setup" file 376 can be used to bypass design information reporting, through inserting an appropriate line of command. In a similar way, the "VEGA_synthesis.setup" 364 file can be used to bypass synthesis scripts generation.
At each of the above described steps, the VEGA system provides status messages to the operator of the VEGA system. These messages can be directed to the console or to a log file. Examples of messages displayed by VEGA when processing a design are given in Appendix B.
D. RTL ANALYSIS USING A GENERIC NETLIST RATHER THAN THE RTL CODE
As discussed above, RTL code in Verilog or VHDL can be parsed in order to identify key hardware elements. Such key hardware elements can include flipflops, latches, tristate buffers, bidirectional buffers and memories. With respect to these key hardware elements, key pins with the elements' active edges or levels can also be identified.
For example, with regard to a flipflop, the following information can be extracted: The data input pin; the clock pin with an active edge (rising or falling), a clear pin with an active level (low or high); and a preset pin with active level (low or high).
In addition to the key hardware elements, interconnections between hardware elements must be understood. Finally, the function of the clouds, or sets, of combinational logic needs to be understood to some extent, so that RTL analysis is able to track design issues.
For example, referring to FIG. 33, the Verilog-HDL code given in box 300 implies that signal "Z" is the output of a flipflop 301, that is clocked on by signal "INT_CLK" 302 created through gating 303 signals "CLK" 304 with "GCLK" 305. This information is critical from the RTL analysis prospective because it indicates a gated clock (this hardware structure is the definition of a gated clock). On the other hand, the exact function of the combinational cloud "C" 306 is not necessarily needed. Knowing that its inputs are "D0" 307, "D1" 308 and "SEL" 309 could be sufficient.
One approach to building the hardware view needed for RTL analysis is to process directly the RTL code and create a specific data structure to represent it. RTL code is technology-independent, and is also supposed to be independent from design tools. So RTL analysis could focus on HDLs, supporting both Verilog-HDL and VHDL, which are the two standard languages currently in use in the industry. However, it may be more advantageous to analyze the IC design further down the design cycle.
However, by utilizing the RTL code directly for design analysis, key design issues can be missed. This is because Verilog-HDL and VHDL were both developed as simulation languages, before logic synthesis tools were made available, and have no formerly-defined synthesis semantics.
Preferably, therefore, instead of processing the RTL code, RTL analysis can process a "generic netlist" created by the target logic synthesis tool through "translation", which is the first step of logic synthesis.
1. Examples of Design Issues That Can Be Missed When Directly Using the RTL Code
As mentioned above, some key design issues can be missed if RTL analysis directly extract information from the RTL code. Because the synthesis semantics of Verilog-HDL and VHDL was not formerly defined, logic synthesis tools can sometimes interpret RTL code in multiple ways. Several representative examples are provided in the following paragraphs.
a. Register with Partial Asynchronous Reset
Flipflops that have an asynchronous reset, either clear or preset, have a larger area than flipflops that do not have an asynchronous reset. As the number of flipflops in a design can be high, when die size is a critical issue, designers often optimize the number of flipflops that have an asynchronous reset. The objective is to minimize the number of flipflops with asynchronous reset, while making sure that gate-level simulation can successfully proceed after activating reset signals.
Using VHDL, the most intuitive RTL model for a register with partial asynchronous reset is illustrated in FIG. 34A (Verilog-HDL could be used instead of VHDL with the same conclusions). In this example, the register includes two flipflops "Q1" and "Q2". "Q1" can be reset asynchronously, through setting signal "RESET_N" to logic zero. "Q2" cannot be reset. Most designers would model partial asynchronous reset this way, and would expect the logic synthesis tool to use a flipflop with clear for "Q1",and a flipflop with no clear for "Q2".
FIG. 34B shows how Synopsys Design Compiler maps the VHDL code 310 of FIG. 34A to a target technology. A multiplexer 311 is used to hold the value of "Q2"312a when "RESET_N" 313 is low. This configuration obviously has a larger gate count than a flipflop with clear and a flipflop with no clear. It also has a larger gate count than two flipflops with an asynchronous reset, and involves more wires with all associated issues, including routing, delays, etc. Q1 314a is reset by signal RESET_N
313.
FIG. 34C shows how AMBIT BuildGates maps the same piece of VHDL code 310 to a target technology. This time, the hardware has the structure that most designers would expect, that is a flipflop 314b with clear for "Q1" and a flipflop 312b with no clear for "Q2". It should be noted that this issue is not VHDL-specific. A similar RTL model could be written using Verilog-HDL, and the results would be identical.
None of Design Compiler and BuildGates can be declared as being "right" or "wrong" in this situation. First, Design Compiler favors consistency of RTL and gate-level simulation. Strictly looking at the VHDL model 310 of FIG. 34A, "Q2"cannot change value when "RESET_N" is low. This is implemented at the gate-level through using a multiplexer that holds the value of "Q2" when "RESET_N" is low. But this implementation is not the one most designers would expect for such a model.
Second, BuildGates favors the designer prospective. It does create the implementation that most designers would expect. But RTL and gate-level simulations can diverge. If there is a rising edge of "CLK" when "RESET_N" is low, "Q2" cannot change value in the RTL model, but can change value in the gate-level implementation.
This is a typical case in which different logic synthesis tools can interpret the same RTL code in multiple ways. Therefore, decisions made by the target logic synthesis tool cannot necessarily be analyzed with the RTL code by itself.
b. Unconnected Pins on Module Instances
Both Verilog-HDL and VHDL do not require that input pins of instantiated modules are connected to any net. As a result, some module input pins can be left unconnected when modules get instantiated.
FIG. 35A shows an example of module 320 with unconnected input pins 321. Module "M1" 320 instantiated in module "TOP" 322 has its input pin "A" 321 left unconnected in the context of module "TOP" 322. Module M2 323 does not have any unconnected pins.
Synopsys Design Compiler ties unconnected module input pins to logic zero (FIG. 35B). During synthesis, the logic is simplified based on this assumption. Only a warning is issued to tell users that unconnected input pins have been tied to zero. Referring to FIG. 35B, the unconnected pin 321 is now connected to ground, a logic zero, in this example.
This is an arbitrary decision that is made by the logic synthesis tool and that is not present in the RTL code. Different logic synthesis tools can make different choices, like for example leaving pins unconnected without simplifying logic during synthesis, or tieing them to logic one. Here again, decisions made the target logic synthesis have to be analyzed instead of the RTL code itself.
c. Enumerated Types in VHDL
VHDL supports "enumerated data types", which are defined as lists of abstract user-defined values. Enumerated types are typically used to model state vectors of state machines. For example, Appendix K shows a state machine with a state vector defined as an enumerated type:
type TYPE_STATE_VECTOR is (IDLE, WAIT_FOR_WINDOW, IN_WINDOW,
ERROR_SEEN);
signal STATE_VECTOR : TYPE_STATE_VECTOR;
Enumerated types have no direct hardware representation since they are just lists of abstract values signals can hold. Therefore, logic synthesis tools have to decide on a binary encoding for those values. By default, logic synthesis tools encode enumerated types using a compact code, based on the declaration order of values in the enumeration list. For the example of Appendix K, the following encoding would be used:
IDLE "00"
WAIT_FOR_WINDOW "01"
IN_WINDOW "10"
ERROR_SEEN "11"
Synopsys Design Compiler provides a specific VHDL attribute that can be used to enforce a different encoding style. For the example of Appendix K, the following VHDL declarations could be used:
attribute ENUM_ENCODING : STRING;
type TYPE_STATE_VECTOR is (IDLE, WAIT_FOR_WINDOW, IN_WINDOW,
ERROR_SEEN); signal STATE_VECTOR : TYPE_STATE_VECTOR;
attribute ENUM_ENCODING of TYPE_STATE_VECTOR: type is "0001 0010 0100 1000";
This would result in the following encoding for the state vector, that is referred to as "one-hot encoding":
IDLE "0001"
WAIT_FOR_WINDOW "0010"
IN_WINDOW "0100"
ERROR_SEEN "1000"
The "enum_encoding" attribute has no effect on the simulation behavior of VHDL models. It is specific to Synopsys Design Compiler, and considered as a comment by other logic synthesis tools that will ignore it. As a result, the number of flipflops used to represent enumerated types in hardware cannot be predicted in a reliable manner through analyzing the VHDL code. Decisions made by the target logic synthesis tool have to be analyzed instead.
2. RTL Code Translation and Generic Netlists Created by Logic Synthesis Tools
The process used by the logic synthesis tools can be modeled as a two-step process as illustrated by FIG. 36. First, the "translation" step 330 transforms the RTL code 331 to a "generic netlist" 333, i.e.,a netlist that instantiates abstract cells that do not belong to any particular technology library referred to as "generic cells". The "optimization and mapping" step 334 then optimizes the generic netlist 333 using boolean and algebraic optimization and then maps it to the target technology library 335 based on constraints and design rules 336 that are specified by the user. The generic cells that are instantiated in generic netlists 333 are similar to cells used in technology netlists 337, for example, basic gates, multiplexers, flipflops, latches, etc.
From the RTL analysis prospective, the key aspect of generic netlists created by translation is that they fully describe the "logic synthesis view" of the RTL code. All choices associated with constructs that do not have uniquely-defined synthesis semantics and all decisions that are made by the target logic synthesis tool are reflected in those netlists.
In particular, generic netlists created by translation reflect the following:
All decisions made to make sure that RTL and gate-level simulations are consistent. This includes "latch inferring", that consists in using latches to store the value of combinational outputs under some input conditions that are not covered in the RTL code (Verilog-HDL and VHDL signals retain their current value until they get modified). This also includes partial asynchronous reset, that has been described above.
All decisions made to map complex abstract types offered by VHDL onto hardware structures. This includes enumerated types that have been described above, and also record types and array-of-array types; and
All kinds of arbitrary design decisions made by the logic synthesis tool. This includes unconnected module input pins in RTL code, that get tied to logic zero in logic synthesis.
It must be noted that tracking some specific issues can also require some knowledge about the target technology library. In particular, this is the case for latches that are modeled in the RTL code as having an asynchronous reset, either clear or preset. This information is of course reflected in the generic netlist. But if the target technology library does not include latches with asynchronous reset, most logic synthesis tools will implement them using the enable pins of latches together with some gates. On silicon, such an implementation can only fail, because of signal races due to delays introduced by wires. This is a well known cause for chip failure, and an example of such circuit is illustrated in FIG. 37.
Referring to FIG. 37, a failing implementation of a latch with clear signal is illustrated. The latch 390 receives its data from an AND gate 391. In the example of FIG. 37, a reset signal 393 is implemented using an enable signal 394 with a NAND gate 392. This implementation fails.
3. Correlation of RTL Code and Generic Netlist Objects
Performing RTL analysis based on generic netlists created by the target logic synthesis tool instead of the RTL code raises a correlation issue. Reports created by RTL analysis tools will then be based on objects of the generic netlist and not on constructs of the RTL code. But designers actually need to be able to identify design issues in the RTL code so that they can understand how issues were created and how they could be fixed.
Translation creates generic netlists through mirroring the RTL code. Templates in the RTL code, that are based on basic statements and constructs of the HDL, are transformed in a straightforward manner to equivalent hardware structures. Some examples of such transforms are illustrated FIG. 38.
Referring now to FIG. 38, examples of transforms used for RTL code translation to hardware structures are illustrated. HDL statements and constructs 400, 402 and 404 are transformed to a hardware equivalents 400', 402' and 404'. The hardware structure 400' implements the HDL construct 400 using a multiplexer 406 receiving data inputs D0 and D1 and providing output Z, the selection between D0 and D1 is performed by an OR gate 408 having signals A and B as inputs. The hardware structure 402' utilizes a 4.times.1 multiplexer 410 used to selecting four possible inputs as the output. The addition, the substraction, the AND functions, and the XOR function are implemented by an ADD circuit 412, a subtractor circuit 414, an AND gate 416, and an XOR gate 418, respectively. The hardware structure of 404' utilizes a flipflop 420 which is reset by negative edge of signal RESET_N 421.
Names of objects in the RTL code are used as much as possible to name objects in generic netlists because:
RTL-code names for modules are always retained;
RTL-code names for module I/O ports are always retained;
RTL-code names for signals, that become nets in generic netlists, are always retained; and
Cells of generic netlists, that do not exist in RTL code, are named based on the RTL-code names of signals they generate and on naming rules. For example, Synopsys Design compiler names flipflops and latches through adding an "_reg" suffix to the RTL-code names of signals they generate.
VHDL signals that use abstract data types are mapped to net names using straightforward naming conventions, that can be re-defined by users. Appendix L shows examples of net names created by Synopsys Design Compiler for RTL-code abstract-type signals.
Because of those naming conventions and because the structure of the generic netlist mirrors the structure of the RTL code, design issues that are reported in terms of objects of the generic netlist can be easily correlated with corresponding constructs in the RTL code. The VEGA system uses the generic netlist created by Synopsys Design Compiler.
4. Benefits of the Netlist Analysis
As described above, working from generic netlists created by the target synthesis tool rather than from the RTL code itself better ensures that design issues will not be missed. In addition to fulfilling this requirement, the approach that has been described has additional substantial benefits as listed below:
First, there is no need to develop analyzers for RTL models in Verilog-HDL and VHDL because RTL code analysis is performed by logic synthesis tools. Developing analyzers for Verilog-HDL is a significant task. For VHDL, because of the sophisticated compilation mechanisms and complex data types the language provides, it becomes an ambitious project that requires language and compiler construction expertise.
Second, RTL code that instantiates some components from a target technology, which is quite a frequent situation, is easily handled. RTL analysis always works on netlists, that instantiate either generic or technology cells. The architecture of RTL analysis tools is then simplified, together with required algorithms.
Third, RTL code analysis tools can be used to investigate designs that have been mapped to a generic library, like for example the Synopsys GTECH library. Mapping RTL code to a generic library, that does not involve any timing and design rules, is at least one order of magnitude faster than synthesizing to a technology library. The obtained netlist is closer to a technology implementation than the generic netlist created by translation, because gates are less abstract (in particular multiplexers) and arithmetic operators are built from gates (instead of consisting in high-level cells like N-bit adders and subtractors). An implementation to a generic library is appropriate to obtain better gate-count and fanout estimates, that are more accurate than estimates based on the generic netlist created by translation. It can also be used to track technology-specific issues, like failing implementations of latches with asynchronous reset that have been described above. Correlation with RTL code is more difficult though, due to re-structuring, boolean and algebraic optimizations, name changes, and mapping of arithmetic operators to gates.
Finally, RTL analysis tools can also be used for designs that are fully mapped to a technology library. In this case, the netlist only instantiates technology cells. Analysis tools for mapped designs have a lot of value, in particular to analyze netlists delivered by customers for sign-off.
E. RTL Design Analysis and Reporting
The analysis performed and the reports 380 of FIG. 13 produced by the VEGA system includes the following analysis and reports:
1. Design Checks Analysis and Report
RTL DRC (Design Rule Checking), that consists in checking the design for basic rules as discussed above. Severity of DRC violations ranges from showstoppers to simple warnings.
For each module of the design that is being analyzed, the following checks are performed by VEGA, and reported in a file that is referred to as "design checks report":
Ports that are directly connected together, including "feedthroughs" that are direct connections of an input port to an output port. Ports connected together used to be a problem for some EDA tools. Today, most tools can deal with them, and just issue warnings. In RTL, there is an output port that is assigned to an input port or to another output port. In generic netlist, there is a net connecting ports together. The user needs to modify the code, fix the design.
Bidirectional ports and pins. When developing RTL VHDL models, designers sometimes introduce bidirectional ports to deal with output ports that are both assigned and read in the same module. The "buffer" mode should be used for such ports, not the bidirectional "inout" mode. VEGA's list of all bidirectional ports and pins allows users to quickly check that bidirectional ports are used only when appropriate and make appropriate correction to the design. The direction of ports is explicit if the RTL code, and has to be declared when writing models. If the port mode is INOUT (both Verilog-HDL and VHDL), then the port is bidirectional. A bidirectional pin is a bidirectional port of a module that is instantiated. All this information is also available in the generic netlist.
Unconnected ports and pins. Some RTL coding errors, in particular when instantiating components, can show up as unconnected ports and pins. Most further design tools will issue warnings. In the generic netlist, no net is connected to the port or pin. In RTL code, the value of the port/pin is never read (used) if an input port/pin, and never written (assigned a value) if an output port/pin.
Ports and pins permanently tied to zero or one. Some RTL coding errors can show up as tied ports or tied pins. Testability issues can also be associated with tied ports and pins, and most further design tools will issue warnings. Synopsys Design 10 Compiler ties to zero input pins that have been left unconnected. So undriven input pins show up in VEGA's design checks report as pins that are tied to zero. In the generic netlist, ports and pins are connected to special nets called logic.sub.-- 0 and logic.sub.-- 1, that represent logical states zero and one. In RTL models, they are connected to signal that is assigned a constant value (either `0` or `1`) or are directly assigned a constant if output pins/ports.
Nets that connect multiple pins on the same instance. Modules that have several input ports shorted together by external nets will get reported. In this case, the clarity of RTL code would require that multiple input ports shorted together are replaced with a single port. These nets are easily identifiable through examining sets of pins connected by inter-module nets in the generic netlist, or inter-module signals in RTL.
Floating nets that have no driver. A Y-pattern generator (DC current testing) may not handle floating nets, which have to be removed to run the tool successfully. Other design tools can also issue warnings on floating nets. Floating nets are identified in the generic netlist through looking at sets of pins connected to each net. If the net has no driving pin/port, it is floating. Floating nets can be introduced in RTL. For example, if you have a signal that is used but is never assigned a value. But such nets are arbitrarily removed or tied to zero by Synopsys Design Compliler when building the generic netlist. In most cases, floating nets are introduced by the logic synthesis tool and are still there in the final netlist. So it really makes more sense to search for floating nets in the generic netlist.
Unloaded nets that drive nothing. Further design tools can issue warnings on unloaded nets. Errors forcing unloaded nets to be removed should not be encountered though. These nets are identified using the same technique used to identify the floating nets.
Appendix D1 gives an example of design checks report. Options are available in the "VEGA_extract.setup" setup file to control the generation of design checks reports, and are described in further sections.
2. Cell Analysis and Report
The "cell report" provides information about the types and numbers of cells that are used in each module.
For modules that consist in generic logic created through RTL code translation by Design Compiler's "elaborate" command, the following information is reported:
Total number of generic cells and total number of cell pins. The total numbers of cells and pins are much more relevant pieces of information, because they give a rough indication of the complexity of the module. VEGA counts the number of generic cells in the generic netlist for each module, and also counts and accumulates the number of pins on each cell. This gives a rough idea of the complexity of a module, which is used by designers to re-arrange the hierarchy and make it more suitable for synthesis. There are other ways to have rough estimates through searching for templates in the RTL and associate a gate-count with them. Designers can compile to the target technology without or with design rules and timing constraints to get better estimates.
List of Designware cells. Those cells contain arithmetic logic that is inferred from operators in the RTL code, like +', -' or *'. The complexity of Designware cells can be quite high, so it is important to provide a complete list of them. Names used by Design Compiler contain all the useful information, including the type of operator (adder, subtractor, multiplier, etc.), the sizes of operands, and the size of result. The Designware cells are easily identified in the generic netlist based on their naming style. For example, ADD_UNS.sub.-- 8.sub.-- 4.sub.-- 8 is an unsigned adder, with 8-bit and 4-bit operands and result on 8bit. VEGA actually uses libraries to identify DesignWare cells with pattern matching. For example, "ADD_UNS*n" is used to respresent unsigned adders. DesignWare cells would also be easy to identify in the RTL code. For example, in most cases, the `+` operator will indicate and adder. The knowledge of DesignWare components is used by designers to have an idea of the complexity of a module. DesignWare components can very significantly increase the gate count. Also, the designer may decide to retain or dissolve the specific modules that are created to encapsulate DesignWare logic.
List of technology cells. It is always important to have a close look at technology cells that have been instantiated by designers in the RTL code. A typical case are delay cells used to generate RAM write enable pulses. Special attention needs to be paid to those cells, both in synthesis and placement. This gives pointers to pieces of code where the RTL designers have done things that are tricky, potentially unsupported or causing problems later on. In particular, delay cells should be used only to build pulse generator for RAM write enables.
For modules that consist in mapped logic, the report provides the complete list of cells of the target technology that have been used.
Appendix D2 provides an examples of cell reports including
(a) Hierarchical module, also containing a synchronous RAM;
((b) Generic module. Note the list of Designware cells, and the technology cell "DEL4" that has been instantiated in the RTL code to delay a signal by 4 ns; and
((c) Mapped module.
3. Design Hierarch Analysis and Report
A "design hierarchy report" provides an easy-to-read view of the design hierarchy, together with the following key synthesis-related information:
Module names are indented according to their depth relatively to the top-level of the design.
For each module, the total number of generic cells and the total number of technology cells used in the module are indicated, together with corresponding numbers of cell pins. The number of generic Designware cells is also indicated, and all memories are explicitly listed. This information can be used to obtain