Home
Patent Search
IMT Blog
REGISTER
|
SIGN IN
United States Patent
6573905
MacInnis , ; et al.
June 3, 2003
Title
Video and graphics system with parallel processing of graphics windows
Abstract
A display engine of a video and graphics system includes one or more processing elements and receives graphics from a memory. The graphics data define multiple graphics layers, and the processing elements process two or more graphics layers in parallel to generate blended graphics. Alpha values may be used while blending graphics. The processing elements may be integrated on an integrated circuit chip with an input for receiving the graphics data and other video and graphics components. The display engine may also include a graphics controller for receiving two or more graphics layers in parallel, for arranging the graphics layers in an order suitable for parallel processing, and for providing the arranged graphics layers to the processing elements. The blended graphics may be blended with HDTV video or SDTV video, which may be extracted from compressed data streams such as an MPEG Transport stream.
Inventors:
MacInnis; Alexander G.
(Los Altos,
CA
)
, Tang; Chengfuh Jeffrey
(Saratoga,
CA
)
, Xie; Xiaodong
(San Jose,
CA
)
Assignee:
Broadcom Corporation
(Irvine,
CA
)
Appl. No.:
641935
Filed:
August 18, 2000
Current U.S. Class:
345/629
345/630
348/597
345/501
345/502
345/505
345/506
345/519
Field of Search:
345/597,601,603,604,501,502-503,505-506,519-522,530-532,536-538,545-549,555 348/589,599,715,597 710/124,244
U.S. Patent Documents
4020332
April 1977
Crochiere et al.
4367466
January 1983
Takeda et al.
4532547
July 1985
Bennett
4679040
July 1987
Yan
4688033
August 1987
Carini et al.
4959718
September 1990
Bennett
5043714
August 1991
Perlman
5097257
March 1992
Clough et al.
5142273
August 1992
Wobermin
5155816
October 1992
Kohn
5258747
November 1993
Oda et al.
5262854
November 1993
Ng
5307177
April 1994
Shibata et al.
5384912
January 1995
Ogrinc et al.
5396567
March 1995
Jass
5398211
March 1995
Willenz et al.
5418535
May 1995
Masucci et al.
5432900
July 1995
Rhodes et al.
5434683
July 1995
Sekine et al.
5434957
July 1995
Moller
5467144
November 1995
Saeger et al.
5471411
November 1995
Adams et al.
5515077
May 1996
Tateyama
5526054
June 1996
Greenfield et al.
5533182
July 1996
Bates et al.
5546103
August 1996
Rhodes et al.
5550594
August 1996
Cooper et al.
5570296
October 1996
Heyl et al.
5577187
November 1996
Mariani
5598525
January 1997
Nally et al.
5600364
February 1997
Hendricks et al.
5604514
February 1997
Hancock
5614952
March 1997
Boyce et al.
5615376
March 1997
Ranganathan
5619337
April 1997
Naimpally
5621869
April 1997
Drews
5621906
April 1997
O'Neill et al.
5625611
April 1997
Yokota et al.
5635985
June 1997
Boyce et al.
5638499
June 1997
O'Connor et al.
5638501
June 1997
Gough et al.
5640543
June 1997
Farrell et al.
5664162
September 1997
Dye
5694143
December 1997
Fielder et al.
5696527
December 1997
King et al.
5708764
January 1998
Borrel et al.
5719593
February 1998
De Lange
5742779
April 1998
Steele et al.
5745095
April 1998
Parchem et al.
5751979
May 1998
McCrory
5758177
May 1998
Gulick et al.
5764238
June 1998
Lum et al.
5790136
August 1998
Hoffert et al.
5790795
August 1998
Hough
5790842
August 1998
Charles et al.
5793445
August 1998
Lum et al.
5815137
September 1998
Weatherford et al.
5828383
October 1998
May et al.
5831615
November 1998
Drews et al.
5844608
December 1998
Yu et al.
5864345
January 1999
Wickstrom et al.
5867166
February 1999
Myhrvold et al.
5874967
February 1999
West et al.
5894300
April 1999
Takizawa
5896136
April 1999
Augustine et al.
5907635
May 1999
Numata
5914725
June 1999
MacInnis et al.
5914728
June 1999
Yamagishi et al.
5920572
July 1999
Washington et al.
5923316
July 1999
Kitamura et al.
5923385
July 1999
Mills et al.
5926647
July 1999
Adams et al.
5940089
August 1999
Dilliplane et al.
5949432
September 1999
Gough et al.
5949439
September 1999
Ben-Yoseph et al.
5951664
September 1999
Lambrecht et al.
5956041
September 1999
Koyamada et al.
5959626
September 1999
Garrison et al.
5959637
September 1999
Mills et al.
5961603
October 1999
Kunkel et al.
5963201
October 1999
McGreggor et al.
5963262
October 1999
Ke et al.
5977933
November 1999
Wicher et al.
5982425
November 1999
Allen et al.
5982459
November 1999
Fandrianto et al.
5987555
November 1999
Alzien et al.
6002411
December 1999
Dye
6005546
December 1999
Keene
6006303
December 1999
Barnaby et al.
6018803
January 2000
Kardach
6023302
February 2000
MacInnis et al.
6028583
February 2000
Hamburg
6038031
March 2000
Murphy
6046740
April 2000
LaRoche et al.
6061094
May 2000
Maietta
6064676
May 2000
Slattery et al.
6078305
June 2000
Mizutani
6088355
July 2000
Mills et al.
6094226
July 2000
Ke et al.
6098046
August 2000
Cooper et al.
6100826
August 2000
Jeon et al.
6100899
August 2000
Ameline et al.
6105048
August 2000
He
6111896
August 2000
Slattery et al.
6115422
September 2000
Anderson et al.
6121978
September 2000
Miler
6133901
October 2000
Law
6151030
November 2000
DeLeeuw et al.
6151074
November 2000
Werner
6157978
December 2000
Ng et al.
6160989
December 2000
Hendricks et al.
6169843
January 2001
Lenihan et al.
6184908
February 2001
Chan et al.
6189064
February 2001
MacInnis et al.
6199131
March 2001
Melo et al.
6204859
March 2001
Jouppi et al.
6205260
March 2001
Crinon et al.
6208354
March 2001
Porter
6212590
April 2001
Melo et al.
6229550
May 2001
Gloudemans et al.
6239810
May 2001
Van Hook et al.
6252608
June 2001
Snyder et al.
6263019
July 2001
Ryan
6263396
July 2001
Cottle et al.
6269107
July 2001
Jong
6275507
August 2001
Anderson et al.
6281873
August 2001
Oakley
6311204
October 2001
Mills
6320619
November 2001
Jiang
6326963
December 2001
Meehan
6326984
December 2001
Chow et al.
6327000
December 2001
Auld et al.
6335746
January 2002
Enokida et al.
6337703
January 2002
Konar et al.
6351471
February 2002
Robinett et al.
6351474
February 2002
Robinett et al.
6353460
March 2002
Sokawa et al.
6362827
March 2002
Ohba
6380945
April 2002
MacInnis et al.
6384840
May 2002
Frank et al.
6411333
June 2002
Auld et al.
6421460
July 2002
Hamburg
6426755
July 2002
Deering
6501480
December 2002
MacInnis et al.
Foreign Patent Documents
0746116
Dec., 1996
EP
0840276
May., 1998
EP
0840277
May., 1998
EP
2287627
Sep., 1995
GB
WO 94/10641
May., 1994
WO
Other References
Sun, Huifang et al., "A New Approach for Memory Efficient ATV Decoding," 1997 IEEE International Conference on Consumer Electronics, pp. 174-175, Los Angeles, 1997. .
Blinn, Jim, "Jim Blinn's Corner: Dirty Pixels," Chapter 16, pp. 179-190; Morgan Kaufman Publishers, Inc., San Francisco, California, 1998. .
Foley, James D. et al., "Computer Graphics: Principles and Practice," 4 Pages, Second Edition in C, Addison-Wesley Publishing Company, USA, 1996. .
Bao, Jay et al., "HDTV Down-Conversion Decoder," IEEE Transactions on Consumer Electronics, pp. 402-410, vol. 42, No. 3, Aug. 1996. .
Mokry, Robert et al., "Minimal Error Drift in Frequency Scalability for Motion-Compenstaed DCT Coding," IEEE Transactions on Circuits and Systems for Video Technology, pp. 392-406, vol. 4, No. 4, Aug. 1994. .
Vetro, Anthony et al., "Minimum Drift Architectures for 3-Layer Scalable DTV Decoding," IEEE Transactions on Consumer Electronics, pp. 527-536, vol. 44, No. 3, Aug. 1998. .
Motorola, Inc., MC92100, ""Scorpion" Graphics Display Generator," SDRAM Controller, and Digital Video Encoder, pp. 1-6, 1997. .
Power TV, Inc., "Eagle.TM. Graphics/Audio Media Compositor Data Sheet," Version 1.7, pp. 1-63, Feb. 27, 1997. .
Lee, Dong-Ho et al., "HDTV Video Decoder Which Can Be Implemented With Low Complexity," IEEE International Conference on Consumer Electronics, pp. 6-7, 1994. .
Sun, Huifang, "Hierarchical Decoder For MPEG Compressed Video Data," IEEE Transactions on Consumer Electronics, pp. 559-564, vol. 39, No. 3, Aug. 1993. .
Yu, Haoping et al., "Block-Based Image Processor for Memory Efficient MPEG Video Decoding," 1999 IEEE International Conference on Consumer Electronics, pp. 114-115, 1999..~
Primary Examiner:
Bella; Matthew C.
Assistant Examiner:
Sajous; Wesner
Attorney, Agent or Firm:
Christie, Parker & Hale, LLP
Parent Case Text
CROSS-REFERENCE TO RELATED APPLICATION(S)
This application is a continuation-in-part of U.S. patent application Ser. No. 09/437,208, filed Nov. 9, 1999 and entitled "Graphics Display System," and claims the benefit of the filing date of U.S. provisional patent application No. 60/170,866, filed Dec. 14, 1999 and entitled "Graphics Chip Architecture," the contents of which are hereby incorporated by reference.
The present application contains subject matter related to the subject matter disclosed in U.S. patent application Ser. No. 09/641,374 entitled "Video, Audio and Graphics Decode, Composite and Display System," U.S. patent application Ser. No. 09/641,936 entitled "Video and Graphics System with an MPEG Video Decoder for Concurrent Multi-Row Decoding," U.S. patent application Ser. No. 09/643,223 entitled "Video and Graphics System with MPEG Specific Data Transfer Commands," U.S. patent application Ser. No. 09/640,870 entitled "Video and Graphics System with Video Scaling," U.S. patent application Ser. No. 09/640,869, now issued as U.S. Pat. No. 6,538,656 on Mar. 25, 2003 entitled "Video and Graphics System with a Data Transport Processor," U.S. patent application Ser. No. 09/641,930 entitled "Video and Graphics System with a Video Transport Processor," U.S. patent application Ser. No. 09/642,510 entitled "Video and Graphics System with a Single-Port RAM," and U.S. patent application Ser. No. 09/642,458 entitled "Video and Graphics System with an Integrated System Bridge Controller," all filed Aug. 18, 2000.
Claims
What is claimed is:
1. A display engine comprising: an input for receiving data representing graphics from a memory, wherein the data representing graphics defines a plurality of graphics windows; one or more processing elements capable of blending the plurality of graphics windows in parallel to generate blended graphics to be blended with video; and a plurality of processing pipelines, each pipeline being capable of processing one of the plurality of graphics windows to place in a common format for blending in the processing elements.
2. The display engine of claim 1 wherein the input for receiving data representing graphics and said one or more processing elements are integrated on an integrated circuit chip.
3. The display engine of claim 1 further comprising: a graphics controller for receiving the plurality of graphics windows in parallel from the processing pipelines, for arranging the graphics windows in an order suitable for parallel processing, and for providing the arranged graphics windows to the one or more processing elements.
4. The display engine of claim 1, wherein each pipeline monitors whether its graphics window is in range for blending, and if it is not in the range, puts out a pixel equivalent to a transparent one so that it will have no effect on a net output when blended with graphics windows from other pipelines.
5. The display engine of claim 1, wherein each graphics window comprises a plurality of pixels, and the graphics windows are processed in the processing pipelines in parallel and in synchronization such that the graphics windows are aligned to each other pixel by pixel in the processing pipelines.
6. The display engine of claim 1, wherein if a particular processing pipeline is not ready, then all pipelines stall and wait for the particular pipeline to become ready again.
7. The display engine of claim 3, wherein the graphics controller synchronizes the processing pipelines and stalls the pipelines if necessary so that the graphics windows processed in the processing pipelines are aligned in order to be blended together in the processing elements.
8. The display engine of claim 3, wherein the graphics controller orders the graphics windows in accordance with a window depth order.
9. The display engine of claim 3, wherein each processing pipeline comprises a graphics converter for converting the format of one of the plurality of graphics windows to the common format, and for providing it to the graphics controller.
10. The display engine of claim 9 wherein each processing pipeline comprises a CLUT for converting the format of said one of the plurality of graphics windows from a CLUT format to the common format.
11. The display engine of claim 1 further comprising: a graphics filter for receiving and spatially processing the blended graphics prior to compositing the blended graphics with the video.
12. The display engine of claim 11 wherein the spatial processing includes filtering.
13. The display engine of claim 11 wherein the spatial processing includes scaling.
14. The display engine of claim 1 wherein a plurality of alpha values are used to blend the plurality of graphics windows to generate the blended graphics.
15. The display engine of claim 14 wherein the plurality of alpha values include at least one alpha value per each of the plurality of graphics windows.
16. The display engine of claim 15 wherein the plurality of alpha values include at least one alpha value per each of the plurality of pixels.
17. The display engine of claim 14 wherein the plurality of alpha values are blended to generate at least one composite alpha value.
18. The display engine of claim 11 further comprising a plurality of line buffers, and wherein the blended graphics include a plurality of lines of pixels, and each of the plurality of lines of pixels is loaded into one of the plurality of line buffers to be provided to the graphics filter.
19. The display engine of claim 1 wherein the one or more processing elements include three graphics blenders, and wherein at least two graphics windows are blended together in each of first two of the graphics blenders in parallel, and outputs of the first two of the graphics blenders are provided to the last graphics blender for blending to generate the blended graphics.
20. A method of blending a plurality of graphics windows to generate blended graphics comprising the steps of: receiving data representing graphics, the data representing graphics defining the plurality of graphics windows, wherein each graphics window comprises a plurality of pixels; processing the graphics windows, each graphics window in one of processing pipelines, in parallel and in synchronization such that the graphics windows are aligned to each other pixel by pixel in the processing pipelines; and blending two or more of the plurality of graphics windows in parallel to generate the blended graphics to be blended with video.
21. The method of blending a plurality of graphics windows of claim 20 further comprising the step of: arranging two or more graphics windows in an order suitable for parallel processing.
22. The method of blending a plurality of graphics windows of claim 21 further comprising the step of: converting formats of the plurality of graphics windows to a common format.
23. The method of blending a plurality of graphics windows of claim 22 wherein the step of converting the format of the plurality of graphics windows to a common format comprises the step of converting the format of the plurality of graphics windows from one or more CLUT formats to the common format.
24. The method of blending a plurality of graphics windows of claim 20 further comprising the step of spatially processing the blended graphics.
25. The method of blending a plurality of graphics windows of claim 24 wherein the step of spatially processing the blended graphics comprises the step of filtering the blended graphics.
26. The method of blending a plurality of graphics windows of claim 24 wherein the step of spatially processing the blended graphics comprises the step of scaling the blended graphics.
27. The method of blending a plurality of graphics windows of claim 20 wherein the step of blending two or more of the plurality of graphics windows in parallel comprises the step of blending two or more of the plurality of graphics windows in parallel using a plurality of alpha values.
28. The method of blending a plurality of graphics windows of claim 27 wherein the plurality of alpha values include at least one alpha value per each of the plurality of graphics windows.
29. The method of blending a plurality of graphics windows of claim 28 wherein each of the plurality of graphics windows includes a plurality of pixels, and the plurality of alpha values include at least one alpha value per each of the plurality of pixels.
30. The method of blending a plurality of graphics windows of claim 28 further comprising the step of: blending the plurality of alpha values to generate at least one composite alpha value.
31. A system comprising: a transport processor for receiving a plurality of compressed data streams, at least one of the plurality of compressed data streams including video data; a video decoder for decoding the video data to generate decoded video data; a display engine for receiving a plurality of graphics windows and for blending them in parallel to generate blended graphics; and a video compositor for blending the decoded video data with the blended graphics.
32. The system of claim 31 wherein the transport processor, the video decoder, the display engine and the video compositor are integrated on an integrated circuit chip.
33. The system of claim 31 wherein the plurality of compressed data streams include one or more MPEG Transport streams, and the transport processor includes an MPEG Transport processor.
34. The system of claim 31 wherein the video data include MPEG-2 video data and the video decoder includes an MPEG-2 video decoder.
35. The system of claim 31 wherein the video data include SDTV video data.
36. The system of claim 31 wherein the video data include HDTV video data.
Description
FIELD OF THE INVENTION
The present invention relates generally to integrated circuits, and more particularly to an integrated circuit system for processing and displaying video and graphics.
BACKGROUND OF THE INVENTION
Video and graphics systems are typically used in television control electronics, such as set top boxes, integrated digital TVs, and home network computers. Video and graphics systems typically include a display engine that may perform display functions. The display engine is the part of the video and graphics system that receives display pixel data from any combination of locally attached video and graphics input ports, processes the data in some way, and produces final display pixels as output.
This application includes references to both graphics and video, which reflects in certain ways the structure of the hardware itself. This split does not, however, imply the existence of any fundamental difference between graphics and video, and in fact much of the functionality is common to both. Graphics as used herein may include graphics, text and video.
SUMMARY OF THE INVENTION
In one embodiment of the present invention, a display engine for processing graphics includes an input for receiving data representing graphics from a memory, and one or more processing elements. The graphics may include multiple graphics layers or windows. The processing elements process two or more graphics layers in parallel to generate blended graphics. The input for receiving data representing graphics and the processing elements may be integrated on an integrated circuit chip. The display engine may also include a graphics controller for receiving two or more graphics layers in parallel, for arranging the graphics layers in an order suitable for parallel processing, and for providing the arranged graphics layers to the one or more processing elements.
In another embodiment of the present invention, a method is provided for blending multiple graphics layers to generate blended graphics. Data representing graphics defining the multiple graphics layers are received, and two or more graphics layers are blended in parallel. Prior to blending, two or more graphics layers may be arranged in an order suitable for parallel processing. The format of the graphics layers may be converted to a common format.
In yet another embodiment of the present invention, a video and graphics system is disclosed including a transport processor, a video decoder, a display engine and a video compositor. The transport processor may receive multiple compressed data streams including video data. The video decoder decodes the video data to generate decoded video data. The display engine receives multiple graphics layers and blend them in parallel to generate blended graphics. The video compositor blends the decoded video data with the blended graphics. The transport processor, the video decoder, the display engine and the video decoder may be integrated on an integrated circuit chip. The compressed data streams may include one or more MPEG Transport streams, which may include SDTV video data and/or HDTV video data.
BRIEF DESCRIPTION OF THE DRAWINGS
FIG. 1 is a block diagram of an integrated circuit graphics display system according to a presently preferred embodiment of the invention;
FIG. 2 is a block diagram of certain functional blocks of the system;
FIG. 3 is a block diagram of an alternate embodiment of the system of FIG. 2 that incorporates an on-chip I/O bus;
FIG. 4 is a functional block diagram of exemplary video and graphics display pipelines;
FIG. 5 is a more detailed block diagram of the graphics and video pipelines of the system;
FIG. 6 is a map of an exemplary window descriptor for describing graphics windows and solid surfaces;
FIG. 7 is a flow diagram of an exemplary process for sorting window descriptors in a window controller;
FIG. 8 is a flow diagram of a graphics window control data passing mechanism and a color look-up table loading mechanism;
FIG. 9 is a state diagram of a state machine in a graphics converter that may be used during processing of header packets;
FIG. 10 is a block diagram of an embodiment of a display engine;
FIG. 11 is a block diagram of an embodiment of a color look-up table (CLUT);
FIG. 12 is a timing diagram of signals that may be used to load a CLUT;
FIG. 13 is a block diagram illustrating exemplary graphics line buffers;
FIG. 14 is a flow diagram of a system for controlling the graphics line buffers of FIG. 13;
FIG. 15 is a representation of left scrolling using a window soft horizontal scrolling mechanism;
FIG. 16 is a representation of right scrolling using a window soft horizontal scrolling mechanism;
FIG. 17 is a flow diagram illustrating a system that uses graphics elements or glyphs for anti-aliased text and graphics applications;
FIG. 18 is a block diagram of certain functional blocks of a video decoder for performing video synchronization;
FIG. 19 is a block diagram of an embodiment of a chroma-locked sample rate converter (SRC);
FIG. 20 is a block diagram of an alternate embodiment of the chroma-locked SRC of FIG. 19;
FIG. 21 is a block diagram of an exemplary line-locked SRC;
FIG. 22 is a block diagram of an exemplary time base corrector (TBC);
FIG. 23 is a flow diagram of a process that employs a TBC to synchronize an input video to a display clock;
FIG. 24 is a flow diagram of a process for video scaling in which downscaling is performed prior to capture of video in memory and upscaling is performed after reading video data out of memory;
FIG. 25 is a detailed block diagram of components used during video scaling with signal paths involved in downscaling;
FIG. 26 is a detailed block diagram of components used during video scaling with signal paths involved in upscaling;
FIG. 27 is a detailed block diagram of components that may be used during video scaling with signal paths indicated for both upscaling and downscaling;
FIG. 28 is a flow diagram of an exemplary process for blending graphics and video surfaces;
FIG. 29 is a flow diagram of an exemplary process for blending graphics windows into a combined blended graphics output;
FIG. 30 is a flow diagram of an exemplary process for blending graphics, video and background color;
FIG. 31 is a block diagram of a polyphase filter that performs both anti-flutter filtering and vertical scaling of graphics windows;
FIG. 32 is a functional block diagram of an exemplary memory service request and handling system with dual memory controllers;
FIG. 33 is a functional block diagram of an implementation of a real time scheduling system;
FIG. 34 is a timing diagram of an exemplary CPU servicing mechanism that has been implemented using real time scheduling;
FIG. 35 is a timing diagram that illustrates certain principles of critical instant analysis for an implementation of real time scheduling;
FIG. 36 is a flow diagram illustrating servicing of requests according to the priority of the task;
FIG. 37 is a block diagram of a graphics accelerator, which may be coupled to a CPU and a memory controller;
FIG. 38 is a block diagram of an integrated circuit chip, which embodies the system of the present invention, coupled to the CPU and other devices;
FIG. 39 is a block diagram of the integrated circuit chip in one embodiment of the present invention;
FIG. 40 is a block diagram of the integrated circuit chip in one embodiment of the present invention;
FIG. 41 is a block diagram that illustrates distribution of MPEG Transport streams in one embodiment of present invention;
FIG. 42 is a block diagram of one embodiment of a data transport;
FIG. 43 is a block diagram of another embodiment of a data transport;
FIG. 44 is a block diagram of a video transport;
FIG. 45 is a block diagram of first and second decode row paths with which four macroblock rows may be decoded simultaneously;
FIG. 46 is a block diagram of a video RISC;
FIG. 47 is a context flow graph of the operation of one of the two row decode paths;
FIG. 48 is a block diagram which illustrates providing an SDTV video output while displaying an HDTV video;
FIG. 49 is a block diagram of MPEG video decoding stages in one embodiment;
FIG. 50 is a block diagram of MPEG video decoding stages in another embodiment;
FIG. 51 is a process diagram illustrating frame-prediction for I-pictures and P-pictures;
FIG. 52 is a process diagram illustrating field-prediction in a frame-picture;
FIG. 53 is a process diagram illustrating prediction of the first field-picture;
FIG. 54 is a process diagram illustrating prediction of the "bottom field" second field-picture;
FIG. 55 is a process diagram illustrating prediction of the "top field" second field-picture;
FIG. 56 is a process diagram illustrating prediction of B field pictures or B frame pictures;
FIG. 57 is a process diagram illustrating frame prediction for B-pictures.
FIG. 58 is a block diagram of image organization in SDRAM;
FIG. 59 is a block diagram of an audio decode processor (ADP);
FIG. 60 is a block diagram of a system bridge controller;
FIG. 61 is a process diagram that illustrates how graphics windows are blended together into blended graphics and composited with video;
FIG. 62 is a block diagram of integrated circuit containing a display engine, the integrated circuit is coupled to external memory and television;
FIG. 63 is a block diagram of a window control block;
FIG. 64 is a block diagram of window controller state machines;
FIG. 65 is a state diagram of a window descriptor state machine;
FIG. 66 is a state diagram of a window state machine;
FIG. 67 is a state diagram of a window state machine;
FIG. 68. is a priority diagram that illustrates window arbitration priorities;
FIG. 69 is a block diagram of a display engine in one embodiment of the present invention;
FIG. 70 is a process diagram that illustrates conversion stages of graphics data in a graphics converter;
FIG. 71 is block diagram of a two-port SRAM;
FIG. 72 is a block diagram of a single-port SRAM that functions equivalently to a dual-port SRAM;
FIG. 73 is a block diagram of a graphics filter coupled to graphics line buffers; and
FIG. 74 is a block diagram of a filter core in the graphics filter.
DETAILED DESCRIPTION
I. Graphics Display System Architecture
Referring to FIG. 1, the graphics display system according to the present invention is preferably contained in an integrated circuit 10. The integrated circuit may include inputs 12 for receiving video signals 14, a bus 20 for connecting to a CPU 22, a bus 24 for transferring data to and from memory 28, and an output 30 for providing a video output signal 32. The system may further include an input 26 for receiving audio input 34 and an output 27 for providing audio output 36.
The graphic display system accepts video input signals that may include analog video signals, digital video signals, or both. The analog signals may be, for example, NTSC, PAL and SECAM signals or any other conventional type of analog signal. The digital signals may be in the form of decoded MPEG signals or other format of digital video. In an alternate embodiment, the system includes an on-chip decoder for decoding the MPEG or other digital video signals input to the system. Graphics data for display is produced by any suitable graphics library software, such as Direct Draw marketed by Microsoft Corporation, and is read from the CPU 22 into the memory 28. The video output signals 32 may be analog signals, such as composite NTSC, PAL, Y/C (S-video), SECAM or other signals that include video and graphics information. In an alternate embodiment, the system provides serial digital video output to an on-chip or off-chip serializer that may encrypt the output.
The graphics display system memory 28 is preferably a unified synchronous dynamic random access memory (SDRAM) that is shared by the system, the CPU 22 and other peripheral components. In the preferred embodiment the CPU uses the unified memory for its code and data while the graphics display system performs all graphics, video and audio functions assigned to it by software. The amount of memory and CPU performance are preferably tunable by the system designer for the desired mix of performance and memory cost. In the preferred embodiment, a set-top box is implemented with SDRAM that supports both the CPU and graphics.
Referring to FIG. 2, the graphics display system preferably includes a video decoder 50, video scaler 52, memory controller 54, window controller 56, display engine 58, video compositor 60, and video encoder 62. The system may optionally include a graphics accelerator 64 and an audio engine 66. The system may display graphics, passthrough video, scaled video or a combination of the different types of video and graphics. Passthrough video includes digital or analog video that is not captured in memory. The passthrough video may be selected from the analog video or the digital video by a multiplexer. Bypass video, which may come into the chip on a separate input, includes analog video that is digitized off-chip into conventional YUV (luma chroma) format by any suitable decoder, such as the BT829 decoder, available from Brooktree Corporation, San Diego, Cali. The YUV format may also be referred to as YCrCb format where Cr and Cb are equivalent to U and V, respectively.
The video decoder (VDEC) 50 preferably digitizes and processes analog input video to produce internal YUV component signals with separated luma and chroma components. In an alternate embodiment, the digitized signals may be processed in another format, such as RGB. The VDEC 50 preferably includes a sample rate converter 70 and a time base corrector 72 that together allow the system to receive non-standard video signals, such as signals from a VCR. The time base corrector 72 enables the video encoder to work in passthrough mode, and corrects digitized analog video in the time domain to reduce or prevent jitter.
The video scaler 52 may perform both downscaling and upscaling of digital video and analog video as needed. In the preferred embodiment, scale factors may be adjusted continuously from a scale factor of much less than one to a scale factor of four. With both analog and digital video input, either one may be scaled while the other is displayed full size at the same time as passthrough video. Any portion of the input may be the source for video scaling. To conserve memory and bandwidth, the video scaler preferably downscales before capturing video frames to memory, and upscales after reading from memory, but preferably does not perform both upscaling and downscaling at the same time.
The memory controller 54 preferably reads and writes video and graphics data to and from memory by using burst accesses with burst lengths that may be assigned to each task. The memory is any suitable memory such as SDRAM. In the preferred embodiment, the memory controller includes two substantially similar SDRAM controllers, one primarily for the CPU and the other primarily for the graphics display system, while either controller may be used for any and all of these functions.
The graphics display system preferably processes graphics data using logical windows, also referred to as viewports, surfaces, sprites, or canvasses, that may overlap or cover one another with arbitrary spatial relationships. Each window is preferably independent of the others. The windows may consist of any combination of image content, including anti-aliased text and graphics, patterns, GIF images, JPEG images, live video from MPEG or analog video, three dimensional graphics, cursors or pointers, control panels, menus, tickers, or any other content, all or some of which may be animated.
Graphics windows are preferably characterized by window descriptors. Window descriptors are data structures that describe one or more parameters of the graphics window. Window descriptors may include, for example, image pixel format, pixel color type, alpha blend factor, location on the screen, address in memory, depth order on the screen, or other parameters. The system preferably supports a wide variety of pixel formats, including RGB 16, RGB 15, YUV 4:2:2 (ITU-R 601), CLUT2, CLUT4, CLUT8 or others.
In addition to each window having its own alpha blend factor, each pixel in the preferred embodiment has its own alpha value. In the preferred embodiment, window descriptors are not used for video windows. Instead, parameters for video windows, such as memory start address and window size are stored in registers associated with the video compositor.
In operation, the window controller 56 preferably manages both the video and graphics display pipelines. The window controller preferably accesses graphics window descriptors in memory through a direct memory access (DMA) engine 76. The window controller may sort the window descriptors according to the relative depth of their corresponding windows on the display. For graphics windows, the window controller preferably sends header information to the display engine at the beginning of each window on each scan line, and sends window header packets to the display engine as needed to display a window. For video, the window controller preferably coordinates capture of non-passthrough video into memory, and transfer of video between memory and the video compositor.
The display engine 58 preferably takes graphics information from memory and processes it for display. The display engine preferably converts the various formats of graphics data in the graphics windows into YUV component format, and blends the graphics windows to create blended graphics output having a composite alpha value that is based on alpha values for individual graphics windows, alpha values per pixel, or both. In the preferred embodiment, the display engine transfers the processed graphics information to memory buffers that are configured as line buffers. In an alternate embodiment, the buffer may include a frame buffer. In another alternate embodiment, the output of the display engine is transferred directly to a display or output block without being transferred to memory buffers.
The video compositor 60 receives one or more types of data, such as blended graphics data, video window data, passthrough video data and background color data, and produces a blended video output. The video encoder 62 encodes the blended video output from the video compositor into any suitable display format such as composite NTSC, PAL, Y/C (S-video), SECAM or other signals that may include video information, graphics information, or a combination of video and graphics information. In an alternate embodiment, the video encoder converts the blended video output of the video compositor into serial digital video output using an on-chip or off chip serializer that may encrypt the output.
The graphics accelerator 64 preferably performs graphics operations that may require intensive CPU processing, such as operations on three dimensional graphics images. The graphics accelerator may be programmable. The audio engine 66 preferably supports applications that create and play audio locally within a set-top box and allow mixing of the locally created audio with audio from a digital audio source, such as MPEG or Dolby, and with digitized analog audio. The audio engine also preferably supports applications that capture digitized baseband audio via an audio capture port and store sounds in memory for later use, or that store audio to memory for temporary buffering in order to delay the audio for precise lip-syncing when frame-based video time correction is enabled.
Referring to FIG. 3, in an alternate embodiment of the present invention, the graphics display system further includes an I/O bus 74 connected between the CPU 22, memory 28 and one or more of a wide variety of peripheral devices, such as flash memory, ROM, MPEG decoders, cable modems or other devices. The on-chip I/O bus 74 of the present invention preferably eliminates the need for a separate interface connection, sometimes referred in the art to as a north bridge. The I/O bus preferably provides high speed access and data transfers between the CPU, the memory and the peripheral devices, and may be used to support the full complement of devices that may be used in a full featured set-top box or digital TV. In the preferred embodiment, the I/O bus is compatible with the 68000 bus definition, including both active DSACK and passive DSACK (e.g., ROM/flash devices), and it supports external bus masters and retry operations as both master and slave. The bus preferably supports any mix of
32-bit, 16-bit and 8-bit devices, and operates at a clock rate of 33 MHz. The clock rate is preferably asynchronous with (not synchronized with) the CPU clock to enable independent optimization of those subsystems.
Referring to FIG. 4, the graphics display system generally includes a graphics display pipeline 80 and a video display pipeline 82. The graphics display pipeline preferably contains functional blocks, including window control block 84, DMA (direct memory access) block 86, FIFO (first-in-first-out memory) block 88, graphics converter block 90, color look up table (CLUT) block 92, graphics blending block 94, static random access memory (SRAM) block 96, and filtering block 98. The system preferably spatially processes the graphics data independently of the video data prior to blending.
In operation, the window control block 84 obtains and stores graphics window descriptors from memory and uses the window descriptors to control the operation of the other blocks in the graphics display pipeline. The windows may be processed in any order. In the preferred embodiment, on each scan line, the system processes windows one at a time from back to front and from the left edge to the right edge of the window before proceeding to the next window. In an alternate embodiment, two or more graphics windows may be processed in parallel. In the parallel implementation, it is possible for all of the windows to be processed at once, with the entire scan line being processed left to right. Any number of other combinations may also be implemented, such as processing a set of windows at a lower level in parallel, left to right, followed by the processing of another set of windows in parallel at a higher level.
The DMA block 86 retrieves data from memory 110 as needed to construct the various graphics windows according to addressing information provided by the window control block. Once the display of a window begins, the DMA block preferably retains any parameters that may be needed to continue to read required data from memory. Such parameters may include, for example, the current read address, the address of the start of the next lines, the number of bytes to read per line, and the pitch. Since the pipeline preferably includes a vertical filter block for anti-flutter and scaling purposes, the DMA block preferably accesses a set of adjacent display lines in the same frame, in both fields. If the output of the system is NTSC or other form of interlaced video, the DMA preferably accesses both fields of the interlaced final display under certain conditions, such as when the vertical filter and scaling are enabled. In such a case, all lines, not just those from the current display field, are preferably read from memory and processed during every display field. In this embodiment, the effective rate of reading and processing graphics is equivalent to that of a non-interlaced display with a frame rate equal to the field rate of the interlaced display.
The FIFO block 88 temporarily stores data read from the memory 110 by the DMA block 86, and provides the data on demand to the graphics converter block 90. The FIFO may also serve to bridge a boundary between different clock domains in the event that the memory and DMA operate under a clock frequency or phase that differs from the graphics converter block 90 and the graphics blending block 94. In an alternate embodiment, the FIFO block is not needed. The FIFO block may be unnecessary, for example, if the graphics converter block processes data from memory at the rate that it is read from the memory and the memory and conversion functions are in the same clock domain.
In the preferred embodiment, the graphics converter block 90 takes raw graphics data from the FIFO block and converts it to YUValpha (YUVa) format. Raw graphics data may include graphics data from memory that has not yet been processed by the display engine. One type of YUVa format that the system may use includes YUV 4:2:2 (i.e. two U and V samples for every four Y samples) plus an 8-bit alpha value for every pixel, which occupies overall 24 bits per pixel. Another suitable type of YUVa format includes YUV 4:4:4 plus the 8-bit alpha value per pixel, which occupies 32 bits per pixel. In an alternate embodiment, the graphics converter may convert the raw graphics data into a different format, such as RGBalpha.
The alpha value included in the YUVa output may depend on a number of factors, including alpha from chroma keying in which a transparent pixel has an alpha equal to zero, alpha per CLUT entry, alpha from Y (luma), or alpha per window where one alpha value characterizes all of the contents of a given window.
The graphics converter block 90 preferably accesses the CLUT 92 during conversion of CLUT formatted raw graphics data. In one embodiment of the present invention, there is only one CLUT. In an alternate embodiment, multiple CLUTs are used to process different graphics windows having graphics data with different CLUT formats. The CLUT may be rewritten by retrieving new CLUT data via the DMA block when required. In practice, it typically takes longer to rewrite the CLUT than the time available in a horizontal blanking interval, so the system preferably allows one horizontal line period to change the CLUT. Non-CLUT images may be displayed while the CLUT is being changed. The color space of the entries in the CLUT is preferably in YUV but may also be implemented in RGB.
The graphics blending block 94 receives output from the graphics converter block 90 and preferably blends one window at a time along the entire width of one scan line, with the back-most graphics window being processed first. The blending block uses the output from the converter block to modify the contents of the SRAM 96. The result of each pixel blend operation is a pixel in the SRAM that consists of the weighted sum of the various graphics layers up to and including the present one, and the appropriate alpha blend value for the video layers, taking into account the graphics layers up to and including the present one.
The SRAM 96 is preferably configured as a set of graphics line buffers, where each line buffer corresponds to a single display line. The blending of graphics windows is preferably performed one graphics window at a time on the display line that is currently being composited into a line buffer. Once the display line in a line buffer has been completely composited so that all the graphics windows on that display line have been blended, the line buffer is made available to the filtering block 98.
The filtering block 98 preferably performs both anti-flutter filtering (AFF) and vertical sample rate conversion (SRC) using the same filter. This block takes input from the line buffers and performs finite impulse response polyphase filtering on the data. While anti-flutter filtering and vertical axis SRC are done in the vertical axis, there may be different functions, such as horizontal SRC or scaling that are performed in the horizontal axis. In the preferred embodiment, the filter takes input from only vertically adjacent pixels at one time. It multiplies each input pixel times a specified coefficient, and sums the result to produce the output. The polyphase action means that the coefficients, which are samples of an approximately continuous impulse response, may be selected from a different fractional-pixel phase of the impulse response every pixel. In an alternate embodiment, where the filter performs horizontal scaling, appropriate coefficients are selected for a finite impulse response polyphase filter to perform the horizontal scaling. In an alternate embodiment, both horizontal and vertical filtering and scaling can be performed.
The video display pipeline 82 may include a FIFO block 100, an SRAM block 102, and a video scaler 104. The video display pipeline portion of the architecture is similar to that of the graphics display pipeline, and it shares some elements with it. In the preferred embodiment, the video pipeline supports up to one scaled video window per scan line, one passthrough video window, and one background color, all of which are logically behind the set of graphics windows. The order of these windows, from back to front, is preferably fixed as background color, then passthrough video, then scaled video.
The video windows are preferably in YUV format, although they may be in either 4:2:2 or 4:2:0 variants or other variants of YUV, or alternatively in other formats such as RGB. The scaled video window may be scaled up in both directions by the display engine, with a factor that can range up to four in the preferred embodiment. Unlike graphics, the system generally does not have to correct for square pixel aspect ratio with video. The scaled video window may be alpha blended into passthrough video and a background color, preferably using a constant alpha value for each video signal.
The FIFO block 100 temporarily stores captured video windows for transfer to the video scaler 104. The video scaler preferably includes a filter that performs both upscaling and downscaling. The scaler function may be a set of two polyphase SRC functions, one for each dimension. The vertical SRC may be a four-tap filter with programmable coefficients in a fashion similar to the vertical filter in the graphics pipeline, and the horizontal filter may use an 8-tap SRC, also with programmable coefficients. In an alternate embodiment, a shorter horizontal filter is used, such as a 4-tap horizontal SRC for the video upscaler. Since the same filter is preferably used for downscaling, it may be desirable to use more taps than are strictly needed for upscaling to accommodate low pass filtering for higher quality downscaling.
In the preferred embodiment, the video pipeline uses a separate window controller and DMA. In an alternate embodiment, these elements may be shared. The FIFOs are logically separate but may be implemented in a common SRAM.
The video compositor block 108 blends the output of the graphics display pipeline, the video display pipeline, and passthrough video. The background color is preferably blended as the lowest layer on the display, followed by passthrough video, the video window and blended graphics. In the preferred embodiment, the video compositor composites windows directly to the screen line-by-line at the time the screen is displayed, thereby conserving memory and bandwidth. The video compositor may include, but preferably does not include, display frame buffers, double-buffered displays, off-screen bit maps, or blitters.
Referring to FIG. 5, the display engine 58 preferably includes graphics FIFO 132, graphics converter 134, RGB-to-YUV converter 136, YUV-444-to-YUV422 converter 138 and graphics blender 140. The graphics FIFO 132 receives raw graphics data from memory through a graphics DMA 124 and passes it to the graphics converter 134, which preferably converts the raw graphics data into YUV 4:4:4 format or other suitable format. A window controller 122 controls the transfer of raw graphics data from memory to the graphics converter 132. The graphics converter preferably accesses the RGB-to-YUV converter 136 during conversion of RGB formatted data and the graphics CLUT 146 during conversion of CLUT formatted data. The RGB-to-YUV converter is preferably a color space converter that converts raw graphics data in RGB space to graphics data in YUV space. The graphics CLUT 146 preferably includes a CLUT 150, which stores pixel values for CLUT-formatted graphics data, and a CLUT controller 152, which controls operation of the CLUT.
The YUV444-to-YUV422 converter 138 converts graphics data from YUV 4:4:4 format to YUV 4:2:2 format. The term YUV 4:4:4 means, as is conventional, that for every four horizontally adjacent samples, there are four Y values, four U values, and four V values; the term YUV 4:2:2 means, as is conventional, that for every four samples, there are four Y values, two U values and two V values. The YUV444-to-YUV422 converter 138 is preferably a UV decimator that sub-samples U and V from four samples per every four samples of Y to two samples per every four samples of Y.
Graphics data in YUV 4:4:4 format and YUV 4:2:2 format preferably also includes four alpha values for every four samples. Graphics data in YUV 4:4:4 format with four alpha values for every four samples may be referred to as being in a YUV
4:4:4:4 format; graphics data in YUV 4:2:2 format with four alpha values for every four samples may be referred to as being in aYUV 4:4:2:2 format.
The YUV444-to-YUV422 converter may also perform low-pass filtering of UV and alpha. For example, if the graphics data with YUV 4:4:4 format has higher than desired frequency content, a low pass filter in the YUV444-to-YUV422 converter may be turned on to filter out high frequency components in the U and V signals, and to perform matched filtering of the alpha values.
The graphics blender 140 blends the YUV 4:2:2 signals together, preferably one line at a time using alpha blending, to create a single line of graphics from all of the graphics windows on the current display line. The filter 170 preferably includes a single 4-tap vertical polyphase graphics filter 172, and a vertical coefficient memory 174. The graphics filter may perform both anti-flutter filtering and vertical scaling. The filter preferably receives graphics data from the display engine through a set of seven line buffers 59, where four of the seven line buffers preferably Provide data to the taps of the graphics filter at any given time.
In the preferred embodiment, the system may receive video input that includes one decoded MPEG video in ITU-R 656 format and one analog video signal. The ITU-R 656 decoder 160 processes the decoded MPEG video to extract timing and data information. In one embodiment, an on-chip video decoder (VDEC) 50 converts the analog video signal to a digitized video signal. In an alternate embodiment, an external VDEC such as the Brooktree BT829 decoder converts the analog video into digitized analog video and provides the digitized video to the system as bypass video 130.
Analog video or MPEG video may be provided to the video compositor as passthrough video. Alternatively, either type of video may be captured into memory and provided to the video compositor as a scaled video window. The digitized analog video signals preferably have a pixel sample rate of 13.5 MHz, contain a 16 bit data stream in YUV 4:2:2 format, and include timing signals such as top field and vertical sync signals.
The VDEC 50 includes a time base corrector (TBC) 72 comprising a TBC controller 164 and a FIFO 166. To provide passthrough video that is synchronized to a display clock preferably without using a frame buffer, the digitized analog video is corrected in the time domain in the TBC 72 before being blended with other graphics and video sources. During time base correction, the video input which runs nominally at 13.5 MHZ is synchronized with the display clock which runs nominally at 13.5 MHZ at the output; these two frequencies that are both nominally 13.5 MHz are not necessarily exactly the same frequency. In the TBC, the video output is preferably offset from the video input by a half scan line per field.
A capture FIFO 158 and a capture DMA 154 preferably capture the digitized analog video signals and MPEG video. The SDRAM controller 126 provides captured video frames to the external SDRAM. A video DMA 144 transfers the captured video frames to a video FIFO 148 from the external SDRAM.
The digitized analog video signals and MPEG video are preferably scaled down to less than 100% prior to being captured and are scaled up to more than 100% after being captured. The video scaler 52 is shared by both upscale and downscale operations. The video scaler preferably includes a multiplexer 176, a set of line buffers 178, a horizontal and vertical coefficient memory 180 and a scaler engine 182. The scaler engine 182 preferably includes a set of two polyphase filters, one for each of horizontal and vertical dimensions.
The vertical filter preferably includes a four-tap filter with programmable filter coefficients. The horizontal filter preferably includes an eight-tap filter with programmable filter coefficients. In the preferred embodiment, three line buffers 178 supply video signals to the scaler engine 182. The three line buffers 178 preferably are 720.times.16 two port SRAM. For vertical filtering, the three line buffers 178 may provide video signals to three of the four taps of the four-tap vertical filter while the video input provides the video signal directly to the fourth tap. For horizontal filtering, a shift register having eight cells in series may be used to provide inputs to the eight taps of the horizontal polyphase filter, each cell providing an input to one of the eight taps.
For downscaling, the multiplexer 168 preferably provides a video signal to the video scaler prior to capture. For upscaling, the video FIFO 148 provides a video signal to the video scaler after capture. Since the video scaler 52 is shared between downscaling and upscaling filtering, downscaling and upscaling operations are not performed at the same time in this particular embodiment.
In the preferred embodiment, the video compositor 60 blends signals from up to four different sources, which may include blended graphics from the filter 170, video from a video FIFO 148, passthrough video from a multiplexer 168, and background color from a background color module 184. Alternatively, various numbers of signals may be composited, including, for example, two or more video windows. The video compositor preferably provides final output signal to the data size converter 190, which serializes the 16-bit word sample into an 8-bit word sample at twice the clock frequency, and provides the 8-bit word sample to the video encoder 62.
The video encoder 62 encodes the provided YUV 4:2:2 video data and outputs it as an output of the graphics display system in any desired analog or digital format.
II. Window Descriptor and Solid Surface Description
Often in the creation of graphics displays, the artist or application developer has a need to include rectangular objects on the screen, with the objects having a solid color and a uniform alpha blend factor (alpha value). These regions (or objects) may be rendered with other displayed objects on top of them or beneath them. In conventional graphics devices, such solid color objects are rendered using the number of distinct pixels required to fill the region. It may be advantageous in terms of memory size and memory bandwidth to render such objects on the display directly, without expending the memory size or bandwidth required in conventional approaches.
In the preferred embodiment, video and graphics are displayed on regions referred to as windows. Each window is preferably a rectangular area of screen bounded by starting and ending display lines and starting and ending pixels on each display line. Raw graphics data to be processed and displayed on a screen preferably resides in the external memory. In the preferred embodiment, a display engine converts raw graphics data into a pixel map with a format that is suitable for display.
In one embodiment of the present invention, the display engine implements graphics windows of many types directly in hardware. Each of the graphics windows on the screen has its own value of various parameters, such as location on the screen, starting address in memory, depth order on the screen, pixel color type, etc. The graphics windows may be displayed such that they may overlap or cover each other, with arbitrary spatial relationships.
In the preferred embodiment, a data structure called a window descriptor contains parameters that describe and control each graphics window. The window descriptors are preferably data structures for representing graphics images arranged in logical surfaces, or windows, for display. Each data structure preferably includes a field indicating the relative depth of the logical surface on the display, a field indicating the alpha value for the graphics in the surface, a field indicating the location of the logical surface on the display, and a field indicating the location in memory where graphics image data for the logical surface is stored.
All of the elements that make up any given graphics display screen are preferably specified by combining all of the window descriptors of the graphics windows that make up the screen into a window descriptor list. At every display field time or a frame time, the display engine constructs the display image from the current window descriptor list. The display engine composites all of the graphics windows in the current window descriptor list into a complete screen image in accordance with the parameters in the window descriptors and the raw graphics data associated with the graphics windows.
With the introduction of window descriptors and real-time composition of graphics windows, a graphics window with a solid color and fixed translucency may be described entirely in a window descriptor having appropriate parameters. These parameters describe the color and the translucency (alpha) just as if it were a normal graphics window. The only difference is that there is no pixel map associated with this window descriptor. The display engine generates a pixel map accordingly and performs the blending in real time when the graphics window is to be displayed.
For example, a window consisting of a rectangular object having a constant color and a constant alpha value may be created on a screen by including a window descriptor in the window descriptor list. In this case, the window descriptor indicates the color and the alpha value of the window, and a null pixel format, i.e., no pixel values are to be read from memory. Other parameters indicate the window size and location on the screen, allowing the creation of solid color windows with any size and location. Thus, in the preferred embodiment, no pixel map is required, memory bandwidth requirements are reduced and a window of any size may be displayed.
Another type of graphics window that the window descriptors preferably describe is an alpha-only type window. The alpha-only type windows preferably use a constant color and preferably have graphics data with 2, 4 or 8 bits per pixel. For example, an alpha-4 format may be an alpha-only format used in one of the alpha-only type windows. The alpha-4 format specifies the alpha-only type window with alpha blend values having four bits per pixel. The alpha-only type window may be particularly useful for displaying anti-aliased text.
A window controller preferably controls transfer of graphics display information in the window descriptors to the display engine. In one embodiment, the window controller has internal memory to store eight window descriptors. In other embodiments, the window controller may have memory allocated to store more or less window descriptors. The window controller preferably reads the window descriptors from external memory via a direct memory access (DMA) module.
The DMA module may be shared by both paths of the display pipeline as well as some of the control logic, such as the window controller and the CLUT. In order to support the display pipeline, the DMA module preferably has three channels where the graphics pipeline and the video pipeline use separate DMA modules. These may include window descriptor read, graphics data read and CLUT read. Each channel has externally accessible registers to control the start address and the number of words to read.
Once the DMA module has completed a transfer as indicated by its start and length registers, it preferably activates a signal that indicates the transfer is complete. This allows the DMA module that sets up operations for that channel to begin setting up of another transfer. In the case of graphics data reads, the window controller preferably sets up a transfer of one line of graphics pixels and then waits for the DMA controller to indicate that the transfer of that line is complete before setting up the transfer of the next line, or of a line of another window.
Referring to FIG. 6, each window descriptor preferably includes four 32-bit words (labeled Word 0 through Word 3) containing graphics window display information. Word 0 preferably includes a window operation parameter, a window format parameter and a window memory start address. The window operation parameter preferably is a 2-bit field that indicates which operation is to be performed with the window descriptor. When the window operation parameter is 00b, the window descriptor performs a normal display operation and when it is 01b, the window descriptor performs graphics color look-up table ("CLUT") re-loading. The window operation parameter of 10b is preferably not used. The window operation parameter of 11b preferably indicates that the window descriptor is the last of a sequence of window descriptors in memory.
The window format parameter preferably is a 4-bit field that indicates a data format of the graphics data to be displayed in the graphics window. The data formats corresponding to the window format parameter is described in Table 1 below.
TABLE 1 Graphics Data Formats win.sub.-- Data format Format Data Format Description 0000b RGB16 5-BIT RED, 6-BIT GREEN, 5-BIT BLUE 0001b RGB15+1 RGB15 plus one bit alpha (keying) 0010b RGBA4444 4-BIT RED, GREEN, BLUE, ALPHA 0100b CLUT2
2-bit CLUT with YUV and alpha in table 0101b CLUT4 4-bit CLUT with YUV and alpha in table 0110b CLUT8 8-bit CLUT with YUV and alpha in table 0111b ACLUT16 8-BIT ALPHA, 8-BIT CLUT INDEX 1000b ALPHA0 Single win_alpha and single RGB win_color 1001b ALPHA2 2-bit alpha with single RGB win_color 1010b ALPHA4 4-bit alpha with single RGB win_color 1011b ALPHA8 8-bit alpha with single RGB win_color 1100b YUV422 U and V are sampled at half the rate of Y 1111b RESERVED Special coding for blank line in new header, i.e., indicates an empty line
The window memory start address preferably is a 26-bit data field that indicates a starting memory address of the graphics data of the graphics window to be displayed on the screen. The window memory start address points to the first address in the corresponding external SDRAM which is accessed to display data on the graphics window defined by the window descriptor. When the window operation parameter indicates the graphics CLUT reloading operation, the window memory start address indicates a starting memory address of data to be loaded into the graphics CLUT.
Word 1 in the window descriptor preferably includes a window layer parameter, a window memory pitch value and a window color value. The window layer parameter is preferably a 4-bit data indicating the order of layers of graphics windows. Some of the graphics windows may be partially or completely stacked on top of each other, and the window layer parameter indicates the stacking order. The window layer parameter preferably indicates where in the stack the graphics window defined by the window descriptor should be placed.
In the preferred embodiment, a graphics window with a window layer parameter of 0000b is defined as the bottom most layer, and a graphics window with a window layer parameter of 1111b is defined as the top most layer. Preferably, up to eight graphics windows may be processed in each scan line. The window memory pitch value is preferably a 12-bit data field indicating the pitch of window memory addressing. Pitch refers to the difference in memory address between two pixels that are vertically adjacent within a window.
The window color value preferably is a 16-bit RGB color, which is applied as a single color to the entire graphics window when the window format parameter is 1000b, 1001b, 1010b, or 1011b. Every pixel in the window preferably has the color specified by the window color value, while the alpha value is determined per pixel and per window as specified in the window descriptor and the pixel format. The engine preferably uses the window color value to implement a solid surface.
Word 2 in the window descriptor preferably includes an alpha type, a widow alpha value, a window y-end value and a window y-start value. The word 2 preferably also includes two bits reserved for future definition, such as high definition television (HD) applications. The alpha type is preferably a 2-bit data field that indicates the method of selecting an alpha value for the graphics window. The alpha type of 00b indicates that the alpha value is to be selected from chroma keying. Chroma keying determines whether each pixel is opaque or transparent based on the color of the pixel. Opaque pixels are preferably considered to have an alpha value of 1.0, and transparent pixels have an alpha value of 0, both on a scale of 0 to 1. Chroma keying compares the color of each pixel to a reference color or to a range of possible colors; if the pixel matches the reference color, or if its color falls within the specified range of colors, then the pixel is determined to be transparent. Otherwise it is determined to be opaque.
The alpha type of 01b indicates that the alpha value should be derived from the graphics CLUT, using the alpha value in each entry of the CLUT. The alpha type of 10b indicates that the alpha value is to be derived from the luminance Y. The Y value that results from conversion of the pixel color to the YUV color space, if the pixel color is not already in the YUV color, is used as the alpha value for the pixel. The alpha type of 11b indicates that only a single alpha value is to be applied to the entire graphics window. The single alpha value is preferably included as the window alpha value next.
The window alpha value preferably is an 8-bit alpha value applied to the entire graphics window. The effective alpha value for each pixel in the window is the product of the window alpha and the alpha value determined for each pixel. For example, if the window alpha value is 0.5 on a scale of 0 to 1, coded as 0x80, then the effective alpha value of every pixel in the window is one-half of the value encoded in or for the pixel itself. If the window format parameter is 1000b, i.e., a single alpha value is to be applied to the graphics window, then the per-pixel alpha value is treated as if it is 1.0, and the effective alpha value is equal to the window alpha value.
The window y-end value preferably is a 10-bit data field that indicates the ending display line of the graphics window on the screen. The graphics window defined by the window descriptor ends at the display line indicated by the window y-end value. The window y-start value preferably is a 10-bit data field that indicates a starting display line of the graphics window on a screen. The graphics window defined by the window descriptor begins at the display line indicated in the window y-start value. Thus, a display of a graphics window can start on any display line on the screen based on the window y-start value.
Word 3 in the window descriptor preferably includes a window filter enable parameter, a blank start pixel value, a window x-size value and a window x-start value. In addition, the word 3 includes two bits reserved for future definition, such as HD applications. Five bits of the 32-bit word 3 are not used. The window filter enable parameter is a 1-bit field that indicates whether low pass filtering is to be enabled during YUV 4:4:4 to YUV 4:2:2 conversion.
The blank start pixel value preferably is a 4-bit parameter indicating a number of blank pixels at the beginning of each display line. The blank start pixel value preferably signifies the number of pixels of the first word read from memory, at the beginning of the corresponding graphics window, to be discarded. This field indicates the number of pixels in the first word of data read from memory that are not displayed. For example, if memory words are 32 bits wide and the pixels are 4 bits each, there are 8 possible first pixels in the first word. Using this field, 0 to 7 pixels may be skipped, making the 1.sup.st to the 8.sup.th pixel in the word appear as the first pixel, respectively. The blank start pixel value allows graphics windows to have any horizontal starting position on the screen, and may be used during soft horizontal scrolling of a graphics window.
The window x-size value preferably is a 10-bit data field that indicates the size of a graphics window in the x direction, i.e., horizontal direction. The window x-size value preferably indicates the number of pixels of a graphics window in a display line.
The window x-start value preferably is a 10-bit data field that indicates a starting pixel of the graphics window on a display line. The graphics window defined by the window descriptor preferably begins at the pixel indicated by the window x-start value of each display line. With the window x-start value, any pixel of a given display line can be chosen to start painting the graphics window. Therefore, there is no need to load pixels on the screen prior to the beginning of the graphics window display area with black.
III. Graphics Window Control Data Passing Mechanism
In one embodiment of the present invention, a FIFO in the graphics display path accepts raw graphics data as the raw graphics data is read from memory, at the full memory data rate using a clock of the memory controller. In this embodiment, the FIFO provides this data, initially stored in an external memory, to subsequent blocks in the graphics pipeline.
In systems such as graphics display systems where multiple types of data may be output from one module, such as a memory controller subsystem, and used in another subsystem, such as a graphics processing subsystem, it typically becomes progressively more difficult to support a combination of dynamically varying data types and data transfer rates and FIFO buffers between the producing and consuming modules. The conventional way to address such problems is to design a logic block that understands the varying parameters of the data types in the first module and controls all of the relevant variables in the second module. This may be difficult due to variable delays between the two modules, due to the use of FIFOs between them and varying data rate, and due to the complexity of supporting a large number of data types.
The system preferably processes graphics images for display by organizing the graphics images into windows in which the graphics images appear on the screen, obtaining data that describes the windows, sorting the data according to the depth of the window on the display, transferring graphics images from memory, and blending the graphics images using alpha values associated with the graphics images.
In the preferred embodiment, a packet of control information called a header packet is passed from the window controller to the display engine. All of the required control information from the window controller preferably is conveyed to the display engine such that all of the relevant variables from the window controller are properly controlled in a timely fashion and such that the control is not dependent on variations in delays or data rates between the window controller and the display engine.
A header packet preferably indicates the start of graphics data for one graphics window. The graphics data for that graphics window continues until it is completed without requiring a transfer of another header packet. A new header packet is preferably placed in the FIFO when another window is to start. The header packets may be transferred according to the order of the corresponding window descriptors in the window descriptor lists.
In a display engine that operates according to lists of window descriptors, windows may be specified to overlap one another. At the same time, windows may start and end on any line, and there may be many windows visible on any one line. There are a large number of possible combinations of window starting and ending locations along vertical and horizontal axes and depth order locations. The system preferably indicates the depth order of all windows in the window descriptor listand implements the depth ordering correctly while accounting for all windows.
Each window descriptor preferably includes a parameter indicating the depth location of the associated window. The range that is allowed for this parameter can be defined to be almost any useful value. In the preferred embodiment there are 16
possible depth values, ranging from 0 to 15, with 0 being the back-most (deepest, or furthest from the viewer), and 15 being the top or front-most depth. The window descriptors are ordered in the window descriptor list in order of the first display scan line where the window appears. For example if window A spans lines 10 to 20, window B spans lines 12 to 18, and window C spans lines 5 to 20, the order of these descriptors in the list would be {C, A, B}.
In the hardware, which is a preferably a VLSI device, there is preferably on-chip memory capable of storing a number of window descriptors. In the preferred implementation, this memory can store up to 8 window descriptors on-chip, however the size of this memory may be made larger or smaller without loss of generality. Window descriptors are read from main memory into the on-chip descriptor memory in order from the start of the list, and stopping when the on-chip memory is full or when the most recently read descriptor describes a window that is not yet visible, i.e., its starting line is on a line that has a higher number than the line currently being constructed. Once a window has been displayed and is no longer visible, it may be cast out of the on-chip memory and the next descriptor in the list may read from main memory. At any given display line, the order of the window descriptors in the on-chip memory bears no particular relation to the depth order of the windows on the screen.
The hardware that controls the compositing of windows builds up the display in layers, starting from the back-most layer. In the preferred embodiment, the back most layer is layer 0. The hardware performs a quick search of the back-most window descriptor that has not yet been composited, regardless of its location in the on-chip descriptor memory. In the preferred embodiment, this search is performed as follows:
All 8 window descriptors are stored on chip in such a way that the depth order numbers of all of them are available simultaneously. While the depth numbers in the window descriptors are 4 bit numbers, representing 0 to 15, the on-chip memory has storage for 5 bits for the depth number. Initially the 5 bit for each descriptor is set to 0. The depth order values are compared in a hierarchy of pair-wise comparisons, and the lower of the two depth numbers in each comparison wins the comparison. That is, at the first stage of the test descriptor pairs {0, 1}, {2, 3}, {4, 5}, and {6, 7} are compared, where {0-7} represent the eight descriptors stored in the on-chip memory. This results in four depth numbers with associated descriptor numbers. At the next stage two pair-wise comparisons compare {(0, 1), (2, 3)} and {(4, 5), (6, 7)}.
Each of these results in a depth number of the lower depth order number and the associated descriptor number. At the third stage, one pair-wise comparison finds the smallest depth number of all, and its associated descriptor number. This number points the descriptor in the on-chip memory with the lowest depth number, and therefore the greatest depth, and this descriptor is used first to render the associated window on the screen. Once this window has been rendered onto the screen for the current scan line, the fifth bit of the depth number in the on-chip memory is set to 1, thereby ensuring that the depth value number is greater than 15, and as a result this depth number will preferably never again be found to be the back-most window until all windows have been rendered on this scan line, preventing rendering this window twice.
Once all the windows have been rendered for a given scan line, the fifth bits of all the on-chip depth numbers are again set to 0; descriptors that describe windows that are no longer visible on the screen are cast out of the on-chip memory; new descriptors are read from memory as required (that is, if all windows in the on-chip memory are visible, the next descriptor is read from memory, and this repeats until the most recently read descriptor is not yet visible on the screen), and the process of finding the back most descriptor and rendering windows onto the screen repeats.
Referring to FIG. 7, window descriptors are preferably sorted by the window controller and used to transfer graphics data to the display engine. Each of window descriptors, including the window descriptor 0 through the window descriptor 7300a-h, preferably contains a window layer parameter. In addition, each window descriptor is preferably associated with a window line done flag indicating that the window descriptor has been processed on a current display line.
The window controller preferably performs window sorting at each display line using the window layer parameters and the window line done flags. The window controller preferably places the graphics window that corresponds to the window descriptor with the smallest window layer parameter at the bottom, while placing the graphics window that corresponds to the window descriptor with the largest window layer parameter at the top.
The window controller preferably transfers the graphics data for the bottom-most graphics window to be processed first. The window parameters of the bottom-most window are composed into a header packet and written to the graphics FIFO. The DMA engine preferably sends a request to the memory controller to read the corresponding graphics data for this window and send the graphics data to the graphics FIFO. The graphics FIFO is then read by the display engine to compose a display line, which is then written to graphics line buffers.
The window line done flag is preferably set true whenever the window surface has been processed on the current display line. The window line done flag and the window layer parameter may be concatenated together for sorting. The window line done flag is added to the window layer parameter as the most significant bit during sorting such that {window line done flag[4], window layer parameter[3:0]} is a five bit binary number, a window layer value, with window line done flag as the most significant bit.
The window controller preferably selects a window descriptor with the smallest window layer value to be processed. Since the window line done flag is preferably the most significant bit of the window layer value, any window descriptor with this flag set, i.e., any window that has been processed on the current display line, will have a higher window layer value than any of the other window descriptors that have not yet been processed on the current display line. When a particular window descriptor is processed, the window line done flag associated with that particular window descriptor is preferably set high, signifying that the particular window descriptor has been processed for the current display line.
A sorter 304 preferably sorts all eight window descriptors after any window descriptor is processed. The sorting may be implemented using binary tree sorting or any other suitable sorting algorithm. In binary tree sorting for eight window descriptors, the window layer value for four pairs of window descriptors are compared at a first level using four comparators to choose the window descriptor that corresponds to a lower window in each pair. In the second level, two comparators are used to select the window descriptor that corresponds to the bottom most graphics window in each of two pairs. In the third and the last level, the bottom-most graphics windows from each of the two pairs are compared against each other preferably using only one comparator to select the bottom window.
A multiplexer 302 preferably multiplexes parameters from the window descriptors. The output of the sorter, i.e., window selected to be the bottom most, is used to select the window parameters to be sent to a direct memory access ("DMA") module
306 to be packaged in a header packet and sent to a graphics FIFO 308. The display engine preferably reads the header packet in the graphics FIFO and processes the raw graphics data based on information contained in the header packet.
The header packet preferably includes a first header word and a second header word. Corresponding graphics data is preferably transferred as graphics data words. Each of the first header word, the second header word and the graphics data words preferably includes 32 bits of information plus a data type bit. The first header word preferably includes a 1-bit data type, a 4-bit graphics type, a 1-bit first window parameter, a 1-bit top/bottom parameter, a 2-bit alpha type, an 8-bit window alpha value and a 16-bit window color value. Table 2 shows contents of the first header word.
TABLE 2 First Header Word Bit 32 31-28 27 26 25-24 23-16 15-0 Position Data Data graphics First top/ alpha window window Content type type Win- bottom type alpha color dow
The 1-bit data type preferably indicates whether a 33-bit word in the FIFO is a header word or a graphics data word. A data type of 1 indicates that the associated 33-bit word is a header word while the data type of 0 indicates that the associated 33-bit word is a graphics data word. The graphics type indicates the data format of the graphics data to be displayed in the graphics window similar to the window format parameter in the word 0 of the window descriptor, which is described in Table 1 above. In the preferred embodiment, when the graphics type is 1111, there is no window on the current display line, indicating that the current display line is empty.
The first window parameter of the first header word preferably indicates whether the window associated with that first header word is a first window on a new display line. The top/bottom parameter preferably indicates whether the current display line indicated in the first header word is at the top or the bottom edges of the window. The alpha type preferably indicates a method of selecting an alpha value individually for each pixel in the window similar to the alpha type in the word 2 of the window descriptor.
The window alpha value preferably is an alpha value to be applied to the window as a whole and is similar to the window alpha value in the word 2 of the window descriptor. The window color value preferably is the color of the window in 16-bit RGB format and is similar to the window color value in the word 1 of the window descriptor.
The second header word preferably includes the 1-bit data type, a 4-bit blank pixel count, a 10-bit left edge value, a 1-bit filter enable parameter and a 10-bit window size value. Table 3 shows contents of the second header word in the preferred embodiment.
TABLE 2 Second Header Word Bit 32 31-28 25-16 10 9-0 Position Data data Blank pixel Left edge filter window size Content type count enabler
Similar to the first header word, the second header word preferably starts with the data type indicating whether the second header word is a header word or a graphics data word. The blank pixel count preferably indicates a number of blank pixels at a left edge of the window and is similar to the blank start pixel value in the word 3 of the window descriptor. The left edge preferably indicates a starting location of the window on a scan line, and is similar to the window x-start value in the word 3 of the window descriptor. The filter enable parameter preferably enables a filter during a conversion of graphics data from a YUV 4:4:4 format to a YUV 4:2:2 format and is similar to the window filter enable parameter in word 3 of the window descriptor. Some YUV 4:4:4 data may contain higher frequency content than others, which may be filtered by enabling a low pass filter during a conversion to the YUV 4:2:2 format. The window size value preferably indicates the actual horizontal size of the window and is similar to the window x-size value in word 3 of the window descriptor.
When the composition of the last window of the last display line is completed, an empty-line header is preferably placed into the FIFO so that the display engine may release the display line for display.
Packetized data structures have been used primarily in the communication world where large amount of data needs to be transferred between hardware using a physical data link (e.g., wires). The idea is not known to have been used in the graphics world where localized and small data control structures need to be transferred between different design entities without requiring a large off-chip memory as a buffer. In one embodiment of the present system, header packets are used, and a general-purpose FIFO is used for routing. Routing may be accomplished in a relatively simple manner in the preferred embodiment because the write port of the FIFO is the only interface.
In the preferred embodiment, the graphics FIFO is a synchronous 32.times.33 FIFO built with a static dual-port RAM with one read port and one write port. The write port preferably is synchronous to a 81 MHz memory clock while the read port may be asynchronous (not synchronized) to the memory clock. The read port is preferably synchronous to a graphics processing clock, which runs preferably at 81 MHz, but not necessarily synchronized to the memory clock. Two graphics FIFO pointers are preferably generated, one for the read port and one for the write port. In this embodiment, each graphics FIFO pointer is a 6-bit binary counter which ranges from 000000b to 111111b, i.e., from 0 to 63. The graphics FIFO is only 32 words deep and requires only 5 bits to represent each 33-bit word in the graphics FIFO. An extra bit is preferably used to distinguish between FIFO full and FIFO empty states.
The graphics data words preferably include the 1-bit data type and 32-bit graphics data bits. The data type is 0 for the graphics data words. In order to adhere to a common design practice that generally limits the size of a DMA burst into a FIFO to half the size of the FIFO, the number of graphics data words in one DMA burst preferably does not exceed 16.
In an alternate embodiment, a graphics display FIFO is not used. In this embodiment, the graphics converter processes data from memory at the rate that it is read from memory. The memory and conversion functions are in a same clock domain. Other suitable FIFO designs may be used.
Referring to FIG. 8, a flow diagram illustrates a process for loading and processing window descriptors. First the system is preferably reset in step 310. Then the system in step 312 preferably checks for a vertical sync ("VSYNC"). When the VSYNC is received, the system in step 314 preferably proceeds to load window descriptors into the window controller from the external SDRAM or other suitable memory over the DMA channel for window descriptors. The window controller may store up to eight window descriptors in one embodiment of the present invention.
The step in step 316 preferably sends a new line header indicating the start of a new display line. The system in step 320 preferably sorts the window descriptors in accordance with the process described in reference to FIG. 7. Although sorting is indicated as a step in this flow diagram, sorting actually may be a continuous process of selecting the bottom-most window, i.e., the window to be processed. The system in step 322 preferably checks to determine if a starting display line of the window is greater than the line count of the current display line. If the starting display line of the window is greater than the line count, i.e., if the current display line is above the starting display line of the bottom most window, the current display line is a blank line. Thus, the system in step 318 preferably increments the line count and sends another new line header in step 316. The process of sending a new line header and sorting window descriptor continues as long as the starting display line of the bottom most (in layer order) window is below the current display line.
The display engine and the associated graphics filter preferably operate in one of two modes, a field mode and a frame mode. In both modes, raw graphics data associated with graphics windows is preferably stored in frame format, including lines from both interlaced fields in the case of an interlaced display. In the field mode, the display engine preferably skips every other display line during processing. In the field mode, therefore, the system in step 318 preferably increments the line count by two each time to skip every other line. In the frame mode, the display engine processes every display line sequentially. In the frame mode, therefore, the system in step 318 preferably increments the line count by one each time.
When the system in step 322 determines that the starting display of the window is greater than the line count, the system in step 324 preferably determines from the header packet whether the window descriptor is for displaying a window or re-loading the CLUT. If the window header indicates that the window descriptor is for re-loading CLUT, the system in step 328 preferably sends the CLUT data to the CLUT and turns on the CLUT write strobe to load CLUT.
If the system in step 324 determines that the window descriptor is for displaying a window, the system in step 326 preferably sends a new window header to indicate that graphics data words for a new window on the display line are going to be transferred into the graphics FIFO. Then, the system in step 330 preferably requests the DMA module to send graphics data to the graphics FIFO over the DMA channel for graphics data. In the event the FIFO does not have sufficient space to store graphics data in a new data packet, the system preferably waits until such space is made available.
When graphics data for a display line of a current window is transferred to the FIFO, the system in step 332 preferably determines whether the last line of the current window has been transferred. If the last line has been transferred, a window descriptor done flag associated with the current window is preferably set. The window descriptor done flag indicates that the graphics data associated with the current window descriptor has been completely transferred. When the window descriptor done flag is set, i.e., when the current window descriptor is completely processed, the system sets a window descriptor done flag in step 334. Then the system in step 336 preferably sets a new window descriptor update flag and increments a window descriptor update counter to indicate that a new window descriptor is to be copied from the external memory.
Regardless of whether the last line of the current window has been processed, the system in step 338 preferably sets the window line done flag for the current window descriptor to signify that processing of this window descriptor on the current display line has been completed. The system in step 340 preferably checks the window line done flags associated with all eight window descriptors to determine whether they are all set, which would indicate that all the windows of the current display line have been processed. If not all window line done flags are set, the system preferably proceeds to step 320 to sort the window descriptors and repeat processing of the new bottom-most window descriptor.
If all eight window line done flags are determined to be set in step 340, all window descriptors on the current display line have been processed. In this case, the system in step 342 preferably checks whether an all window descriptor done flag has been set to determine whether all window descriptors have been processed completely. The all window descriptor done flag is set when processing of all window descriptors in the current frame or field have been processed completely. If the all window descriptor done flag is set, the system preferably returns to step 310 to reset and awaits another VSYNC in step 312. If not all window descriptors have been processed, the system in step 344 preferably determines if the new window descriptor update flag has been set. In the preferred embodiment, this flag would have been set in step 334 if the current window descriptor has been completely processed.
When the new window descriptor update flag is set, the system in step 352 preferably sets up the DMA to transfer a new window descriptor from the external memory. Then the system in step 350 preferably clears the new window descriptor update flag.
After the system c