United States Patent6026484
GolstonFebruary 15, 2000

Title

Data processing apparatus, system and method for if, then, else operation using write priority

Abstract

A data processing apparatus employs write priority to permit a data processing apparatus to execute an if, then, else operation in a single instruction cycle. The data processing apparatus includes pipelined data unit (110) and address unit (120) operations. The address unit (120) data move operation has a higher write priority than the storing of the data unit (110) operation. The data unit (110) includes an arithmetic logic unit (230) that performs an unconditional operation with the result to be stored in a destination register (200). The address unit (120) sets the address for a data move operation to the same destination register (200). The data move operation is conditional upon the if condition set by the instruction and based upon a set of status bits in a status register (210). The status register (210) includes a plurality of status bits set corresponding to a prior arithmetic logic unit (230) result. The status bits preferably include a negative status bit, a carry status bit, an overflow status bit and a zero status bit. This address unit (120) data move operation, having a higher write priority than the data unit (110) operation, controls the data written into the destination register (200). If the status bits do not match the condition specified in the instruction, then the conditional data move does not take place and the results of the data unit operation are stored in the destination register (200).


Inventors:Golston; Jeremiah E. (Sugar Land, TX)
Assignee:Texas Instruments Incorporated (Dallas, TX)
Appl. No.:160300
Filed:November 30, 1993

Current U.S. Class:712/226 712/234 712/225 
Field of Search:712/234,220,236,221,225,223,226,227 710/244 708/525

U.S. Patent Documents
4179746December 1979Tubbs
4811266March 1989Woods et al.
4873627October 1989Baum et al.
4933878June 1990Guttag et al.
5045995September 1991Levinthal et al.
5125092June 1992Prener
5140687August 1992Dye et al.
5146592September 1992Pfeiffer et al.
5197140March 1993Balmer
5212777May 1993Gove et al.
5226125July 1993Balmer et al.
5231694July 1993Novak et al.
5239654August 1993Ing-Simmons et al.
5247627September 1993Murakami et al.
5249266September 1993Dye et al.
5274777December 1993Kawata
5289427February 1994Nicholes et al.
5349671September 1994Maeda et al.
Foreign Patent Documents
2228652BOct., 1993GB
Other References
Microprocessor Report, Slater, Michael, "IIT Ships Programmable Video Processor," vol. 5, No. 20 Oct. 30, 1991 pp. 1,6-7,13..~
Primary Examiner: Vu; Viet D.
Attorney, Agent or Firm:Marshall, Jr.; Robert D. Laws; Gerald E. Donaldson; Richard L.

Claims


We claim:
1. A method of conditional data processing operation comprising the steps of:
setting a condition to either a first state or a second state;
performing a first arithmetic/logical operation and storing a first result in a first data register with a first write priority; and
conditionally moving predetermined data into said first data register if said condition has said first state with a second write priority, said second write priority of said conditional move being higher than said first write priority of said first arithmetic/logical operation whereby said first data register stores said predetermined data if said condition has said first state or said first result of said first arithmetic/logical operation if said condition does not have said first state.

2. The method of claim 1, wherein:
said step of setting a condition in either a first state or a second state includes
performing a second arithmetic/logical operation and storing a second result in a second data register with said first write priority,
setting at least one status bit dependent upon said second result, and
storing said at least one status bit in a status register.

3. The method of claim 2, wherein:
said step of setting a condition in either a first state or a second state further includes
moving data stored in said second data register into said second data register with said second write priority, said move thereby having a higher write priority than said storing of said second result whereby said second data register stores unchanged data.

4. The method of claim 2, wherein:
said step of setting a condition in either a first state or a second state wherein
said at least one status bit includes a negative status bit set to said first state if said second result is negative and set to said second state if said second result is not negative.

5. The method of claim 2, wherein:
said step of setting a condition in either a first state or a second state wherein
said at least one status bit includes a carry out status bit set to said first state if said second result generates a carry out and set to said second state if said second result does not generate a carry out.

6. The method of claim 2, wherein:
said step of setting a condition in either a first state or a second state wherein
said at least one status bit includes an overflow status bit set to said first state if said second result generates an overflow and set to said second state if said second result does not generate an overflow.

7. The method of claim 2, wherein:
said step of setting a condition in either a first state or a second state wherein
said at least one status bit includes a zero status bit set to said first state if said second result is zero and set to said second state if said second result is not zero.

8. The method of claim 2, wherein:
said step of setting a condition in either a first state or a second state further includes
setting a plurality of status bits dependent upon said second result, and
storing said plurality of status bits in said status register; and
said step of conditionally moving predetermined data into said first data register if said condition has said first state includes testing a selected set of said plurality of said status bit and conditionally moving said predetermined data into said first register only if each status bit in said selected set of said plurality of status bits has a predetermined state.

9. The method of claim 1, wherein:
said step of conditionally moving predetermined data into said first data register moves said predetermined data from a predetermined memory location to said first data register.

10. The method of claim 1, wherein:
said step of conditionally moving predetermined data into said first data register moves said predetermined data from a third data register to said first data register.

11. A data processing apparatus comprising:
a plurality of data registers;
a status register storing at least one status bit;
an arithmetic logic unit having first and second inputs and an output coupled to said plurality of data registers, said arithmetic logic unit performing an arithmetic or logical operation upon data received at said first and second inputs and generating a result at said output; and
an instruction logic circuit connected to said plurality of data registers, said status register and said arithmetic logic unit, said instruction logic circuit controlling said plurality of data registers, said status register and said arithmetic logic unit in response to a received instruction, said instruction logic circuit in response to a first instruction
controlling said arithmetic logic unit to form a first combination of inputs from first designated source data registers and store a first result in a first designated destination data register with a first write priority, and
controlling said plurality of data registers to conditionally move predetermined data into said first designated destination data register if said at least one status bit in said status register has a predetermined state with a second write priority, said second write priority of said conditional move being higher than said first write priority of said first combination of inputs of said arithmetic logic unit whereby said first designated destination data register stores said predetermined data if said at least one status register has said predetermined state or said result of said first combination if said at least one status register does not have said predetermined state.

12. The data processing apparatus of claim 11, wherein:
said instruction decode logic in response to a second instruction
controlling said arithmetic logic unit to form a second combination of inputs from designated source data registers and store a second result in a second designated destination data register with said first write priority, and
setting said at least one status bit in said status register corresponding to said second result.

13. The data processing apparatus of claim 12, wherein:
said instruction decode logic in response to a second instruction further
controlling said data registers to move data stored in said second data register into said second data register with said second write priority, said move thereby having a higher write priority than said storing of said second result whereby said second data register stores data unchanged by said second instruction.

14. The data processing apparatus of claim 12, wherein:
said status register sets a negative status bit when said second result is negative.

15. The data processing apparatus of claim 12, wherein:
said status register sets a carry status bit when said second result generates a carry out signal.

16. The data processing apparatus of claim 12, wherein:
said status register sets an overflow status bit when said second result generates an overflow at said output.

17. The data processing apparatus of claim 12, wherein:
said status register sets a zero status bit when said second result is zero.

18. The data processing apparatus of claim 12, wherein:
said status register includes a plurality of status bits;
said instruction decode logic response to said second instruction setting each of said plurality of status bits in said status register corresponding to said second result;
said first instruction includes an indication of a set of said status bits upon which said conditional move is conditional; and
said instruction decode circuit in response to said first instruction controlling said plurality of data registers to conditionally move predetermined data into said first designated destination data register if said set of status bits indicated in said first instruction have a predetermined state.

19. The data processing apparatus of claim 11, further comprising:
a memory for storing data at a plurality of memory locations;
an address unit connected to said instruction decode circuit and said memory for generating an address for accessing said memory; and
said instruction decode circuit in response to said first instruction
controlling said address generator and said data registers to conditionally move predetermined data into said first data register by accessing an address via said address unit and transferring data stored in a memory location corresponding to said address to said first data register.

20. The data processing apparatus of claim 11, wherein:
said instruction decode circuit in response to said first instruction
controlling said data registers to conditionally move data stored a third data register to said first data register.

21. An data processing system comprising:
an data system bus transferring data and addresses;
a system memory connected to said data system bus, said system memory storing data and transferring data via said data system bus;
an data processor circuit connected to said data system bus, said data processor circuit including
a plurality of data registers,
a status register storing at least one status bit,
an arithmetic logic unit having first and second inputs and an output coupled to said plurality of data registers, said arithmetic logic unit performing an arithmetic or logical operation upon data received at said first and second inputs and generating a result at said output, and
an instruction logic circuit connected to said plurality of data registers, said status register and said arithmetic logic unit, said instruction logic circuit controlling said plurality of data registers, said status register and said arithmetic logic unit in response to a received instruction, said instruction logic circuit in response to a first instruction
controlling said arithmetic logic unit to form a first combination of inputs from first designated source data registers and store a first result in a first designated destination data register with a first write priority, and
controlling said plurality of data registers to conditionally move predetermined data into said first designated destination data register if said at least one status bit in said status register has a predetermined state with a second write priority, said second write priority of said conditional move being higher than said first write priority of said first combination of inputs of said arithmetic logic unit whereby said first designated destination data register stores said predetermined data if said at least one status register has said predetermined state or said result of said first combination if said at least one status register does not have said predetermined state.

22. The data processing system of claim 21, wherein:
said data processor circuit wherein
said instruction decode logic in response to a second instruction
controlling said arithmetic logic unit to form a second combination of inputs from designated source data registers and store a second result in a second designated destination data register with said first write priority, and
setting said at least one status bit in said status register corresponding to said second result.

23. The data processing system of claim 22, wherein:
said data processor circuit wherein
said instruction decode logic in response to a second instruction further
controlling said data registers to move data stored in said second data register into said second data register with said second write priority, said move thereby having a higher write priority than said storing of said second result whereby said second data register stores data unchanged by said second instruction.

24. The data processing system of claim 22, wherein:
said data processor circuit wherein
said status register sets a negative status bit when said second result is negative.

25. The data processing system of claim 22, wherein:
said data processor circuit wherein
said status register sets a carry status bit when said second result generates a carry out signal.

26. The data processing system of claim 22, wherein:
said data processor circuit wherein
said status register sets an overflow status bit when said second result generates an overflow at said output.

27. The data processing system of claim 22, wherein:
said data processor circuit wherein
said status register sets a zero status bit when said second result is zero.

28. The data processing system of claim 22, wherein:
said data processor circuit wherein
said status register includes a plurality of status bits,
said instruction decode logic in response to said second instruction setting each of aid plurality of status bits in said status register corresponding to said second result;
said first instruction includes an indication of a set of said status bits upon which said conditional move is conditional, and
said instruction decode circuit in response to said first instruction controlling said plurality of data registers to conditionally move predetermined data into said first designated destination data register if said set of status bits indicated in said first instruction have a predetermined state.

29. The data processing system of claim 21, wherein:
said data processor circuit further including
a memory for storing data at a plurality of memory locations,
an address unit connected to said instruction decode circuit and said memory for generating an address for accessing said memory, and
said instruction decode circuit in response to said first instruction
controlling said address generator and said data registers to conditionally move predetermined data into said first data register by accessing an address via said address unit and transferring data stored in a memory location corresponding to said address to said first data register.

30. The data processing system of claim 21, wherein:
said data processor circuit wherein
said instruction decode circuit in response to said first instruction
controlling said data registers to conditionally move data stored a third data register to said first data register.

31. The data processing system of claim 21, wherein:
said data processor circuit further includes
a plurality of data memories connected to said data processor circuit,
an instruction memory supplying instructions to said data processor circuit, and
a transfer controller connected to said data system bus, each of said data memories and said instruction memory controlling data transfer between said system memory and said plurality of data memories and between said system memory and said instruction memory.

32. The data processing system of claim 31, wherein:
said data processor circuit further includes
at least one additional data processor circuit identical to said data processor circuit,
a plurality of additional data memories connected to each additional data processor circuit,
an additional instruction memory supplying instructions to each additional data processor circuit, and
said transfer controller is further connected to each of said additional data memories and each said additional instruction memory controlling data transfer between said system memory and said each of said additional data memories and between said system memory and each said additional instruction memory.

33. The data processing system of claim 32, wherein:
said data processor circuit including said data processor circuit, said data memories, said instruction memories, each of said additional data processor circuits, each of said additional data memories, each additional instruction memory and said transfer controller are formed on a single integrated circuit.

34. The data processing system of claim 31, wherein:
said data processor circuit further includes
a master data processor,
a plurality of master data memories connected to said master data processor,
at least one master instruction memory supplying instructions to said master data processor, and
said transfer controller is further connected to each of said master data memories and each said master instruction memory controlling data transfer between said system memory and said each of said master data memories and between said system memory and each said master instruction memory.

35. The data processing system of claim 34, wherein:
said data processor circuit including said data processor circuit, said data memories, said instruction memories, said master data processor, each of said master data memories, each master instruction memory and said transfer controller are formed on a single integrated circuit.

36. The data processor system of claim 21, wherein:
said system memory consists of an image memory storing image data in a plurality of pixels; and
said data processor system further comprising:
an image display unit connected to said image memory generating a visually perceivable output of an image consisting of a plurality of pixels stored in said image memory.

37. The data processor system of claim 36, further comprising:
a palette forming a connection between said image memory and said image display unit, said palette transforming pixels recalled from said image memory into video signals driving said image display unit;
and wherein said data processor circuit further includes
a frame controller connected to said palette controlling said palette transformation of pixels into video signals.

38. The data processor system of claim 21, wherein:
said system memory consists of an image memory storing image data in a plurality of pixels; and
said data processor system further comprising:
a printer connected to said image memory generating a printed output of an image consisting of a plurality of pixels stored in said image memory.

39. The data processor system of claim 38, wherein:
said printer consists of a color printer.

40. The data processor system of claim 38, further comprising:
a printer controller forming a connection between said image memory and said printer, said printer controller transforming pixels recalled from said image memory into print signals driving said printer;
and wherein said data processor circuit further includes
a frame controller connected to said print controller controlling said print controller transformation of pixels into print signals.

41. The data processor system of claim 21, wherein:
said system memory consists of an image memory storing image data in a plurality of pixels; and
said data processor system further comprising:
an imaging device connected to said image memory generating an image signal input.

42. The data processor system of claim 41, further comprising:
an image capture controller forming a connection between said imaging device and said image memory, said image capture controller transforming said image signal into pixels supplied for storage in said image memory;
and wherein said data processor circuit further includes
a frame controller connected to said image capture controller controlling said image capture controller transformation of said image signal into pixels.

43. The data processor system of claim 21, further comprising:
a modem connected to said data system bus and to a communications line.

44. The data processor system of claim 21, further comprising:
a host processing system connected to said data system bus.

45. The data processor system of claim 44, further comprising:
a host system bus connected to said host processing system transferring data and addresses; and
at least one host peripheral connected to said host system bus.

46. A method of conditional data processing operation comprising the steps of:
a) setting a condition to either a first state or a second state;
b) providing a first instruction type including
an arithmetic logic unit code field indicating a first arithmetic/logical operation,
a first source field indicating a first source data register from among a plurality of data registers,
a second source field indicating a second source data register from among a plurality of data registers,
a destination field indicating a first destination data register from among a plurality of data registers,
at least one data transfer source field indicating a source of predetermined data for a first data transfer,
a data transfer destination field indicating a second destination data register from among a plurality of data registers,
a condition field indicating a state of said condition upon which a data transfer is conditional;
c) specifying an if-then-else instruction in said first instruction type having said destination field indicating said first destination data register as a common destination data register and said data transfer destination field indicating said second destination data register as said common destination data register; and
d) executing said if-then-else instruction by
performing said first arithmetic/logical operation on data from said first and second source data registers and storing a first result in said common destination data register with a first write priority, and
conditionally performing said first data transfer of said predetermined data into said common destination data register if said condition has said first state with a second write priority, said second write priority being higher than said first write priority thereby storing said predetermined data in said common destination data register if said condition has said first state or storing said first result of said first arithmetic/logical operation in said common destination data register if said condition does not have said first state.

47. The method of claim 46, wherein:
said step of setting a condition in either a first state or a second state includes
specifying a second instruction including a second arithmetic logic operation having third and fourth source data registers and a third destination data register,
executing said second instruction by performing said second arithmetic/logical operation on data from said third and fourth source data registers and storing a second result in said third destination data register with said first write priority,
setting at least one status bit dependent upon said second result, and
storing said at least one status bit in a status register.

48. The method of claim 47, wherein:
said step of setting a condition in either a first state or a second state wherein said substep of specifying said second instruction includes specifying said second instruction as said first instruction type having a destination field indicating said third destination data register as a second common destination data register;
said step of setting a condition in either a first state or a second state further includes
specifying in said second instruction an unconditional second data transfer of data stored in said second common destination data into said second common destination data register,
executing said second instruction by transferring data stored in said second common destination data register into said second common destination data register with said second write priority, said second data transfer operation thereby having a higher write priority than said storing of said second result whereby said second common destination data register stores unchanged data.

49. The method of claim 48, wherein:
said step of providing a first instruction type further includes a data transfer conditional field having either a first state indicating a conditional data transfer or a second state indicating an unconditional data transfer
said step of specifying said if-then-else instruction includes providing said data transfer conditional field in said first state; and
said step of specifying said second instruction employs said first instruction type and provides said data transfer conditional field in said second state.

50. The method of claim 47, wherein:
said step of setting a condition in either a first state or a second state wherein
said at least one status bit includes a negative status bit set to said first state if said second result is negative and set to said second state if said second result is not negative.

51. The method of claim 47, wherein:
said step of setting a condition in either a first state or a second state wherein
said at least one status bit includes a carry out status bit set to said first state if said second result generates a carry out and set to said second state if said second result does not generate a carry out.

52. The method of claim 47, wherein:
said step of setting a condition in either a first state or a second state wherein
said at least one status bit includes an overflow status bit set to said first state if said second result generates an overflow and set to said second state if said second result does not generate an overflow.

53. The method of claim 47, wherein:
said step of setting a condition in either a first state or a second state wherein
said at least one status bit includes a zero status bit set to said first state if said second result is zero and set to said second state if said second result is not zero.

54. The method of claim 47, wherein:
said step of setting a condition in either a first state or a second state further includes
setting a plurality of status bits dependent upon said second result, and
storing said plurality of status bits in said status register;
said step of providing said first instruction type includes said condition field indicating said state of said condition as a predetermined state of each status bit of a specified set of said plurality of status bits; and
said step of conditionally transferring predetermined data into said common destination data register if said condition has said first state includes testing said specified set of said plurality of said status bits of said condition field and conditionally transferring said predetermined data into said common destination data register if and only if each status bit in said selected set of said plurality of status bits has said predetermined state of said condition.

55. The method of claim 54, wherein:
said plurality of status bits consists of a negative status bit, a carry out status bit, an overflow status bit and a zero status bit;
said condition field of said first instruction type consists of 4 bits coding the following sixteen combinations of conditions
1) unconditional, in which the data transfer takes place regardless of the condition of said status bits,
2) positive, in which the data transfer takes place if and only if (Not N AND Not Z) is true,
3) lower than or same, in which the data transfer takes place if and only if (Not C OR Z) is true,
4) higher than, in which the data transfer takes place if and only if (C AND Not Z) is true,
5) less than, in which the data transfer takes place if and only if ((N AND Not V) OR (Not N AND V)) is true,
6) less than or equal to, in which the data transfer takes place if and only if ((N AND Not V) OR (Not N AND V) OR Z) is true,
7) greater than or equal to, in which the data transfer takes place if and only if ((N AND V) OR (Not N AND Not V)) is true,
8) greater than, in which the data transfer takes place if and only if ((N AND V AND Not Z) OR (Not N AND Not V AND Not Z)) is true,
9) lower than, carry, in which the data transfer takes place if and only if C is true,
10) higher than or same, no carry, in which the data transfer takes place if and only if (Not C) is true,
11) equal, zero, in which the data transfer takes place if and only if Z is true,
12) not equal, not zero, in which the data transfer takes place if and only if (Not Z) is true,
13) overflow, in which the data transfer takes place if and only if V is true,
14) no overflow, in which the data transfer takes place if and only if (Not V) is true,
15) negative, in which the data transfer takes place if and only if N is true, and
16) non-negative, in which the data transfer takes place if and only if (Not N) is true,
where: N is the negative status bit; C is the carry out status bit; V is the overflow status bit; and Z is the zero status bit.

56. The method of claim 46, wherein:
said step of providing a first instruction type includes providing said at least one data transfer source field indicating a predetermined memory location of a memory as said source of said predetermined data for said first data transfer; and
said step of conditionally transferring predetermined data into said common destination data register transfers said predetermined data from said predetermined memory location to said common destination data register.

57. The method of claim 46, wherein:
said step of providing a first instruction type includes providing said at least one data transfer source field indicating a third source data register as said source of said predetermined data for said first data transfer; and
said step of conditionally transferring predetermined data into said common destination data register transfers said predetermined data from said third source data register to said common destination data register.

58. A data processing apparatus comprising:
a plurality of data registers;
a status register storing at least one status bit;
an arithmetic logic unit having first and second inputs and an output coupled to said plurality of data registers, said arithmetic logic unit performing an arithmetic or logical operation upon data received at said first and second inputs and generating a result at said output;
a source instructions including at least one if-then-else instruction having a first instruction type, said first instruction type including
an arithmetic logic unit code field indicating a first arithmetic/logical operation,
a first source field indicating a first source data register from among a plurality of data registers,
a second source field indicating a second source data register from among a plurality of data registers,
a destination field indicating a first destination data register from among a plurality of data registers,
at least one data transfer source field indicating a source of predetermined data for a first data transfer,
a data transfer destination field indicating a second destination data register from among a plurality of data registers,
a condition field indicating a predetermined state of said at least one status bit upon which a data transfer is conditional,
said at least one if-then-else instruction having said destination field indicating said first destination data register as a common destination data register and said data transfer destination field indicating said second destination data register as said common destination data register;
an instruction logic circuit connected to said plurality of data registers, said status register, said arithmetic logic unit and said source of instructions, said instruction logic circuit controlling said plurality of data registers, said status register and said arithmetic logic unit in response to a received instruction, said instruction logic circuit in response to an if-then-else instruction
controlling said arithmetic logic unit to form a first combination of said first and second source data registers corresponding to said specified first arithmetic/logical operation and store a first result in said common destination data register with a first write priority, and
controlling said plurality of data registers to conditionally transfer predetermined data into said common destination data register if said at least one status bit in said status register has said predetermined state with a second write priority, said second write priority being higher than said first write priority whereby said common destination data register stores said predetermined data if said at least one status register has said predetermined state or said first result if said at least one status register does not have said predetermined state.

59. The data processing apparatus of claim 58, wherein:
said source instructions includes a second instruction specifying a second arithmetic logic operation having third and fourth source data registers and a third destination data register; and
said instruction decode logic in response to a second instruction
controlling said arithmetic logic unit to form a second combination of said third and fourth source data registers corresponding to said specified second arithmetic/logical operation and store a second result in said third destination data register with a first write priority, and
setting said at least one status bit in said status register corresponding to said second result.

60. The data processing apparatus of claim 59, wherein:
said second instruction
is of said first instruction type and further includes a destination field indicating said third destination data register as a second common destination data register, and
specifies an unconditional second data transfer operation of data stored in said second common destination data into said second common destination data register with said second write priority;
said instruction decode logic in response to a second instruction further
controlling said data registers to transfer data stored in said second common destination data register into said second common destination data register with said second write priority, said second data transfer operation thereby having a higher write priority than said storing of said second result whereby said second common destination data register stores data unchanged by said second instruction.

61. The data processing apparatus of claim 60, wherein:
said first instruction type further includes a data transfer conditional field having either a first state indicating a conditional data transfer or a second state indicating an unconditional data transfer;
said if-then-else instruction includes providing said data transfer conditional field in said first state; and
said second instruction employs said first instruction type and provides said data transfer conditional field in said second state.

62. The data processing apparatus of claim 59, wherein:
said status register sets a negative status bit when said second result is negative.

63. The data processing apparatus of claim 59, wherein:
said status register sets a carry status bit when said second result generates a carry out signal.

64. The data processing apparatus of claim 59, wherein:
said status register sets an overflow status bit when said second result generates an overflow at said output.

65. The data processing apparatus of claim 59, wherein:
said status register sets a zero status bit when said second result is zero.

66. The data processing apparatus of claim 59, wherein:
said status register includes a plurality of status bits;
said instruction decode logic in response to said second instruction setting each of said plurality of status bits in said status register corresponding to said second result;
said if-then-else instruction specifies said condition as a predetermined state of each status bit of a specified set of said plurality of status bits; and
said instruction decode circuit in response to said if-then-else instruction controlling said plurality of data registers to conditionally transfer predetermined data into said common destination data register if and only if each status bit in said selected set of said plurality of status bits has said predetermined state of said condition.

67. The data processing apparatus of claim 66, wherein:
said plurality of status bits stored in said status register consists of a negative status bit, a carry out status bit, an overflow status bit and a zero status bit;
said condition field of said first instruction type consists of 4 bits coding the following sixteen combinations of conditions
1) unconditional, in which the data transfer takes place regardless of the condition of said status bits,
2) positive, in which the data transfer takes place if and only if (Not N AND Not Z) is true,
3) lower than or same, in which the data transfer takes place if and only if (Not C OR Z) is true,
4) higher than, in which the data transfer takes place if and only if (C AND Not Z) is true,
5) less than, in which the data transfer takes place if and only if ((N AND Not V) OR (Not N AND V)) is true,
6) less than or equal to, in which the data transfer takes place if and only if ((N AND Not V) OR (Not N AND V) OR Z) is true,
7) greater than or equal to, in which the data transfer takes place if and only if ((N AND V) OR (Not N AND Not V)) is true,
8) greater than, in which the data transfer takes place if and only if ((N AND V AND Not Z) OR (Not N AND Not V AND Not Z)) is true,
9) lower than, carry, in which the data transfer takes place if and only if C is true,
10) higher than or same, no carry, in which the data transfer takes place if and only if (Not C) is true,
11) equal, zero, in which the data transfer takes place if and only if Z is true,
12) not equal, not zero, in which the data transfer takes place if and only if (Not Z) is true,
13) overflow, in which the data transfer takes place if and only if V is true,
14) no overflow, in which the data transfer takes place if and only if (Not V) is true,
15) negative, in which the data transfer takes place if and only if N is true, and
16) non-negative, in which the data transfer takes place if and only if (Not N) is true,
where: N is the negative status bit; C is the carry out status bit; V is the overflow status bit; and Z is the zero status bit.

68. The data processing apparatus of claim 58, further comprising:
a memory for storing data at a plurality of memory locations;
an address unit connected to said instruction decode circuit and said memory for generating an address for accessing said memory;
said a first instruction type includes providing said at least one data transfer source field indicating a predetermined operation of said address unit for generating said address of said memory as said source of said predetermined data for said first data transfer; and
said instruction decode circuit in response to said if-then-else instruction
controlling said address generator and said data registers to conditionally transfer predetermined data into said common destination data register by accessing an address via said predetermined operation of said address unit and transferring data stored in a memory location corresponding to said address to said common destination data register.

69. The data processing apparatus of claim 58, wherein:
said first instruction type includes said at least one data transfer source field indicating a third source data register as said source of said predetermined data for said first data transfer; and
said instruction decode circuit in response to said if-then-else instruction
controlling said data registers to conditionally transfer data stored in said third source data register to said common destination data register.

70. An data processing system comprising:
an data system bus transferring data and addresses;
a system memory connected to said data system bus, said system memory storing data and transferring data via said data system bus;
an data processor circuit connected to said data system bus, said data processor circuit including
a plurality of data registers;
a status register storing at least one status bit;
an arithmetic logic unit having first and second inputs and an output coupled to said plurality of data registers, said arithmetic logic unit performing an arithmetic or logical operation upon data received at said first and second inputs and generating a result at said output;
a source instructions including at least one if-then-else instruction having a first instruction type, said first instruction type including
an arithmetic logic unit code field indicating a first arithmetic/logical operation,
a first source field indicating a first source data register from among a plurality of data registers,
a second source field indicating a second source data register from among a plurality of data registers,
a destination field indicating a first destination data register from among a plurality of data registers,
at least one data transfer source field indicating a source of predetermined data for a first data transfer,
a data transfer destination field indicating a second destination data register from among a plurality of data registers,
a condition field indicating a predetermined state of said at least one status bit upon which a data transfer is conditional,
said at least one if-then-else instruction having said destination field indicating said first destination data register as a common destination data register and said data transfer destination field indicating said second destination data register as said common destination data register;
an instruction logic circuit connected to said plurality of data registers, said status register, said arithmetic logic unit and said source of instructions, said instruction logic circuit controlling said plurality of data registers, said status register and said arithmetic logic unit in response to a received instruction, said instruction logic circuit in response to an if-then-else instruction
controlling said arithmetic logic unit to form a first combination of said first and second source data registers corresponding to said specified first arithmetic/logical operation and store a first result in said common destination data register with a first write priority, and
controlling said plurality of data registers to conditionally transfer predetermined data into said common destination data register if said at least one status bit in said status register has said predetermined state with a second write priority, said second write priority being higher than said first write priority whereby said common destination data register stores said predetermined data if said at least one status register has said predetermined state or said first result if said at least one status register does not have said predetermined state.

71. The data processing system of claim 70, wherein:
said data processor circuit wherein
said source instructions includes a second instruction specifying a second arithmetic logic operation having third and fourth source data registers and a third destination data register; and
said instruction decode logic in response to a second instruction
controlling said arithmetic logic unit to form a second combination of said third and fourth source data registers corresponding to said specified second arithmetic/logical operation and store a second result in said third destination data register with a first write priority, and
setting said at least one status bit in said status register corresponding to said second result.

72. The data processing system of claim 71, wherein:
said data processor circuit wherein
said second instruction
is of said first instruction type and includes a destination field indicating said third destination data register as a second common destination data register, and
specifies an unconditional second data transfer operation of data stored in said second common destination data into said second common destination data register with said second write priority;
said instruction decode logic in response to a second instruction further
controlling said data registers to transfer data stored in said second common destination data register into said second common destination data register with said second write priority, said second data transfer operation thereby having a higher write priority than said storing of said second result whereby said second common destination data register stores data unchanged by said second instruction.

73. The data processing system of claim 72, wherein:
said data processor circuit wherein
said first instruction type further includes a data transfer conditional field having either a first state indicating a conditional data transfer or a second state indicating an unconditional data transfer;
said if-then-else instruction includes providing said data transfer conditional field in said first state; and
said second instruction employs said first instruction type and provides said data transfer conditional field in said second state.

74. The data processing system of claim 71, wherein:
said data processor circuit wherein
said status register sets a negative status bit when said second result is negative.

75. The data processing system of claim 71, wherein:
said data processor circuit wherein
said status register sets a carry status bit when said second result generates a carry out signal.

76. The data processing system of claim 71, wherein:
said data processor circuit wherein
said status register sets an overflow status bit when said second result generates an overflow at said output.

77. The data processing system of claim 71, wherein:
said data processor circuit wherein
said status register sets a zero status bit when said second result is zero.

78. The data processing system of claim 71, wherein:
said data processor circuit wherein
said status register includes a plurality of status bits;
said instruction decode logic in response to said second instruction setting each of said plurality of status bits in said status register corresponding to said second result;
said if-then-else instruction specifies said condition as a predetermined state of each status bit of a specified set of said plurality of status bits; and
said instruction decode circuit in response to said if-then-else instruction controlling said plurality of data registers to conditionally transfer predetermined data into said common destination data register if and only if each status bit in said selected set of said plurality of status bits has said predetermined state of said condition.

79. The data processing system of claim 78, wherein:
said data processor circuit wherein
said plurality of status bits stored in said status register consists of a negative status bit, a carry out status bit, an overflow status bit and a zero status bit;
said condition field of said first instruction type consists of 4 bits coding the following sixteen combinations of conditions
1) unconditional, in which the data transfer takes place regardless of the condition of said status bits,
2) positive, in which the data transfer takes place if and only if (Not N AND Not Z) is true,
3) lower than or same, in which the data transfer takes place if and only if (Not C OR Z) is true,
4) higher than, in which the data transfer takes place if and only if (C AND Not Z) is true,
5) less than, in which the data transfer takes place if and only if ((N AND Not V) OR (Not N AND V)) is true,
6) less than or equal to, in which the data transfer takes place if and only if ((N AND Not V) OR (Not N AND V) OR Z) is true,
7) greater than or equal to, in which the data transfer takes place if and only if ((N AND V) OR (Not N AND Not V)) is true,
8) greater than, in which the data transfer takes place if and only if ((N AND V AND Not Z) OR (Not N AND Not V AND Not Z)) is true,
9) lower than, carry, in which the data transfer takes place if and only if C is true,
10) higher than or same, no carry, in which the data transfer takes place if and only if (Not C) is true,
11) equal, zero, in which the data transfer takes place if and only if Z is true,
12) not equal, not zero, in which the data transfer takes place if and only if (Not Z) is true,
13) overflow, in which the data transfer takes place if and only if V is true,
14) no overflow, in which the data transfer takes place if and only if (Not V) is true,
15) negative, in which the data transfer takes place if and only if N is true, and
16) non-negative, in which the data transfer takes place if and only if (Not N) is true,
where: N is the negative status bit; C is the carry out status bit; V is the overflow status bit; and Z is the zero status bit.

80. The data processing system of claim 70, wherein:
said data processor circuit further including
a memory for storing data at a plurality of memory locations,
an address unit connected to said instruction decode circuit and said memory for generating an address for accessing said memory, and
said a first instruction type includes providing said at least one data transfer source field indicating a predetermined operation of said address unit for generating said address of said memory as said source of said predetermined data for said first data transfer; and
said instruction decode circuit in response to said if-then-else instruction
controlling said address generator and said data registers to conditionally transfer predetermined data into said common destination data register by accessing an address via said predetermined operation of said address unit and transferring data stored in a memory location corresponding to said address to said common destination data register.

81. The data processing system of claim 70, wherein:
said data processor circuit wherein
said first instruction type includes said at least one data transfer source field indicating a third source data register as said source of said predetermined data for said first data transfer; and
said instruction decode circuit in response to said if-then-else instruction
controlling said data registers to conditionally transfer data stored in said third source data register to said common destination data register.

82. The data processing system of claim 70, wherein:
said data processor circuit further includes
a plurality of data memories connected to said data processor circuit,
an instruction memory supplying instructions to said data processor circuit, and
a transfer controller connected to said data system bus, each of said data memories and said instruction memory controlling data transfer between said system memory and said plurality of data memories and between said system memory and said instruction memory.

83. The data processing system of claim 82, wherein:
said data processor circuit further includes
at least one additional data processor circuit identical to said data processor circuit,
a plurality of additional data memories connected to each additional data processor circuit,
an additional instruction memory supplying instructions to each additional data processor circuit, and
said transfer controller is further connected to each of said additional data memories and each said additional instruction memory controlling data transfer between said system memory and said each of said additional data memories and between said system memory and each said additional instruction memory.

84. The data processing system of claim 83, wherein:
said data processor circuit including said data processor circuit, said data memories, said instruction memories, each of said additional data processor circuits, each of said additional data memories, each additional instruction memory and said transfer controller are formed on a single integrated circuit.

85. The data processing system of claim 82, wherein:
said data processor circuit further includes
a master data processor,
a plurality of master data memories connected to said master data processor,
at least one master instruction memory supplying instructions to said master data processor, and
said transfer controller is further connected to each of said master data memories and each said master instruction memory controlling data transfer between said system memory and said each of said master data memories and between said system memory and each said master instruction memory.

86. The data processing system of claim 85, wherein:
said data processor circuit including said data processor circuit, said data memories, said instruction memories, said master data processor, each of said master data memories, each master instruction memory and said transfer controller are formed on a single integrated circuit.

87. The data processor system of claim 70, wherein:
said system memory consists of an image memory storing image data in a plurality of pixels; and
said data processor system further comprising:
an image display unit connected to said image memory generating a visually perceivable output of an image consisting of a plurality of pixels stored in said image memory.

88. The data processor system of claim 87, further comprising:
a palette forming a connection between said image memory and said image display unit, said palette transforming pixels recalled from said image memory into video signals driving said image display unit;
and wherein said data processor circuit further includes
a frame controller connected to said palette controlling said palette transformation of pixels into video signals.

89. The data processor system of claim 70, wherein:
said system memory consists of an image memory storing image data in a plurality of pixels; and
said data processor system further comprising:
a printer connected to said image memory generating a printed output of an image consisting of a plurality of pixels stored in said image memory.

90. The data processor system of claim 89, wherein:
said printer consists of a color printer.

91. The data processor system of claim 89, further comprising:
a printer controller forming a connection between said image memory and said printer, said printer controller transforming pixels recalled from said image memory into print signals driving said printer;
and wherein said data processor circuit further includes
a frame controller connected to said print controller controlling said print controller transformation of pixels into print signals.

92. The data processor system of claim 70, wherein:
said system memory consists of an image memory storing image data in a plurality of pixels; and
said data processor system further comprising:
an imaging device connected to said image memory generating an image signal input.

93. The data processor system of claim 92, further comprising:
an image capture controller forming a connection between said imaging device and said image memory, said image capture controller transforming said image signal into pixels supplied for storage in said image memory;
and wherein said data processor circuit further includes
a frame controller connected to said image capture controller controlling said image capture controller transformation of said image signal into pixels.

94. The data processor system of claim 70, further comprising:
a modem connected to said data system bus and to a communications line.

95. The data processor system of claim 70, further comprising:
a host processing system connected to said data system bus.

96. The data processor system of claim 95, further comprising:
a host system bus connected to said host processing system transferring data and addresses; and
at least one host peripheral connected to said host system bus.

Description

CROSS REFERENCE TO RELATED APPLICATIONS

This application relates to improvements in the inventions disclosed in the following copending U.S. patent applications, all of which are assigned to Texas Instruments:

U.S. patent application Ser. No. 08/263,501 filed Jun. 21, 1994 entitled "MULTI-PROCESSOR WITH CROSSBAR LINK OF PROCESSORS AND MEMORIES AND METHOD OF OPERATION", a continuation of U.S. patent application Ser. No. 08/135,754 filed Oct. 12,
1993 and now abandoned, a continuation of U.S. patent application Ser. No. 07/933,865 filed Aug. 21, 1993 and now abandoned, a continuation of U.S. patent application Ser. No. 435,591 filed Nov. 17, 1989 and now abandoned;

U.S. Pat. No. 5,212,777, issued May 18, 1993, filed Nov. 17, 1989 and entitled "SIMD/MIMD RECONFIGURABLE MULTI-PROCESSOR AND METHOD OF OPERATION";

U.S. patent application Ser. No. 08/264,111 filed Jun. 22, 1994 entitled "RECONFIGURABLE COMMUNICATIONS FOR MULTI-PROCESSOR AND METHOD OF OPERATION," a continuation of U.S. patent application Ser. No. 07/895,565 filed Jun. 5, 1992 and now abandoned, a continuation of U.S. patent application Ser. No. 07/437,856 filed Nov. 17, 1989 and now abandoned;

U.S. patent application Ser. No. 08/264,582 filed Jun. 22, 1994 entitled "REDUCED AREA OF CROSSBAR AND METHOD OF OPERATION", a continuation of U.S. patent application Ser. No. 07/437,852 filed Nov. 17, 1989 and now abandoned;

U.S. patent application Ser. No. 08/032,530 filed Mar. 15, 1993 entitled "SYNCHRONIZED MIMD MULTI-PROCESSING SYSTEM AND METHOD OF OPERATION," a continuation of U.S. patent application Ser. No. 07/437,853 filed Nov. 17, 1989 and now abandoned;

U.S. Pat. No. 5,197,140 issued Mar. 23, 1993 filed Nov. 17, 1989 and entitled "SLICED ADDRESSING MULTI-PROCESSOR AND METHOD OF OPERATION";

U.S. Pat. No. 5,339,447 issued Aug. 16, 1994 filed Nov. 17, 1989 entitled "ONES COUNTING CIRCUIT, UTILIZING A MATRIX OF INTERCONNECTED HALF-ADDERS, FOR COUNTING THE NUMBER OF ONES IN A BINARY STRING OF IMAGE DATA";

U.S. Pat. No. 5,239,654 issued Aug. 24, 1993 filed Nov. 17, 1989 and entitled "DUAL MODE SIMD/MIMD PROCESSOR PROVIDING REUSE OF MIMD INSTRUCTION MEMORIES AS DATA MEMORIES WHEN OPERATING IN SIMD MODE";

U.S. patent application Ser. No. 07/911,562 filed Jun. 29, 1992 entitled "IMAGING COMPUTER AND METHOD OF OPERATION", a continuation of U.S. patent application Ser. No. 07/437,854 filed Nov. 17, 1989 and now abandoned; and

U.S. Pat. No. 5,226,125 issued Jul. 6, 1993 filed Nov. 17, 1989 and entitled "SWITCH MATRIX HAVING INTEGRATED CROSSPOINT LOGIC AND METHOD OF OPERATION".

This application is also related to the following concurrently filed U.S. patent applications, which include the same disclosure:

U.S. patent application Ser. No. 08/160,229 "THREE INPUT ARITHMETIC LOGIC UNIT WITH BARREL ROTATOR";

U.S. patent application Ser. No. 08/158,742 "ARITHMETIC LOGIC UNIT HAVING PLURAL INDEPENDENT SECTIONS AND REGISTER STORING RESULTANT INDICATOR BIT FROM EVERY SECTION";

U.S. patent application Ser. No. 08/160,118 "MEMORY STORE FROM A REGISTER PAIR CONDITIONAL";

U.S. patent application Ser. No. 08/324,323 "ITERATIVE DIVISION APPARATUS, SYSTEM AND METHOD FORMING PLURAL QUOTIENT BITS PER ITERATION" a continuation of U.S. application Ser. No. 08/160,115 concurrently filed with this application and now abandoned;

U.S. patent application Ser. No. 08/158,285 "THREE INPUT ARITHMETIC LOGIC UNIT FORMING MIXED ARITHMETIC AND BOOLEAN COMBINATIONS";

U.S. patent application Ser. No. 08/160,119 "METHOD, APPARATUS AND SYSTEM FORMING THE SUM OF DATA IN PLURAL EQUAL SECTIONS OF A SINGLE DATA WORD";

U.S. patent application Ser. No. 08/159,359 "HUFFMAN ENCODING METHOD, CIRCUITS AND SYSTEM EMPLOYING MOST SIGNIFICANT BIT CHANGE FOR SIZE DETECTION";

U.S. patent application Ser. No. 08/160,296 "HUFFMAN DECODING METHOD, CIRCUIT AND SYSTEM EMPLOYING CONDITIONAL SUBTRACTION FOR CONVERSION OF NEGATIVE NUMBERS";

U.S. patent application Ser. No. 08/160,112 "METHOD, APPARATUS AND SYSTEM FOR SUM OF PLURAL ABSOLUTE DIFFERENCES";

U.S. patent application Ser. No. 08/160,120 "ITERATIVE DIVISION APPARATUS, SYSTEM AND METHOD EMPLOYING LEFT MOST ONE'S DETECTION AND LEFT MOST ONE'S DETECTION WITH EXCLUSIVE OR";

U.S. patent application Ser. No. 08/160,114 "ADDRESS GENERATOR EMPLOYING SELECTIVE MERGE OF TWO INDEPENDENT ADDRESSES";

U.S. patent application Ser. No. 08/160,116 "METHOD, APPARATUS AND SYSTEM METHOD FOR CORRELATION";

U.S. patent application Ser. No. 08/160,297 "LONG INSTRUCTION WORD CONTROLLING PLURAL INDEPENDENT PROCESSOR OPERATIONS";

U.S. patent application Ser. No. 08/159,346 "ROTATION REGISTER FOR ORTHOGONAL DATA TRANSFORMATION";

U.S. patent application Ser. No. 08/159,652 "MEDIAN FILTER METHOD, CIRCUIT AND SYSTEM";

U.S. patent application Ser. No. 08/159,344 "ARITHMETIC LOGIC UNIT WITH CONDITIONAL REGISTER SOURCE SELECTION";

U.S. patent application Ser. No. 08/160,301 "APPARATUS, SYSTEM AND METHOD FOR DIVISION BY ITERATION"

U.S. patent application Ser. No. 08/159,650 "MULTIPLY ROUNDING USING REDUNDANT CODED MULTIPLY RESULT";

U.S. patent application Ser. No. 08/159,349 "SPLIT MULTIPLY OPERATION";

U.S. patent application Ser. No. 08/158,741 "MIXED CONDITION TEST CONDITIONAL AND BRANCH OPERATIONS INCLUDING CONDITIONAL TEST FOR ZERO";

U.S. patent application Ser. No. 08/160,302 "PACKED WORD PAIR MULTIPLY OPERATION";

U.S. patent application Ser. No. 08/160,573 "THREE INPUT ARITHMETIC LOGIC UNIT WITH SHIFTER";

U.S. patent application Ser. No. 08/159,282 "THREE INPUT ARITHMETIC LOGIC UNIT WITH MASK GENERATOR";

U.S. patent application Ser. No. 08/160,111 "THREE INPUT ARITHMETIC LOGIC UNIT WITH BARREL ROTATOR AND MASK GENERATOR";

U.S. patent application Ser. No. 08/160,298 "THREE INPUT ARITHMETIC LOGIC UNIT WITH SHIFTER AND MASK GENERATOR";

U.S. patent application Ser. No. 08/159,345 "THREE INPUT ARITHMETIC LOGIC UNIT FORMING THE SUM OF A FIRST INPUT ADDED WITH A FIRST BOOLEAN COMBINATION OF A SECOND INPUT AND THIRD INPUT PLUS A SECOND BOOLEAN COMBINATION OF THE SECOND AND THIRD INPUTS";

U.S. patent application Ser. No. 08/160,113 "THREE INPUT ARITHMETIC LOGIC UNIT FORMING THE SUM OF FIRST BOOLEAN COMBINATION OF FIRST, SECOND AND THIRD INPUTS PLUS A SECOND BOOLEAN COMBINATION OF FIRST, SECOND AND THIRD INPUTS";

U.S. patent application Ser. No. 08/159,640 "THREE INPUT ARITHMETIC LOGIC UNIT EMPLOYING CARRY PROPAGATE LOGIC".

TECHNICAL FIELD OF THE INVENTION

The technical field of this invention is the field of digital data processing and more particularly microprocessor circuits, architectures and methods for digital data processing especially digital image/graphics processing.

BACKGROUND OF THE INVENTION

This invention relates to the field of computer graphics and in particular to bit mapped graphics. In bit mapped graphics computer memory stores data for each individual picture element or pixel of an image at memory locations that correspond to the location of that pixel within the image. This image may be an image to be displayed or a captured image to be manipulated, stored, displayed or retransmitted. The field of bit mapped computer graphics has benefited greatly from the lowered cost and increased capacity of dynamic random access memory (DRAM) and the lowered cost and increased processing power of microprocessors. These advantageous changes in the cost and performance of component parts enable larger and more complex computer image systems to be economically feasible.

The field of bit mapped graphics has undergone several stages in evolution of the types of processing used for image data manipulation. Initially a computer system supporting bit mapped graphics employed the system processor for all bit mapped operations. This type of system suffered several drawbacks. First, the computer system processor was not particularly designed for handling bit mapped graphics. Design choices that are very reasonable for general purpose computing are unsuitable for bit mapped graphics systems. Consequently some routine graphics tasks operated slowly. In addition, it was quickly discovered that the processing needed for image manipulation of bit mapped graphics was so loading the computational capacity of the system processor that other operations were also slowed.

The next step in the evolution of bit mapped graphics processing was dedicated hardware graphics controllers. These devices can draw simple figures, such as lines, ellipses and circles, under the control of the system processor. Many of these devices can also do pixel block transfers (PixBlt). A pixel block transfer is a memory move operation of image data from one portion of memory to another. A pixel block transfer is useful for rendering standard image elements, such as alphanumeric characters in a particular type font, within a display by transfer from nondisplayed memory to bit mapped display memory. This function can also be used for tiling by transferring the same small image to the whole of bit mapped display memory. The built-in algorithms for performing some of the most frequently used graphics functions provide a way of improving system performance. However, a useful graphics computer system often requires many functions besides those few that are implemented in such a hardware graphics controller. These additional functions must be implemented in software by the system processor. Typically these hardware graphics controllers allow the system processor only limited access to the bit map memory, thereby limiting the degree to which system software can augment the fixed set of functions of the hardware graphics controller.

The graphics system processor represents yet a further step in the evolution of bit mapped graphics processing. A graphics system processor is a programmable device that has all the attributes of a microprocessor and also includes special functions for bit mapped graphics. The TMS34010 and TMS34020 graphics system processors manufactured by Texas Instruments Incorporated represent this class of devices. These graphics system processors respond to a stored program in the same manner as a microprocessor and include the capability of data manipulation via an arithmetic logic unit, data storage in register files and control of both program flow and external data memory. In addition, these devices include special purpose graphics manipulation hardware that operate under program control. Additional instructions within the instruction set of these graphics system processors controls the special purpose graphics hardware. These instructions and the hardware that supports them are selected to perform base level graphics functions that are useful in many contexts. Thus a graphics system processor can be programmed for many differing graphics applications using algorithms selected for the particular problem. This provides an increase in usefulness similar to that provided by changing from hardware controllers to programmed microprocessors. Because such graphics system processors are programmable devices in the same manner as microprocessors, they can operate as stand alone graphics processors, graphics co-processors slaved to a system processor or tightly coupled graphics controllers.

New applications are driving the desire to provide more powerful graphics functions. Several fields require more cost effective graphics operations to be economically feasible. These include video conferencing, multi-media computing with full motion video, high definition television, color facsimile and digital photography. Each of these fields presents unique problems, but image data compression and decompression are common themes. The amount of transmission bandwidth and the amount of storage capacity required for images and particular full motion video is enormous. Without efficient video compression and decompression that result in acceptable final image quality, these applications will be limited by the costs associated with transmission bandwidth and storage capacity. There is also a need in the art for a single system that can support both image processing functions such as image recognition and graphics functions such as display control.

SUMMARY OF THE INVENTION

This invention permits a data processing apparatus to execute an if, then, else operation in a single instruction cycle. The data processing apparatus includes pipelined data unit and address unit operations. The data unit writes results to an instruction designated data register. The address unit controls data movement between memory and the data registers, and may also control register to register data movement. The address unit data move operation has a higher write priority than the storing of the data unit result. Thus if the destination registers are the same, only the data move operations occurs.

The write priority permits the if, then, else operation. Due to the pipelined arithmetic logic unit and address unit operations, a single instruction of the data processing apparatus may specify both a data unit operation and an address unit operation. The data unit includes an arithmetic logic unit that performs an unconditional operation with the result to be stored in a destination register specified in the instruction. This executes the "else" part of the if, then, else operation. The address unit sets the address for a data move operation to the same destination register. The data move operation is conditional upon the if condition. In the preferred embodiment, this condition is set by the instruction and is based upon a set of status bits in a status register. If the status bits match the instruction specified condition, then the conditional data move operation takes place. This data move operation, having a higher write priority than the data unit operation, controls the data written into the destination register. If the status bits do not match the condition specified in the instruction, then the conditional data move does not take place. In this event the results of the arithmetic logic unit operation are stored in the destination register. Thus the alternative or "else" result is stored in the destination register.

In the preferred embodiment the status register includes a plurality of bits. In the preferred embodiment a prior instruction sets the status bits to a particular state based upon the arithmetic logic unit result. This state of the status register bits is the basis for the conditional operation. This prior instruction can use write priority to preserve the status of the destination register of this instruction. An unconditional register to register move of the destination register to itself overrules the storing of the result of the arithmetic logic unit. Thus the prior instruction can perform an arithmetic or logical combination, set the status bits correspondingly and not change the status of any register. The status bits preferably include: a negative status bit set when the arithmetic logic unit result is negative; a carry status bit set when the arithmetic logic unit result generates a carry out signal; an overflow status bit set when the arithmetic logic unit result generates an overflow; and a zero status bit set when the arithmetic logic unit result is zero. The condition may be based upon a single status bit or a combination of status bits as set by the instruction.

In the preferred embodiment of this invention, the data unit including data registers, the arithmetic logic unit and the status register, and the address unit are embodied in at least one digital image/graphics processor as a part of a multiprocessor formed in a single integrated circuit used in image processing.

BRIEF DESCRIPTION OF THE FIGURES

These and other aspects of the present invention are described below together with the Figures, in which:

FIG. 1 illustrates the system architecture of an image processing system such as would employ this invention;

FIG. 2 illustrates the architecture of a single integrated circuit multiprocessor that forms the preferred embodiment of this invention;

FIG. 3 illustrates in block diagram form one of the digital image/graphics processors illustrated in FIG. 2;

FIG. 4 illustrates in schematic form the pipeline stages of operation of the digital image/graphics processor illustrated in FIG. 2;

FIG. 5 illustrates in block diagram form the data unit of the digital image/graphics processors illustrated in FIG. 3;

FIG. 6 illustrates in schematic form field definitions of the status register of the data unit illustrated in FIG. 5;

FIG. 7 illustrates in block diagram form the manner of splitting the arithmetic logic unit of the data unit illustrated in FIG. 5;

FIG. 8 illustrates in block diagram form the manner of addressing the data register of the data unit illustrated in FIG. 5 as a rotation register;

FIG. 9 illustrates in schematic form the field definitions of the first data register of the data unit illustrated in FIG. 5;

FIG. 10a illustrates in schematic form the data input format for 16 bit by 16 bit signed multiplication operands;

FIG. 10b illustrates in schematic form the data output format for 16 bit by 16 bit signed multiplication results;

FIG. 10c illustrates in schematic form the data input format for 16 bit by 16 bit unsigned multiplication operands;

FIG. 10d illustrates in schematic form the data output format for 16 bit by 16 bit unsigned multiplication results;

FIG. 11a illustrates in schematic form the data input format for dual 8 bit by 8 bit signed multiplication operands;

FIG. 11b illustrates in schematic form the data input format for dual 8 bit by 8 bit unsigned multiplication operands;

FIG. 11c illustrates in schematic form the data output format for dual 8 bit by 8 bit signed multiplication results;

FIG. 11d illustrates in schematic form the data output format for dual 8 bit by 8 bit unsigned multiplication results;

FIG. 12 illustrates in block diagram form the multiplier illustrated in FIG. 5;

FIG. 13 illustrates in schematic form generation of Booth quads for the first operand in 16 bit by 16 bit multiplication;

FIG. 14 illustrates in schematic form generation of Booth quads for dual first operands in 8 bit by 8 bit multiplication;

FIG. 15a illustrates in schematic form the second operand supplied to the partial product generators illustrated in FIG. 12 in 16 bit by 16 bit unsigned multiplication;

FIG. 15b illustrates in schematic form the second operand supplied to the partial product generators illustrated in FIG. 12 in 16 bit by 16 bit signed multiplication;

FIG. 16a illustrates in schematic form the second operand supplied to the first three partial product generators illustrated in FIG. 12 in dual 8 bit by 8 bit unsigned multiplication;

FIG. 16b illustrates in schematic form the second operand supplied to the first three partial product generators illustrated in FIG. 12 in dual 8 bit by 8 bit signed multiplication;

FIG. 16c illustrates in schematic form the second operand supplied to the second three partial product generators illustrated in FIG. 12 in dual 8 bit by 8 bit unsigned multiplication;

FIG. 16d illustrates in schematic form the second operand supplied to the second three partial product generators illustrated in FIG. 12 in dual 8 bit by 8 bit signed multiplication;

FIG. 17a illustrates in schematic form the output mapping for 16 bit by 16 bit multiplication;

FIG. 17b illustrates in schematic form the output mapping for dual 8 bit by 8 bit multiplication;

FIG. 18 illustrates in block diagram form the details of the construction of the rounding adder 226 illustrated in FIG. 5;

FIG. 19 illustrates in block diagram form the construction of one bit circuit of the arithmetic logic unit of the data unit illustrated in FIG. 5;

FIG. 20 illustrates in schematic form the construction of the resultant logic and carry out logic of the bit circuit illustrated in FIG. 19;

FIG. 21 illustrates in schematic form the construction of the Boolean function generator of the bit circuit illustrated in FIG. 19;

FIG. 22 illustrates in block diagram form the function signal selector of the function signal generator of the data unit illustrated in FIG. 5;

FIG. 23 illustrates in block diagram form the function signal modifier portion of the function signal generator of the data unit illustrated in FIG. 5;

FIG. 24 illustrates in block diagram form the bit 0 carry-in generator of the data unit illustrated in FIG. 5;

FIG. 25 illustrates in block diagram form a conceptual view of the arithmetic logic unit illustrated in FIGS. 19 and 20;

FIG. 26 illustrates in block diagram form a conceptual view of an alternative embodiment of the arithmetic logic unit;

FIG. 27 illustrates in block diagram form the address unit of the digital image/graphics processor illustrated in FIG. 3;

FIG. 28 illustrates in block diagram form an example of a global or a local address unit of the address unit illustrated in FIG. 27;

FIG. 29a illustrates the order of data bytes according to the little endian mode;

FIG. 29b illustrates the order of data bytes according to the big endian mode;

FIG. 30 illustrates a circuit for data selection, data alignment and sign or zero extension in each data port of a digital image/graphics processor;

FIG. 31 illustrates in block diagram form the program flow control unit of the digital image/graphics processors illustrated in FIG. 3;

FIG. 32 illustrates in schematic form the field definitions of the program counter of the program flow control unit illustrated in FIG. 31;

FIG. 33 illustrates in schematic form the field definitions of the instruction pointer-address stage register of the program flow control unit illustrated in FIG. 31;

FIG. 34 illustrates in schematic form the field definitions of the instruction pointer-return from subroutine register of the program flow control unit illustrated in FIG. 31;

FIG. 35 illustrates in schematic form the field definitions of the cache tag registers of the program flow control unit illustrated in FIG. 31;

FIG. 36 illustrates in schematic form the field definitions of the loop logic control register of the program flow control unit illustrated in FIG. 31;

FIG. 37 illustrates in block diagram form the loop logic circuit of the program flow control unit;

FIG. 38 illustrates in flow chart form a program example of a single program loop with multiple loop ends;

FIG. 39 illustrates the overlapping pipeline stages in an example of a software branch from a single instruction hardware loop;

FIG. 40 illustrates in schematic form the field definitions of the interrupt enable register and the interrupt flag register of the program flow control unit illustrated in FIG. 31;

FIG. 41 illustrates in schematic form the field definitions of a command word transmitted between processors of the single integrated circuit multiprocessor illustrated in FIG. 2;

FIG. 42 illustrates in schematic form the field definitions of the communications register of the program flow control unit illustrated in FIG. 31;

FIG. 43 illustrates in schematic form the instruction word controlling the operation of the digital image/graphics processor illustrated in FIG. 3;

FIG. 44 illustrates in schematic form data flow within the data unit during execution of a divide iteration instruction;

FIG. 45 illustrates in flow chart form the use of a left most one's function in a division algorithm;

FIG. 46 illustrates in flow chart form the use of a left most one's function and an exclusive OR in a division algorithm;

FIG. 47 illustrates in schematic form within the data flow during an example sum of absolute value of differences algorithm;

FIGS. 48a, 48b, 48c, 48d and 48e illustrate in schematic form a median filter algorithm;

FIG. 49 illustrates the overlapping pipeline stages in an example of a single instruction hardware loop with a conditional hardware branch;

FIG. 50 illustrates in schematic form a hardware divider that generates two bits of the desired quotient per divide iteration;

FIG. 51 illustrates in schematic form the data flow within the hardware divider illustrated in FIG. 48;

FIG. 52 illustrates in schematic form a hardware divider that generates three bits of the desired quotient per divide iteration;

FIG. 53 illustrates in schematic form the data flow within a hardware divider illustrated in FIG. 51; and

FIG. 54 illustrates in schematic form the multiprocessor integrated circuit of this invention having a single digital image/graphics processor in color facsimile system.

DETAILED DESCRIPTION OF PREFERRED EMBODIMENTS

FIG. 1 is a block diagram of an image data processing system including a multiprocessor integrated circuit constructed for image and graphics processing according to this invention. This data processing system includes a host processing system
1. Host processing system 1 provides the data processing for the host system of data processing system of FIG. 1. Included in the host processing system 1 are a processor, at least one input device, a long term storage device, a read only memory, a random access memory and at least one host peripheral 2 coupled to a host system bus. Arrangement and operation of the host processing system are considered conventional. Because of its processing functions, the host processing system 1 controls the function of the image data processing system.

Multiprocessor integrated circuit 100 provides most of the data processing including data manipulation and computation for image operations of the image data processing system of FIG. 1. Multiprocessor integrated circuit 100 is bi-directionally coupled to an image system bus and communicates with host processing system 1 by way of this image system bus. In the arrangement of FIG. 1, multiprocessor integrated circuit 100 operates independently from the host processing system 1. The multiprocessor integrated circuit 100, however, is responsive to host processing system 1.

FIG. 1 illustrates two image systems. Imaging device 3 represents a document scanner, charge coupled device scanner or video camera that serves as an image input device. Imagine device 3 supplies this image to image capture controller 4, which serves to digitize the image and form it into raster scan frames. This frame capture process is controlled by signals from multiprocessor integrated circuit 100. The thus formed image frames are stored in video random access memory 5. Video random access memory 5 may be accessed via the image system bus permitting data transfer for image processing by multiprocessor integrated circuit 100.

The second image system drives a video display. Multiprocessor integrated circuit 100 communicates with video random access memory 6 for specification of a displayed image via a pixel map. Multiprocessor integrated circuit 100 controls the image data stored in video random access memory 6 via the image system bus. Data corresponding to this image is recalled from video random access memory 6 and supplied to video palette 7. Video palette 7 may transform this recalled data into another color space, expand the number of bits per pixel and the like. This conversion may be accomplished through a look-up table. Video palette 7 also generates the proper video signals to drive video display 8. If these video signals are analog signals, then video palette 7 includes suitable digital to analog conversion. The video level signal output from the video palette 7 may include color, saturation, and brightness information. Multiprocessor integrated circuit 100 controls data stored within the video palette 7, thus controlling the data transformation process and the timing of image frames. Multiprocessor integrated circuit 100 can control the line length and the number of lines per frame of the video display image, the synchronization, retrace, and blanking signals through control of video palette 7. Significantly, multiprocessor integrated circuit 100 determines and controls where graphic display information is stored in the video random access memory 6. Subsequently, during readout from the video random access memory 6, multiprocessor integrated circuit 100 determines the readout sequence from the video random access memory 6, the addresses to be accessed, and control information needed to produce the desired graphic image on video display 8.

Video display 8 produces the specified video display for viewing by the user. There are two widely used techniques. The first technique specifies video data in terms of color, hue, brightness, and saturation for each pixel. For the second technique, color levels of red, blue and green are specified for each pixel. Video palette 7 the video display 8 is designed and fabricated to be compatible with the selected technique.

FIG. 1 illustrates an addition memory 9 coupled to the image system bus. This additional memory may include additional video random access memory, dynamic random access memory, static random access memory or read only memory. Multiprocessor integrated circuit 100 may be controlled either in wholly or partially by a program stored in the memory 9. This memory 9 may also store various types of graphic image data. In addition, multiprocessor integrated circuit 100 preferably includes memory interface circuits for video random access memory, dynamic random access memory and static random access memory. Thus a system could be constructed using multiprocessor integrated circuit 100 without any video random access memory 5 or 6.

FIG. 1 illustrates transceiver 16. Transceiver 16 provides translation and bidirectional communication between the image system bus and a communications channel. One example of a system employing transceiver 16 is video conferencing. The image data processing system illustrated in FIG. 1 employs imaging device 3 and image capture controller 4 to form a video image of persons at a first location. Multiprocessor integrated circuit 100 provides video compression and transmits the compressed video signal to a similar image data processing system at another location via transceiver 16 and the communications channel. Transceiver 16 receives a similarly compressed video signal from the remote image data processing system via the communications channel. Multiprocessor integrated circuit 100 decompresses this received signal and controls video random access memory 6 and video palette 7 to display the corresponding decompressed video signal on video display 8. Note this is not the only example where the image data processing system employs transceiver 16. Also note that the bidirectional communications need not be the same type signals. For example, in an interactive cable television signal the cable system head in would transmit compressed video signals to the image data processing system via the communications channel. The image data processing system could transmit control and data signals back to the cable system head in via transceiver 16 and the communications channel.

FIG. 1 illustrates multiprocessor integrated circuit 100 embodied in a system including host processing system 1. Those skilled in the art would realize from the following disclosure of the invention that multiprocessor integrated circuit 100
may be employed as the only processor of a useful system. In such a system multiprocessor integrated circuit 100 is programmed to perform all the functions of the system.

This invention is particularly useful in a processor used for image processing. According to the preferred embodiment, this invention is embodied in multiprocessor integrated circuit 100. This preferred embodiment includes plural identical processors that embody this invention. Each of these processors will be called a digital image/graphics processor. This description is a matter of convenience only. The processor embodying this invention can be a processor separately fabricated on a single integrated circuit or a plurality of integrated circuits. If embodied on a single integrated circuit, this single integrated circuit may optionally also include read only memory and random access memory used by the digital image/graphics processor.

FIG. 2 illustrates the architecture of the multiprocessor integrated circuit 100 of the preferred embodiment of this invention. Multiprocessor integrated circuit 100 includes: two random access memories 10 and 20, each of which is divided into plural sections; crossbar 50; master processor 60; digital image/graphics processors 71, 72, 73 and 74; transfer controller 80, which mediates access to system memory; and frame controller 90, which can control access to independent first and second image memories. Multiprocessor integrated circuit 100 provides a high degree of operation parallelism, which will be useful in image processing and graphics operations, such as in the multi-media computing.

Multiprocessor integrated circuit 100 includes two random access memories. Random access memory 10 is primarily devoted to master processor 60. It includes two instruction cache memories 11 and 12, two data cache memories 13 and 14 and a parameter memory 15. These memory sections can be physically identical, but connected and used differently. Random access memory 20 may be accessed by master processor 60 and each of the digital image/graphics processors 71, 72, 73 and 74. Each digital image/graphics processor 71, 72, 73 and 74 has five corresponding memory sections. These include an instruction cache memory, three data memories and one parameter memory. Thus digital image/graphics processor 71 has corresponding instruction cache memory 21, data memories 22, 23, 24 and parameter memory 25; digital image/graphics processor 72 has corresponding instruction cache memory 26, data memories 27, 28, 29 and parameter memory 30; digital image/graphics processor 73 has corresponding instruction cache memory 31, data memories 32, 33, 34 and parameter memory 35; and digital image/graphics processor 74 has corresponding instruction cache memory 36, data memories 37, 38, 39 and parameter memory 40. Like the sections of random access memory 10, these memory sections can be physically identical but connected and used differently. Each of these memory sections of memories 10 and 20 preferably includes 2K bytes, with a total memory within multiprocessor integrated circuit 100 of 50K bytes.

Multiprocessor integrated circuit 100 is constructed to provide a high rate of data transfer between processors and memory using plural independent parallel data transfers. Crossbar 50 enables these data transfers. Each digital image/graphics processor 71, 72, 73 and 74 has three memory ports that may operate simultaneously each cycle. An instruction port (I) may fetch 64 bit data words from the corresponding instruction cache. A local data port (L) may read a 32 bit data word from or write a 32 bit data word into the data memories or the parameter memory corresponding to that digital image/graphics processor. A global data port (G) may read a 32 bit data word from or write a 32 bit data word into any of the data memories or the parameter memories or random access memory 20. Master Processor 60 includes two memory ports. An instruction port (I) may fetch a 32 bit instruction word from either of the instruction caches 11 and 12. A data port (C) may read a 32 bit data word from or write a 32 bit data word into data caches 13 or 14, parameter memory 15 of random access memory 10 or any of the data memories, the parameter memories of random access memory 20. Transfer controller 80 can access any of the sections of random access memory 10
or 20 via data port (C). Thus fifteen parallel memory accesses may be requested at any single memory cycle. Random access memories 10 and 20 are divided into 25 memories in order to support so many parallel accesses.

Crossbar 50 controls the connections of master processor 60, digital image/graphics processors 71, 72, 73 and 74, and transfer controller 80 with memories 10 and 20. Crossbar 50 includes a plurality of crosspoints 51 disposed in rows and columns. Each column of crosspoints 51 corresponds to a single memory section and a corresponding range of addresses. A processor requests access to one of the memory sections through the most significant bits of an address output by that processor. This address output by the processor travels along a row. The crosspoint 51 corresponding to the memory section having that address responds either by granting or denying access to the memory section. If no other processor has requested access to that memory section during the current memory cycle, then the crosspoint 51 grants access by coupling the row and column. This supplies the address to the memory section. The memory section responds by permitting data access at that address. This data access may be either a data read operation or a data write operation.

If more than one processor requests access to the same memory section simultaneously, then crossbar 50 grants access to only one of the requesting processors. The crosspoints 51 in each column of crossbar 50 communicate and grant access based upon a priority hierarchy. If two requests for access having the same rank occur simultaneously, then crossbar 50 grants access on a round robin basis, with the processor last granted access having the lowest priority. Each granted access lasts as long as needed to service the request. The processors may change their addresses every memory cycle, so crossbar 50 can change the interconnection between the processors and the memory sections on a cycle by cycle basis.

Master processor 60 preferably performs the major control functions for multiprocessor integrated circuit 100. Master processor 60 is preferably a 32 bit reduced instruction set computer (RISC) processor including a hardware floating point calculation unit. According to the RISC architecture, all accesses to memory are performed with load and store instructions and most integer and logical operations are performed on registers in a single cycle. The floating point calculation unit, however, will generally take several cycles to perform operations when employing the same register file as used by the integer and logical unit. A register score board ensures that correct register access sequences are maintained. The RISC architecture is suitable for control functions in image processing. The floating point calculation unit permits rapid computation of image rotation functions, which may be important to image processing.

Master processor 60 fetches instruction words from instruction cache memory 11 or instruction cache memory 12. Likewise, master processor 60 fetches data from either data cache 13 or data cache 14. Since each memory section includes 2K bytes of memory, there is 4K bytes of instruction cache and 4K bytes of data cache. Cache control is an integral function of master processor 60. As previously mentioned, master processor 60 may also access other memory sections via crossbar 50.

The four digital image/graphics processors 71, 72, 73 and 74 each have a highly parallel digital signal processor (DSP) architecture. FIG. 3 illustrates an overview of exemplary digital image/graphics processor 71, which is identical to digital image/graphics processors 72, 73 and 74. Digital image/graphics processor 71 achieves a high degree of parallelism of operation employing three separate units: data unit 110; address unit 120; and program flow control unit 130. These three units operate simultaneously on different instructions in an instruction pipeline. In addition each of these units contains internal parallelism.

The digital image/graphics processors 71, 72, 73 and 74 can execute independent instruction streams in the multiple instruction multiple data mode (MIMD). In the MIMD mode, each digital image/graphics processor executes an individual program from its corresponding instruction cache, which may be independent or cooperative. In the latter case crossbar 50 enables inter-processor communication in combination with the shared memory. Digital image/graphics processors 71, 72, 73 and 74 may also operate in a synchronized MIMD mode. In the synchronized MIMD mode, the program control flow unit 130 of each digital image/graphics processor inhibits fetching the next instruction until all synchronized processors are ready to proceed. This synchronized MIMD mode allows the separate programs of the digital image/graphics processors to be executed in lock step in a closely coupled operation.

Digital image/graphics processors 71, 72, 73 and 74 can execute identical instructions on differing data in the single instruction multiple data mode (SIMD). In this mode a single instruction stream for the four digital image/graphics processors comes from instruction cache memory 21. Digital image/graphics processor 71 controls the fetching and branching operations and crossbar 50 supplies the same instruction to the other digital image/graphics processors 72, 73 and 74. Since digital image/graphics processor 71 controls instruction fetch for all the digital image/graphics processors 71, 72, 73 and 74, the digital image/graphics processors are inherently synchronized in the SIMD mode.

Transfer controller 80 is a combined direct memory access (DMA) machine and memory interface for multiprocessor integrated circuit 100. Transfer controller 80 intelligently queues, sets priorities and services the data requests and cache misses of the five programmable processors. Master processor 60 and digital image/graphics processors 71, 72, 73 and 74 all access memory and systems external to multiprocessor integrated circuit 100 via transfer controller 80. Data cache or instruction cache misses are automatically handled by transfer controller 80. The cache service (S) port transmits such cache misses to transfer controller 80. Cache service port (S) reads information from the processors and not from memory. Master processor 60 and digital image/graphics processors 71, 72, 73 and 74 may request data transfers from transfer controller 80 as linked list packet requests. These linked list packet requests allow multi-dimensional blocks of information to be transferred between source and destination memory addresses, which can be within multiprocessor integrated circuit 100 or external to multiprocessor integrated circuit 100. Transfer controller 80 preferably also includes a refresh controller for dynamic random access memory (DRAM) which require periodic refresh to retain their data.

Frame controller 90 is the interface between multiprocessor integrated circuit 100 and external image capture and display systems. Frame controller 90 provides control over capture and display devices, and manages the movement of data between these devices and memory automatically. To this end, frame controller 90 provides simultaneous control over two independent image systems. These would typically include a first image system for image capture and a second image system for image display, although the application of frame controller 90 is controlled by the user. These image systems would ordinarily include independent frame memories used for either frame grabber or frame buffer storage. Frame controlled 90 preferably operates to control video dynamic random access memory (VRAM) through refresh and shift register control.

Multiprocessor integrated circuit 100 is designed for large scale image processing. Master processor 60 provides embedded control, orchestrating the activities of the digital image/graphics processors 71, 72, 73 and 74, and interpreting the results that they produce. Digital image/graphics processors 71, 72, 73 and 74 are well suited to pixel analysis and manipulation. If pixels are thought of as high in data but low in information, then in a typical application digital image/graphics processors 71, 72, 73 and 74 might well examine the pixels and turn the raw data into information. This information can then be analyzed either by the digital image/graphics processors 71, 72, 73 and 74 or by master processor 60. Crossbar 50 mediates inter-processor communication. Crossbar 50 allows multiprocessor integrated circuit 100 to be implemented as a shared memory system. Message passing need not be a primary form of communication in this architecture. However, messages can be passed via the shared memories. Each digital image/graphics processor, the corresponding section of crossbar 50 and the corresponding sections of memory 20 have the same width. This permits architecture flexibility by accommodating the addition or removal of digital image/graphics processors and corresponding memory modularly while maintaining the same pin out.

In the preferred embodiment all parts of multiprocessor integrated circuit 100 are disposed on a single integrated circuit. In the preferred embodiment, multiprocessor integrated circuit 100 is formed in complementary metal oxide semiconductor (CMOS) using feature sizes of 0.6 .mu.m. Multiprocessor integrated circuit 100 is preferably constructed in a pin grid array package having 256 pins. The inputs and outputs are preferably compatible with transistor-transistor logic (TTL) logic voltages. Multiprocessor integrated circuit 100 preferably includes about 3 million transistors and employs a clock rate of 50M Hz.

FIG. 3 illustrates an overview of exemplary digital image/graphics processor 71, which is virtually identical to digital image/graphics processors 72, 73 and 74. Digital image/graphics processor 71 includes: data unit 110; address unit 120; and program flow control unit 130. Data unit 110 performs the logical or arithmetic data operations. Data unit 110 includes eight data registers D7-D0, a status register 210 and a multiple flags register 211. Address unit 120 controls generation of load/store addresses for the local data port and the global data port. As will be further described below, address unit 120 includes two virtually identical addressing units, one for local addressing and one for global addressing. Each of these addressing units includes an all "0" read only register enabling absolute addressing in a relative address mode, a stack pointer, five address registers and three index registers. The addressing units share a global bit multiplex control register used when forming a merging address from both address units. Program flow control unit 130 controls the program flow for the digital image/graphics processor 71 including generation of addresses for instruction fetch via the instruction port. Program flow control unit 130 includes; a program counter PC 701; an instruction pointer-address stage IRA 702 that holds the address of the instruction currently in the address pipeline stage; an instruction pointer-execute stage IRE 703 that holds the address of the instruction currently in the execute pipeline stage; an instruction pointer-return from subroutine IPRS 704 holding the address for returns from subroutines; a set of registers controlling zero overhead loops; four cache tag registers TAG3-TAG0
collectively called 708 that hold the most significant bits of four blocks of instruction words in the corresponding instruction cache memory.

Digital image/graphics processor 71 operates on a three stage pipeline as illustrated in FIG. 4. Data unit 110, address unit 120 and program flow control unit 130 operate simultaneously on different instructions in an instruction pipeline. The three stages in chronological order are fetch, address and execute. Thus at any time, digital image/graphics processor 71 will be operating on differing functions of three instructions. The phrase pipeline stage is used instead of referring to clock cycles, to indicate that specific events occur when the pipeline advances, and not during stall conditions.

Program flow control unit 130 performs all the operations that occur during the fetch pipeline stage. Program flow control unit 130 includes a program counter, loop logic, interrupt logic and pipeline control logic. During the fetch pipeline stage, the next instruction word is fetched from memory. The address contained in the program counter is compared with cache tag registers to determine if the next instruction word is stored in instruction cache memory 21. Program flow control unit 130
supplies the address in the program counter to the instruction port address bus 131 to fetch this next instruction word from instruction cache memory 21 if present. Crossbar 50 transmits this address to the corresponding instruction cache, here instruction cache memory 21, which returns the instruction word on the instruction bus 132. Otherwise, a cache miss occurs and transfer controller 80 accesses external memory to obtain the next instruction word. The program counter is updated. If the following instruction word is at the next sequential address, program control flow unit 130 post increments the program counter. Otherwise, program control flow unit 130 loads the address of the next instruction word according to the loop logic or software branch. If the synchronized MIMD mode is active, then the instruction fetch waits until all the specified digital image/graphics processors are synchronized, as indicated by sync bits in a communications register.

Address unit 120 performs all the address calculations of the address pipeline stage. Address unit 120 includes two independent address units, one for the global port and one for the local port. If the instruction calls for one or two memory accesses, then address unit 120 generates the address(es) during the address pipeline stage. The address(es) are supplied to crossbar 50 via the respective global port address bus 121 and local port address bus 122 for contention detection/prioritization. If there is no contention, then the accessed memory prepares to allow the requested access, but the memory access occurs during the following execute pipeline stage.

Data unit 110 performs all of the logical and arithmetic operations during the execute pipeline stage. All logical and arithmetic operations and all data movements to or from memory occur during the execute pipeline stage. The global data port and the local data port complete any memory accesses, which are begun during the address pipeline stage, during the execute pipeline stage. The global data port and the local data port perform all data alignment needed by memory stores, and any data extraction and sign extension needed by memory loads. If the program counter is specified as a data destination during any operation of the execute pipeline stage, then a delay of two instructions is experienced before any branch takes effect. The pipelined operation requires this delay, since the next two instructions following such a branch instruction have already been fetched. According to the practice in RISC processors, other useful instructions may be placed in the two delay slot positions.

Digital image/graphics processor 71 includes three internal 32 bit data busses. These are local port data bus Lbus 103, global port source data bus Gsrc 105 and global port destination data bus Gdst 107. These three buses interconnect data unit
110, address unit 120 and program flow control unit 130. These three buses are also connected to a data port unit 140 having a local port 141 and global port 145. Data port unit 140 is coupled to crossbar 50 providing memory access.

Local data port 141 has a buffer 142 for data stores to memory. A multiplexer/buffer circuit 143 loads data onto Lbus 103 from local port data bus 144 from memory via crossbar 50, from a local port address bus 122 or from global port data bus
148. Local port data bus Lbus 103 thus carries 32 bit data that is either register sourced (stores) or memory sourced (loads). Advantageously, arithmetic results in address unit 120 can be supplied via local port address bus 122, multiplexer buffer 143
to local port data bus Lbus 103 to supplement the arithmetic operations of data unit 110. This will be further described below. Buffer 142 and multiplexer buffer 143 perform alignment and extraction of data. Local port data bus Lbus 103 connects to data registers in data unit 110. A local bus temporary holding register LTD 104 is also connected to local port data Lbus 103.

Global port source data bus Gsrc 105 and global port destination data bus Gdst 107 mediate global data transfers. These global data transfers may be either memory accesses, register to register moves or command word transfers between processors. Global port source data bus Gsrc 105 carries 32 bit source information of a global port data transfer. The data source can be any of the registers of digital image/graphics processor 71 or any data or parameter memory corresponding to any of the digital image/graphics processors 71, 72, 73 or 74. The data is stored to memory via the global port 145. Multiplexer buffer 146 selects lines from local port data Lbus 103 or global port source data bus Gsrc 105, and performs data alignment. Multiplexer buffer 146 writes this data onto global port data bus 148 for application to memory via crossbar 50. Global port source data bus Gsrc 105 also supplies data to data unit 110, allowing the data of global port source data bus Gsrc 105 to be used as one of the arithmetic logic unit sources. This latter connection allows any register of digital image/graphics processor 71 to be a source for an arithmetic logic unit operation.

Global port destination data bus Gdst 107 carries 32 bit destination data of a global bus data transfer. The destination is any register