NS32532-20 [ETC]

;
NS32532-20
型号: NS32532-20
厂家: ETC    ETC
描述:

文件: 总102页 (文件大小:1116K)
中文:  中文翻译
下载:  下载PDF数据表文档文件
May 1991  
NS32532-20/NS32532-25/NS32532-30  
High-Performance 32-Bit Microprocessor  
General Description  
The NS32532 is a high-performance 32-bit microprocessor  
Features  
Y
Software compatible with the Series 32000 family  
32-bit architecture and implementation  
4-GByte uniform addressing space  
On-chip memory management unit with 64-entry  
translation look-aside buffer  
Y
Y
Y
in the Series 32000 family. It is software compatible with  
É
the previous microprocessors in the family but with a greatly  
enhanced internal implementation.  
The high-performance specifications are the result of a four-  
stage instruction pipeline, on-chip instruction and data  
caches, on-chip memory management unit and a signifi-  
cantly increased clock frequency. In addition, the system  
interface provides optimal support for applications spanning  
a wide range, from low-cost, real-time controllers to highly  
sophisticated, general purpose multiprocessor systems.  
Y
Y
Y
Y
4-Stage instruction pipeline  
512-Byte on-chip instruction cache  
1024-Byte on-chip data cache  
High-performance bus  
Ð Separate 32-bit address and data lines  
Ð Burst mode memory accessing  
Ð Dynamic bus sizing  
The NS32532 integrates more than 370,000 transistors fab-  
ricated in a 1.25 mm double-metal CMOS technology. The  
advanced technology and mainframe-like design of the de-  
vice enable it to achieve more than 10 times the throughput  
of the NS32032 in typical applications.  
Y
Y
Y
Y
Extensive multiprocessing support  
Floating-point support via the NS32381 or NS32580  
1.25 mm double-metal CMOS technology  
175-pin PGA package  
In addition to generally improved performance, the  
NS32532 offers much faster interrupt service and task  
switching for real-time applications.  
Block Diagram  
TL/EE/9354–1  
FIGURE 1  
Series 32000É and TRI-STATEÉ are registered trademarks of National Semiconductor Corporation.  
C
1995 National Semiconductor Corporation  
TL/EE/9354  
RRD-B30M105/Printed in U. S. A.  
Table of Contents  
1.0 PRODUCT INTRODUCTION  
2.0 ARCHITECTURAL DESCRIPTION  
2.1 Register Set  
3.0 FUNCTIONAL DESCRIPTION (Continued)  
3.1.3 Instruction Pipeline  
3.1.3.1 Branch Prediction  
3.1.3.2 Memory Mapped I/O  
2.1.1 General Purpose Registers  
2.1.2 Address Registers  
3.1.3.3 Serializing Operations  
3.1.4 Slave Processor Instructions  
3.1.4.1 Regular Slave Instruction Protocol  
3.1.4.2 Pipelined Slave Instruction Protocol  
3.1.4.3 Instruction Flow and Exceptions  
3.1.4.4 Floating-Point Instructions  
3.1.4.5 Custom Slave Instructions  
2.1.3 Processor Status Register  
2.1.4 Configuration Register  
2.1.5 Memory Management Registers  
2.1.6 Debug Registers  
2.2 Memory Organization  
2.2.1 Address Mapping  
3.2 Exception Processing  
3.2.1 Exception Acknowledge Sequence  
3.2.2 Returning from an Exception Service Procedure  
3.2.3 Maskable Interrupts  
2.3 Modular Software Support  
2.4 Memory Management  
3.2.3.1 Non-Vectored Mode  
2.4.1 Page Tables Structure  
3.2.3.2 Vectored Mode: Non-Cascaded Case  
3.2.3.3 Vectored Mode: Cascaded Case  
3.2.4 Non-Maskable Interrupt  
3.2.5 Traps  
2.4.2 Virtual Address Spaces  
2.4.3 Page Table Entry Formats  
2.4.4 Physical Address Generation  
2.4.5 Address Translation Algorithm  
3.2.6 Bus Errors  
2.5 Instruction Set  
3.2.7 Priority Among Exceptions  
2.5.1 General Instruction Format  
2.5.2 Addressing Modes  
3.2.8 Exception Acknowledge Sequences:  
Detailed Flow  
3.2.8.1 Maskable/Non-Maskable Interrupt  
Sequence  
2.5.3 Instruction Set Summary  
3.0 FUNCTIONAL DESCRIPTION  
3.2.8.2 Abort/Restartable Bus Error Sequence  
3.1 Instruction Execution  
3.2.8.3 SLAVE/ILL/SVC/DVZ/FLG/BPT/UND  
Trap Sequence  
3.1.1 Operating States  
3.2.8.4 Trace Trap Sequence  
3.1.2 Instruction Endings  
3.1.2.1 Completed Instructions  
3.1.2.2 Suspended Instructions  
3.1.2.3 Terminated Instructions  
3.1.2.4 Partially Completed Instructions  
2
Table of Contents (Continued)  
3.0 FUNCTIONAL DESCRIPTION (Continued)  
4.0 DEVICE SPECIFICATIONS (Continued)  
3.2.8.5 Integer-Overflow Trap Sequence  
3.2.8.6 Debug Trap Sequence  
4.4.1 Definitions  
4.4.2 Timing Tables  
3.2.8.7 Non-Restartable Bus Error Sequence  
4.4.2.1 Output Signals: Internal Propagation  
Delays  
3.3 Debugging Support  
4.4.2.2 Input Signal Requirements  
3.3.1 Instruction Tracing  
4.4.3 Timing Diagrams  
3.3.2 Debug Trap Capability  
3.4 On-Chip Caches  
APPENDIX A: INSTRUCTION FORMATS  
3.4.1 Instruction Cache (IC)  
B: COMPATIBILITY ISSUES  
B.1 Restrictions on Compatibility  
B.2 Architecture Extensions  
B.3 Integer-Overflow Trap  
B.4 Self-Modifying Code  
3.4.2 Data Cache (DC)  
3.4.3 Cache Coherence Support  
3.4.4 Translation Look-aside Buffer (TLB)  
3.5 System Interface  
3.5.1 Power and Grounding  
3.5.2 Clocking  
B.5 Memory-Mapped I/O  
3.5.3 Resetting  
C: INSTRUCTION SET EXTENSIONS  
C.1 Processor Service Instructions  
C.2 Memory Management Instructions  
C.3 Instruction Definitions  
3.5.4 Bus Cycles  
3.5.4.1 Bus Status  
3.5.4.2 Basic Read and Write Cycles  
3.5.4.3 Burst Cycles  
3.5.4.4 Cycle Extension  
3.5.4.5 Interlocked Bus Cycles  
3.5.4.6 Interrupt Control Cycles  
3.5.4.7 Slave Processor Bus Cycles  
D: INSTRUCTION EXECUTION TIMES  
D.1 Internal Organization and Instruction  
Execution  
D.2 Basic Execution Times  
3.5.5 Bus Exceptions  
D.2.1 Loader Timing  
3.5.6 Dynamic Bus Configuration  
3.5.6.1 Instruction Fetch Sequences  
3.5.6.2 Data Read Sequences  
3.5.6.3 Data Write Sequences  
3.5.7 Bus Access Control  
D.2.2 Address Unit Timing  
D.2.3 Execution Unit Timing  
D.3 Instruction Dependencies  
D.3.1 Data Dependencies  
D.3.1.1 Register Interlocks  
D.3.1.2 Memory Interlocks  
3.5.8 Interfacing Memory-Mapped I/O Devices  
3.5.9 Interrupt and Debug Trap Requests  
3.5.10 Cache Invalidation Requests  
3.5.11 Internal Status  
D.3.2 Control Dependencies  
D.4 Storage Delays  
D.4.1 Instruction Cache Misses  
D.4.2 Data Cache Misses  
4.0 DEVICE SPECIFICATIONS  
4.1 Pin Descriptions  
D.4.3 TLB Misses  
4.1.1 Supplies  
D.4.4 Instruction and Operand Alignment  
4.1.2 Input Signals  
4.1.3 Output Signals  
4.1.4 Input/Output Signals  
D.5 Execution Time Calculations  
D.5.1 Definitions  
D.5.2 Notes on Table Use  
4.2 Absolute Maximum Ratings  
4.3 Electrical Characteristics  
4.4 Switching Characteristics  
D.5.3 T Evaluation  
eff  
D.5.4 Instruction Timing Example  
D.5.5 Execution Timing Tables  
D.5.5.1 Basic and Memory  
Management Instructions  
D.5.5.2 Floating-Point Instructions,  
CPU Portion  
3
List of Illustrations  
CPU Block Diagram ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ 1  
NS32532 Internal Registers ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ 2-1  
Processor Status Register (PSR) ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ 2-2  
Configuration Register (CFG) ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ 2-3  
Page Table Base Registers (PTBn) ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ 2-4  
Memory Management Control Register (MCR) ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ 2-5  
Memory Management Status Register (MSR) ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ 2-6  
Debug Condition Register (DCR) ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ 2-7  
Debug Status Register (DSR) ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ 2-8  
NS32532 Address Mapping ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ 2-9  
NS32532 Run-Time Environment ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ 2-10  
Two-Level Page Tables ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ 2-11  
Page Table Entries (PTE’s) ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ 2-12  
Virtual to Physical Address Translation ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ 2-13  
General Instruction Format ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ 2-14  
Index Byte Format ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ 2-15  
Displacement Encodings ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ 2-16  
Operating States ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ 3-1  
NS32532 Internal Instruction Pipeline ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ 3-2  
Memory References for Consecutive Instructions ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ 3-3  
Memory References after Serialization ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ 3-4  
Regular Slave Instruction Protocol: CPU Actions ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ 3-5  
ID and Operation Word ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ 3-6  
Slave Processor Status Word ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ 3-7  
Instruction Flow in Pipelined Floating-Point Mode ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ 3-8  
Interrupt Dispatch Table ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ 3-9  
Exception Acknowledge Sequence: Direct-Exception Mode Disabled ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ 3-10  
Exception Acknowledge Sequence: Direct-Exception Mode Enabled ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ 3-11  
Return From Trap (RETTn) Instruction Flow: Direct-Exception Mode Disabled ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ 3-12  
Return From Interrupt (RETI) Instruction Flow: Direct-Exception Mode Disabled ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ 3-13  
Exception Processing Flowchart ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ 3-14  
Service Sequence ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ 3-15  
Instruction Cache Structure ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ 3-16  
Data Cache Structure ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ 3-17  
TLB Model ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ 3-18  
Power and Ground Connections ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ 3-19  
Bus Clock Synchronization ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ 3-20  
Power-On Reset Requirements ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ 3-21  
General Reset Timing ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ 3-22  
Basic Read Cycle ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ 3-23  
Write Cycle ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ 3-24  
Burst Read cycles ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ 3-25  
Cycle Extension of a Basic Read Cycle ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ 3-26  
Slave Processor Write Cycle ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ 3-27  
Slave Processor Read Cycle ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ 3-28  
Bus Retry During a Basic Read Cycle ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ 3-29  
Basic Interface for 32-Bit Memories ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ 3-30  
Basic Interface for 16-Bit Memories ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ 3-31  
Hold Acknowledge: (Bus Initially Idle) ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ 3-32  
Typical I/O Device Interface ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ 3-33  
4
List of Illustrations (Continued)  
NS32532 Interface Signals ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ 4-1  
175-Pin PGA Package ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ 4-2  
Output Signals Specification Standard ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ4-3  
Input Signals Specification StandardÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ4-4  
Basic Read Cycle Timing ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ 4-5  
Write Cycle Timing ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ 4-6  
Interlocked Read and Write Cycles ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ 4-7  
Burst Read Cycles ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ 4-8  
External Termination of Burst Cycles ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ 4-9  
Bus Error or Retry During Burst Cycles ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ 4-10  
Extended Retry Timing ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ 4-11  
HOLD Timing (Bus Initially Idle) ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ 4-12  
HOLD Acknowledge Timing (Bus Initially Not Idle) ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ 4-13  
Slave Processor Read Timing ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ 4-14  
Slave Processor Write Timing ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ 4-15  
Slave Processor Done ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ 4-16  
FSSR Signal Timing ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ 4-17  
Cache Invalidation Request ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ 4-18  
INT and NMI Signals Sampling ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ 4-19  
Debug Trap Request ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ 4-20  
PFS Signal Timing ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ 4-21  
ISF Signal Timing ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ 4-22  
Break Point Signal Timing ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ 4-23  
Clock Waveforms ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ 4-24  
Bus Clock Synchronization ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ 4-25  
Power-On Reset ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ 4-26  
Non-Power-On Reset ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ 4-27  
LPRi/SPRi Instruction Formats ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ C-1  
CINV Instruction Format ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ C-2  
LMR/SMR Instruction Formats ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ C-3  
List of Tables  
Access Protection Levels ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ 2-1  
NS32532 Addressing Modes ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ 2-2  
NS32532 Instruction Set Summary ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ 2-3  
Floating-Point Instruction Protocol ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ 3-1  
Custom Slave Instruction Protocols ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ 3-2  
Summary of Exception Processing ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ 3-3  
Interrupt Sequences ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ 3-4  
Cacheable/Non-Cacheable Instruction Fetches from a 32-Bit Bus ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ 3-5  
Cacheable/Non-Cacheable Instruction Fetches from a 16-Bit Bus ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ 3-6  
Cacheable/Non-Cacheable Instruction Fetches from an 8-Bit Bus ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ 3-7  
Cacheable/Non-Cacheable Data Reads from a 32-Bit Bus ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ 3-8  
Cacheable/Non-Cacheable Data Reads from a 16-Bit Bus ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ 3-9  
Cacheable/Non-Cacheable Data Reads from an 8-Bit Bus ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ 3-10  
Data Writes to a 32-Bit Bus ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ 3-11  
Data Writes to a 16-Bit Bus ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ 3-12  
Data Writes to an 8-Bit Bus ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ 3-13  
LPRi/SPRi New ‘Short’ Field Encodings ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ C-1  
LMR/SMR ‘Short’ Field Encodings ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ C-2  
Additional Address Unit Processing Time for Complex Addressing Modes ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀD-1  
5
1.0 Product Introduction  
The NS32532 is an extremely sophisticated microprocessor  
in the Series 32000 family with a full 32-bit architecture and  
implementation optimized for high-performance applica-  
tions.  
Large, Uniform Addressing. The NS32532 has 32-bit ad-  
dress pointers that can address up to 4 gigabytes without  
requiring any segmentation; this addressing scheme pro-  
vides flexible memory management without added-on ex-  
pense.  
By employing a number of mainframe-like features, the de-  
vice can deliver 15 MIPS peaks performance with no wait  
states at a frequency of 30 MHz.  
Modular Software Support. Any software package for the  
Series 32000 family can be developed independent of all  
other packages, without regard to individual addressing. In  
addition, ROM code is totally relocatable and easy to ac-  
cess, which allows a significant reduction in hardware and  
software costs.  
The NS32532 is fully software compatible will all the other  
Series 32000 CPUs. The architectural features of the Series  
32000 family and particularly the NS32532 CPU, are de-  
scribed briefly below.  
Software Processor Concept. The Series 32000 architec-  
ture allows future expansions of the instruction set that can  
be executed by special slave processors, acting as exten-  
sions to the CPU. This concept of slave processors is  
unique to the Series 32000 family. It allows software com-  
patibility even for future components because the slave  
hardware is transparent to the software. With future ad-  
vances in semiconductor technology, the slaves can be  
physically integrated on the CPU chip itself.  
Powerful Addressing Modes. Nine addressing modes  
available to all instructions are included to access data  
structures efficiently.  
Data Types. The architecture provides for numerous data  
types, such as byte, word, doubleword, and BCD, which may  
be arranged into a wide variety of data structures.  
Symmetric Instruction Set. While avoiding special case  
instructions that compilers can’t use, the Series 32000 ar-  
chitecture incorporates powerful instructions for control op-  
erations, such as array indexing and external procedure  
calls, which save considerable space and time for compiled  
code.  
To summarize, the architectural features cited above pro-  
vide three primary performance advantages and character-  
istics:  
High-level language support  
#
#
#
Memory-to-Memory Operations. The Series 32000 CPUs  
represent two-address machines. This means that each op-  
erand can be referenced by any one of the addressing  
modes provided.  
Easy future growth path  
Application flexibility  
2.0 Architectural Description  
2.1 REGISTER SET  
This powerful memory-to-memory architecture permits  
memory locations to be treated as registers for all usefull  
operations. This is important for temporary operands as well  
as for context switching.  
The NS32532 CPU has 28 internal registers grouped ac-  
cording to functions as follows: 8 general purpose, 7 ad-  
dress, 1 processor status, 1 configuration, 7 memory man-  
agement and 4 debug. All registers are 32 bits wide except  
for the module and processor status, which are each 16 bits  
wide. Figure 2-1 shows the NS32532 internal registers.  
Memory Management. The NS32532 on-chip memory  
management unit provides advanced operating system sup-  
port functions, including dynamic address translation, virtual  
memory management, and memory protection.  
Address  
32 Bits  
General Purpose  
w 32 Bits x  
w
x
PC  
R0  
SP0  
R1  
SP1  
R2  
FP  
R3  
SB  
R4  
INTBASE  
R5  
MOD  
R6  
R7  
Processor Status  
PSR  
Debug  
DCR  
DSR  
CAR  
BPC  
Memory Management  
PTB0  
PTB1  
IVAR0  
IVAR1  
TEAR  
MCR  
Configuration  
MSR  
CFG  
FIGURE 2-1. NS32532 Internal Registers  
6
2.0 Architectural Description (Continued)  
2.1.1 General Purpose Registers  
INTBASEÐInterrupt Base. The INTBASE register holds  
the address of the dispatch table for interrupts and traps  
(Section 3.2.1).  
There are eight registers (R0R7) used for satisfying the  
high speed general storage requirements, such as holding  
temporary variables and addresses. The general purpose  
registers are free for any use by the programmer. They are  
32 bits in length. If a general purpose register is specified for  
an operand that is eight or 16 bits long, only the low part of  
the register is used; the high part is not referenced or modi-  
fied.  
MODÐModule. The MOD register holds the address of the  
module descriptor of the currently executing software mod-  
ule. The MOD register is 16 bits long, therefore the module  
table must be contained within the first 64 kbytes of memo-  
ry.  
2.1.3 Processor Status Register  
2.1.2 Address Registers  
The Processor Status Register (PSR) holds status informa-  
tion for the microprocessor.  
The seven address registers are used by the processor to  
implement specific address functions. A description of them  
follows.  
The PSR is sixteen bits long, divided into two eight-bit  
halves. The low order eight bits are accessible to all pro-  
grams, but the high order eight bits are accessible only to  
programs executing in Supervisor Mode.  
PCÐProgram Counter. The PC register is a pointer to the  
first byte of the instruction currently being executed. The PC  
is used to reference memory in the program section.  
C
The C bit indicates that a carry or borrow occurred after  
an addition or subtraction instruction. It can be used  
with the ADDC and SUBC instructions to perform multi-  
ple-precision integer arithmetic calculations. It may  
have a setting of 0 (no carry or borrow) or 1 (carry or  
borrow).  
SP0, SP1ÐStack Pointers. The SP0 register points to the  
lowest address of the last item stored on the INTERRUPT  
STACK. This stack is normally used only by the operating  
system. It is used primarily for storing temporary data, and  
holding return information for operating system subroutines  
and interrupt and trap service routines. The SP1 register  
points to the lowest address of the last item stored on the  
USER STACK. This stack is used by normal user programs  
to hold temporary data and subroutine return information.  
T
L
The T bit causes program tracing. If this bit is set to 1, a  
TRC trap is executed after every instruction (Section  
3.3.1).  
The L bit is altered by comparison instructions. In a  
comparison instruction the L bit is set to ‘‘1’’ if the sec-  
ond operand is less than the first operand, when both  
operands are interpreted as unsigned integers. Other-  
wise, it is set to ‘‘0’’. In Floating-Point comparisons, this  
bit is always cleared.  
When a reference is made to the selected Stack Pointer  
(see PSR S-bit), the terms ‘SP Register’ or ‘SP’ are used.  
SP refers to either SP0 or SP1, depending on the setting of  
the S bit in the PSR register. If the S bit in the PSR is 0, SP  
refers to SP0. If the S bit in the PSR is 1 then SP refers to  
SP1.  
V
F
The V-bit enables generation of a trap (OVF) when an  
integer arithmetic operation overflows.  
The NS32532 also allows the SP1 register to be directly  
loaded and stored using privileged forms of the LPRi and  
SPRi instructions, regardless of the setting of the PSR S-bit.  
When SP1 is accessed in this manner, it is referred to as  
‘USP Register’ or simply ‘USP’.  
The F bit is a general condition flag, which is altered by  
many instructions (e.g., integer arithmetic instructions  
use it to indicate overflow).  
Z
The Z bit is altered by comparison instructions. In a  
comparison instruction the Z bit is set to ‘‘1’’ if the sec-  
ond operand is equal to the first operand; otherwise it is  
set to ‘‘0’’.  
Stacks in the Series 32000 family grow downward in memo-  
ry. A Push operation pre-decrements the Stack Pointer by  
the operand length. A Pop operation post-increments the  
Stack Pointer by the operand length.  
N
The N bit is altered by comparison instructions. In a  
comparison instruction the N bit is set to ‘‘1’’ if the sec-  
ond operand is less than the first operand, when both  
operands are interpreted as signed integers. Otherwise,  
it is set to ‘‘0’’.  
FPÐFrame Pointer. The FP register is used by a procedure  
to access parameters and local variables on the stack. The  
FP register is set up on procedure entry with the ENTER  
instruction and restored on procedure termination with the  
EXIT instruction.  
U
If the U bit is ‘‘1’’ no privileged instructions may be exe-  
cuted. If the U bit is ‘‘0’’ then all instructions may be  
The frame pointer holds the address in memory occupied by  
the old contents of the frame pointer.  
e
Supervisor Mode; when U  
executed. When U  
0 the processor is said to be in  
e
SBÐStatic Base. The SB register points to the global vari-  
ables of a software module. This register is used to support  
relocatable global variables for software modules. The SB  
register holds the lowest address in memory occupied by  
the global variables of a module.  
1 the processor is said to  
15  
8 7  
0
I
P
S
U
N
Z
F
V
L
T
C
FIGURE 2-2. Processor Status Register (PSR)  
7
2.0 Architectural Description (Continued)  
be in User Mode. A User Mode program is restricted  
from executing certain instructions and accessing cer-  
tain registers which could interfere with the operating  
system. For example, a User Mode program is prevent-  
ed from changing the setting of the flag used to indicate  
its own privilege mode. A Supervisor Mode program is  
assumed to be a trusted part of the operating system,  
hence it has no such restrictions.  
F
Floating-point instruction set. This bit indicates  
whether a floating-point unit (FPU) is present to exe-  
cute floating-point instructions. If this bit is 0 when the  
CPU executes  
a floating-point instruction, a Trap  
(UND) occurs. If this bit is 1, then the CPU transfers  
the instruction and any necessary operands to the  
FPU using the slave-processor protocol described in  
Section 3.1.4.1.  
S
The S bit specifies whether the SP0 register or SP1  
register is used as the Stack Pointer. The bit is automat-  
ically cleared on interrupts and traps. It may have a  
setting of 0 (use the SP0 register) or 1 (use the SP1  
register).  
M
Memory management instruction set. This bit en-  
ables the execution of memory management instruc-  
tions. If this bit is 0 when the CPU executes an LMR,  
SMR, RDVAL, or WRVAL instruction, a Trap (UND)  
occurs. If this bit is 1, the CPU executes LMR, SMR,  
RDVAL, and WRVAL instructions using the on-chip  
MMU.  
P
I
The P bit prevents a TRC trap from occuring more than  
once for an instruction (Section 3.3.1). It may have a  
setting of 0 (no trace pending) or 1 (trace pending).  
C
Custom instruction set. This bit indicates whether a  
custom slave processor is present to execute custom  
instructions. If this bit is 0 when the CPU executes a  
custom instruction, a Trap (UND) occurs. If this bit is  
1, the CPU transfers the instruction and any neces-  
sary operands to the custom slave processor using  
the slave-processor protocol described in Section  
3.1.4.1.  
e
e
0,  
If I  
1, then all interrupts will be accepted. If I  
only the NMI interrupt is accepted. Trap enables are not  
affected by this bit.  
2.1.4 Configuration Register  
The Configuration Register (CFG) is 32 bits wide, of which  
ten bits are implemented. The implemented bits enable vari-  
ous operating modes for the CPU, including vectoring of  
interrupts, execution of slave instructions, and control of the  
on-chip caches. In the NS32332 bits 4 through 7 of the CFG  
register selected between the 16-bit and 32-bit slave proto-  
cols and between 512-byte and 4-Kbyte page sizes. The  
NS32532 supports only the 32-bit slave protocol and  
4-Kbyte page size: consequently these bits are forced to 1.  
DE  
DC  
Direct-Exception mode enable. This bit enables the  
Direct-Exception mode for processing exceptions.  
When this mode is selected, the CPU response time  
to interrupts and other exceptions is significantly im-  
proved. Refer to Section 3.2.1 for more information.  
Data Cache enable. This bit enables the on-chip Data  
Cache to be accessed for data reads and writes. Re-  
fer to Section 3.4.2 for more information.  
When the CFG register is loaded using the LPRi instruction,  
bits 14 through 31 should be set to 0. Bits 4 through 7 are  
ignored during loading, and are always returned as 1’s when  
CFG is stored via the SPRi instruction. When the SETCFG  
instruction is executed, the contents of the CFG register bits  
0 through 3 are loaded from the instruction’s short field, bits  
4 through 7 are ignored and bits 8 through 13 are forced to  
0.  
LDC Lock Data Cache. This bit controls whether the con-  
tents of the on-chip Data Cache are locked to fixed  
e
memory locations (LDC 1), or updated when a data  
read is missing from the cache (LDC 0).  
e
IC  
Instruction Cache enable. This bit enables the on-  
chip Instruction Cache to be accessed for instruction  
fetches. Refer to Section 3.4.1 for more information.  
The format of the CFG register is shown in Figure 2-3. The  
various control bits are described below.  
LIC Lock Instruction Cache. This bit controls whether the  
contents of the on-chip Instruction Cache are locked  
I
Interrupt vectoring. This bit controls whether maska-  
e
to fixed memory locations (LIC 1), or updated when  
an instruction fetch is missing from the cache  
e
(LIC 0).  
e
ble interrupts are handled in nonvectored (I 0) or  
vectored (I 1) mode. Refer to Section 3.2.3 for more  
e
information.  
PF  
Pipelined Floating-point execution. This bit indicates  
whether the floating-point unit uses the pipelined  
slave protocol. When PF is 1 the pipelined protocol is  
selected. PF is ignored if the F bit is 0. Refer to Sec-  
tion 3.1.4.2 for more information.  
31  
14 13  
8 7  
0
Reserved  
PF  
LIC  
IC  
LDC  
DC  
DE  
1
1
1
1
C
M
F
I
FIGURE 2-3. Configuration Register (CFG) Bits  
13 to 31 are Reserved; Bits 4 to 7 are Forced to 1.  
8
2.0 Architectural Description (Continued)  
2.1.5 Memory Management Registers  
DS Dual Space. While this bit is 1, then PTB1 contains the  
level-1 page table base address of all addresses spec-  
ified in User-Mode, and PTB0 contains the level-1  
page table base address of all addresses specified in  
Supervisor Mode. While this bit is 0, then PTB0 con-  
tains the level-1 page table base address of all ad-  
dresses specified in both User and Supervisor Modes.  
The NS32532 provides 7 registers to support memory man-  
agement functions. They are accessed by means of the  
LMR and SMR instructions. All of them can be read and  
written except IVAR0 and IVAR1 that are write-only. A de-  
scription of the memory management registers is given in  
the following sections.  
AO Access Level Override. When this bit is set to 1, User-  
Mode accesses are given Supervisor Mode privilege.  
PTB0, PTB1ÐPage Table Base Pointers. The PTBn regis-  
ters hold the physical addresses of the level-1 page tables  
used in address translation. The least significant 12 bits are  
permanently zero, so that each register always points to a  
4-Kbyte boundary in memory.  
31  
4 3  
0
Reserved  
AO DS TS TU  
When either PTB0 or PTB1 is loaded by executing an LMR  
instruction, the MMU automatically invalidates all entries in  
the TLB that had been translated using the old value in the  
selected PTBn register.  
FIGURE 2-5. Memory Management  
Control Register (MCR)  
MSRÐMemory Management Status. The MSR register  
provides status information related to the occurrence of a  
translation exception. Only eight bits are implemented. Bits  
8 to 31 are ignored when MSR is loaded and are returned  
as zeroes when it is read as a 32-bit word. MSR is only  
updated by the MMU when a protection violation or page  
fault is detected while translating an address for a reference  
required to execute an instruction. It is not updated if a page  
fault is detected during either an operand or an instruction  
prefetch, if the data being prefetched is not needed due to a  
change in the instruction execution sequence. The format of  
MSR is shown in Figure 2-6. Details on the function of each  
bit are given below.  
The format of the PTBn registers is shown in Figure 2-4.  
31  
12 11  
0
Base Address  
000000000000  
FIGURE 2-4. Page Table Base Registers (PTBn)  
IVAR0, IVAR1ÐInvalidate Virtual Address. The Invalidate  
Virtual Address registers are write-only registers. When a  
virtual address is written to IVAR0 or IVAR1 using the LMR  
instruction, the translation for that virtual address is purged,  
if present, from the TLB. This must be done whenever a  
Page Table Entry has been changed in memory, since the  
TLB might otherwise contain an incorrect translation value.  
TEX Translation Exception. This two-bit field specifies the  
cause of the current address translation exception.  
(Trap(ABT)). Combinations appearing in this field  
are summarized below.  
Another technique for purging TLB entries is to load a PTBn  
register. Turning off translation (clearing the MCR TU and/  
or TS bits) does not purge any entries from the TLB.  
00 No Translation Exception  
01 First Level PTE Invalid  
10 Second Level PTE Invalid  
11 Protection Violation  
TEARÐTranslation Exception Address Register. The  
TEAR register is loaded by the on-chip MMU when a trans-  
lation exception occurs. It contains the 32-bit virtual address  
that caused the translation exception.  
During address translation, if a protection violation  
and an invalid PTE are detected at the same time,  
the TEX field is set to indicate a protection violation.  
TEAR is not updated if a page fault is detected while pre-  
fetching an instruction that is not executed because the pre-  
vious instruction caused a trap.  
DDT Data Direction. This bit indicates the direction of the  
transfer that the CPU was attempting when the  
translation exception occurred.  
MCRÐMemory Management Control. The MCR register  
controls the operation of the MMU. Only four bits are imple-  
mented. Bits 4 to 31 are reserved for future use and must be  
loaded with zeroes.  
l
l
e
e
e
e
DDT  
DDT  
0
1
Read Cycle  
Write Cycle  
When MCR is read as a 32-bit word, bits 4 to 31 are re-  
turned as zeroes. The format of MCR is shown inFigure 2-5.  
Details on the control bits are given below.  
UST User/Supervisor. This bit indicates whether the  
Translation Exception was caused by a User-Mode  
or Supervisor Mode reference. If UST is 1, then the  
exception was caused by a User-Mode reference;  
otherwise it was caused by a Supervisor Mode refer-  
ence.  
TU Translate User. While this bit is 1, address translation  
is enabled for User-Mode memory references. While  
this bit is 0, address translations is disabled for User-  
Mode memory references.  
TS Translate Supervisor. While this bit is 1, address trans-  
lation is enabled for Supervisor Mode memory refer-  
ences. While this bit is 0, address translation is dis-  
abled for Supervisor-Mode memory references.  
9
2.0 Architectural Description (Continued)  
31  
8 7  
4 3  
UST  
0
Reserved  
STT  
DDT  
TEX  
FIGURE 2-6. Memory Management Status Register (MSR)  
STT CPU Status. This four bit field is set on an address PCE PC-match enable  
translation exception according to the following en-  
codings.  
UD  
SD  
Enable debug conditions in User-Mode  
Enable debug conditions in Supervisor Mode  
1000 Sequential Instruction Fetch  
1001 Non-Sequential Instruction Fetch  
1010 Data Transfer  
DEN Enable debug conditions  
The following 2 bits control testing features that can be  
used during initial system debugging. These features are  
unique to the NS32532 implementation of the Series 32000  
architecture; as such, they may not be supported in future  
implementations. For normal operation these 2 bits should  
be set to 0.  
1011 Read Read-Modify-Write Operand  
1100 Read for Effective Address  
If a reference for an Interrupt-Acknowledge or End-  
of-Interrupt bus cycle (either Master of Cascaded)  
causes a Translation Exception, then the value of  
the STT-field is undefined.  
SI  
Single-Instruction mode enable. This bit, when set  
to 1, inhibits the overlapping of instruction’s execu-  
tion.  
2.1.6 Debug Registers  
BCP Branch Condition Prediction disable. When this bit is  
1, the branch prediction mechanism is disabled. See  
Section 3.1.3.1.  
The NS32532 contains 4 registers dedicated for debugging  
functions.  
These registers are accessed using privileged forms of the  
LPRi and SPRi instructions.  
DSRÐDebug Status Register. The DSR Register indicates  
debug conditions that have been detected. When the CPU  
detects an enabled debug condition, it sets the correspond-  
ing bit (BC, BEX, BCA) in the DSR to 1. When an address-  
compare condition is detected, then the RD-bit is loaded to  
indicate whether a read or write reference was performed.  
Software must clear all the bits in the DSR when appropri-  
ate. The format of the DSR is shown in Figure 2-8; the vari-  
ous fields are described below.  
DCRÐDebug Condition Register. The DCR Register en-  
ables detection of debug conditions. The format of the DCR  
is shown in Figure 2-7; the various bits are described below.  
A debug condition is enabled when the related bit is set to 1.  
CBE0 Compare Byte Enable 0; when set, BYTE0 of an  
aligned double-word is included in the address com-  
parison  
RD  
Indicates whether the last address-compare condi-  
e
0)  
CBE1 Compare Byte Enable 1; when set, BYTE1 of an  
aligned double-word is included in the address com-  
parison  
e
tion was for a read (RD  
reference  
1) or write (RD  
BPC PC-match condition detected  
BEX External condition detected  
CBE2 Compare Byte Enable 2; when set, BYTE2 of an  
aligned double-word is included in the address com-  
parison  
BCA Address-compare condition detected  
CBE3 Compare Byte Enable 3; when set, BYTE3 of an  
aligned double-word is included in the address com-  
parison  
Note 1: The content of the DSR register is not defined if a debug condition  
was detected on a floating-point instruction in pipelined mode and a  
trap was generated by a previous floating-point instruction.  
Note 2: If an address compare is detected on a read and a write for the  
e
VNP Compare virtual address (VNP  
e
1) or physical ad-  
same instruction then the RD-bit will remain clear.  
dress (VNP  
0)  
CARÐCompare Address Register. The CAR Register  
contains the address that is compared to operand reference  
addresses to detect an address-compare condition. The ad-  
dress must be double-word aligned; that is, the two least-  
significant bits must be 0. The CAR is 32 bits wide.  
CWR Address-compare enable for write references  
CRD Address-compare enable for read references  
CAE Address-compare enable  
TR  
Enable Trap (DBG) when a debug condition is de-  
tected  
15  
8 7  
0
Reserved  
Reserved  
CAE CRD CWR VNP CBE3 CBE2 CBE1 CBE0  
31  
31  
24 23  
DEN  
16  
Res  
SD  
UD  
PCE  
TR  
BCP  
SI  
FIGURE 2-7. Debug Condition Register (DCR)  
28 27  
BCA  
0
RD  
BPC  
BEX  
Reserved  
FIGURE 2-8. Debug Status Register (DSR)  
10  
2.0 Architectural Description (Continued)  
BPCÐBreakpoint Program Counter. The BPC Register  
contains the address that is compared with the PC contents  
to detect a PC-match condition. The BPC Register is 32 bits  
wide.  
stored at the lowest address and the most significant word  
of the double-word is stored at the address two higher. In  
memory, the address of a double-word is the address of its  
least significant byte, and a double-word may start at any  
address.  
2.2 MEMORY ORGANIZATION  
The NS32532 implements full 32-bit virtual addresses. This  
allows the CPU to access up to 4 Gbytes of virtual memory.  
The memory is a uniform linear address space. Memory lo-  
31  
24 23  
16 15  
8
7
0
a
a
a
A 1  
A
3
A
2
A
cations are numbered sequentially starting at zero and end-  
32  
MSB  
LSB  
b
ing at 2  
1. The number specifying a memory location is  
called an address. The contents of each memory location is  
a byte consisting of eight bits. Unless otherwise noted, dia-  
grams in this document show data stored in memory with  
the lowest address on the right and the highest address on  
the left. Also, when data is shown vertically, the lowest ad-  
dress is at the top of a diagram and the highest address at  
the bottom of the diagram. When bits are numbered in a  
diagram, the least significant bit is given the number zero,  
and is shown at the right of the diagram. Bits are numbered  
in increasing significance and toward the left.  
Double-Word at Address A  
Although memory is addressed as bytes, it is actually orga-  
nized as double-words. Note that access time to a word or a  
double-word depends upon its address, e.g. double-words  
that are aligned to start at addresses that are multiples of  
four will be accessed more quickly than those not so  
aligned. This also applies to words that cross a double-word  
boundary.  
2.2.1 Address Mapping  
Figure 2-9 shows the NS32532 address mapping.  
7
0
The NS32532 supports the use of memory-mapped periph-  
eral devices and coprocessors. Such memory-mapped de-  
vices can be located at arbitrary locations in the address  
space except for the upper 8 Mbytes of virtual memory (ad-  
dresses between FF800000 (hex) and FFFFFFFF (hex), in-  
clusive), which are reserved by National Semiconductor  
Corporation. Nevertheless, it is recommended that high-per-  
formance peripheral devices and coprocessors be located  
in a specific 8 Mbyte region of virtual memory (addresses  
between FF000000 (hex) and FF7FFFFF (hex), inclusive),  
that is dedicated for memory-mapped I/O. This is because  
the NS32532 detects references to the dedicated locations  
and serializes reads and writes. See Section 3.1.3.3. When  
making I/O references to addresses outside the dedicated  
region, external hardware must indicate to the NS32532  
that special handling is required.  
A
Byte at Address A  
Two contiguous bytes are called a word. Except where not-  
ed, the least significant byte of a word is stored at the lower  
address, and the most significant byte of the word is stored  
at the next higher address. In memory, the address of a  
word is the address of its least significant byte, and a word  
may start at any address.  
15  
8
7
0
a
A
1
A
MSB  
LSB  
In this case a small performance degradation will also re-  
sult. Refer to Section 3.1.3.2 for more information on memo-  
ry-mapped I/O.  
Word at Address A  
Two contiguous words are called a double-word. Except  
where noted, the least significant word of a double-word is  
Address (Hex)  
00000000  
Memory and I/O  
FF000000  
FF800000  
Memory-Mapped I/O  
Reserved by NSC  
Interrupt Control  
FFFFFE00  
FFFFFFFF  
FIGURE 2-9. NS32532 Address Mapping  
11  
2.0 Architectural Description (Continued)  
2.3 MODULAR SOFTWARE SUPPORT  
The Module Table is located within the first 64 kbytes of  
virtual memory. This table contains a Module Descriptor  
(also called a Module Table Entry) for each module in the  
address space of the program. A Module Descriptor has  
four 32-bit entries corresponding to each component of a  
module:  
The NS32532 provides special support for software mod-  
ules and modular programs.  
Each module in a NS32532 software environment consists  
of three components:  
1. Program Code Segment.  
The Static Base entry contains the address of the begin-  
ning of the module’s static data segment.  
#
This segment contains the module’s code and constant  
data.  
The Link Table Base points to the beginning of the mod-  
ule’s Link Table.  
#
2. Static Data Segment.  
Used to store variables and data that may be accessed  
by all procedures within the module.  
The Program Base is the address of the beginning of the  
code and constant data for the module.  
#
3. Link Table.  
A fourth entry is currently unused but reserved.  
#
The MOD Register in the CPU contains the address of the  
Module Descriptor for the currently executing module.  
This component contains two types of entries: Absolute  
Addresses and Procedure Descriptors.  
An Absolute Address is used in the external addressing  
mode, in conjunction with a displacement and the current  
MOD Register contents to compute the effective address  
of an external variable belonging to another module.  
The Static Base Register (SB) contains a copy of the Static  
Base entry in the Module Descriptor of the currently execut-  
ing module, i.e., it points to the beginning of the current  
module’s static data area.  
The Procedure Descriptor is used in the call external pro-  
cedure (CXP) instruction to compute the address of an  
external procedure.  
This register is implemented in the CPU for efficiency pur-  
poses. By having a copy of the static base entry or chip, the  
CPU can avoid reading it from memory each time a data  
item in the static data segment is accessed.  
Normally, the linker program specifies the locations of the  
three components. The Static Data and Link Table typically  
reside in RAM; the code component can be either in RAM or  
in ROM. The three components can be mapped into non-  
contiguous locations in memory, and each can be indepen-  
dently relocated. Since the Link Table contains the absolute  
addresses of external variables, the linker need not assign  
absolute memory addresses for these in the module itself;  
they may be assigned at load time.  
In an NS32532 software environment modules need not be  
linked together prior to loading. As modules are loaded, a  
linking loader simply updates the Module Table and fills the  
Link Table entries with the appropriate values. No modifica-  
tion of a module’s code is required. Thus, modules may be  
stored in read-only memory and may be added to a system  
independently of each other, without regard to their individu-  
al addressing. Figure 2-10 shows a typical NS32532 run-  
time environment.  
To handle the transfer of control from one module to anoth-  
er, the NS32532 uses a module table in memory and two  
registers in the CPU.  
TL/EE/9354–2  
Note: Dashed lines indicate information copied to registers during transfer of control between modules.  
FIGURE 2-10. NS32532 Run-Time Environment  
12  
2.0 Architectural Description (Continued)  
2.4 MEMORY MANAGEMENT  
Level-2 Page Tables contain 1024 32-bit Page Table en-  
tries, and so occupy 4 Kbytes (1 page). Each Level-2 Page  
Table Entry points to a final 4-Kbyte physical page frame. In  
other words, its PFN provides the Page Frame Number por-  
tion (bits 1231) of the translated address (Figure 2-13 ).  
The OFFSET field of the translated address is taken directly  
from the corresponding field of the virtual address.  
The Memory Mangement Unit of the NS32532 provides  
support for demand-paged virtual memory. The MMU trans-  
lates 32-bit virtual addresses into 32-bit physical addresses.  
The page size is 4096 bytes.  
The mapping from virtual to physical addresses is defined  
by means of sets of tables in physical memory. These tables  
are found by the MMU using one of its two Page Table Base  
registers: PTB0 or PTB1. Which register is used depends on  
the currently selected address space. See Section 2.4.2.  
2.4.2 Virtual Address Spaces  
When the Dual Space option is selected for address transla-  
tion in the MCR (Section 2.1.5) the on-chip MMU uses two  
maps: one for translating addresses presented to it in Su-  
pervisor Mode and another for User Mode addresses. Each  
map is referenced by the MMU using one of the two Page  
Table Base registers: PTB0 or PTB1. The MMU determines  
the map to be used by applying the following rules.  
Translation efficiency is improved by means of an on-chip  
64-entry translation look-aside buffer (TLB). Refer to Sec-  
tion 3.4.4 for details.  
If the MMU detects a protection violation or page fault while  
translating an address for a reference required to execute  
an instruction, a translation exception (Trap (ABT)) will re-  
sult.  
e
1) While the CPU is in Supervisor Mode (U/S pin  
0), the  
CPU is said to be generating virtual addresses belonging  
to Address Space 0, and the MMU uses the PTB0 regis-  
ter as its reference for looking up translations from mem-  
ory.  
2.4.1 Page Tables Structure  
The page tables are arranged in a two-level structure, as  
shown in Figure 2-11. Each of the MMU’s PTBn registers  
may point to a Level-1 page table. Each entry of the Level-1  
page table may in turn point to a Level-2 page table. Each  
Level-2 page table entry contains translation information for  
one page of the virtual space.  
e
2) While the CPU is in User Mode (U/S pin  
1), and the  
MCR DS bit is set to enable Dual Space translation, the  
CPU is said to be generating virtual addresses belonging  
to Address Space 1, and the MMU uses the PTB1 regis-  
ter to look up translations.  
The Level-1 page table must remain in physical memory  
while the PTBn register contains its address and translation  
is enabled. Level-2 Page Tables need not reside in physical  
memory permanently, but may be swapped into physical  
memory on demand as is done with the pages of the virtual  
space.  
3) If Dual Space translation is not selected in the MCR,  
there is no Adress Space 1, and all virtual addresses gen-  
erated in both Supervisor and User modes are consid-  
ered by the MMU to be in Address Space 0. The privilege  
level of the CPU is used then only for access level check-  
ing.  
The Level-1 Page Table contains 1024 32-bit Page Table  
Entries (PTE’s) and therefore occupies 4 Kbytes. Each entry  
of the Level-1 Page Table contains a field used to construct  
the physical base address of a Level-2 Page Table. This  
field is a 20-bit PFN field, providing bits 1231 of the physi-  
cal address. The remaining bits (011) are assumed zero,  
placing a Level-2 Page Table always on a 4-Kbyte (page)  
boundary.  
Note: When the CPU executes a Dual-Space Move instruction (MOVUSi or  
MOVSUi), it temporarily enters User Mode by switching the state of  
the U/S pin. Accesses made by the CPU during this time are treated  
by the MMU as User-Mode accesses for both mapping and access  
level checking. It is possible, however, to force the MMU to assume  
Supervisor Mode privilege on such accesses by setting the Access  
Override (AO) bit in the MCR (Section 2.1.5).  
TL/EE/9354–3  
FIGURE 2-11. Two-Level Page Tables  
13  
2.0 Architectural Description (Continued)  
2.4.3 Page Table Entry Formats  
R
Referenced. This is a status bit, set by the MMU and  
cleared by the operating system, that indicates  
whether the page mapped by this PTE has been ref-  
erenced within a period of time determined by the  
operating system. It is intended to assist in imple-  
menting memory allocation strategies. In a Level-1  
PTE, the R bit indicates only that the Level-2 Page  
Table has been referenced for a translation, without  
necessarily implying that the translation was suc-  
cessful. In a Level-2 PTE, it indicates that the page  
mapped by the PTE has been sucessfully referenced.  
Figure 2-12 shows the formats of Level-1 and Level-2 Page  
Table Entries (PTE’s).  
The bits are defined as follows:  
V
Valid. The V bit is set and cleared only by software.  
l
e
e
V
1
The PTE is valid and may be used for  
translation by the MMU.  
l
e
e
V
0
The PTE does not represent a valid trans-  
lation. Any attempt to use this PTE to trans-  
late and address will cause the MMU to  
generate an Abort trap.  
l
e
e
R
1
The page has been referenced since the  
R bit was last cleared.  
PL Protection Level. This two-bit field establishes the  
types of accesses permitted for the page in both User  
Mode and Supervisor Mode, as shown in Table 2-1.  
l
e
e
R
0
The page has not been referenced since  
the R bit was last cleared.  
The PL field is modified only by software. In a Level-1  
PTE, it limits the maximum access level allowed for all  
pages mapped through that PTE.  
M
Modified. This is a status bit, set by the MMU when-  
ever a write cycle is successfully performed to the  
page mapped by this PTE. It is initialized to zero by  
the operating system when the page is brought into  
physical memory.  
TABLE 2-1. Access Protection Levels  
Protection Level Bits (PL)  
U/S  
l
e
e
M
1
The page has been modified since it was  
last brought into physical memory.  
Mode  
00  
no  
01  
no  
10  
11  
l
e
e
M
0
The page has not been modified since it  
was last brought into physical memory.  
User  
1
read  
only  
full  
access access  
access  
In Level-1 Page Table Entries, this bit po-  
sition is undefined, and is unaltered.  
Supervisor  
0
read  
only  
full  
full  
full  
USR User bits. These bits are ignored by the MMU and  
their values are not changed.  
access access access  
NU  
CI  
Not Used. These bits are reserved by National for  
future enhancements. Their values should be set to  
zero.  
They can be used by the user software.  
PFN Page Frame Number. This 20-bit field provides bits  
1231 of the physical address. See Figure 2-13.  
Cache Inhibit. This bit appears only in Level-2 PTE’s.  
It is used to specify non-cacheable pages.  
31  
12 11  
9 8  
0
PFN  
PFN  
USR  
USR  
NU  
R
NU  
PL  
PL  
V
V
First Level PTE  
31  
12 11  
8
9
0
M
R
CI  
NU  
Second Level PTE  
FIGURE 2-12. Page Table Entries (PTE’s)  
14  
2.0 Architectural Description (Continued)  
TL/EE/9354–4  
FIGURE 2-13. Virtual to Physical Address Translation  
2.4.4 Physical Address Generation  
by 4) to the base address taken from the Level-1 Page Ta-  
ble Entry. The PFN field of the selected entry provides the  
entire Page Frame Number of the translated address.  
When a virtual address is presented to the MMU and the  
translation information is not in the TLB, the MMU performs  
a page table lookup in order to generate the physical ad-  
dress.  
The offset field of the virtual address is then appended to  
this frame number to generate the final physical address.  
The Page Table structure is traversed by the MMU using  
fields taken from the virtual address. This sequence is dia-  
grammed in Figure 2-13.  
2.4.5. Address Translation Algorithm  
The MMU either translates the 32-bit virtual address to a 32-  
bit physical address or generates an abort trap to report a  
translation error. The algorithm used by the MMU to perform  
the translation is compatible with that of the NS32382. Re-  
fer to Appendix C for differences between the two MMUs.  
Bits 1231 of the virtual address hold the 20-bit Page Num-  
ber, which in the course of the translation is replaced with  
the 20-bit Page Frame Number of the physical address. The  
virtual Page Number field is further divided into two fields,  
INDEX 1 and INDEX 2.  
In the description that follows, the symbol ‘U’ takes the val-  
ue 1 for a User-Mode memory reference. A reference is a  
User-Mode reference in the following cases:  
Bits 011 constitute the OFFSET field, which identifies a  
byte’s position within the accessed page. Since the byte  
position within a page does not change with translation, this  
value is not used, and is simply echoed by the MMU as bits  
0–11 of the final physical address.  
1. The reference is performed while executing in User-  
Mode.  
2. The reference is for the source operand of a MOVUS  
instruction.  
The 10-bit INDEX 1 field of the virtual address is used as an  
index into the Level-1 Page Table, selecting one of its 1024  
entries. The address of the entry is computed by adding  
INDEX 1 (scaled by 4) to the contents of the current Page  
Table Base register. The PFN field of that entry gives the  
base address of the selected Level-2 Page Table.  
3. The reference is for the destination operand of a MOVSU  
instruction.  
The following notations are used in the algorithm.  
A B  
x
A concatenated with B  
#
#
#
#
ll  
A.B  
x
x
B is a field inside register A  
object pointed to by address A  
The INDEX 2 field of the virtual address (10 bits) is used as  
the index into the Level-2 Page Table, by adding it (scaled  
(A)  
(A).B  
x
B field of the object pointed to by address A  
15  
2.0 Architectural Description (Continued)  
Each access is associated with one of two Address Spaces  
(AS), defined as follows:  
Ð TEAR  
w
virtual address,  
e
e
Ð clock MSR with MSR.TEX  
Ð terminate translation;  
e
11,  
10,  
e
AS  
U AND MCR.DS  
e
If (PTE.V  
0) then  
#
#
If AS  
1, Page Table Base Register 1 (PTB1) is used to  
e
select the first-level page table. If AS  
select the first-level page table.  
0, PTB0 is used to  
/* PTE2 Invalid */  
virtual address,  
Ð TEAR  
w
Ð clock MSR with MSR.TEX  
Ð terminate translation;  
The access-level is a 2-bit value used to specify the privi-  
lege level of an access. It is determined as follows:  
e
e
BIT1  
BIT0  
U AND (NOT(MCR.A0))  
#
#
e
If ((read AND NOT interlocked) AND PTE.R  
Read-Modify-Write a double-word interlocked (PTE Poin-  
0) then  
#
1 for write, or read with ‘RMW’ status  
0 otherwise  
e
ter).R  
1;  
START TRANSLATION:  
e
If ((write OR interlocked read) AND (PTE.R  
0 OR  
#
e
e
e
e
e
0)  
PTE.M  
terlocked (PTE Pointer).R  
6. Generate Physical address:  
0) then Read-Modify-Write a double-word in-  
If (U  
0 AND MCR.TS  
0 OR U  
1 AND MCR.TU  
e
e
1;  
1, (PTE Pointer).M  
then  
/* address translation disabled */  
(physical address virtual address; CIOUT pin  
0 in all MMU generated accesses */  
physical address  
w
PTE.CI  
PTE.PFN OFFSET  
ll  
#
#
e
w
0);  
CIOUT pin  
w
e
/* Note: CIOUT  
7. Update Translation Buffer:  
else BEGIN /* (see also Figure 2-13 ) */  
1. Select PTB:  
Select entry for replacement;  
#
TLB. Virtual Page Number  
w
INDEX1 INDEX2;  
ll  
#
#
#
#
#
#
#
e
e
1) then  
If (MCR.DS  
1 AND U  
PTB0, AS  
#
Ð PTB  
TLB.AS  
w
TLB. Physical Frame Number  
AS;  
e
PTB1,  
w
PTE.PFN  
e
Ð AS  
1;  
TLB.PL  
TLB.CI  
TLB.M  
w
w
w
Effective PL  
PTE.CI  
e
e
0);  
else (PTB  
#
2. Fetch first level PTE:  
(PTE Pointer) .M  
e
PTE Pointer  
PTB.BASE ADDRESS INDEX1 00;  
ll  
#
#
#
ll  
Enable entry  
PTE  
w
Effective PL  
(PTE Pointer); /* Fetch PTE1 */  
PTE.PL  
END  
w
Note 1: The TEAR and MSR are only updated when a Trap (ABT) occurs. It  
is possible that the MMU detects a page fault or protection violation  
on a reference for an instruction that is not executed, for example  
on a prefetch. In that event, Trap (ABT) does not occur, and the  
TEAR and MSR are not updated.  
3. Validate First Level PTE:  
k
If (PTE.PL  
access level) then  
#
#
/* Protection Exception */  
Ð TEAR  
w
Ð clock MSR with MSR.TEX  
Ð terminate translation;  
virtual address,  
Note 2: If the MMU is translating a virtual address to check protection while  
executing a RDVAL or WRVAL instruction, then Trap (ABT) occurs  
only if the level-1 PTE is invalid and the access is permitted by the  
PL-field. These instructions will not generate an abort if the F bit  
value can be determined from Level-1 PTE.  
e
11,  
e
If (PTE.V  
0) then  
/* PTE1 Invalid */  
virtual address,  
#
#
2.5 INSTRUCTION SET  
Ð TEAR  
w
2.5.1 General Instruction Format  
e
Ð clock MSR with MSR.TEX  
Ð terminate translation;  
01,  
Figure 2-14 shows the general format of a Series 32000  
instruction. The Basic Instruction is one to three bytes long  
and contains the Opcode and up to two 5-bit General Ad-  
dressing Mode (‘‘Gen’’) fields. Following the Basic Instruc-  
tion field is a set of optional extensions, which may appear  
depending on the instruction and the addressing modes se-  
lected.  
e
If (PTE.R  
0) then  
Ð Write a Byte (PTE Pointer) .R  
Effective PL PTE.PL  
4. Fetch second level PTE:  
#
e
1;  
w
#
e
PTE Pointer  
PTE.PFN INDEX2 00;  
ll  
#
#
#
Index Bytes appear when either or both Gen fields specify  
Scaled Index. In this case, the Gen field specifies only the  
Scale Factor (1, 2, 4 or 8), and the Index Byte specifies  
which General Purpose Register to use as the index, and  
which addressing mode calculation to perform before index-  
ing. See Figure 2-15.  
ll  
PTE  
w
(PTE Pointer); /* Fetch PTE2 */  
k
If (PTE.PL  
effective PL) then  
PTE.PL;  
5. Validate Second Level PTE:  
Ð Effective PL  
w
k
If (PTE.PL  
access level) then  
#
#
/* Protection Exception */  
16  
2.0 Architectural Description (Continued)  
TL/EE/9354–5  
FIGURE 2-14. General Instruction Format  
TL/EE/9354–6  
FIGURE 2-15. Index Byte Format  
Following Index Bytes come any displacements (addressing  
constants) or immediate values associated with the select-  
ed addressing modes. Each Disp/Imm field may contain  
one or two displacements, or one immediate value. The size  
of a Displacement field is encoded with the top bits of that  
field, as shown in Figure 2-16, with the remaining bits inter-  
preted as a signed (two’s complement) value. The size of an  
immediate value is determined from the Opcode field. Both  
Displacement and Immediate fields are stored most signifi-  
cant byte first. Note that this is different from the memory  
representation of data (Section 2.2).  
PC, SP, SB or FP. These registers point to data areas gen-  
erally needed by high-level languages.  
b
a
Byte Displacement: Range 64 to 63  
Some instructions require additional, ‘implied’’ immediates  
and/or displacements, apart from those associated with ad-  
dressing modes. Any such extensions appear at the end of  
the instruction, in the order that they appear within the list of  
operands in the instruction definition (Section 2.5.3).  
2.5.2 Addressing Modes  
The CPU generally accesses an operand by calculating its  
Effective Address based on information available when the  
operand is to be accessed. The method to be used in per-  
forming this calculation is specified by the programmer as  
an ‘‘addressing mode.’’  
Addressing modes are designed to optimally support high-  
level language accesses to variables. In nearly all cases, a  
variable access requires only one addressing mode, within  
the instruction that acts upon that variable. Extraneous data  
movement is therefore minimized.  
Addressing Modes fall into nine basic types:  
Register: The operand is available in one of the eight Gen-  
eral Purpose Registers. In certain Slave Processor instruc-  
tions, an auxiliary set of eight registers may be referenced  
instead.  
TL/EE/9354–7  
Register Relative: A General Purpose Register contains an  
address to which is added a displacement value from the  
instruction, yielding the Effective Address of the operand in  
memory.  
FIGURE 2-16. Displacement Encodings  
*Note: The pattern ‘‘11100000’’ for the most significant byte of the displace-  
ment is reserved by National for future enhancements. Therefore, it  
should never be used by the user program. This causes the lower  
29  
24  
2
29  
b
2 .  
b
limit of the displacement range to be (2  
b
) instead of  
Memory Space: Identical to Register Relative above, ex-  
cept that the register used is one of the dedicated registers  
17  
2.0 Architectural Description (Continued)  
Memory Relative: A pointer variable is found within the  
memory space pointed to by the SP, SB or FP register. A  
displacement is added to that pointer to generate the Effec-  
tive Address of the operand.  
Format tables (Appendix A). The Instruction column gives  
the instruction as coded in assembly language, and the De-  
scription column provides a short description of the function  
provided by that instruction. Further details of the exact op-  
erations performed by each instruction may be found in the  
Instruction Set Reference Manual.  
Immediate: The operand is encoded within the instruction.  
This addressing mode is not allowed if the operand is to be  
written.  
Notations:  
Absolute: The address of the operand is specified by a  
displacement field in the instruction.  
e
e
e
e
i
Integer length suffix: B  
Byte  
W
D
Word  
External: A pointer value is read from a specified entry of  
the current Link Table. To this pointer value is added a dis-  
placement, yielding the Effective Address of the operand.  
Double Word  
e
e
e
f
Floating Point length suffix: F  
L
Standard Floating  
Long Floating  
Top of Stack: The currently-selected Stack Pointer (SP0 or  
SP1) specifies the location of the operand. The operand is  
pushed or popped, depending on whether it is written or  
read.  
e
gen  
specified.  
General operand. Any addressing mode can be  
e
short  
(see Appendix A for encodings).  
A 4-bit value encoded within the Basic Instruction  
Scaled Index: Although encoded as an addressing mode,  
Scaled Indexing is an option on any addressing mode ex-  
cept Immediate or another Scaled Index. It has the effect of  
calculating an Effective Address, then multiplying any Gen-  
eral Purpose Register by 1, 2, 4 or 8 and adding it into the  
total, yielding the final Effective Address of the operand.  
e
imm  
ed after any addressing extensions.  
Implied immediate operand. An 8-bit value append-  
e
disp  
bits. All three lengths legal.  
Displacement (addressing constant): 8, 16 or 32  
e
reg  
areg  
Configuration.  
Any General Purpose Register: R0R7.  
Table 2-2 is a brief summary of the addressing modes. For a  
complete description of their actions, see the Instruction Set  
Reference Manual.  
e
Any Processor Register: Address, Debug, Status,  
e
mreg  
Any Memory Management Register.  
2.5.3 Instruction Set Summary  
e
tion Dependent).  
creg  
A Custom Slave Processor Register (Implementa-  
Table 2-3 presents a brief description of the NS32532 in-  
struction set. The Format column refers to the Instruction  
e
the Basic Instruction (see Appendix A for encodings).  
cond  
Any condition code, encoded as a 4-bit field within  
18  
2.0 Architectural Description (Continued)  
TABLE 2-2. NS32532 Addressing Modes  
ENCODING  
Register  
00000  
MODE  
ASSEMBLER SYNTAX  
EFFECTIVE ADDRESS  
Register 0  
Register 1  
Register 2  
Register 3  
Register 4  
Register 5  
Register 6  
Register 7  
R0, F0, L0  
R1, F1, L1  
R2, F2, L2  
R3, F3, L3  
R4, F4, L4  
R5, F5, L5  
R6, F6, L6  
R7, F7, L7  
None: Operand is in the  
specified register.  
00001  
00010  
00011  
00100  
00101  
00110  
00111  
Register Relative  
01000  
a
Register.  
Register 0 relative  
Register 1 relative  
Register 2 relative  
Register 3 relative  
Register 4 relative  
Register 5 relative  
Register 6 relative  
Register 7 relative  
disp(R0)  
disp(R1)  
disp(R2)  
disp(R3)  
disp(R4)  
disp(R5)  
disp(R6)  
disp(R7)  
Disp  
01001  
01010  
01011  
01100  
01101  
01110  
01111  
Memory Relative  
10000  
a
Pointer; Pointer found at  
Frame memory relative  
Stack memory relative  
Static memory relative  
disp2(disp1(FP))  
disp2(disp1(SP))  
disp2(disp1(SB))  
Disp2  
a
Register. ‘‘SP’’ is either  
10001  
address Disp1  
10010  
SP0 or SP1, as selected in PSR.  
Reserved  
10011  
(Reserved for Future Use)  
Immediate  
Immediate  
10100  
value  
None. Operand is input from  
instruction queue.  
Absolute  
10101  
@
Absolute  
External  
disp  
Disp.  
External  
10110  
a
a
Pointer; Pointer is found  
EXT(disp1)  
TOS  
disp2  
Disp2  
at Link Table Entry number Disp1.  
Top of Stack  
10111  
Top of stack  
Top of current stack, using either  
User or Interrupt Stack Pointer,  
as selected in PSR. Automatic  
Push/Pop included.  
Memory Space  
11000  
a
Register; ‘‘SP’’ is either  
Frame memory  
Stack memory  
Static memory  
Program memory  
disp(FP)  
disp(SP)  
disp(SB)  
Disp  
11001  
SP0 or SP1, as selected in PSR.  
11010  
a
disp  
11011  
*
Scaled Index  
11100  
a
[
]
]
]
]
Index, bytes  
mode Rn:B  
EA (mode)  
EA (mode)  
EA (mode)  
EA (mode)  
Rn.  
c
a
a
a
[
11101  
Index, words  
mode Rn:W  
2
Rn.  
Rn.  
Rn.  
c
[
11110  
Index, double words  
Index, quad words  
mode Rn:D  
4
c
[
mode Rn:Q  
11111  
8
‘‘Mode’ and ‘n’ are contained  
within the Index Byte.  
EA (mode) denotes the effective  
address generated using mode.  
19  
2.0 Architectural Description (Continued)  
TABLE 2-3. NS32532 Instruction Set Summary  
MOVES  
Format  
Operation  
MOVi  
Operands  
gen,gen  
Description  
4
2
7
7
7
7
7
4
Move a value.  
MOVQi  
short,gen  
gen,gen,disp  
gen,gen  
Extend and move a signed 4-bit constant.  
Move Multiple: disp bytes (1 to 16).  
Move with zero extension.  
Move with zero extension.  
Move with sign extension.  
Move with sign extension.  
Move Effective Address.  
MOVMi  
MOVZBW  
MOVZiD  
MOVXBW  
MOVXiD  
ADDR  
gen,gen  
gen,gen  
gen,gen  
gen,gen  
INTEGER ARITHMETIC  
Format  
Operation  
ADDI  
ADDQi  
ADDCi  
SUBi  
Operands  
gen,gen  
short,gen  
gen,gen  
gen,gen  
gen,gen  
gen,gen  
gen,gen  
gen,gen  
gen,gen  
gen,gen  
gen,gen  
gen,gen  
gen,gen  
gen,gen  
Description  
4
2
4
4
4
6
6
7
7
7
7
7
7
7
Add.  
Add signed 4-bit constant.  
Add with carry.  
Subtract.  
SUBCi  
NEGi  
ABSi  
Subtract with carry (borrow).  
Negate (2’s complement).  
Take absolute value.  
Multiply.  
MULi  
QUOi  
REMi  
DIVi  
Divide, rounding toward zero.  
Remainder from QUO.  
Divide, rounding down.  
Remainder from DIV (Modulus).  
Multiply to Extended Integer.  
Divide Extended Integer.  
MODi  
MEIi  
DEIi  
PACKED DECIMAL (BCD) ARITHMETIC  
Format  
Operation  
ADDPi  
Operands  
gen,gen  
gen,gen  
Description  
6
6
Add Packed.  
SUBPi  
Subtract Packed.  
INTEGER COMPARISON  
Format  
Operation  
CMPi  
Operands  
gen,gen  
Description  
4
2
7
Compare.  
CMPQi  
CMPMi  
short,gen  
gen,gen,disp  
Compare to signed 4-bit constant.  
Compare Multiple: disp bytes (1 to 16).  
LOGICAL AND BOOLEAN  
Format  
Operation  
ANDi  
Operands  
gen,gen  
gen,gen  
gen,gen  
gen,gen  
gen,gen  
gen,gen  
gen  
Description  
4
4
4
4
6
6
2
Logical AND.  
ORi  
Logical OR.  
BICi  
Clear selected bits.  
XORi  
Logical Exclusive OR.  
Complement all bits.  
COMi  
NOTi  
Boolean complement: LSB only.  
Save condition code (cond) as a Boolean variable of size i.  
Scondi  
SHIFTS  
Format  
Operation  
LSHi  
Operands  
gen,gen  
gen,gen  
gen,gen  
Description  
6
6
6
Logical Shift, left or right.  
Arithmetic Shift, left or right.  
Rotate, left or right.  
ASHi  
ROTi  
20  
2.0 Architectural Description (Continued)  
TABLE 2-3. NS32532 Instruction Set Summary (Continued)  
BITS  
Format  
Operation  
TBITi  
Operands  
gen,gen  
gen,gen  
gen,gen  
gen,gen  
gen,gen  
gen,gen  
gen,gen  
Description  
4
6
6
6
6
6
8
Test bit.  
SBITi  
Test and set bit.  
SBITIi  
CBITi  
CBITIi  
IBITi  
Test and set bit, interlocked.  
Test and clear bit.  
Test and clear bit, interlocked.  
Test and invert bit.  
Find first set bit.  
FFSi  
BIT FIELDS  
Bit fields are values in memory that are not aligned to byte boundaries. Examples are PACKED arrays and records used in  
Pascal. ‘‘Extract’’ instructions read and align a bit field. ‘‘Insert’’ instructions write a bit field from an aligned source.  
Format  
Operation  
EXTi  
Operands  
Description  
8
8
7
7
8
reg,gen,gen,disp  
reg,gen,gen,disp  
gen,gen,imm,imm  
gen,gen,imm,imm  
reg,gen,gen  
Extract bit field (array oriented).  
Insert bit field (array oriented).  
Extract bit field (short form).  
Insert bit field (short form).  
Convert to Bit Field Pointer.  
INSi  
EXTSi  
INSSi  
CVTP  
ARRAYS  
Format  
Operation  
CHECKi  
INDEXi  
Operands  
reg,gen,gen  
reg,gen,gen  
Description  
8
8
Index bounds check.  
Recursive indexing step for multiple-dimensional arrays.  
STRINGS  
String instructions assign specific functions to  
the General Purpose Registers:  
R4 - Comparison Value  
Options on all string instructions are:  
B (Backward):  
Decrement string pointers after each step  
rather than incrementing.  
End instruction if String 1 entry  
matches R4.  
R3 - Translation Table Pointer  
R2 - String 2 Pointer  
U (Until match):  
W (While match):  
R1 - String 1 Pointer  
End instruction if String 1 entry  
does not match R4.  
R0 - Limit Count  
All string instructions end when R0 decrements to zero.  
Description  
Format  
Operation  
MOVSi  
MOVST  
CMPSi  
Operands  
options  
options  
options  
options  
options  
options  
5
Move String 1 to String 2.  
Move string, translating bytes.  
Compare String 1 to String 2.  
Compare translating, String 1 bytes.  
Skip over String 1 entries.  
5
5
CMPST  
SKPSi  
SKPST  
Skip, translating bytes for Until/While.  
21  
2.0 Architectural Description (Continued)  
TABLE 2-3. NS32532 Instruction Set Summary (Continued)  
JUMPS AND LINKAGE  
Format  
Operation  
JUMP  
BR  
Operands  
gen  
Description  
3
0
0
3
2
3
1
1
3
1
1
1
1
1
1
1
1
1
Jump.  
disp  
Branch (PC Relative).  
Bcond  
CASEi  
ACBi  
JSR  
disp  
Conditional branch.  
gen  
Multiway branch.  
short,gen,disp  
gen  
Add 4-bit constant and branch if non-zero.  
Jump to subroutine.  
BSR  
disp  
Branch to subroutine.  
CXP  
disp  
Call external procedure.  
CXPD  
SVC  
gen  
Call external procedure using descriptor.  
Supervisor Call.  
FLAG  
BPT  
Flag Trap.  
Breakpoint Trap.  
[
[
]
ENTER  
EXIT  
reg list ,disp  
Save registers and allocate stack frame (Enter Procedure).  
Restore registers and reclaim stack frame (Exit Procedure).  
Return from subroutine.  
]
reg list  
RET  
disp  
disp  
disp  
RXP  
Return from external procedure call.  
Return from trap. (Privileged)  
Return from interrupt. (Privileged)  
RETT  
RETI  
CPU REGISTER MANIPULATION  
Format  
Operation  
SAVE  
Operands  
Description  
[
[
]
]
1
1
2
reg list  
Save General Purpose Registers.  
RESTORE  
LPRi  
reg list  
Restore General Purpose Registers.  
Load Processor Register. (Privileged if PSR, INTBASE, USP, CFG  
or Debug Registers).  
areg,gen  
2
SPRi  
areg,gen  
Store Processor Register. (Privileged if PSR, INTBASE, USP, CFG  
or Debug Registers).  
3
3
3
5
ADJSPi  
gen  
gen  
gen  
Adjust Stack Pointer.  
BISPSRi  
BICPSRi  
SETCFG  
Set selected bits in PSR. (Privileged if not Byte length)  
Clear selected bits in PSR. (Privileged if not Byte length)  
Set Configuration Register. (Privileged)  
[
]
option list  
FLOATING POINT  
Format  
11  
9
Operation  
Operands  
gen,gen  
gen,gen  
gen,gen  
gen,gen  
gen,gen  
gen,gen  
gen,gen  
gen,gen  
gen,gen  
gen,gen  
gen,gen  
gen,gen  
gen,gen  
gen,gen  
gen,gen  
gen,gen  
gen,gen  
gen,gen  
gen,gen  
gen,gen  
gen  
Description  
MOVf  
Move a Floating Point value.  
MOVLF  
MOVFL  
MOVif  
ROUNDfi  
TRUNCfi  
FLOORfi  
ADDf  
Move and shorten a Long value to Standard.  
9
Move and lengthen a Standard value to Long.  
9
Convert any integer to Standard or Long Floating.  
9
Convert to integer by rounding.  
9
Convert to integer by truncating, toward zero.  
9
Convert to largest integer less than or equal to value.  
11  
11  
11  
11  
11  
11  
11  
12  
12  
12  
12  
12  
12  
9
Add.  
SUBf  
Subtract.  
MULf  
Multiply.  
DIVf  
Divide.  
CMPf  
Compare.  
NEGf  
Negate.  
ABSf  
Take absolute value.  
Polynomial Step.  
Dot Product.  
Binary Scale.  
Binary Log.  
Square Root  
Multiply and Accumulate  
Load FSR.  
POLYf  
DOTf  
SCALBf  
LOGBf  
SQRTf  
MACf  
LFSR  
9
SFSR  
gen  
Store FSR.  
22  
2.0 Architectural Description (Continued)  
TABLE 2-3. NS32532 Instruction Set Summary (Continued)  
MEMORY MANAGEMENT  
Format  
Operation  
LMR  
Operands  
mreg,gen  
mreg,gen  
gen  
Description  
14  
14  
14  
14  
8
Load Memory Management Register. (Privileged)  
Store Memory Management Register. (Privileged)  
Validate address for reading. (Privileged)  
Validate address for writing. (Privileged)  
Move a value from Supervisor  
SMR  
RDVAL  
WRVAL  
MOVSUi  
gen  
gen,gen  
Space to User Space. (Privileged)  
Move a value from User Space  
8
MOVUSi  
gen,gen  
to Supervisor Space. (Privileged)  
MISCELLANEOUS  
Format  
Operation  
Operands  
Description  
1
1
1
NOP  
WAIT  
DIA  
No Operation.  
Wait for interrupt.  
Diagnose. Single-byte ‘‘Branch to Self’’ for hardware  
breakpointing. Not for use in programming.  
Cache Invalidate. (Privileged)  
[ ]  
options ,gen  
14  
CINV  
CUSTOM SLAVE  
Operation  
Format  
15.5  
15.5  
15.5  
15.5  
15.5  
15.5  
15.5  
15.5  
15.5  
15.5  
15.1  
15.1  
15.1  
15.1  
15.1  
15.1  
15.1  
15.1  
15.0  
15.0  
Operands  
gen,gen  
gen,gen  
gen,gen  
gen,gen  
gen,gen  
gen,gen  
gen,gen  
gen,gen  
gen,gen  
gen,gen  
gen,gen  
gen,gen  
gen,gen  
gen,gen  
gen,gen  
gen,gen  
gen  
Description  
CCAL0c  
CCAL1c  
CCAL2c  
CCAL3c  
CMOV0c  
CMOV1c  
CMOV2c  
CMOV3c  
CCMP0c  
CCMP1c  
CCV0ci  
CCV1ci  
CCV2ci  
CCV3ic  
CCV4DQ  
CCV5QD  
LCSR  
Custom Calculate.  
Custom Move.  
Custom Compare.  
Custom Convert.  
Load Custom Status Register.  
Store Custom Status Register.  
Load Custom Register. (Privileged)  
Store Custom Register. (Privileged)  
SCSR  
gen  
LCR  
creg,gen  
creg,gen  
SCR  
23  
3.0 Functional Description  
This chapter provides details on the functional characteris-  
tics of the NS32532 microprocessor.  
The chapter is divided into five main sections:  
Instruction Execution, Exception Processing, Debugging,  
On-Chip Caches and System Interface.  
3.1 INSTRUCTION EXECUTION  
To execute an instruction, the NS32532 performs the fol-  
lowing operations:  
Fetch the instruction  
#
Read source operands, if any (1)  
#
Calculate results  
#
Write result operands, if any  
#
Modify flags, if necessary  
#
Update the program counter  
#
Under most circumstances, the CPU can be conceived to  
execute instructions by completing the operations above in  
strict sequence for one instruction and then beginning the  
sequence of operations for the next instruction. However,  
due to the internal instruction pipelining, as well as the oc-  
currence of exceptions, the sequence of operations per-  
formed during the execution of an instruction may be al-  
tered. Furthermore, exceptions also break the sequentiality  
of the instructions executed by the CPU.  
Details on the effects of the internal pipelining, as well as  
the occurrence of exceptions on the instruction execution,  
are provided in the following sections.  
TL/EE/9354–8  
FIGURE 3-1. Operating States  
tion is detected, the CPU enters the Processing-An-Excep-  
tion state.  
Note: 1 In this and following sections, memory locations read by the CPU to  
calculate effective addresses for Memory-Relative and External ad-  
dressing modes are considered like source operands, even if the  
effective address is being calculated for an operand with access  
class of write.  
The CPU enters the Halted state when a bus error or abort  
is detected while the CPU is processing an exception, there-  
by preventing the transfer of control to an appropriate ex-  
ception service procedure. The CPU remains in the Halted  
state until reset occurs. A special status identifying this state  
is presented on the system interface.  
3.1.1 Operating States  
The CPU has five operating states regarding the execution  
of instructions and the processing of exceptions: Reset, Ex-  
ecuting Instructions, Processing An Exception, Waiting-For-  
An-Interrupt, and Halted. The various states and transitions  
between them are shown in Figure 3-1.  
Note: When the Direct-Exception mode is enabled, the CPU does not save  
the MOD Register contents nor does it read the module linkage infor-  
mation for the exception service procedure. Refer to Section 3.2 for  
details.  
Whenever the RST signal is asserted, the CPU enters the  
reset state. The CPU remains in the reset state until the  
RST signal is driven inactive, at which time it enters the  
Executing-Instructions state. In the Reset state the contents  
of certain registers are initialized. Refer to Section 3.5.3 for  
details.  
3.1.2 Instruction Endings  
The NS32532 checks for exceptions at various points while  
executing instructions. Certain exceptions, like interrupts,  
are in most cases recognized between instructions. Other  
exceptions, like Divide-By-Zero Trap, are recognized during  
execution of an instruction. When an exception is recog-  
nized during execution of an instruction, the instruction ends  
in one of four possible ways: completed, suspended, termi-  
nated, or partially completed. Each type of exception caus-  
es a particular ending, as specified in Section 3.2.  
In the Executing-Instructions state, the CPU executes in-  
structions. It will exit this state when an exception is recog-  
nized or a WAIT instruction is encountered. At which time it  
enters the Processing-An-Exception state or the Waiting-  
For-An-Interrupt state respectively.  
While in the Processing-An-Exception state, the CPU saves  
the PC, PSR and MOD register contents on the stack and  
reads the new PC and module linkage information to begin  
execution of the exception service procedure (see note).  
3.1.2.1 Completed Instructions  
When an exception is recognized after an instruction is  
completed, the CPU has performed all of the operations for  
that instruction and for all other instructions executed since  
the last exception occurred. Result operands have been  
written, flags have been modified, and the PC saved on the  
Interrupt Stack contains the address of the next instruction  
to execute. The exception service procedure can, at its con-  
clusion, execute the RETT instruction (or the RETI instruc-  
tion for vectored interrupts), and the CPU will begin execut-  
ing the instruction following the completed instruction.  
Following the completion of all data references required to  
process an exception, the CPU enters the Executing-In-  
structions state.  
In the Waiting-For-An-Interrupt state, the CPU is idle. A spe-  
cial status identifying this state is presented on the system  
interface (Section 3.5). When an interrupt or a debug condi-  
24  
3.0 Functional Description (Continued)  
3.1.2.2 Suspended Instructions  
are the contents of the Stack Pointers. The result operands  
of other instructions executed since the last serializing oper-  
ation may not have been written to memory. A terminated  
instruction cannot be completed.  
An instruction is suspended when one of several trap condi-  
tions or a restartable bus error is detected during execution  
of the instruction. A suspended instruction has not been  
completed, but all other instructions executed since the last  
exception occurred have been completed. Result operands  
and flags due to be affected by the instruction may have  
been modified, but only modifications that allow the instruc-  
tion to be executed again and completed can occur. For  
certain exceptions (Trap (ABT), Trap (UND), Trap (ILL), and  
bus errors) the CPU clears the P-flag in the PSR before  
saving the copy that is pushed on the Interrupt Stack. The  
PC saved on the Interrupt Stack contains the address of the  
suspended instruction.  
3.1.2.4 Partially Completed Instructions  
When a restartable bus error, interrupt, abort, or debug con-  
dition is recognized during execution of a string instruction,  
the instruction is said to be partially completed. A partially  
completed instruction has not completed, but all other in-  
structions executed since the last exception occurred have  
been completed. Result operands and flags due to be af-  
fected by the instruction may have been modified, but the  
values stored in the string pointers and other general-pur-  
pose registers used during the instruction’s execution allow  
the instruction to be executed again and completed.  
For example, the RESTORE instruction pops up to 8 gener-  
al-purpose registers from the stack. If an invalid page table  
entry is detected on one of the references to the stack, then  
the instruction is suspended. The general-purpose registers  
due to be loaded by the instruction may have been modified,  
but the stack pointer still holds the same value that it did  
when the instruction began.  
The CPU clears the P-flag in the PSR before saving the  
copy that is pushed on the Interrupt Stack. The PC saved on  
the Interrupt Stack contains the address of the partially  
completed instruction. The exception service procedure  
can, at its conclusion, simply execute the RETT instruction  
(or the RETI instruction for vectored interrupts), and the  
CPU will resume executing the partially completed instruc-  
tion.  
To complete a suspended instruction, the exception service  
procedure takes either of two actions:  
1. The service procedure can simulate the suspended in-  
struction’s execution. After calculating and writing the in-  
struction’s results, the flags in the PSR copy saved on the  
Interrupt Stack should be modified, and the PC saved on  
the Interrupt Stack should be updated to point to the next  
instruction to execute. The service procedure can then  
execute the RETT instruction, and the CPU begins exe-  
cuting the instruction following the suspended instruction.  
This is the action taken when floating-point instructions  
are simulated by software in systems without a hardware  
floating-point unit.  
3.1.3 Instruction Pipeline  
The NS32532 executes instructions in a heavily pipelined  
fashion. This allows a significant performance enhancement  
since the operations of several instructions are performed  
simultaneously rather than in a strictly sequential manner.  
The CPU provides a four-stage internal instruction pipeline.  
As shown in Figure 3-2, a write buffer, that can hold up to  
two operands, is also provided to allow write operations to  
be performed off-line.  
2. The suspended instruction can be executed again after  
the service procedure has eliminated the trap condition  
that caused the instruction to be suspended. The service  
procedure should execute the RETT instruction at its con-  
clusion; then the CPU begins executing the suspended  
instruction again. This is the action taken by a debugger  
when it encounters a BPT instruction that was temporarily  
placed in another instruction’s location in order to set a  
breakpoint.  
Note 1: Although the NS32532 allows a suspended instruction to be execut-  
ed again and completed, the CPU may have read a source operand  
for the instruction from a memory-mapped peripheral port before  
the exception was recognized. In such a case, the characteristics of  
the peripheral device may prevent correct reexecution of the in-  
struction.  
Note 2: It may be necessary for the exception service procedure to alter the  
P-flag in the PSR copy saved on the Interrupt Stack: If the exception  
service procedure simulates the suspended instruction and the P-  
flag was cleared by the CPU before saving the PSR copy, then the  
saved T-flag must be copied to the saved P-flag (like the floating-  
point instruction simulation described above). Or if the exception  
service procedure executes the suspended instruction again and  
the P-flag was not cleared by the CPU before saving the PSR copy,  
then the saved P-flag must be cleared (like the breakpoint trap de-  
scribed above). Otherwise, no alteration to the saved P-flag is nec-  
essary.  
TL/EE/9354–9  
FIGURE 3-2. NS32532 Internal Instruction Pipeline  
3.1.2.3 Terminated Instructions  
Due to the pipelining, operations like fetching one instruc-  
tion, reading the source operands of a second instruction,  
calculating the results of a third instruction and storing the  
results of a fourth instruction, can all occur in parallel.  
An instruction being executed is terminated when reset or a  
nonrestartable bus error occurs. Any result operands and  
flags due to be affected by the instruction are undefined, as  
25  
3.0 Functional Description (Continued)  
The order of memory references performed by the CPU may  
also differ from that related to a strictly sequential instruc-  
tion execution. In fact, when an instruction is being execut-  
ed, some of the source operands may be read from memory  
before the instruction is completely fetched. For example,  
the CPU may read the first source operand for an instruction  
before it has fetched a displacement used in calculating the  
address of the second source operand. The CPU, however,  
always completes fetching an instruction and reading its  
source operands before writing its results. When more than  
one source operand must be read from memory to execute  
an instruction, the operands may be read in any order. Simi-  
larly, when more than one result operand is written to mem-  
ory to execute an instruction, the operands may be written  
in any order.  
formed in the order implied by the program. Refer to Section  
3.1.3.2 for details.  
It is also to be noted that the CPU does not check for de-  
pendencies between the fetching of an instruction and the  
writing of previous instructions’ results. Therefore, special  
care is required when executing self-modifying code.  
3.1.3.1 Branch Prediction  
One problem inherent to all pipelined machines is what is  
called ‘‘Pipeline Breakage’’.  
This occurs every time the sequentiality of the instructions is  
broken, due to the execution of certain instructions or the  
occurrence of exceptions.  
The result of a pipeline breakage is a performance degrada-  
tion, due to the fact that a certain portion of the pipeline  
must be flushed and new data must be brought in.  
An instruction is fetched only after all previous instructions  
have been completely fetched. However, the CPU may be-  
gin fetching an instruction before all of the source operands  
have been read and results written for previous instructions.  
The NS32532 provides a special mechanism, called branch  
prediction, that helps minimize this performance penalty.  
When a conditional branch instruction is decoded in the ear-  
ly stages of the pipeline, a prediction on the execution of the  
instruction is performed.  
The source operands for an instruction are read only after  
all previous instructions have been fetched and their source  
operands read. A source operand for an instruction may be  
read before all results of previous instructions have been  
written, except when the source operand’s value depends  
on a result not yet written. The CPU compares the physical  
address and length of a source operand with those of any  
results not yet written, and delays reading the source oper-  
and until after writing all results on which the source oper-  
and depends. Also, the CPU ensures that the interlocked  
read and write references to execute an SBITIi or CBITIi  
instruction occur after writing all results of previous instruc-  
tions and before reading any source operands for subse-  
quent instructions.  
More precisely, the prediction mechanism predicts back-  
ward branches as taken and forward branches as not taken,  
except for the branch instructions BLE and BNE that are  
always predicted as taken.  
Thus, the resulting probability of correct prediction is fairly  
high, especially for branch instructions placed at the end of  
loops.  
The sequence of operations performed by the loader and  
execution units in the CPU is given below:  
Loader detects branches and calculates destination ad-  
dresses  
#
The result operands for an instruction are written after all  
results of previous instructions have been written.  
Loader uses branch opcode and direction to select be-  
tween sequential and non-sequential streams  
#
The description above is summarized in Figure 3-3, which  
shows the precedence of memory references for two con-  
secutive instructions.  
Loader saves address for alternate stream  
#
#
Execution unit resolves branch decision  
Due to the branch predicition, some special care is required  
when writing self-modifying code. Refer to the appropriate  
section in Appendix B for more information on this subject.  
3.1.3.2 Memory-Mapped I/O  
The characteristics of certain peripheral devices and the  
overlapping of instruction execution in the pipeline of the  
NS32532 require that special handling be applied to memo-  
ry-mapped I/O references. I/O references differ from mem-  
ory references in two significant ways, imposing the follow-  
ing requirements:  
TL/EE/935410  
FIGURE 3-3. Memory References for  
Consecutive Instructions  
1. Reading from a peripheral port can alter the value read  
on the next reference to the same port or another port in  
the same device. (A characteristic called here ‘‘destruc-  
tive-reading’’.) Serial communication controllers and  
FIFO buffers commonly operate in this manner. As ex-  
plained in ‘‘Instruction Pipeline’’ above, the NS32532 can  
read the source operands for one instruction while the  
previous instruction is executing. Because the previous  
instruction may cause a trap, an interrupt may be recog-  
nized, or the flow of control may be otherwise altered, it is  
a requirement that destructive-reading of source oper-  
ands before the execution of an instruction be avoided.  
(An arrow from one reference to another indicates that  
the first reference always precedes the second.)  
Another consequence of overlapping the operations for sev-  
eral instructions, is that the CPU may fetch an instruction  
and read its source operands, even though the instruction is  
not executed (e.g., due to the occurrence of an exception).  
In such a case, the MMU may update the R-bit in Page  
Table Entries used in referring to the fetched instruction and  
its source operands.  
Special care is needed in the handling of memory-mapped  
I/O devices. The CPU provides special mechanisms to en-  
sure that the references to these devices are always per-  
26  
3.0 Functional Description (Continued)  
2. Writing to a peripheral port can alter the value read from  
another port of the same device. (A characteristic called  
here ‘‘side-effects of writing’’). For example, before read-  
ing the counter’s value from the NS32202 Interrupt Con-  
trol Unit it is first necessary to freeze the value by writing  
to another control register.  
serializing operation takes place. This is necessary since  
the privilege level might have changed and the instructions  
following the LPRW instruction must be fetched again with  
the new privilege level and possibly with a different MMU  
mapping. See Section 2.4.2.  
The CPU serializes instruction execution after executing one  
of the following instructions: BICPSRW, BISPSRW, BPT,  
CINV, DIA, FLAG (trap taken), LMR, LPR (CFG, INTBASE,  
PSR, UPSR, DCR, BPC, DSR, and CAR only), RETT, RETI,  
and SVC. Figure 3-4 shows the memory references after  
serialization.  
However, as mentioned above, the NS32532 can read the  
source operands for one instruction before writing the re-  
sults of previous instructions unless the addresses indicate  
a dependency between the read and write references. Con-  
sequently, it is a requirement that read and write references  
to peripheral that exhibit side-effects of writing must occur in  
the order dictated by the instructions.  
Note 1: LPRB UPSR can be executed in User Mode to serialize instruction  
execution.  
Note 2: After an instruction that writes a result to memory is executed, the  
updating of the result’s memory location may be delayed until the  
next serializing operation.  
The NS32532 supports 2 methods for handling memory-  
mapped I/O. The first method is more general; it satisfies  
both requirements listed above and places no restriction on  
the location of memory-mapped peripheral devices. The  
second method satisfies only the requirement for side ef-  
fects of writing, and it restricts the location of memory-  
mapped I/O devices, but it is more efficient for devices that  
do not have destructive-read ports.  
Note 3: When reset or a nonrestartable bus error exception occurs, the CPU  
discards any results that have not yet been written to memory.  
The first method for handling memory-mapped I/O uses two  
signals: IOINH and IODEC. When the NS32532 generates a  
read bus cycle, it asserts the output signal IOINH if either of  
the I/O requirements listed above is not satisfied. That is,  
IOINH is asserted during a read bus cycle when (1) the read  
reference is for an instruction that may not be executed or  
(2) the read reference occurs while a write reference is  
pending for a previous instruction. When the read reference  
is to a peripheral device that implements ports with destruc-  
tive-reading or side-effects of writing, the input signal  
IODEC must be asserted; in addition, the device must not  
be selected if IOINH is active. When the CPU detects that  
the IODEC input signal is active while the IOINH output sig-  
nal is also active, it discards the data read during the bus  
cycle and serializes instruction execution. See the next sec-  
tion for details on serializing operations. The CPU then gen-  
erates the read bus cycle again, this time satisfying the re-  
quirements for I/O and driving IOINH inactive.  
TL/EE/935411  
FIGURE 3-4. Memory References after Serialization  
3.1.4 Slave Processor Instructions  
The NS32532 recognizes two groups of instructions being  
executable by external slave processors:  
Floating Point Instructions  
#
Custom Slave Instructions  
#
Each Slave Instruction Set is enabled by a bit in the Configu-  
ration Register (Section 2.1.4). Any Slave Instruction which  
does not have its corresponding Configuration Register bit  
set will trap as undefined, without any Slave Processor com-  
munication attempted by the CPU. This allows software sim-  
ulation of a non-existent Slave Processor.  
Note that the Memory Management Instructions, like Float-  
ing Point and Custom Slave Instructions, have to be en-  
abled through an appropriate bit in the configuration register  
in order to be executable.  
The second method for handling memory-mapped I/O uses  
a dedicated region of virtual memory. The NS32532 treats  
all references to the memory range from address FF000000  
to address FFFFFFFF inclusive in a special manner.  
However, they are not considered here as Slave Instruc-  
tions, since the NS32532 integrates the MMU on-chip and  
the execution of them does not follow the protocol of the  
Slave Instructions.  
While a write to a location in this range is pending, reads  
from locations in the same range are delayed. However,  
reads from locations with addresses lower than FF000000  
may occur. Similarly, reads from locations in the above  
range may occur while writes to locations outside of the  
range are pending.  
3.1.4.1 Regular Slave Instruction Protocol  
Slave Processor instructions have a three-byte Basic In-  
struction field, consisting of an ID Byte followed by an Oper-  
ation Word. The ID Byte has three functions:  
It is to be noted that the CPU may assert IOINH even when  
the reference is within the dedicated region. Refer to Sec-  
tion 3.5.8 for more information on the handling of I/O devic-  
es.  
1) It identifies the instruction as being a Slave Processor  
instruction.  
2) It specifies which Slave Processor will execute it.  
3.1.3.3 Serializing Operations  
3) It determines the format of the following Operation Word  
of the instruction.  
After executing certain instructions or processing an excep-  
tion, the CPU serializes instruction execution. Serializing in-  
struction execution means that the CPU completes writing  
all previous instructions’ results to memory, then begins  
fetching and executing the next instruction.  
Upon receiving a Slave Processor instruction, the CPU initi-  
ates the sequence outlined in Figure 3-5 . While applying  
Status code 11111 (Broadcast ID Section 3.5.4.1), the CPU  
transfers the ID Byte on bits D24D31, the operation  
For example, when a new value is loaded into the PSR by  
executing an LPRW instruction, the pipeline is flushed and a  
27  
3.0 Functional Description (Continued)  
TL/EE/935412  
FIGURE 3-5. Regular Slave Instruction Protocol: CPU Actions  
28  
3.0 Functional Description (Continued)  
31  
0
ID BYTE  
OPCODE (LOW)  
OPCODE (HIGH)  
XXXXXXXX  
FIGURE 3-6. ID and Operation Word  
31  
15  
TS  
7
0
ZERO  
ZERO  
N
Z
0
0
0
L
0
Q
FIGURE 3-7. Slave Processor Status Word  
word on bits D8D23 in a swapped order of bytes and a  
e
3.1.4.2 Pipelined Slave Instruction Protocol  
non-used byte XXXXXXXX (X  
(Figure 3-6 ).  
don’t care) on bits D0D7  
In order to increase performance of floating-point instruc-  
tions while maintaining full software compatibility with the  
Series 32000 architecture, the NS32532 incorporates a  
pipelined floating-point protocol. This protocol is designed  
to operate in conjunction with the NS32580 FPC, or any  
other floating-point slave which conforms to the protocol  
and the Series 32000 architecture. The protocol is enabled  
by the PF bit in the CFG register.  
All slave processors observe the bus cycle and inspect the  
identification code. The slave selected by the identification  
code continues with the protocol; other slaves wait for the  
next slave instruction to be broadcast.  
After transferring the slave instruction, the CPU sends to the  
slave any source operands that are located in memory or  
the General-Purpose registers. The CPU then waits for the  
slave to assert SDN or FSSR. While the CPU is waiting, it  
can perform bus cycles to fetch instructions and read  
source operands for instructions that follow the slave in-  
struction being executed. If there are no bus cycles to per-  
form, the CPU is idle with a special Status indicating that it is  
waiting for a slave processor. After the slave asserts SDN or  
FSSR, the CPU follows one of the two sequences described  
below.  
The basic methods of transferring data and control informa-  
tion between the CPU and the FPC, are the same as in the  
regular slave protocol.  
However, in pipelined mode, the CPU may send a new float-  
ing-point instruction to the FPC before the previous instruc-  
tion has been completed.  
Although the CPU can advance as many as four floating-  
point instructions before receiving a completion pulse on  
SDN for the first instruction, full exception recovery is as-  
sured. This is accomplished through a FIFO mechanism  
which maintains the addresses of all the floating-point in-  
structions sent to the FPC for execution.  
If the slave asserts SDN, then the CPU checks whether the  
instruction stores any results to memory or the General-Pur-  
pose registers. The CPU reads any such results from the  
slave by means of 1 or 2 bus cycles and updates the desti-  
nation.  
Pipelined execution can occur only for instructions which do  
not require a result to be read from the FPC.  
If the slave asserts FSSR, then the NS32532 reads a 32-bit  
status word from the slave. The CPU checks bit 0 in the  
slave’s status word to determine whether to update the PSR  
flags or to process an exception. Figure 3-7 shows the for-  
mat of the slave’s status word.  
In cases where a result is to be read back, the CPU will wait  
for instruction completion before issuing the next instruc-  
tion. Floating-point instructions can be divided into two  
groups, depending on the amount of pipelining permitted.  
Group A. Fully-Pipelined Instructions  
If the Q bit in the status word is 0, the CPU updates the N, Z  
and L flags in the PSR.  
Instructions in this group can be sent to the FPC before  
previous group A instructions are completed. No instruction  
completion indication from the FPC is required in order to  
continue to another group A or group B instruction.  
If the Q bit in the status word is set to 1, the CPU processes  
either a Trap (UND) if TS is 1 or a Trap (SLAVE) if TS is 0.  
Note 1: Only the floating-point and custom compare instructions are allowed  
to return a value of 0 for the Q bit when the FSSR signal is activat-  
ed. All other instructions must always set the Q bit to 1 (to signal a  
Trap), when activating FSSR.  
Group A contains floating-point instructions satisfying all of  
the following conditions.  
1. The destination operand is in a floating-point register.  
2. The source operand is not of type TOS or IMM.  
3. The instruction format is either 11 or 12.  
Group B. Half-Pipelined Instructions  
Note 2: While executing an LMR or CINV instruction, the CPU displays the  
operation code and source operand using slave processor write bus  
cycles, as described in the protocol above. Nevertheless, the CPU  
does not wait for SDN or FSSR to be asserted while executing  
these instructions. This information can be used to monitor the con-  
tents of the on-chip TLB, Instruction Cache, and Data Cache.  
Group B instructions can begin execution before previous  
group A instructions are completed. However, they cannot  
complete before the FPC signals completion of all the previ-  
ous floating-point instructions.  
Note 3: The slave processor must be ready to accept new slave instruction  
at any time, even while the slave is executing another instruction or  
waiting for the CPU to read results. For example, the CPU may  
terminate an instruction being executed by a slave because a non-  
restartable bus error is detected while the MMU is updating a Page  
Table Entry for an instruction being prefetched.  
Group B contains floating-point instructions satisfying at  
least one of the following conditions.  
Note 4: If a slave instruction stores a result to memory, the CPU checks  
whether Trap (ABT) would occur on the store operation before read-  
ing the result from the slave. For quad-word destination operands,  
the CPU checks that both double-words of the destination can be  
stored without an abort before reading either double-word of the  
result from the slave.  
1. The destination operand is either in memory or in a CPU  
register (this includes the CMPf instruction which modifies  
the PSR register).  
2. The source operand is of type TOS or IMM.  
3. The instruction format is 9.  
29  
3.0 Functional Description (Continued)  
TL/EE/935473  
FIGURE 3-8. Instruction Flow in Pipelined Floating-Point Mode  
30  
3.0 Functional Description (Continued)  
Note: Non-floating-point instructions cannot be pipelined. They can begin  
execution only after all other instructions have been completed. The  
CPU cannot proceed to other instructions before their execution is  
completed.  
The Returned Value Type and Destination column gives the  
size of any returned value and where the CPU places it. The  
PSR-Bits-Affected column indicates which PSR bits, if any,  
are updated from the Slave Processor Status Word (Figure  
3-7).  
3.1.4.3 Instruction Flow and Exceptions  
When operating in pipelined mode, the CPU will push the  
address of group A instructions into a five-entry FIFO after  
the ID, opcode and source operands have been sent to the  
FPC. The address will be pushed into the FIFO only if no  
exception is detected during the transfer of the source oper-  
ands needed for the execution of the instruction.  
Any operand indicated as being of type ‘‘f’’ will not cause a  
transfer if the Register addressing mode is specified. This is  
because the Floating Point Registers are physically on the  
Floating Point Unit and are therefore available without CPU  
assistance.  
3.1.4.5 Custom Slave Instructions  
Group A instructions are only stalled when the FIFO is full,  
in which case the CPU will wait before sending the next  
instruction. Group B instructions can begin execution while  
some entries are still in the FIFO, but cannot complete be-  
fore the FIFO is empty (i.e., before all previous instructions  
are completed). Non-floating-point instructions cannot begin  
execution until the FIFO is empty. When a normal comple-  
tion indication is received, the instruction address at the bot-  
tom of the FIFO is dropped. If a trap indication is received  
and the FIFO is not empty, the instruction address at the  
bottom of the FIFO is copied to the PC register and the  
floating-point exception is serviced. The remaining entries in  
the FIFO are discarded.  
Provided in the NS32532 is the capability of communicating  
with a user-defined, ‘‘Custom’’ Slave Processor. The in-  
struction set provided for a Custom Slave Processor defines  
the instruction formats, the operand classes and the com-  
munication protocol. Left to the user are the interpretations  
of the Op Code fields, the programming model of the Cus-  
tom Slave and the actual types of data transferred. The pro-  
tocol specifies only the size of an operand, not its data type.  
Table 3-2 lists the relevant information for the Custom Slave  
instruction set. The designation ‘‘c’’ is used to represent an  
operand which can be a 32-bit (‘‘D’’) or 64-bit (‘‘Q’’) quantity  
in any format; the size is determined by the suffix on the  
mnemonic. Similarly, an ‘‘i’’ indicates an integer size (Byte,  
Word, Double Word) selected by the corresponding mne-  
monic suffix.  
A floating-point exception may be received and serviced at  
any time after the CPU has sent the ID and opcode for the  
first instruction and until the FPC has signalled completion  
for the last instruction.  
Any operand indicated as being of type ‘‘c’’ will not cause a  
transfer if the register addressing mode is specified. It is  
assumed in this case that the slave processor is already  
holding the operand internally.  
Other exceptions may occur while the FIFO is not empty.  
This may be the case when an interrupt is received or a  
translation exception is detected in the access of an oper-  
and needed for the execution of the next floating-point in-  
struction. These exceptions will be processed as soon as  
the FIFO becomes empty, and after any floating-point ex-  
ception has been acknowledged.  
For the instruction encodings, see Appendix A.  
3.2 EXCEPTION PROCESSING  
Exceptions are special events that alter the sequence of  
instruction execution. The CPU recognizes three basic types  
of exceptions: interrupts, traps and bus errors.  
In the event of a non-restartable bus error, the acknowledge  
will occur immediately. The CPU will flush the internal FIFO  
and will reset the FPC by performing a dummy read of the  
slave status word. This operation is performed for both the  
regular and pipelined floating-point protocol and regardless  
of whether any floating-point instruction is pending in the  
FPC instruction queue.  
An interrupt occurs in response to an event signalled by  
activating the NMI or INT input signals. Interrupts are typi-  
cally requested by peripheral devices that require the CPU’s  
attention.  
Traps occur as a result either of exceptional conditions  
(e.g., attempted division by zero) or of specific instructions  
whose purpose is to cause a trap to occur (e.g., supervisor  
call instruction).  
The CPU may cancel the last instruction sent to the FPC by  
sending another ID and opcode, before the last source op-  
erand for that instruction has been sent. Figure 3-8 shows  
the instruction flow in pipelined floating-point mode.  
A bus error exception occurs when the BER signal is acti-  
vated during an instruction fetch or data transfer required by  
the CPU to execute an instruction.  
3.1.4.4 Floating Point Instructions  
Table 3-1 gives the protocols followed for each Floating  
Point instruction. The instructions are referenced by their  
mnemonics. For the bit encodings of each instruction, see  
Appendix A.  
When an exception is recognized, the CPU saves the PC,  
PSR and optionally the MOD register contents on the inter-  
rupt stack and then it transfers control to an exception serv-  
ice procedure.  
The Operand class columns give the Access Class for each  
general operand, defining how the addressing modes are  
interpreted (see Instruction Set Reference Manual).  
Details on the operations performed in the various cases by  
the CPU to enter and exit the exception service procedure  
are given in the following sections.  
The Operand Issued columns show the sizes of the oper-  
ands issued to the Floating Point Unit by the CPU. ‘‘D’’ indi-  
cates a 32-bit Double Word. ‘‘i’’ indicates that the instruction  
It is to be noted that the reset operation is not treated here  
as an exception. Even though, like any exception, it alters  
the instruction execution sequence.  
e
e
Double Word). ‘‘f’’ indicates that the instruction  
specifies an integer size for the operand (B  
e
Byte, W  
The reason being that the CPU handles reset in a signifi-  
cantly different way than it does for exceptions.  
Word, D  
specifies a Floating Point size for the operand (F  
e
32-bit  
e
Refer to Section 3.5.3 for details on the reset operation.  
Standard Floating, L  
64-bit Long Floating).  
31  
3.0 Functional Description (Continued)  
TABLE 3-1. Floating Point Instruction Protocols  
Operand 1  
Class  
read.f  
read.f  
read.f  
read.f  
read.f  
read.f  
read.f  
read.f  
read.f  
read.f  
read.f  
read.F  
read.L  
read.i  
read.D  
N/A  
Operand 2  
Class  
rmw.f  
Operand 1  
Operand 2  
Returned Value  
Type and Dest.  
f to Op.2  
f to Op.2  
f to Op.2  
f to Op.2  
f to Op.2  
f to Op.2  
f to Op.2  
N/A  
PSR Bits  
Affected  
none  
none  
none  
none  
none  
none  
none  
N, Z, L  
none  
none  
none  
none  
none  
none  
none  
none  
none  
none  
none  
none  
none  
none  
Mnemonic  
Issued  
Issued  
ADDf  
f
f
SUBf  
rmw.f  
f
f
MULf  
rmw.f  
f
f
DIVf  
rmw.f  
f
f
MOVf  
write.f  
write.f  
write.f  
read.f  
write.i  
write.i  
write.i  
write.L  
write.F  
write.f  
N/A  
f
N/A  
N/A  
N/A  
f
ABSf  
f
NEGf  
f
CMPf  
f
FLOORfi  
TRUNCfi  
ROUNDfi  
MOVFL  
MOVLF  
MOVif  
LFSR  
f
N/A  
N/A  
N/A  
N/A  
N/A  
N/A  
N/A  
N/A  
f
i to Op.2  
i to Op.2  
i to Op.2  
L to Op.2  
F to Op.2  
f to Op.2  
N/A  
f
f
F
L
i
D
SFSR  
POLYf  
DOTf  
write.D  
read.f  
read.f  
rmw.f  
N/A  
D to Op.2  
f to F0  
read.f  
read.f  
read.f  
read.f  
read.f  
read.f  
f
f
f
f
f
f
f
f to F0  
SCALBf  
LOGBf  
SQRTf  
MACf  
f
f to Op.2  
f to Op.2  
f to Op.2  
f to F1  
write.f  
write.f  
read.f  
N/A  
N/A  
f
TABLE 3-2. Custom Slave Instruction Protocols  
Operand 1  
Class  
read.c  
read.c  
read.c  
read.c  
read.c  
read.c  
read.c  
read.c  
read.c  
read.c  
read.c  
read.c  
read.c  
read.i  
Operand 2  
Class  
rmw.c  
rmw.c  
rmw.c  
rmw.c  
write.c  
write.c  
write.c  
write.c  
read.c  
read.c  
write.i  
write.i  
write.i  
write.c  
write.Q  
write.D  
N/A  
Operand 1  
Operand 2  
Issued  
c
Returned Value  
Type and Dest.  
c to Op.2  
c to Op.2  
c to Op.2  
c to Op.2  
c to Op.2  
c to Op.2  
c to Op.2  
c to Op.2  
N/A  
PSR Bits  
Affected  
none  
none  
none  
none  
none  
none  
none  
none  
N,Z,L  
N,Z,L  
none  
none  
none  
none  
none  
none  
none  
none  
none  
none  
Mnemonic  
Issued  
CCAL0c  
CCAL1c  
CCAL2c  
CCAL3c  
CMOV0c  
CMOV1c  
CMOV2c  
CMOV3c  
CCMP0c  
CCMP1c  
CCV0ci  
CCV1ci  
CCV2ci  
CCV3ic  
CCV4DQ  
CCV5QD  
LCSR  
c
c
c
c
c
c
c
c
N/A  
N/A  
N/A  
N/A  
c
c
c
c
c
c
c
N/A  
c
N/A  
N/A  
N/A  
N/A  
N/A  
N/A  
N/A  
N/A  
N/A  
N/A  
i to Op.2  
i to Op.2  
i to Op.2  
c to Op.2  
Q to Op.2  
D to Op.2  
N/A  
c
c
i
read.D  
read.Q  
read.D  
N/A  
D
Q
D
N/A  
D
N/A  
SCSR  
write.D  
N/A  
D to Op.2  
N/A  
LCR*  
SCR*  
read.D  
write.D  
N/A  
D to Op.1  
Note:  
e
D
Double Word  
e
i
Integer size (B,W,D) specified in mnemonic.  
e
e
c
*
Custom size (D:32 bits or Q:64 bits) specified in mnemonic.  
Privileged instruction: will trap if CPU is in User Mode.  
e
N/A  
Not Applicable to this instruction.  
32  
3.0 Functional Description (Continued)  
3.2.1 Exception Acknowledge Sequence  
reads the double-word entry from the Interrupt Dispatch ta-  
c
4’. See Figures 3-9  
and 3-10. The CPU uses this entry to call the exception  
service procedure, interpreting the entry as an external pro-  
cedure descriptor.  
a
ble at address ‘INTBASE  
vector  
When an exception is recognized, the CPU goes through  
three major steps:  
1) Adjustment of Registers. Depending on the source of the  
exception, the CPU may restore and/or adjust the con-  
tents of the Program Counter (PC), the Processor Status  
Register (PSR) and the currently-selected Stack Pointer  
(SP). A copy of the PSR is made, and the PSR is then set  
to reflect Supervisor Mode and selection of the Interrupt  
Stack. Trap (TRC) and Trap (OVF) are always disabled.  
Maskable interrupts are also disabled if the exception is  
caused by an interrupt, Trap (DBG), Trap (ABT) or bus  
error.  
A new module number is loaded into the MOD register from  
the least-significant word of the descriptor, and the static-  
base pointer for the new module is read from memory and  
loaded into the SB register. Then the program-base pointer  
for the new module is read from memory and added to the  
most-significant word of the module descriptor, which is in-  
terpreted as an unsigned value. Finally, the result is loaded  
into the PC register.  
2) Vector Acquisition. A vector is either obtained from the  
data bus or is supplied internally by default.  
Direct-Exception Mode Enabled  
The Direct-Exception mode is enabled when the DE bit in  
the CFG register is set to 1. In this case the CPU first  
pushes the saved PSR copy along with the contents of the  
PC register on the Interrupt Stack. The word stored on the  
Interrupt Stack between the saved PSR and PC register is  
reserved for future use; its contents are undefined. The CPU  
then reads the double-word entry from the Interrupt Dis-  
3) Service Call. The CPU performs one of two sequences  
common to all exceptions to complete the acknowledge  
process and enter the appropriate service procedure.  
The selection between the two sequences depends on  
whether the Direct-Exception mode is disabled or en-  
abled.  
a
c
4’. The CPU  
patch Table at address ‘INTBASE  
vector  
Direct-Exception Mode Disabled  
uses this entry to call the exception service procedure, inter-  
preting the entry as an absolute address that is simply load-  
ed into the PC register. Figure 3-11 provides a pictorial of  
the acknowledge sequence. It is to be noted that while the  
The Direct-Exception mode is disabled while the DE bit in  
the CFG register is 0 (Section 2.1.4). In this case the CPU  
first pushes the saved PSR copy along with the contents of  
the MOD and PC registers on the interrupt stack. Then it  
TL/EE/935413  
FIGURE 3-9. Interrupt Dispatch Table  
33  
3.0 Functional Description (Continued)  
TL/EE/935414  
TL/EE/935415  
FIGURE 3-10. Exception Acknowledge Sequence.  
Direct-Exception Mode Disabled.  
34  
3.0 Functional Description (Continued)  
TL/EE/935416  
TL/EE/935417  
FIGURE 3-11. Exception Acknowledge Sequence.  
Direct-Exception Mode Enabled.  
direct-exception mode is enabled, the CPU can respond  
more quickly to interrupts and other exceptions because  
fewer memory references are required to process an excep-  
tion. The MOD and SB registers, however, are not initialized  
before the CPU transfers control to the service procedure.  
Consequently, the service procedure is restricted from exe-  
cuting any instructions, such as CXP, that use the contents  
of the MOD or SB registers in effective address calcula-  
tions.  
mode procedures, RETT can also adjust the Stack Pointer  
(SP) to discard a specified number of bytes from the original  
stack as surplus parameter space.  
RETI is used to return from a maskable interrupt service  
procedure. A difference of RETT, RETI also informs any  
external interrupt control units that interrupt service has  
completed. Since interrupts are generally asynchronous ex-  
ternal events, RETI does not discard parameters from the  
stack.  
Both of the above instructions always restore the Program  
Counter (PC) and the Processor Status Register from the  
interrupt stack. If the Direct-Exception mode is disabled,  
they also restore the MOD and SB register contents. Fig-  
ures 3-12 and 3-13 show the RETT and RETI instruction  
flows when the Direct-Exception mode is disabled.  
3.2.2 Returning from an Exception Service Procedure  
To return control to an interrupted program, one of two in-  
structions can be used: RETT (Return from Trap) and RETI  
(Return from Interrupt).  
RETT is used to return from any trap, non-maskable inter-  
rupt or bus error service procedure. Since some traps are  
often used deliberately as a call mechanism for supervisor  
35  
3.0 Functional Description (Continued)  
TL/EE/935418  
FIGURE 3-12. Return from Trap (RETT n) Instruction Flow.  
Direct-Exception Mode Disabled.  
3.2.3 Maskable Interrupts  
3.2.3.2 Vectored Mode: Non-Cascaded Case  
The INT pin is a level-sensitive input. A continuous low level  
is allowed for generating multiple interrupt requests. The in-  
put is maskable, and is therefore enabled to generate inter-  
rupt requests only while the Processor Status Register I bit  
is set. The I bit is automatically cleared during service of an  
INT, NMI, Trap (DBG), Trap (ABT) or Bus Error request, and  
is restored to its original setting upon return from the inter-  
rupt service routine via the RETT or RETI instruction.  
In the Vectored mode, the CPU uses an Interrupt Control  
Unit (ICU) to prioritize many interrupt requests. Upon receipt  
of an interrupt request on the INT pin, the CPU performs an  
‘‘Interrupt Acknowledge, Master’’ bus cycle (Section  
3.5.4.6) reading a vector value from the low-order byte of  
the Data Bus. This vector is then used as an index into the  
Dispatch Table in order to find the External Procedure De-  
scriptor for the proper interrupt service procedure. The serv-  
ice procedure eventually returns via the Return from Inter-  
rupt (RETI) instruction, which performs an End of Interrupt  
bus cycle, informing the ICU that it may re-prioritize any in-  
terrupt requests still pending. The ICU provides the vector  
number again, which the CPU uses to determine whether it  
needs also to inform a Cascaded ICU (see below).  
The INT pin may be configured via the SETCFG instruction  
e
as either Non-Vectored (CFG Register bit I  
e
0) or Vec-  
tored (bit I  
1).  
3.2.3.1 Non-Vectored Mode  
In the Non-Vectored mode, an interrupt request on the INT  
pin will cause an Interrupt Acknowledge bus cycle, but the  
CPU will ignore any value read from the bus and use instead  
a default vector of zero. This mode is useful for small sys-  
tems in which hardware interrupt prioritization is unneces-  
sary.  
In a system with only one ICU (16 levels of interrupt), the  
vectors provided must be in the range of 0 through 127; that  
is, they must be positive numbers in eight bits. By providing  
36  
3.0 Functional Description (Continued)  
TL/EE/935419  
FIGURE 3-13. Return from Interrupt (RETI) Instruction Flow.  
Direct-Exception Mode Disabled.  
37  
3.0 Functional Description (Continued)  
a negative vector number, an ICU flags the interrupt source  
as being a Cascaded ICU (see below).  
‘‘Interrupt Acknowledge, Master’’ bus cycle (Section  
3.5.4.6) when processing of this interrupt actually begins.  
The Interrupt Acknowledge cycle differs from that provided  
for Maskable Interrupts in that the address presented is  
Note: During a return from interrupt the CPU looks at bit 7 of the vector  
number from the master ICU. If bit 7 is 0, bits 0 through 6 are ignored.  
FFFFFF00 . The vector value used for the Non-Maskable  
16  
Interrupt is taken as 1, regardless of the value read from the  
bus.  
3.2.3.3 Vectored Mode: Cascaded Case  
In order to allow more levels of interrupt, provision is made  
in the CPU to transparently support cascading. Note that  
the Interrupt output from a Cascaded ICU goes to an Inter-  
rupt Request input of the Master ICU, which is the only ICU  
which drives the CPU INT pin. Refer to the ICU data sheet  
for details.  
The service procedure returns from the Non-Maskable In-  
terrupt using the Return from Trap (RETT) instruction. No  
special bus cycles occur on return.  
3.2.5 Traps  
In a system which uses cascading, two tasks must be per-  
formed upon initialization:  
Traps are processing exceptions that are generated as di-  
rect results of the execution of an instruction.  
1) For each Cascaded ICU in the system, the Master ICU  
must be informed of the line number on which it receives  
the cascaded requests.  
The return address saved on the stack by any trap except  
Trap (TRC) and Trap (DBG) is the address of the first bye of  
the instruction during which the trap occurred.  
2) A Cascade Table must be established in memory. The  
Cascade Table is located in a NEGATIVE direction from  
the location indicated by the CPU Interrupt Base (INT-  
BASE) Register. Its entries are 32-bit addresses, pointing  
to the Vector Registers of each of up to 16 Cascaded  
ICUs.  
When a trap is recognized, maskable interrupts are not dis-  
abled except for the case of Trap (ABT) and Trap (DBG).  
There are 11 trap conditions recognized by the NS32532 as  
described below.  
Trap (ABT): An abort trap occurs when an invalid page ta-  
ble entry or a protection level violation is detected for any of  
the memory references required to execute an instruction.  
Figure 3-9 illustrates the position of the Cascade Table. To  
find the Cascade Table entry for a Cascaded ICU, take its  
Master ICU line number (0 to 15) and subtract 16 from it,  
Trap (SLAVE): An exceptional condition was detected by  
the Floating Point Unit or another Slave Processor during  
the execution of a Slave Instruction. This trap is requested  
via the Status Word returned as part of the Slave Processor  
Protocol (Section 3.1.4.1).  
b
b
giving an index in the range 16 to 1. Multiply this value  
by 4, and add the resulting negative number to the contents  
of the INTBASE Register. The 32-bit entry at this address  
must be set to the address of the Hardware Vector Register  
of the Cascaded ICU. This is referred to as the ‘‘Cascade  
Address.’’  
Trap (ILL): Illegal operation. A privileged operation was at-  
e
tempted while the CPU was in User Mode (PSR bit U  
1).  
Trap (SVC): The Supervisor Call (SVC) instruction was exe-  
cuted.  
Upon receipt of an interrupt request from a Cascaded ICU,  
the Master ICU interrupts the CPU and provides the nega-  
tive Cascade Table index instead of a (positive) vector num-  
ber. The CPU, seeing the negative value, uses it as an index  
into the Cascade Table and reads the Cascade Address  
from the referenced entry. Applying this address, the CPU  
performs an‘‘Interrupt Acknowledge, Cascaded’’ bus cycle,  
reading the final vector value. This vector is interpreted by  
the CPU as an unsigned byte, and can therefore be in the  
range of 0 through 255.  
Trap (DVZ): An attempt was made to divide an integer by  
zero. (The FPU trap is used for Floating Point division by  
zero.)  
Trap (FLG): The FLAG instruction detected a ‘‘1’’ in the  
PSR F bit.  
Trap (BPT): The Breakpoint (BPT) instruction was execut-  
ed.  
Trap (TRC): The instruction just completed is being traced.  
In returning from a Cascaded interrupt, the service proce-  
dure executes the Return from Interrupt (RETI) instruction,  
as it would for any Maskable Interrupt. The CPU performs  
an ‘‘End of Interrupt, Master’’ bus cycle, whereupon the  
Master ICU again provides the negative Cascade Table in-  
dex. The CPU, seeing a negative value, uses it to find the  
corresponding Cascade Address from the Cascade Table.  
Applying this address, it performs an ‘‘End of Interrupt, Cas-  
caded’’ bus cycle, informing the Cascaded ICU of the com-  
pletion of the service routine. The byte read from the Cas-  
caded ICU is discarded.  
Refer to Section 3.3.1 for details.  
Trap (UND): An Undefined-Instruction trap occurs when an  
attempt to execute an instruction is made and one or more  
of the following conditions is detected:  
1. The instruction is undefined. Refer to Appendix A for a  
description of the codes that the CPU recognizes to be  
undefined.  
2. The instruction is a floating point instruction and the F-bit  
in the CFG register is 0.  
3. The instruction is a custom slave instruction and the C-bit  
in the CFG register is 0.  
Note: If an interrupt must be masked off, the CPU can do so by setting the  
corresponding bit in the interrupt mask register of the interrupt con-  
troller.  
4. The instruction is a memory-management instruction and  
the M-bit in the CFG register is 0.  
However, if an interrupt is set pending during the CPU instruction that  
masks off that interrupt, the CPU may still perform an interrupt ac-  
knowledge cycle following that instruction since it might have sampled  
the INT line before the ICU deasserted it. This could cause the ICU to  
provide an invalid vector. To avoid this problem the above operation  
should be performed with the CPU interrupt disabled.  
5. An LMR or SMR instruction is executed while the U-flag  
in the PSR is 0 and the most significant bit of the instruc-  
tion’s short field is 0.  
6. The reserved general adressing mode encoding (10011)  
is used.  
3.2.4 Non-Maskable Interrupt  
The Non-Maskable Interrupt is triggered whenever a falling  
edge is detected on the NMI pin. The CPU performs an  
7. Immediate addressing mode is used for an operand that  
has access class different from read.  
38  
3.0 Functional Description (Continued)  
8. Scaled Indexing is used and the basemode is also Scaled  
Indexing.  
The NS32532 does not respond to bus errors indicated for  
instructions that are not executed. For example, no bus er-  
ror exception occurs in response to asserting the BER sig-  
nal during a bus cycle to prefetch an instruction that is not  
executed because the previous instruction caused a trap.  
9. The instruction is a floating-point or custom slave instruc-  
tion that the FPU or custom slave detects to be unde-  
fined. Refer to Section 3.1.4.1 for more information.  
An exception to this rule occurs if the bus error is detected  
during an MMU write cycle to update the R-bit in a page  
table entry.  
Trap (OVF): An Integer-Overflow trap occurs when the V-bit  
in the PSR register is set to 1 and an Integer-Overflow con-  
dition is detected during the execution of an instruction. An  
Integer-Overflow condition is detected in the following cas-  
es:  
In this case the CPU recognizes the bus error and considers  
it as non-restartable even though the bus cycle that caused  
it belongs to a non-executed instruction.  
1. The F-flag is 1 following execution of an ADDi, ADDQi,  
ADDCi, SUBi, SUBCi, NEGi, ABSi, or CHECKi instruction.  
If a bus error is detected during a data transfer required for  
the processing of another exception or during the ICU read  
cycle of a RETI instruction, then the CPU considers it as a  
fatal bus error and enters the ‘HALTED’ state.  
2. The product resulting from a MULi instruction cannot be  
represented exactly in the destination operand’s location.  
3. The quotient resulting from a DEIi, DIVi, or QUOi instruc-  
tion cannot be represented exactly in the destination op-  
erand’s location.  
Note 1: If the address and control signals associated with the last bus cycle  
that caused a bus error are latched by external hardware, then the  
information they provide can be used by the service procedure for  
restartable bus errors to analyze and resolve the exception recog-  
nized by the CPU. This can be accomplished because upon detect-  
ing a restartable bus error, the NS32532 stops making memory ref-  
erences for subsequent instructions until it determines whether the  
instruction that caused the bus error is executed and the exception  
is processed.  
4. The result of an ASHi instruction cannot be represented  
exactly in the destination operand’s location.  
5. The sum of the ‘INC’ value and the ‘INDEX’ operand for  
an ACBi instruction cannot be represented exactly in the  
index operand’s location.  
Note 2: When a non-restartable bus error is recognized, the service proce-  
dure must execute the CINV and LMR instructions to invalidate the  
on-chip caches and TLB. This is necessary to maintain coherence  
between them and external memory.  
Trap (DBG): A debug trap occurs when one or more of the  
conditions selected by the settings of the bits in the DCR  
register is detected. This trap can also be requested by acti-  
vating the input signal DBG. Refer to Section 3.3.2 for more  
information.  
Note 3: If the instruction causing a non-restartable bus error is followed by a  
slave instruction, the service procedure should reset the slave by  
reading the slave status register.  
Note 1: Following execution of the WAIT instruction, then a Trap (DBG) can  
be pending for a PC-match condition. In such an event, the Trap  
(DBG) is processed immediately.  
3.2.7 Priority Among Exceptions  
The CPU checks for specific exceptions at various points  
while executing an instruction. It is possible that several ex-  
ceptions occur simultaneously. In that event, the CPU re-  
sponds to the exception with highest priority.  
Note 2: If an attempt is made to execute a memory-management instruction  
while in User-Mode and the M-bit in the CFG register is 0, then Trap  
(UND) occurs.  
Note 3: If an attempt is made to execute a privileged custom instruction  
while in User-Mode and the C-bit in the CFG register is 0, then Trap  
(UND) occurs.  
Figure 3-14 shows an exception processing flowchart. A  
non-restartable bus error is assigned highest priority and is  
serviced immediately regardless of the execution state of  
the CPU.  
Note 4: While operating in User-Mode, if an attempt is made to execute a  
privileged instruction with an undefined use of a general addressing  
mode (either the reserved encoding is used or else scaled-index or  
immediate modes are incorrectly used), the Trap (UND) occurs.  
Before executing an instruction, the CPU checks for pend-  
ing Trap (DBG), interrupts, and Trap (TRC), in that order. If a  
Trap (DBG) is pending, then the CPU processes that excep-  
tion, otherwise the CPU checks for pending interrupts. At  
this point, the CPU responds to any pending interrupt re-  
quests; nonmaskable interrupts are recongized with higher  
priority than maskable interrupts. If no interrupts are pend-  
ing, then the CPU checks the P-flag in the PSR to determine  
whether a Trap (TRC) is pending. If the P-flag is 1, a Trap  
(TRC) is processed. If no Trap (DBG), interrupt or Trap  
(TRC) is pending, the CPU begins executing the instruction.  
Note 5: If an undefined instruction or illegal operation is detected, then no  
data references are performed for the instruction.  
Note 6: For certain instructions that are relatively long to execute, such as  
DEID, the CPU checks for pending interrupts during execution of the  
instruction. In order to reduce interrupt latency, the NS32532 can  
suspend executing the instruction and process the interrupt. Refer  
to Section B.5 in Appendix B for more information about recognizing  
interrupts in this manner.  
3.2.6 Bus Errors  
A bus error exception occurs when the BER signal is assert-  
ed in response to an instruction fetch or data transfer that is  
required to execute an instruction.  
While executing an instruction, the CPU may recognize up  
to four exceptions:  
Two types of bus errors are recognized: Restartable and  
Non-Restartable. Restartable bus errors are recognized dur-  
ing read bus cycles, except for MMU read cycles (from Page  
Tables) needed to translate the address of a result being  
stored into memory. All other bus errors are non-restartable.  
1. trap (ABT)  
2. restartable bus error  
3. trap (DBG) or interrupt, if the instruction is interruptible  
4. one of 7 mutually exclusive traps: SLAVE, ILL, SVC, DVZ,  
FLG, BPT, UND  
The CPU responds to restartable bus errors by suspending  
the instruction that it was executing. When a non-restartable  
bus error is detected, the CPU responds immediately and  
the instruction being executed is terminated. See Section  
3.1.2.3.  
Trap (ABT) and restartable bus error have equal priority; the  
CPU responds to the first one detected.  
If no exception is detected while the instruction is executing,  
then the instruction is completed and the PC is updated to  
point to the next instruction. If a Trap (OVF) is detected,  
then it is processed at this time.  
The PC value saved on the stack is undefined.  
39  
3.0 Functional Description (Continued)  
TL/EE/935420  
FIGURE 3-14. Exception Processing Flowchart  
40  
3.0 Functional Description (Continued)  
b
b
While executing the instruction, the CPU checks for enabled  
debug conditions. If an enabled debug condition is met, a  
Trap (DBG) is held pending until after the instruction is com-  
pleted (see Note 3). If another exception is detected before  
the instruction is completed, the pending Trap (DBG) is re-  
moved and the DSR register is not updated.  
7. If ‘‘Byte’’ is in the range 16 through 1, then the inter-  
rupt source is Cascaded. (More negative values are re-  
served for future use.) Perform the following:  
a. Read the 32-bit Cascade Address from memory. The  
a
address is calculated as INTBASE  
4* Byte.  
b. Read ‘‘Vector,’’ applying the Cascade Address just  
read and Status Code 00101 (Interrupt Acknowledge,  
Cascaded).  
Note 1: Trap (DBG) can be detected simultaneously with Trap (OVF). In this  
event, the Trap (OVF) is processed before the Trap (DBG).  
Note 2: An address-compare debug condition can be detected while pro-  
cessing a bus error, interrupt, or trap. In this event, the Trap (DBG)  
is held pending until after the CPU has processed the first excep-  
tion.  
8. Perform Service (Vector, Return Address), Figure 3-15.  
3.2.8.2 Abort/Restartable Bus Error Sequence  
1. Suspend instruction and restore the currently selected  
Stack Pointer to its original contents at the beginning of  
the instruction.  
Note 3: Between operations of a string instruction, the CPU responds to  
pending operand address compare and external debug conditions  
as well as interrupts. If  
a PC-match debug condition is detected  
while executing a string instruction, then Trap (DBG) is held pending  
until the instruction has completed.  
2. Clear the PSR P bit.  
3. Copy the PSR into a temmporary register, then clear PSR  
bits T, V, U, S and I.  
3.2.8 Exception Acknowledge Sequences: Detailed Flow  
For purposes of the following detailed discussion of excep-  
tion acknowledge sequences, a single sequence called  
‘‘service’’ is defined in Figure 3-15.  
4. Set ‘‘Vector’’ to the value corresponding to the exception  
type:  
e
e
Abort:  
Vector  
2
Upon detecting any interrupt request, trap or bus error con-  
dition, the CPU first performs a sequence dependent upon  
the type of exception. This sequence will include saving a  
copy of the Processor Status Register and establishing a  
vector and a return address. The CPU then performs the  
service sequence.  
Restartable Bus Error: Vector  
11  
5. Set ‘‘Return Address’’ to the address of the first byte of  
the suspended instruction.  
6. Perform Service (Vector, Return Address), Figure 3-15.  
3.2.8.3 SLAVE/ILL/SVC/DVZ/FLG/BPT/UND Trap  
Sequence  
3.2.8.1 Maskable/Non-Maskable Interrupt Sequence  
This sequence is performed by the CPU when the NMI pin  
receives a falling edge, or the INT pin becomes active with  
the PSR I bit set. The interrupt sequence begins either at  
the next instruction boundary or, in the case of an interrupt-  
ible instruction (e.g., string instruction), at the next interrupt-  
ible point during its execution.  
1. Restore the currently selected Stack Pointer and the  
Processor Status Register to their original values at the  
start of the trapped instruction.  
2. Set ‘‘Vector’’ to the value corresponding to the trap type.  
e
e
e
e
e
e
e
SLAVE: Vector  
3.  
4.  
5.  
6.  
7.  
8.  
10.  
ILL:  
Vector  
Vector  
Vector  
Vector  
Vector  
Vector  
1. If an interruptible instruction was interrupted and not yet  
completed:  
SVC:  
DVZ:  
FLG:  
BPT:  
UND:  
a. Clear the Processor Status Register P bit.  
b. Set ‘‘Return Address’’ to the address of the first byte of  
the interrupted instruction.  
Otherwise, set ‘‘Return Address’’ to the address of the  
next instruction.  
3. If Trap (ILL) or Trap (UND)  
a. Clear the Processor Status Register P bit.  
2. Copy the Processor Status Register (PSR) into a tempo-  
rary register, then clear PSR bits T, V, U, S, P and I.  
4. Copy the Processor Status Register (PSR) into a tempo-  
rary register, then clear PSR bits T, V, U, S and P.  
3. If the interrupt is Non-Maskable:  
5. Set ‘‘Return Address’’ to the address of the first byte of  
the trapped instruction.  
a. Read  
a
byte from address FFFFFF00 , applying  
16  
Status Code 00100 (Interrupt Acknowledge, Master).  
Discard the byte read.  
6. Perform Service (Vector, Return Address), Figure 3-15.  
b. Set ‘‘Vector’’ to 1.  
c. Go to Step 8.  
3.2.8.4 Trace Trap Sequence  
1. In the Processor Status Register (PSR), clear the P bit.  
4. If the interrupt is Non-Vectored:  
2. Copy the PSR into a temporary register, then clear PSR  
bits T, V, U and S.  
a. Read  
a byte from address FFFFFE00 , applying  
16  
Status Code 00100 (Interrupt Acknowledge, Master).  
Discard the byte read.  
3. Set ‘‘Vector’’ to 9.  
4. Set ‘‘Return Address’’ to the address of the next instruc-  
tion.  
b. Set ‘‘Vector’’ to 0.  
c. Go to Step 8.  
5. Perform Service (Vector, Return Address), Figure 3-15.  
5. Here the interrupt is Vectored. Read ‘‘Byte’’ from address  
3.2.8.5 Integer-Overflow Trap Sequence  
FFFFFE00 , applying Status Code 00100 (Interrupt Ac-  
16  
knowledge, Master).  
1. Copy the PSR into a temporary register, then clear PSR  
bits T, V, U, S and P.  
t
6. If ‘‘Byte’’ 0, then set ‘‘Vector’’ to ‘‘Byte’’ and go to Step  
2. Set ‘‘Vector’’ to 13.  
8.  
3. Set ‘‘Return Address’’ to the address of the next instruc-  
tion.  
41  
3.0 Functional Description (Continued)  
4. Perform Service (Vector, Return Address), Figure 3-15.  
3.3 DEBUGGING SUPPORT  
The NS32532 provides serveral features to assist in pro-  
gram debugging.  
3.2.8.6 Debug Trap Sequence  
A debug condition can be recognized either at the next in-  
struction boundary or, in the case of an interruptible instruc-  
tion, at the next interruptible point during its execution.  
Besides the Breakpoint (BPT) instruction that can be used  
to generate soft breaks, the CPU also provides instruction  
tracing as well as debug trap (or hardware breakpoints) ca-  
pabilities. Details on these features are provided in the fol-  
lowing sub-sections.  
1. If PC-match condition, then go to Step 3.  
2. If a String instruction was interrupted and not yet com-  
pleted:  
3.3.1 Instruction Tracing  
a. Clear the Processor Status Register P bit.  
Instruction tracing is a very useful feature that can be used  
during debugging to single-step through selected portions of  
a program. Tracing is enabled by setting the T-bit in the PSR  
Register. When enabled, the CPU generates a Trace Trap  
(TRC) after the execution of each instruction.  
b. Set ‘‘Return Address’’ to the address of the first byte of  
the instruction.  
c. Go to Step 4.  
3. Set ‘‘Return Address’’ to the address of the next instruc-  
tion.  
At the beginning of each instruction, the T bit is copied into  
the PSR P (Trace ‘‘Pending’’) bit. If the P bit is set at the end  
of an instruction, then the Trace Trap is activated. If any  
other trap or interrupt request is made during a traced in-  
struction, its entire service procedure is allowed to complete  
before the Trace Trap occurs. Each interrupt and trap se-  
quence handles the P bit for proper tracing, guaranteeing  
only one Trace Trap per instruction, and guaranteeing that  
the Return Address pushed during a Trace Trap is always  
the address of the next instruction to be traced.  
4. Set ‘‘Vector’’ to 14.  
5. Copy the Processor Status Register (PSR) into a tempo-  
rary register, then clear PSR bits T, V, U, S, P and I.  
6. Perform Service (Vector, Return Address), Figure 3-15.  
Note: In case of PC-match or address-compare on write, the Trap (DBG)  
may occur before the instruction is executed.  
3.2.8.7 Non-Restartable Bus Error Sequence  
1. Set ‘‘Vector’’ to 12.  
Due to the fact that some instructions can clear the T and P  
bits in the PSR, in some cases a Trace Trap may not occur  
at the end of the instruction. This happens when one of the  
privileged instructions BICPSRW or LPRW PSR is executed.  
2. Set ‘‘Return Address’’ to ‘‘Undefined’’.  
3. Copy the Processor Status Register (PSR) into a tempo-  
rary register, then clear PSR bits T, V, U, S, P and I.  
4. Perform a dummy read of the Slave Status Word to reset  
the Slave Processor.  
5. Perform Service (Vector, Return Address), Figure 3-15.  
TABLE 3-3. Summary of Exception Processing  
Instruction  
Ending  
Cleared Before  
Saving PSR  
Cleared After  
Saving PSR  
Exception  
Restartable Bus Error  
Nonrestartable Bus Error  
Suspended  
Terminated  
P
Undefined  
TVUSI  
TVUS  
Interrupt  
Before Instruction  
None/P*  
TVUSPI  
ABT  
ILL, UND  
SLAVE, SVC, DVZ, FLG, BPT  
OVF  
TRC  
DBG  
Suspended  
Suspended  
Suspended  
Completed  
Before Instruction  
Before Instruction  
P
P
None  
None  
P
TVUSI  
TVUS  
TVUSP  
TVUSP  
TVUS  
None/P*  
TVUSPI  
*Note: The P bit of the saved PSR is cleared in case the exception is acknowledged before the instruction is completed (e.g., interrupted string instruction). This is  
to avoid a mid-instruction trace trap upon return from the Exception Service Routine.  
Service (Vector, Return Address):  
1) Push the PSR copy onto the Interrupt Stack as a 16-bit value.  
2) If Direct-Exception mode is selected, then go to step 4.  
3) Push MOD Register into the Interrupt Stack as a 16-bit value.  
a
c
4’.  
4) Read 32-bit Interrupt Dispatch Table (IDT) entry at address ‘INTBASE  
5) If Direct-Exception mode is selected, then go to Step 10.  
vector  
6) Move the L.S. word of the IDT entry (Module Field) into the MOD register.  
a
7) Read the Program Base pointer from memory address ‘MOD  
Program Counter.  
8’, and add to it the M.S. word of the IDT entry (Offset Field), placing the result in the  
8) Read the new Static Base pointer from the memory address contained in MOD, placing it into the SB Register.  
9) Go to Step 11.  
10) Place IDT entry in the Program Counter.  
11) Push the Return Address onto the Interrupt Stack as a 32-bit quantity.  
12) Serialize: Non-sequentially fetch first instruction of Exception Service Routine.  
Note: Some of the Memory Accesses indicated in the service sequence may be performed in an order different from the one shown.  
FIGURE 3-15. Service Sequence  
42  
3.0 Functional Description (Continued)  
In other cases, it is still possible to guarantee that a Trace  
Trap occurs at the end of the instruction, provided that spe-  
cial care is taken before returning from the Trace Trap Serv-  
ice Procedure. In case a BICPSRB instruction has been ex-  
ecuted, the service procedure should make sure that the T  
bit in the PSR copy saved on the Interrupt Stack is set be-  
fore executing the RETT instruction to return to the program  
begin traced. If the RETT or RETI instructions have to be  
traced, the Trace Trap Service Procedure should set the P  
and T bits in the PSR copy on the Interrupt Stack that is  
going to be restored in the execution of such instructions.  
higher priority trap (i.e., ABORT) is detected, the BP signal  
may or may not be asserted.  
Note 1: The assertion of BP is not affected by the setting of the TR bit in the  
DCR register.  
Note 2: While executing the MOVUS and MOVSU instructions, the com-  
pare-address condition is enabled for the User space memory refer-  
ence under control of the UD-bit in the DCR.  
Note 3: When the LPRi instruction is executed to load a new value into the  
BPC, CAR or DCR, it is undefined whether the address-compare  
and PC-match conditions, in effect while executing the instruction,  
are detected under control of the old or new contents of the loaded  
register. Therefore, any LPRi instruction that alters the control of the  
address-compare or PC-match conditions should use register or im-  
mediate addressing mode for the source operand.  
Note: If instruction tracing is enabled while the WAIT instruction is executed,  
the Trap (TRC) occurs after the next interrupt, when the interrupt  
service procedure has returned.  
Note 4: If an exception occurred during the previous instruction, trap (DBG)  
may be taken prior to instruction execution.  
3.3.2 Debug Trap Capability  
3.4 ON-CHIP CACHES  
The CPU recognizes three different conditions to generate a  
Debug Trap:  
The NS32532 provides three on-chip caches: the Instruc-  
tion Cache (IC), the Data Cache (DC) and the Translation  
Look-aside Buffer (TLB).  
1) Address Compare  
2) PC Match  
The first two are used to hold the contents of frequently  
used memory locations, while the TLB holds address-trans-  
lation information.  
3) External  
These conditions can be enabled and monitored through  
the CPU Debug Registers.  
The IC and DC can be individually enabled by setting appro-  
priate bits in the CFG Register (See Section 2.1.4); the TLB  
is automatically enabled when address-translation is en-  
abled.  
An address-compare condition is detected when certain  
memory locations are either read or written. The double-  
word address used for the comparison is specified in the  
CAR Register. The address-compare condition can be sep-  
arately enabled for each of the bytes in the specified dou-  
ble-word, under control of the CBE bits of the DCR Register.  
The VNP bit in the DCR controls whether virtual or physical  
addresses are compared. The CRD and CWR bits in the  
DCR separately enable the address compare condition for  
read and write references; the CAE bit in the DCR can be  
used to disable the compare-address condition indepen-  
dently from the other control bits. The CPU examines the  
address compare condition for all data reads and writes,  
reads of memory locations for effective address calcula-  
tions, Interrupt-Acknowledge and End-of-Interrupt bus cy-  
cles, and memory references for exception processing. An  
address-compare condition is not detected for MMU refer-  
ences to Page Table Entries.  
The CPU also provides a locking feature that allows the  
contents of the IC and DC to be locked to specific memory  
locations. This is accomplished by setting the LIC and LDC  
bits in the CFG register.  
Cache locking can be successfully used in real-time applica-  
tions to guarantee fast access to critical instruction and data  
areas.  
Details on the organization and function of each of the  
caches are provided in the following sections.  
Note: The size and organization of the on-chip caches may change in future  
Series 32000 microprocessors. This however, will not affect software  
compatibility.  
3.4.1 Instruction Cache (IC)  
The basic structure of the instruction cache (IC) is shown in  
Figure 3-16.  
The PC-match condition is detected when the address of  
the instruction equals the value specified in the BPC regis-  
ter. The PC-match condition is enabled by the PCE bit in the  
DCR.  
The IC stores 512 bytes of code in a direct-mapped organi-  
zation with 32 sets. Direct-mapped means that each set  
contains only one block, thus each memory location can be  
loaded into the IC in only one place.  
Detection of address-compare and PC-match conditions is  
enabled for User and Supervisor Modes by the UD and SD  
bits in the DCR. The DEN-bit can be used to disable detec-  
tion of these two conditions independently from the other  
control bits.  
Each block contains a 23-bit tag, which holds the most-sig-  
nificant bits of the physical address for the locations stored  
in the block, along with 4 double-words and 4 validity bits  
(one for each double-word).  
An external condition is recognized whenever the DBG sig-  
nal is activated.  
A 4-double-word instruction buffer is also provided, which is  
loaded either from a selected cache block or from external  
memory. Instructions are read from this buffer by the loader  
unit and transferred to an 8-byte instruction queue.  
When the CPU detects an address-compare or PC-match  
condition while executing an instruction or processing an  
exception, then Trap (DBG) occurs if the TR bit in the DCR  
is 1. When an external debug condition is detected, Trap  
(DBG) occurs regardless of the TR bit. The cause of the  
Trap (DBG) is indicated in the DSR Register.  
The IC may or may not be enabled to cache an instruction  
being fetched by the CPU. It is enabled when the IC bit in  
the CFG Register is set to 1 and either the address transla-  
tion is disabled or the CI bit in the Level-2 PTE used to  
translate the virtual address of the instruction is set to 0.  
When an address-compare or PC-match condition is detect-  
ed while executing an instruction, the CPU asserts the BP  
signal at the beginning of the next instruction, synchronous-  
ly with PFS. If the instruction is not completed because a  
If the IC is disabled, the CPU bypasses it during the instruc-  
tion fetch and its contents are not affected. The instruction  
is read directly from external memory into the instruction  
buffer.  
43  
3.0 Functional Description (Continued)  
TL/EE/935421  
FIGURE 3-16. Instruction Cache Structure  
3.4.2 Data Cache (DC)  
When the IC is enabled, the instruction address bits 4 to 8  
are used to select the IC set where the instruction may be  
stored. The tag corresponding to the single block in the set  
is compared with the 23 most-significant bits of the instruc-  
tion’s physical address. The 4 double-words in this block are  
loaded into the instruction buffer and the 4 validity bits are  
also retrieved. Bits 2 and 3 of the instruction’s physical ad-  
dress select one of these double-words and the associated  
validity bit.  
The Data Cache (DC) stores 1,024 bytes of data in a two-  
way set associative organization as shown in Figure 3-17.  
Each of the 32 sets has 2 cache blocks. Each block con-  
tains a 23-bit tag, which holds the most-significant bits of  
the physical address for the locations stored in the block,  
along with 4 double-words and 4 validity bits (one for each  
double-word).  
The DC is enabled for a data read when all of the following  
conditions are satisfied.  
If the tag matches and the selected double-word is valid, a  
cache ‘hit’ occurs and the double-word is directly trans-  
ferred to the instruction queue for decoding; otherwise a  
cache ‘miss’ will result.  
The DC bit in the CFG Register is set to 1.  
#
Either the address translation is disabled or the CI bit in  
the Level-2 PTE used to translate the virtual address of  
the data reference is set to 0.  
#
In the latter case, if the cache is not locked, the CPU will  
take the following actions.  
The reference is not an interlocked read resulting from  
executing a CBITI or SBITI instruction.  
#
First, if the tag of the selected block does not match, the tag  
is loaded with the 23 most-significant bits of the instruction  
address and all the validity bits are cleared. Then, the in-  
struction is read from external memory into the instruction  
buffer.  
If the DC is disabled, the CPU bypasses it during the data  
read and its contents are not affected. The data is read  
directly from external memory. The DC is also bypassed for  
MMU reads from Page Table entries during address transla-  
tion and for Interrupt-Acknowledge and End-of-Interrupt bus  
cycles.  
If the CIIN input signal is not active during the fetching of the  
missing instruction, then the IC is updated and the instruc-  
tion double-words fetched from memory are stored into it  
with the validity bits set.  
When the DC is enabled for a data read, the address bits 4  
to 8 are used to select the DC set where the data may be  
stored.  
If the cache is locked, its contents are not affected, as the  
CPU reads the missing instruction from external memory.  
The tags corresponding to the two blocks in the set are  
compared to the 23 most-significant bits of the physical ad-  
dress. Bits 2 and 3 of the address select one double-word in  
each block and the associated validity bit.  
Whenever the CPU accesses external memory, whether or  
not the IC is enabled, it always fetches instruction double-  
words in a non-wrap-around fashion. Refer to Sections  
3.5.4.3 and 3.5.6 for more information.  
If one of the tag matches and the selected double-word in  
the corresponding block is valid, a cache ‘hit’ occurs and  
the data is used to execute the instruction; otherwise a  
cache ‘miss’ will result. In the latter case, if the cache is not  
locked, the CPU will take the following actions.  
The contents of the instruction cache can be invalidated by  
software through the CINV instruction or by hardware  
through the appropriate cache invalidation input signals.  
Clearing the IC bit in the CFG Register also invalidates the  
instruction cache. Refer to Sections 3.5.10 and C.3 for de-  
tails.  
Note: If the IC is enabled for a certain instruction and a ‘miss’ occurs due to  
a tag mismatch, the CPU will clear all the validity bits of the selected  
tag before fetching the instruction from external memory. If the CIIN  
input signal is activated during the fetching of that instruction, the  
validity bits are not set and the IC is not updated.  
44  
3.0 Functional Description (Continued)  
TL/EE/935422  
FIGURE 3-17. Data Cache Structure  
First, if the tag of either block in the set matches the data  
address, that block is selected for updating. Otherwise, if  
neither tag matches, then the least recently used block is  
selected; its tag is loaded with the 23 most-significant bits of  
the data address, and all the validity bits are cleared.  
vidual pages using the CI-bit in the level-2 Page Table En-  
tries. The CINV instruction can be executed to invalidate  
entriely the Instruction Cache and/or Data Cache; the CINV  
instruction can also be executed to invalidate  
16-byte block in either or both caches.  
a single  
Then, the data is read from external memory; up to 4 dou-  
ble-word bits are read into the cache in a wrap-around fash-  
ion. Refer to Sections 3.5.4.3 and 3.5.6 for more informa-  
tion.  
In hardware, the use of the caches can be inhibited for indi-  
vidual locations using the CIIN input signal. A cache invali-  
dation request can cause the entire Instruction Cache and/  
or Data Cache to be invalidated; a cache invalidation re-  
quest can also cause invalidation of a single set in either or  
both caches. Refer to Section 3.5.7 for more information.  
If the CIIN and IODEC input signals are both inactive during  
the bus cycles performed to read the missing data, then the  
DC is updated, as each double-word is read from memory,  
and the corresponding validity bit is set. If the cache is  
locked, its contents are not affected, as the CPU reads the  
missing data from external memory.  
An external ‘‘Bus Watcher’’ circuit can also be used to help  
maintain cache coherence. The Bus Watcher observes the  
CPU’s bus cycles to maintain a copy of the on-chip cache  
tags while also monitoring writes to main memory by DMA  
controllers and other microprocessors in the system. When  
the Bus Watcher detects that a location in one of the on-  
chip caches has been modified in main memory, it issues an  
invalidation request to the CPU. The CPU provides the nec-  
essary information on the system interface to help maintain  
an external copy of the on-chip tags.  
The DC is enabled for a data write whenever the DC bit in  
the CFG Register is set to 1, including interlocked writes  
resulting from executing the CBITI and SBITI instructions,  
and MMU writes to Page Table entries during address trans-  
lation.  
The DC does not use write allocation. This means that, dur-  
ing a write, if a cache ‘hit’ occurs, the DC is updated, other-  
wise it is unaffected. The data is always written through to  
external memory.  
The status codes differentiate between instruction fetches  
and data reads.  
The set, affected during the bus access (if CIOUT is low), as  
well as the tag can be determined from the address bits A4  
through A8 and A9 through A31 respectively.  
The contents of the data cache can be invalidated by soft-  
ware through the CINV instruction or by hardware through  
the appropriate cache invalidation input signals. Clearing  
the DC bit in the CFG Register also invalidates the data  
cache. Refer to Sections 3.5.10 and C.3 for details.  
During a data read the CPU also indicates, by means of the  
CASEC signal, which block in the set is being updated.  
Whenever a CINV instruction is executed, the operation  
code and operand appear on the system interface using  
slave processor bus cycles. Thus, invalidations of the on-  
chip caches by software can be monitored externally.  
Note: If the DC is enabled for a certain data reference and a ‘‘miss’’ occurs  
due to tag mismatch, the CPU will clear all the validity bits for the least  
recently used tag before reading the data from external memory. If  
either CIIN or IODEC are activated during the data read bus cycles,  
the validity bits are not set and the DC is not updated.  
Note, however, that the software is responsible for commu-  
nicating to the external circuitry the values of the cache en-  
able and lock bits in the CFG Register, since the CPU does  
not generate any special cycle (e.g., Slave Cycle) when the  
CFG Register is loaded.  
3.4.3 Cache Coherence Support  
The NS32532 provides several mechanisms for maintaining  
coherence between the on-chip caches and external mem-  
ory. In software, the use of caches can be inhibited for indi-  
45  
3.0 Functional Description (Continued)  
3.4.4 Translation Look-aside Buffer (TLB)  
were not already set. For these reasons, there is no need to  
replicate either the V bit or the R bit in the TLB entries.  
The Translation Look-aside Buffer is an on-chip fully asso-  
ciative memory. It provides direct virtual to physical mapping  
for 64 pages, thus minimizing the time needed to perform  
the address translation.  
Whenever a Page Table Entry in memory is altered by soft-  
ware, it is necessary to purge any matching entry from the  
TLB, otherwise the corresponding addresses would be  
translated according to obsolete information. TLB entries  
may be selectively purged by writing a virtual address to one  
of the IVARn registers using the LMR instruction. The TLB  
entry (if any) that matches that virtual address is then  
purged, and its space is made available for another transla-  
tion. Purging is also performed whenever an address space  
is remapped by altering the contents of the PTB0 or PTB1  
register. When this is done, all the TLB entries correspond-  
ing to the address space mapped by that register are  
purged. Turning translation on or off (via the MCR TU and  
TS bits) does not affect the contents of the TLB.  
The efficiency of the on-chip MMU is greatly increased by  
the TLB, which bypasses the much longer Page Table look-  
up in over 99% of the accesses made by the CPU.  
Entries in the TLB are allocated and replaced automatically;  
the operating system is not involved. The TLB entries can-  
not be read or written by software; however, they can be  
purged from it under program control.  
Figure 3-18 shows a model of the TLB. Information is  
placed into the TLB whenever a Page Table lookup is per-  
e
formed. If the retrieved mapping is valid (V  
1 in both  
levels of the Page Tables), and the access attempted is  
permitted by the protection level, an entry of the TLB is  
loaded from the information retrieved from memory.  
It is possible to maintain an external copy of the valid con-  
tents of the on-chip TLB by observing the CPU’s system  
interface during the replacement and invalidation of TLB en-  
tries. Whenever the CPU replaces a TLB entry, the page  
tables are accessed in external memory using bus cycles  
with a special Status. Because a FIFO replacement algo-  
rithm is used, it is possible to determine which entry is being  
replaced by using a 6-bit counter that is incremented when-  
ever a Level-1 PTE is accessed. The contents of the new  
entry can be found as follows:  
The on-chip MMU places the Virtual Page Number (VPN)  
and the Address Space qualifier (AS) into the tag portion of  
the TLB entry.  
The value portion of the entry is loaded from the Page Ta-  
bles as follows:  
The PFN field (20 bits) as well as the CI and M bits are  
loaded from the Level-2 Page Table Entry (PTE2).  
#
VPN appears on A2 through A11 during the PTE1 and  
#
The PL field (2 bits) is loaded to reflect the most restric-  
#
PTE2 accesses. The most-significant 10 bits appear dur-  
ing the PTE1 access, and the least-significant 10 bits  
appear during the PTE2 access.  
tive of the protection levels imposed by the PL fields of  
the Level-1 and Level-2 Page Table Entries (PTE1 and  
PTE2).  
AS can be determined from the U/S signal during the  
PTE1 access.  
#
Not shown in the figure is an additional bit associated with  
each TLB entry which indicates whether the entry is valid.  
PFN, M and CI can be determined from the PTE2 value  
#
Address translation can be either enabled or disabled for a  
memory reference. If translation is disabled, then the TLB is  
bypassed and the physical address is identical to the virtual  
address.  
read on the Data Bus. PL can be determined from the  
most restrictive of the PTE1 and PTE2 values read on  
the Data Bus.  
Whenever a LMR instruction is executed, the operation  
code and operand appear on the system interface using  
slave processor bus cycles. Thus, the information is avail-  
able externally to determine the translation modes con-  
trolled by the MCR and to identify that a TLB entry has been  
invalidated.  
When translation is enabled and a virtual address needs to  
be translated, the high-order 20 bits (VPN) and the Address  
Space qualifier are compared associatively to the corre-  
sponding fields in all entries of the TLB.  
For a read reference, if the tag portion of a valid TLB entry,  
completely matches the input values, then the value portion  
of the entry is used to complete the address translation and  
protection checking.  
When the PTB0 register is loaded by executing the ‘LMR  
PTB0 src’ instruction, the internal FIFO pointer is also reset  
to point to the first TLB entry.  
For a write reference, if a valid entry with a matching tag is  
present in the TLB, then the M bit is examined. If the M bit is  
1, the value portion of the entry is used to complete the  
address translation and protection checking. If the M bit is 0,  
the entry is invalidated.  
Note that the contents of the TLB maintained externally in-  
clude copies of all valid entries in the on-chip TLB, but the  
external copy may include some entries that are invalid in  
the on-chip TLB. For example, when the TLB is searched  
for a write reference and a matching entry is found with the  
M bit clear, then the on-chip entry is invalidated and a miss  
is processed. It is not possible to detect externally that the  
old matching entry on-chip has been invalidated.  
In either case, if a protection level violation is detected, a  
translation exception (Trap (ABT)) is generated. When no  
matching entry is found or a matching entry is invalidated  
because the M bit is 0 in a write reference, a Page Table  
lookup is performed. The virtual address is translated ac-  
cording to the algorithm given in Section 2.4.5 and the  
translation information is loaded into the TLB.  
3.5 SYSTEM INTERFACE  
This section provides general information on the NS32532  
interface to the external world. Descriptions of the CPU re-  
quirements as well as the various bus characteristics are  
provided here. Details on other device characteristics in-  
cluding timing are given in Chapter 4.  
The recipient entry is selected by an on-chip circuit that im-  
plements a First-In-First-Out (FIFO) algorithm.  
Note that for a translation to be loaded into the TLB it is  
necessary that the Level-1 and Level-2 Page Table Entries  
3.5.1 Power and Grounding  
e
be valid (V bit  
1). Also, it is guaranteed that in the pro-  
The NS32532 requires a single 5-volt power supply, applied  
on 21 pins. The logic voltage pins (VCCL1 to VCCL6) supply  
cess of loading a TLB entry (during a Page Table lookup)  
the Level-1 and Level-2 R bits will be set in memory if they  
46  
3.0 Functional Description (Continued)  
TL/EE/935423  
*AS represents the virtual address space qualifier.  
FIGURE 3-18. TLB Model  
3.5.2 Clocking  
the power to the on-chip logic. The buffer voltage pins  
(VCCB1 to VCCB14) supply the power to the output drivers  
of the chip. The bus clock power pin (VCCCLK) is the power  
supply for the on-chip clock drivers. All the voltage pins  
should be connected together by a power (VCC) plane on  
the printed circuit board.  
The NS32532 requires a single-phase input clock signal  
(CLK) with frequency twice the CPU’s operating frequency.  
This clock signal is internally divided by two to generate two  
non-overlapping phases PHI1 and PHI2. One single-phase  
clock signal BCLK in phase with PHI1 and its complement  
BCLK, are also generated and output by the CPU for timing  
reference.  
The NS32532 grounding connections are made on 20 pins.  
The logic ground pins (GNDL1 to GNDL6) are the ground  
pins for the on-chip logic. The buffer ground pins (GNDB1 to  
GNDB13) are the ground pins for the output drivers of the  
chip. The bus clock ground pin (GNDCLK) is the ground  
connection for the on-chip clock drivers. All the ground pins  
should be connected together by a ground plane on the  
printed circuit board.  
Following power-on, the phase relationship between BCLK  
and CLK is undefined. Nevertheless, in some systems it  
may be necessary to synchronize the CPU bus timing to an  
external reference. The SYNC input signal can be used to  
initialize the phase relationship between CLK and BCLK.  
SYNC can also be used to stretch BCLK (Low) while CLK is  
toggling.  
Both power and ground connections are shown in Figure  
3-19.  
SYNC is sampled on each rising edge of CLK. As shown in  
Figure 3-20, whenever SYNC is sampled low, BCLK stops  
toggling and stays low. On the first rising edge that SYNC is  
sampled high, BCLK is driven high and then toggles on each  
subsequent rising edge of CLK.  
Every rising edge of BCLK defines a transition in the timing  
state (‘‘T-State’’) of the CPU.  
One T-State represents the execution of one microinstruc-  
tion within the CPU and/or one step of an external bus  
transfer.  
Note: The CPU requirement on the maximum period of BCLK must be satis-  
fied when SYNC is asserted at times other than reset.  
3.5.3 Resetting  
The RST input pin is used to reset the NS32532. The CPU  
samples RST synchronously on the rising edge of BCLK.  
Whenever a low level is detected, the CPU responds imme-  
diately. Any instruction being executed is terminated; any  
results that have not yet been written to memory are dis-  
carded; and any pending bus errors, interrupts, and traps  
are eliminated. The internal latches for the edge-sensitive  
NMI and DBG signals are cleared.  
TL/EE/935424  
FIGURE 3-19. Power and Ground Connections  
TL/EE/935425  
FIGURE 3-20. Bus Clock Synchronization  
47  
3.0 Functional Description (Continued)  
The CPU stores the PC contents in the R0 Register and the  
PSR contents in the least-significant word of R1, leaving the  
most-significant word undefined. The PC is then cleared to 0  
and so are all the implemented bits in the PSR, MSR, MCR  
and CFG registers. The DEN-bit in the DCR Register is also  
cleared to 0. After reset, the remaining implemented bits in  
DCR and the contents of all other registers are undefined.  
The CPU begins executing the instruction at Address 0.  
3.5.4.1 Bus Status  
The CPU presents five bits of Bus Status information on  
pins ST0ST4. The various combinations on these pins in-  
dicate why the CPU is performing a bus cycle, or, if it is idle  
on the bus, then why is it idle.  
The Bus Status pins are interpreted as a five-bit value, with  
ST0 the least significant bit. Their values decode as follows:  
00000 The bus is idle because the CPU does not yet need  
to access the bus.  
On application of power, RST must be held low for at least  
is stable. This is to ensure that all on-chip  
50 ms after V  
CC  
00001 The bus is idle because the CPU is waiting for an  
interrupt following execution of the WAIT instruc-  
tion.  
voltages are completely stable before operation. Whenever  
a Reset is applied, it must also remain active for not less  
than 64 BCLK cycles. See Figures 3-21 and 3-22.  
00010 The bus is idle because the CPU has halted after  
detecting an abort or bus error while processing an  
exception.  
While in the Reset state, the CPU drives the signals ADS,  
BE03, BMT, CONF and HLDA inactive. The data bus is  
floated and the state of all other output signals is undefined.  
00011 The bus is idle because the CPU is waiting for a  
Slave Processor to complete executing an instruc-  
tion.  
Note 1: If HOLD is active at the time RST is deasserted, the CPU acknowl-  
edges HOLD before performing any bus cycle.  
Note 2: If SYNC is asserted while the CPU is being reset, then BCLK does  
not toggle. Consequently, SYNC must be high for at least 128 CLK  
cycles while RST is low.  
00100 Interrupt Acknowledge, Master.  
The CPU is reading an interrupt vector to acknowl-  
edge an interrupt request.  
00101 Interrupt Acknowledge, Cascaded.  
The CPU is reading an interrupt vector to acknowl-  
edge a maskable interrupt request from a Cascad-  
ed Interrupt Control Unit.  
00110 End of Interrupt, Master.  
The CPU is performing a read cycle to indicate that  
it is executing a Return from Interrupt (RETI) in-  
struction at the completion of an interrupt’s service  
procedure.  
TL/EE/935426  
00111 End of Interrupt, Cascaded.  
FIGURE 3-21. Power-On Reset Requirements  
The CPU is performing a read cycle from a Cascad-  
ed Interrupt Control Unit to indicate that it is execut-  
ing a Return from Interrupt (RETI) instruction at the  
completion of an interrupt’s service procedure.  
01000 Sequential Instruction Fetch.  
The CPU is fetching the next double-word in se-  
quence from the instruction stream.  
TL/EE/935427  
01001 Non-Sequential Instruction Fetch.  
FIGURE 3-22. General Reset Timing  
The CPU is fetching the first double-word of a new  
sequence of instruction. This will occur as a result  
of any JUMP or BRANCH, any exception, or after  
the execution of certain instructions.  
3.5.4 Bus Cycles  
The NS32532 CPU will perform bus cycles for one of the  
following reasons:  
01010 Data Transfer.  
1. To fetch instructions from memory.  
The CPU is reading or writing an operand for an  
instruction, or it is referring to memory while pro-  
cessing an exception.  
2. To write or read data to or from memory or peripheral  
devices. Peripheral input and output are memory mapped  
in the Series 32000 family.  
01011 Read RMW Class Operand.  
3. To read and update Page Table Entries in memory to  
perform memory management functions.  
The CPU is reading an operand with access class  
of read-modify-write.  
4. To acknowledge an interrupt and allow external circuitry  
to provide a vector number, or to acknowledge comple-  
tion of an interrupt service routine.  
01100 Read for Effective Address Calculation.  
The CPU is reading a pointer from memory in order  
to calculate an effective address for Memory Rela-  
tive or External addressing modes.  
5. To transfer information to or from a Slave Processor.  
In terms of bus timing, cases 1 through 4 above are identi-  
cal. For timing specifications, see Section 4. The only exter-  
nal difference between them is the 5-bit code placed on the  
Bus Status pins (ST0ST4). Slave Processor cycles differ in  
that separate control signals are applied (Section 3.5.4.7).  
01101 Access PTE1 by MMU.  
The CPU is reading or writing a Level-1 Page Table  
Entry while the on-chip MMU is translating virtual  
address.  
48  
3.0 Functional Description (Continued)  
01110 Access PTE2 by MMU.  
The CPU is reading or writing a Level-2 Page Table  
Entry while the on-chip MMU is translating a virtual  
address.  
11101 Transfer Slave Processor Operand.  
The CPU is transferring an operand to or from a  
Slave Processor.  
11110 Read Slave Processor Status.  
The CPU is reading a status word from a slave  
processor after the slave processor has activated  
the FSSR signal.  
a
11111 Broadcast Slave Processor ID  
OPCODE.  
The CPU is initiating the execution of a Slave In-  
struction by transferring the first 3 bytes of the in-  
struction, which specify the Slave Processor identi-  
fication and operation.  
3.5.4.2 Basic Read and Write Cycles  
The sequence of events occurring during a basic CPU ac-  
cess to either memory or peripheral device is shown in Fig-  
ure 3-23 for a read cycle, and Figure 3-24 for a write cycle.  
The cases shown assume that the selected memory or pe-  
ripheral device is capable of communicating with the CPU at  
full speed. If not, then cycle extension may be requested  
through the RDY line. See Section 3.5.4.4.  
A full speed bus cycle is performed in two cycles of the  
BCLK clock, labeled T1 and T2. For both read and write bus  
cycles the CPU asserts ADS during the first half of T1 indi-  
cating the beginning of the bus cycle. From the beginning of  
T1 until the completion of the bus cycle the CPU drives the  
Address Bus and other relevant control signals as indicated  
in the timing diagrams. For cacheable data read cycles the  
CPU also drives the CASEC signal to indicate the block in  
the DC set where the data will be stored. If the bus cycle is  
not cancelled (e.g., state T2 is entered in the next clock  
cycle), the confirm signal (CONF) is asserted in the middle  
of T1. Note that due to a bus cycle cancellation, the BMT  
signal may be asserted at the beginning of T1, and then  
deasserted before the time in which it is guaranteed valid  
(see Section 4.4.2).  
A confirmed bus cycle is completed at the end of T2, unless  
a cycle extension is requested. Following state T2 is either  
state T1 of the next bus cycle, or an idle T-state, if the CPU  
has no bus cycle to perform.  
In case of a read cycle the CPU samples the data bus at the  
end of state T2.  
TL/EE/935428  
If a bus exception is detected, the data is ignored.  
FIGURE 3-23. Basic Read Cycle  
For write bus cycles, valid data is output from the middle of  
T1 until the end of the cycle. When a write bus cycle is  
immediately followed by another write cycle, the CPU keeps  
driving the bus with the data related to the previous cycle  
until the middle of state T1 of the second bus cycle.  
The CPU always inserts an idle state before a write cycle  
when the write immediately follows a read cycle.  
Note: The CPU can initiate a bus cycle with a T1-state and then cancel the  
cycle, such as when a TLB miss or a Cache hit occurs. In such a case,  
the CONF signal remains High and the BMT signal is driven High; the  
T1-state is followed by another T1-state or an idle T-state.  
49  
3.0 Functional Description (Continued)  
3.5.4.3 Burst Cycles  
The NS32532 is capable of performing burst cycles in order  
to increase the bus transfer rate. Burst is only available in  
instruction fetch cycles and data read cycle from 32-bit wide  
memories. Burst is not supported in operand write cycles or  
slave cycles.  
The sequence of events for burst cycles is shown in Figure  
3-25. The case shown assumes that the selected memory is  
capable of communicating with the CPU at full speed. If not,  
then cycle extension can be requested through the RDY  
line. See Section 3.5.4.4.  
A Burst cycle is composed of two parts. The first part is a  
regular cycle (opening cycle), in which the CPU outputs the  
new status and asserts all the other relevant control signals.  
In addition, the Burst Out Signal (BOUT) is activated by the  
CPU indicating that the CPU can perform Burst cycles. If the  
selected memory allows Burst cycles, it will notify the CPU  
by activating the burst in signal (BIN). BIN is sampled by the  
CPU in the middle of T2 on the falling edge of BCLK. If the  
memory does not allow burst (BIN high), the cycle will termi-  
nate at the end of T2 and BOUT will go inactive immediate-  
ly. If the memory allows burst (BIN low), and the CPU has  
not deasserted BOUT, the second part of the Burst cycle  
will be performed and BOUT will remain active until termina-  
tion of the Burst.  
The second part consists of up to 3 nibbles, labeled T2B. In  
each of them a data item is read by the CPU. For each  
nibble in the burst sequence the CPU forces the 2 least-sig-  
nificant bits of the address to 0 and increments address bits  
2 and 3 to select the next double-word; all the byte enable  
signals (BE03) are activated.  
As shown in Figures 3-25 and 4-8 (in Section 4), the CPU  
samples RDY at the end of each nibble and extends the  
access time for the burst transfer if RDY is inactive.  
The CPU initiates burst read cycles in the following cases.  
e
1. An instruction must be fetched (Status  
01000 or  
01001), and the instruction address does not fall within  
the last double-word in an aligned 16-byte block (e.g.,  
address bits 2 and 3 are not both equal to 1).  
e
01100), and all of the following conditions are met.  
2. A data item must be read (Status  
01010, 01011 or  
e
The data cache is enabled and not locked. (DC  
e
1
#
and LDC  
0 in the CFG register.)  
The addressed page is cacheable as indicated in the  
Level-2 Page Table Entry.  
#
#
TL/EE/935429  
The bus cycle is not an interlocked data access per-  
formed while executing a CBITI or SBITI instruction.  
FIGURE 3-24. Write Cycle  
The Burst sequence will be terminated when one of the  
following events occurs.  
1. The last instruction double-word in an aligned 16-byte  
block has been fetched.  
2. The CPU detects that the instructions being prefetched  
are no longer needed due to an alteration of the flow of  
control. This happens, for example, when a Branch in-  
struction is executed or an exception occurs.  
3. 4 double-words of data have been read by the CPU. The  
double-words are transferred within an aligned 16-byte  
block in a wrap-around order. For example, if a source  
operand is located at address 104 , then the burst read  
16  
cycle transfers the double-words at 104, 108, 10C, and  
100, in that order.  
50  
3.0 Functional Description (Continued)  
TL/EE/935430  
FIGURE 3-25. Burst Read Cycles  
51  
3.0 Functional Description (Continued)  
4. The BIN signal is deasserted.  
Note 2: The CPU may assert ILO before a read cycle that is cancelled (for  
example, due to a TLB miss). In such a case, the CPU deasserts  
ILO before performing any additional bus cycles.  
5. BRT is asserted to signal a bus retry.  
6. IODEC is asserted or the BW0–1 signals indicate a bus  
width other than 32-bits. The CPU samples these signals  
during state T2 of the opening cycle. During T2B-states  
BW0–1 are ignored and IODEC must be kept HIGH.  
3.5.4.6 Interrupt Control Cycles  
The CPU generates Interrupt-Acknowledge bus cycles in re-  
sponse to non-maskable interrupt and enabled maskable  
interrupt requests.  
The CPU uses only the values of the above signals sampled  
during the last state of the transfer when the cycle is ex-  
tended. See Section 3.5.4.4.  
The CPU also generates one or two End-of-Interrupt bus  
cycles during execution of the Return-from-Interrupt (RETI)  
instruction.  
Note: A burst sequence is not stopped by the assertion of either BER or  
CIIN. See Note 3 in Section 3.5.5.  
The timing for the interrupt control cycles is the same as for  
the basic memory read cycle shown in Figure 3-23 ; only the  
status presented on pins ST0–4 is different. These cycles  
are single-byte read cycles, and they always bypass the  
data cache.  
3.5.4.4 Cycle Extension  
To allow sufficient access time for any speed of memory or  
peripheral device, the NS32532 provides for extension of a  
bus cycle. Any type of bus cycle except a slave processor  
cycle can be extended.  
Table 3-4 shows the interrupt control sequences associated  
with each interrupt and with the return from its service pro-  
cedure.  
A bus cycle can be extended by causing state T2 for a  
normal cycle or state T2B for a Burst cycle to be repeated.  
3.5.4.7 Slave Processor Bus Cycles  
At the end of each T2 or T2B state, on the rising edge of  
BCLK, the RDY line is sampled by the CPU. If RDY is active,  
then the transfer cycle will be completed. If RDY is inactive,  
then the bus cycle is extended by repeating the T-state for  
another clock cycle. These additional T-states inserted by  
the CPU in this manner are called ‘WAIT’ states.  
The NS32532 performs bus cycles to transfer information to  
or from slave processors while executing floating-point or  
custom-slave instructions.  
The CPU uses slave write bus cycles to broadcast the iden-  
tification and operation codes of a slave instruction as well  
as to transfer operands from memory or general purpose  
registers to a slave.  
During a transfer the CPU samples the input control signals  
BIN, BER, BRT, BW01, CIIN and IODEC.  
Figure 3-27 shows the timing for a slave write bus cycle.  
The CPU asserts SPC during T1; the status is valid during  
T1 and T2. The operation code or operand is output on the  
data bus from the middle of T1 until the end of T2.  
When wait states are inserted, only the values of these sig-  
nals sampled during the last wait state are significant.  
Figures 3-26 and 4-8 (in Section 4) illustrate both a normal  
read cycle and a Burst cycle with wait states added through  
the RDY pin.  
The CPU uses a slave read bus cycle to transfer a result  
operand from a slave to either memory or a general purpose  
register. A slave read cycle is also used to read a status  
word when the FSSR signal is asserted. Figure 3-28 shows  
the timing for a slave read bus cycle.  
Note: If RST is asserted during a bus cycle, then the cycle is terminated  
without regard of RDY.  
3.5.4.5 Interlocked Bus Cycles  
The NS32532 supports indivisible read-modify-write trans-  
actions by asserting the ILO signal during consecutive read  
and write operations. See Figure 4-7 in Section 4.  
During T1 and T2 the CPU drives the status lines and as-  
serts SPC. The data from the slave is sampled at the end of  
T2.  
Interlocked transactions are always preceded and followed  
by one or more idle T-states.  
The CPU will never perform another slave cycle immediately  
following a slave read cycle.  
The ILO signal is asserted in the middle of the idle T-state  
preceding state T1 of the read operation, and is deasserted  
in the middle of one of the idle T-states following completion  
of the write operation, including any retried bus cycles.  
Slave processor data transfers are always 32 bits wide. If  
the operand is a single byte, then it is transferred on D0  
through D7. If it is a word, then it is transferred on D0  
through D15.  
No other bus operations (e.g., instruction fetches) will occur  
while an interlocked transaction is taking place.  
When two operands are transferred, operand 1 is trans-  
ferred before operand 2. For double-precision operands, the  
least-significant double-word is transferred before the most-  
significant double-word.  
Interlocked transactions are required in multiprocessor sys-  
tems to handle shared resources. The CPU uses them to  
reference data while executing a CBITIi or SBITIi instruction,  
during which a single byte of data is read and written. They  
are also used when the on-chip MMU is updating a Level-2  
Page Table Entry during a Page Table Lookup.  
During a slave bus cycle the output signals BE0–3 are un-  
defined while the input signals BW0–1 and RDY are ig-  
nored.  
BER and BRT must be kept high.  
In this case a double-word is read and written. If the Level-2  
Page Tables are located in a memory area whose width is  
other than 32 bits, multiple interlocked reads followed by  
multiple interlocked writes will result. The ILO signal is al-  
ways released for one or more clock cycles in the middle of  
two consecutive interlocked transactions.  
Note 1: If a bus error is detected during an interlocked read cycle, the sub-  
sequent interlocked write cycle will not be performed, and ILO is  
deasserted before the next bus cycle begins.  
52  
3.0 Functional Description (Continued)  
TL/EE/935431  
3-26. Cycle Extension of a Basic Read Cycle  
53  
3.0 Functional Description (Continued)  
TABLE 3-4. Interrupt Sequences  
Data Bus  
â
V
W
Byte 0  
Cycle Status  
Address  
DDIN  
BE3  
BE2  
BE1  
BE0  
Byte 3 Byte 2 Byte 1  
A. Non-Maskable Interrupt Control Sequences  
Interrupt Acknowledge  
1
00100 FFFFFF00  
0
1
1
1
0
X
X
X
X
16  
Interrupt Return  
None: Performed through Return from Trap (RETT) instruction.  
B. Non-Vectored Interrupt Control Sequences  
Interrupt Acknowledge  
1
00100 FFFFFE00  
0
0
1
1
1
1
1
1
0
0
X
X
X
X
X
X
X
X
16  
16  
Interrupt Return  
1
00110 FFFFFE00  
C. Vectored Interrupt Sequences: Non-Cascaded  
Interrupt Acknowledge  
1
00100 FFFFFE00  
0
0
1
1
1
1
1
1
0
0
X
X
X
X
X
X
Vector:  
16  
Range: 0127  
Interrupt Return  
1
00110 FFFFFE00  
Vector: Same as  
in Previous Int.  
Ack. Cycle  
16  
D. Vectored Interrupt Sequences: Cascaded  
Interrupt Acknowledge  
1
00100 FFFFFE00  
0
1
1
1
0
0
X
X
X
Cascade Index:  
b
range 16 to  
16  
b
1
(The CPU here uses the Cascade Index to find the Cascade Address)  
2
001101  
Cascade  
Address  
0
0
See Note  
Vector, range 16255; on appropriate byte of  
data bus.  
Interrupt Return  
00110 FFFFFE00  
1
1
1
1
X
X
X
Cascade Index:  
Same as in  
16  
previous Int.  
Ack. Cycle  
(The CPU here uses the Cascade Index to find the Cascade Address)  
2
00111  
Cascade  
Address  
0
See Note  
X
X
X
X
e
Note: BE0BE3 signals will be activated according to the cascaded ICU address  
X
Don’t Care  
54  
3.0 Functional Description (Continued)  
TL/EE/935432  
TL/EE/935433  
FIGURE 3-27. Slave Processor Write Cycle  
FIGURE 3-28. Slave Processor Read Cycle  
3.5.5 Bus Exceptions  
When BER is sampled active, the CPU completes the bus  
cycle normally. If a bus error occurs during a bus cycle for a  
reference required to execute an instruction, then a bus er-  
ror exception is recognized. However, if an error occurs dur-  
ing an acknowledge cycle of another exception or during  
the ICU read cycle of a RETI instruction, the CPU interprets  
the event as a fatal bus error and enters the ‘halted’ state.  
The NS32532 has the capability of handling errors occurring  
during the execution of a bus cycle. These errors can be  
either correctable or incorrectable, and the CPU can be no-  
tified of their occurrence through the input signals BRT and/  
or BER.  
Bus Retry  
In this state the CPU floats its address and data buses and  
places a special status code on the ST0–4 lines. The CPU  
can exit this condition only through a hardware reset. Refer  
to Section 3.2.6 for more details on bus error.  
If a bus error can be corrected, the CPU may be requested  
to repeat the erroneous bus cycle. The request is done by  
asserting the BRT signal. BRT is sampled at the end of  
state T2 or T2B.  
Note 1: If the erroneous bus cycle is extended by means of wait states, then  
the CPU uses the values of BRT and/or BER sampled during the  
last wait state.  
When the CPU detects that BRT is active, it completes the  
bus cycle normally, but ignores the data read in case of a  
read cycle, and maintains a copy of the data to be written in  
case of a write cycle. Then, after a delay of two clock cy-  
cles, it will start executing the bus cycle again.  
Note 2: If the CPU samples both BRT and BER active, BRT has higher  
priority. The bus error indication is ignored, and the bus cycle is  
repeated.  
Note 3: If BER is asserted during a bus cycle of a multi-cycle data transfer,  
the CPU completes the entire transfer normally, but the data will be  
ignored. The CPU also ignores any subsequent assertion of BER  
during the same data transfer.  
If the transfer cycle is multiple (e.g., for non-aligned data),  
only the problematic part will be repeated.  
For instance, if a non-aligned double-word is being trans-  
ferred and the second half of the transfer fails, only the  
second part will be repeated.  
Note 4: Neither BRT nor BER should be asserted during the T2 state of a  
slave processor bus cycle.  
3.5.6 Dynamic Bus Configuration  
The same applies for a retry during a burst sequence. The  
repeated cycle will begin where the read operation failed  
(rather than the first address of the burst) and will finish the  
original burst.  
The NS32532 is tuned to operate with 32-bit wide memory  
and peripheral devices. The bus also supports 8-bit and  
16-bit data widths, but at reduced efficiency. The CPU can  
switch from one bus width to another dynamically; the only  
restriction is that the bus width cannot change for locations  
within an aligned 16-byte block.  
Figures 3-29 and 4-10 (in Section 4) show the BRT timing  
for a basic access cycle and for burst cycles respectively.  
The CPU always waits for BRT to be HIGH before repeating  
the bus cycle. While BRT is LOW, the CPU places all the  
The CPU determines the bus width in effect for a bus cycle  
by using the values of the BW0 and BW1 signals sampled  
during the last T2 state. Values of BW0 and BW1 sampled  
before the last T2 state or during T2B states are ignored.  
Whenever a bus width other than 32-bit is detected by the  
CPU, two idle states are inserted before the next bus cycle  
is initiated. These idle states are only inserted once during  
an operand access, even if more than two bus cycles are  
needed to complete the access.  
output signals shown inFigure 4-11 in a TRI-STATE condi-  
tion.  
É
Bus Error  
If a bus error is incorrectable the CPU may be requested to  
interrupt the current process and branch to an appropriate  
procedure to handle the error. The request is performed by  
activating the BER signal. BER is sampled by the CPU at  
the end of state T2 or T2B on the rising edge of BCLK.  
55  
3.0 Functional Description (Continued)  
TL/EE/935434  
FIGURE 3-29. Bus Retry During a Basic Read Cycle  
56  
3.0 Functional Description (Continued)  
The various combinations for BW0 and BW1 are shown be-  
low.  
The following subsections provide detailed descriptions of  
the access sequences performed in the various cases.  
Note: Although the NS32532 ignores the BIN signal for 8-bit and 16-bit bus  
widths, it is recommended that BIN be asserted only if the system  
supports burst transfers. This is to ensure compatibility with future  
versions of the CPU that might support burst transfers for 8-bit and  
16-bit buses.  
BW1  
BW0  
0
0
1
1
0
1
0
1
Reserved  
8-Bit Bus  
16-Bit Bus  
32-Bit Bus  
The bus width must always be 32 bits during slave cycles.  
An important feature of the NS32532 is that it does not  
impose any restrictions on the data alignment, regardless of  
the bus width.  
Bus accesses are performed in double-word units. Access-  
es of data operands that cross double-word boundaries are  
decomposed into two or more aligned double-word access-  
es.  
The CPU provides four byte enable signals (BE03) which  
facilitate individual byte accessing on either a 32-bit or a  
16-bit bus.  
Figures 3-30 and 3-31 show the basic interfaces for 32-bit  
and 16-bit memories. An 8-bit memory interface (not shown)  
is even simpler since it does not use any of the BE0–3  
signals and its single bank is always enabled whenever the  
memory is selected. Each byte location in this case is se-  
lected by address bits A031.  
The NS32532 does not keep track of the bus width used in  
previous instruction fetches or data accesses. At the begin-  
ning of every memory transaction, the CPU always assumes  
that the bus is 32-bit wide and the BE0–3 signals are acti-  
vated accordingly.  
The BOUT signal is also asserted during instruction fetches  
or data reads if the conditions for bursting are satisfied. If  
the bus is other than 32-bit wide, the BIN signal is ignored  
and BOUT is deasserted at the beginning of the T state  
following T2, since burst cycles are not allowed for 8-bit or  
16-bit buses.  
TL/EE/935436  
FIGURE 3-31. Basic Interface for 16-Bit Memories  
3.5.6.1 Instruction Fetch Sequences  
The CPU performs two types of instruction fetch cycles: se-  
quential and non-sequential. These can be distinguished  
from each other by the differing status combinations on pins  
ST04. For non-sequential instruction fetches the CPU  
presents on the address bus the exact byte address of the  
first instruction in the instruction stream that is about to be-  
gin; for sequential instruction fetches, the address of the  
next aligned instruction double-word is presented on the ad-  
dress bus. The CPU always activates all byte enable signals  
(BE03) for both sequential and non-sequential fetches.  
BOUT is also asserted during T2 if the addressed double-  
word is not the last in an aligned 16-byte block. Tables 3-5  
to 3-7 show the fetch sequence for the various bus widths.  
32-Bit Bus Width  
The CPU reads the entire double-word present on the data  
bus into its internal instruction buffer.  
If BOUT and BIN are both active, the CPU reads up to 3  
consecutive double-words using burst cycles. Burst cycles  
are used for instruction fetches regardless of whether the  
accesses are cacheable.  
TL/EE/935435  
FIGURE 3-30. Basic Interface for 32-Bit Memories  
Note: The CACH signal must be asserted during cacheable read accesses.  
57  
3.0 Functional Description (Continued)  
@
@
6
Example: JUMP  
5
Example JUMP  
The CPU performs a fetch cycle at address 5 with BE0–3  
all active.  
A fetch cycle is performed at address 6 with BE0–3 all  
active.  
#
#
Two burst cycles are then performed and addresses 8 and  
12 are output while BE0–3 are kept active.  
The word at address 4 is then fetched if the access is  
cacheable.  
#
#
16-Bit Bus Width  
8-Bit Bus Width  
The word on the least-significant half of the data bus is read  
by the CPU. This is either the even or the odd word within  
the required instruction double-word, as determined by ad-  
dress bit 1.  
The instruction byte on the bus lines D0–7 is fetched. The  
CPU performs three consecutive cycles to read the remain-  
ing bytes within the required double-word, while keeping  
BE0–3 all active. The 4 bytes are then assembled into a  
double-word and transferred into the instruction buffer. For  
a non-sequential fetch, if the access is not cacheable, the  
CPU will only read the upper bytes within the instruction  
double-word starting with the byte at the instruction ad-  
dress.  
The CPU then complements address bit 1, clears address  
bit 0 and initiates a bus cycle to read the other word, while  
keeping all the BE0–3 signals active.  
These two words are then assembled into a double-word  
and transferred into the instruction buffer.  
@
Example: JUMP  
7
In case of a non-sequential fetch, if the access is not cache-  
able and the instruction address selects the odd word within  
the instruction double-word, the even word is not fetched.  
The CPU performs a fetch cycle at address 7 with BE0–3  
all active.  
#
Bytes at addresses 4, 5 and 6 are then fetched consecu-  
tively if the access is cacheable.  
#
TABLE 3-5. Cacheable/Non-Cacheable Instruction Fetches from a 32-Bit Bus  
1. In a burst access four bytes are fetched with the L.S. bits of the address set to 00.  
2. A ‘C’ on the data bus refers to cacheable fetches and indicates that the byte is placed in the instruction cache. An ‘I’ refers  
to non-cacheable fetches and indicates that the byte is ignored.  
Number  
of Bytes  
Address  
LSB  
Address  
Bus  
Bytes to be Fetched  
BE0–3  
Data Bus  
1
2
3
4
11  
10  
01  
00  
B0  
B1  
B2  
B3  
Ð
B0  
B1  
B2  
Ð
Ð
Ð
Ð
Ð
B0  
A
A
A
A
L L L L  
L L L L  
L L L L  
L L L L  
B0  
B1  
B2  
B3  
C/I  
C/I  
C/I  
B0  
B1  
C/I  
C/I  
C/I  
B0  
B0  
B1  
B2  
B0  
B1  
TABLE 3-6. Cacheable/Non-Cacheable Instruction Fetches from a 16-Bit Bus  
1. A bus access marked with ‘*’ in the ‘Address Bus’ column is performed only if the fetch is cacheable.  
Number  
of Bytes  
Address  
LSB  
Address  
Bus  
Bytes to be Fetched  
BE0–3  
Data Bus  
1
2
3
4
11  
10  
01  
00  
B0  
B1  
B2  
B3  
Ð
B0  
B1  
B2  
Ð
Ð
Ð
Ð
Ð
B0  
A
L L L L  
L L L L  
Ð
Ð
Ð
B0  
C
C/I  
C
b
*A  
*A  
A
3
2
1
2
Ð
A
L L L L  
L L L L  
Ð
Ð
Ð
Ð
B1  
C
B0  
C
b
B0  
B1  
A
L L L L  
L L L L  
Ð
Ð
Ð
Ð
B0  
B2  
C/I  
B1  
a
A
L L L L  
L L L L  
Ð
Ð
Ð
Ð
B1  
B3  
B0  
B2  
a
A
58  
3.0 Functional Description (Continued)  
TABLE 3-7. Cacheable/Non-Cacheable Instruction Fetches from an 8-Bit Bus  
Number  
of Bytes  
Address  
LSB  
Address  
Bus  
Bytes to be Fetched  
BE0–3  
Data Bus  
1
2
3
4
11  
10  
01  
00  
B0  
B1  
B2  
B3  
Ð
B0  
B1  
B2  
Ð
Ð
Ð
Ð
Ð
B0  
A
L L L L  
L L L L  
L L L L  
L L L L  
Ð
Ð
Ð
Ð
Ð
Ð
Ð
Ð
Ð
B0  
C
b
b
b
* A  
* A  
* A  
3
2
1
Ð
Ð
Ð
C
C
A
L L L L  
L L L L  
L L L L  
L L L L  
Ð
Ð
Ð
Ð
Ð
Ð
Ð
Ð
Ð
Ð
Ð
Ð
B0  
B1  
C
a
b
b
A
* A  
* A  
1
2
1
C
B0  
B1  
A
L L L L  
L L L L  
L L L L  
L L L L  
Ð
Ð
Ð
Ð
Ð
Ð
Ð
Ð
Ð
Ð
Ð
Ð
B0  
B1  
B2  
C
a
a
b
A
A
1
2
1
* A  
A
L L L L  
L L L L  
L L L L  
L L L L  
Ð
Ð
Ð
Ð
Ð
Ð
Ð
Ð
Ð
Ð
Ð
Ð
B0  
B1  
B2  
B3  
a
a
a
A
A
A
1
2
3
3.5.6.2 Data Read Sequences  
16-Bit Bus Width  
The CPU starts a data read access by placing the exact  
address of the operand on the address bus. The byte en-  
able lines are activated to select only the bytes required by  
the instruction being executed. This prevents spurious ac-  
cesses to peripheral devices that might be sensitive to read  
accesses, such as those which exhibit the characteristic of  
destructive reading. If the on-chip data cache is internally  
enabled for the read access, the BOUT signal is asserted at  
the beginning of state T2. BOUT will be deasserted if the  
data cache is externally inhibited (through CIIN or IODEC),  
or the bus width is other than 32 bits. During cacheable  
accesses the CPU always reads all the bytes in the double-  
word, whether or not they are needed to execute the in-  
struction, and stores them into the data cache. The external  
memory, in this case, must place the data on the bus re-  
gardless of the state of the byte enable signals.  
The word on the least-significant half of the data bus is read  
by the CPU. The CPU can then perform another access  
cycle with address bit 1 complemented and address bit 0  
cleared to read the other word within the addressed double-  
word.  
If the access is cacheable, the entire double-word is read  
and stored into the cache.  
If the access is not cacheable, the CPU ignores the bytes in  
the double-word not selected by BE03. In this case, the  
second access cycle is not performed, unless selected  
bytes are contained in the second word.  
@
Example: MOVB 5, R0  
The CPU reads a word at address 5 while keeping BE1  
active.  
#
If the access is not cacheable, the CPU ignores byte 0.  
#
If the data cache is either internally or externally inhibited  
during the access, the CPU ignores the bytes not selected  
by the BE0–3 signals. Data read sequences for the various  
bus widths are shown in tables 3-8 to 3-10.  
If the access is cacheable, the CPU performs another ac-  
cess cycle, with BE0–3 all active, to read the word at  
address 6.  
#
8-Bit Bus Width  
32-Bit Bus Width  
The data byte on the bus lines D0–7 is read by the CPU.  
The CPU can then perform up to 3 access cycles to read  
the remaining bytes in the double-word.  
The entire double-word present on the bus is read by the  
CPU. If the access is cacheable and the memory allows  
burst accesses, the CPU reads up to 3 additional double-  
words within the aligned 16-byte block containing the first  
byte of the operand. These burst accesses are performed in  
a wrap-around fashion within the 16-byte block.  
If the access is cacheable, the entire double-word is read  
and stored into the cache.  
If the access is not cacheable, the CPU will only perform  
those access cycles needed to read the selected bytes.  
@
Example: MOVW 5, R0  
@
Example: MOVW 5, R0  
The CPU reads a double-word at address 5 while keeping  
BE1 and BE2 active.  
#
The CPU reads the byte at address 5 while keeping BE1  
and BE2 active.  
#
If the access is not-cacheable, BOUT is deasserted and  
the data bytes 0 and 3 are ignored.  
#
If the access is not cacheable, the CPU activates BE2 and  
reads the byte at address 6.  
#
If the access is cacheable, the CPU performs burst cycles  
with BE0–3 all active, to read the double-words at ad-  
dresses 8, 12, and 0.  
#
If the access is cacheable, the CPU performs three bus  
cycles with BE0–3 all active, to read the bytes at address-  
es 6, 7 and 4.  
#
59  
3.0 Functional Description (Continued)  
TABLE 3-8. Cacheable/Non-Cacheable Data Reads from a 32-Bit Bus  
1. In a burst access four bytes are read with the L.S. bits of the address set to 00.  
2. A ‘C’ on the data bus refers to cacheable reads and indicates that the byte is placed in the data cache. An ‘I’ refers to non-  
cacheable reads and indicates that the byte is ignored.  
Number  
of Bytes  
Address  
LSB  
Address  
Bus  
Bytes to be Read  
BE0–3  
Data Bus  
C/I  
1
1
1
1
2
2
2
3
3
4
00  
01  
10  
11  
00  
01  
10  
00  
01  
00  
Ð
Ð
Ð
Ð
Ð
B0  
Ð
B0  
Ð
Ð
Ð
B0  
Ð
Ð
B0  
Ð
B0  
A
A
A
A
A
A
A
A
A
A
H H H L  
H H L H  
H L H H  
L H H H  
H H L L  
H L L H  
L L H H  
H L L L  
L L L H  
L L L L  
C/I  
C/I  
C/I  
B0  
C/I  
B0  
C/I  
C/I  
B1  
B0  
C/I  
B1  
B0  
B1  
B0  
C/I  
C/I  
C/I  
B0  
C/I  
B0  
C/I  
C/I  
B1  
B0  
B2  
B1  
B2  
Ð
B0  
Ð
BO  
Ð
Ð
Ð
B1  
B0  
Ð
C/I  
C/I  
B1  
Ð
B1  
B0  
B2  
B1  
B2  
C/I  
C/I  
B0  
B1  
Ð
B1  
B0  
B1  
C/I  
B2  
B2  
B3  
C/I  
B0  
B3  
TABLE 3-9. Cacheable/Non-Cacheable Data Reads from a 16-Bit Bus  
1. A bus access marked with ‘*’ in the ‘Address Bus’ column is performed only if the read is cacheable.  
Number  
of Bytes  
Address  
LSB  
Address  
Bus  
BE0–3  
Non Cach.  
Data to be Read  
Data Bus  
Cach.  
1
1
1
1
2
2
2
3
3
4
00  
01  
10  
11  
00  
01  
10  
00  
01  
00  
Ð
Ð
Ð
Ð
Ð
B0  
Ð
B0  
Ð
Ð
Ð
B0  
Ð
Ð
B0  
Ð
B0  
A
H H H L  
L L L L  
H H H L  
H H L H  
H L H H  
L H H H  
H H L L  
Ð
Ð
Ð
C/I  
C
B0  
C
a
* A  
* A  
* A  
* A  
* A  
A
2
1
2
3
2
1
2
2
1
2
Ð
A
H H L H  
L L L L  
Ð
Ð
Ð
Ð
B0  
C
C/I  
C
a
Ð
B0  
Ð
A
H L H H  
L L L L  
Ð
Ð
Ð
Ð
C/I  
C
B0  
C
b
B0  
Ð
Ð
A
L H H H  
L L L L  
Ð
Ð
Ð
Ð
B0  
C
C/I  
C
b
Ð
B1  
B0  
Ð
A
H H L L  
L L L L  
Ð
Ð
Ð
Ð
B1  
C
B0  
C
a
Ð
B1  
B0  
B2  
B1  
B2  
A
H L L H  
L L L L  
H L L H  
H L H H  
Ð
Ð
Ð
Ð
B0  
C/I  
B1  
a
C/I  
B1  
Ð
A
L L H H  
L L L L  
L L H H  
Ð
Ð
Ð
Ð
B1  
C
B0  
C
b
* A  
B1  
B0  
B1  
A
H L L L  
L L L L  
H L L L  
Ð
Ð
Ð
Ð
B1  
B0  
B2  
a
A
H L H H  
C/I  
B2  
B3  
A
L L L H  
L L L L  
L L L H  
L L H H  
Ð
Ð
Ð
Ð
B0  
B2  
C/I  
B1  
a
A
A
L L L L  
L L L L  
L L L L  
Ð
Ð
Ð
Ð
B1  
B3  
B0  
B2  
a
A
L L H H  
60  
3.0 Functional Description (Continued)  
TABLE 3-10. Cacheable/Non-Cacheable Data Reads from an 8-Bit Bus D812  
Number  
of Bytes  
Address  
LSB  
Address  
Bus  
BE0–3  
Non Cach.  
Data to be Read  
Data Bus  
Cach.  
1
1
1
1
2
2
2
3
3
4
00  
01  
10  
11  
00  
01  
10  
00  
01  
00  
Ð
Ð
Ð
Ð
Ð
B0  
Ð
B0  
Ð
Ð
Ð
B0  
Ð
Ð
B0  
Ð
B0  
A
H H H L  
L L L L  
L L L L  
L L L L  
H H H L  
H H L H  
H L H H  
L H H H  
Ð
Ð
Ð
Ð
Ð
Ð
Ð
Ð
Ð
Ð
Ð
Ð
B0  
C
a
a
a
*A  
*A  
*A  
1
2
3
C
C
A
H H L H  
L L L L  
L L L L  
L L L L  
Ð
Ð
Ð
Ð
Ð
Ð
Ð
Ð
Ð
Ð
Ð
Ð
B0  
C
a
a
b
*A  
*A  
*A  
1
2
1
C
C
Ð
B0  
Ð
A
H L H H  
L L L L  
L L L L  
L L L L  
Ð
Ð
Ð
Ð
Ð
Ð
Ð
Ð
Ð
Ð
Ð
Ð
B0  
C
a
b
b
*A  
*A  
*A  
1
2
1
C
C
B0  
Ð
Ð
A
L H H H  
L L L L  
L L L L  
L L L L  
Ð
Ð
Ð
Ð
Ð
Ð
Ð
Ð
Ð
Ð
Ð
Ð
B0  
C
b
b
b
*A  
*A  
*A  
3
2
1
C
C
Ð
B1  
B0  
Ð
A
H H L L  
L L L L  
L L L L  
L L L L  
H H L L  
H H L H  
Ð
Ð
Ð
Ð
Ð
Ð
Ð
Ð
Ð
Ð
Ð
Ð
B0  
B1  
C
a
a
a
A
*A  
*A  
1
2
3
C
Ð
B1  
B0  
B2  
B1  
B2  
A
H L L H  
L L L L  
L L L L  
L L L L  
H L L H  
H L H H  
Ð
Ð
Ð
Ð
Ð
Ð
Ð
Ð
Ð
Ð
Ð
Ð
B0  
B1  
C
a
a
b
A
*A  
*A  
1
2
1
C
B1  
Ð
A
L L H H  
L L L L  
L L L L  
L L L L  
L L H H  
L H H H  
Ð
Ð
Ð
Ð
Ð
Ð
Ð
Ð
Ð
Ð
Ð
Ð
B0  
B1  
C
a
b
b
A
*A  
*A  
1
2
1
C
B1  
B0  
B1  
A
H L L L  
L L L L  
L L L L  
L L L L  
H L L L  
H L L H  
H L H H  
Ð
Ð
Ð
Ð
Ð
Ð
Ð
Ð
Ð
Ð
Ð
Ð
B0  
B1  
B2  
C
a
a
a
A
A
1
2
3
*A  
B2  
B3  
A
L L L H  
L L L L  
L L L L  
L L L L  
L L L H  
L L H H  
L H H H  
Ð
Ð
Ð
Ð
Ð
Ð
Ð
Ð
Ð
Ð
Ð
Ð
B0  
B1  
B2  
C
a
a
b
A
A
1
2
1
*A  
A
L L L L  
L L L L  
L L L L  
L L L L  
L L L L  
L L L H  
L L H H  
L H H H  
Ð
Ð
Ð
Ð
Ð
Ð
Ð
Ð
Ð
Ð
Ð
Ð
B0  
B1  
B2  
B3  
a
a
a
A
A
A
1
2
3
3.5.6.3 Data Write Sequences  
32-Bit Bus Width  
In a write access the CPU outputs the operand address and  
asserts only the byte enable lines needed to select the spe-  
cific bytes to be written.  
The CPU performs only one access cycle to write the se-  
lected bytes within the addressed double-word.  
@
Example: MOVB R0,  
6
In addition, the CPU duplicates the data to be written on the  
appropriate bytes of the data bus in order to handle 8-bit  
and 16-bit buses.  
The CPU duplicates byte 2 of the data bus into byte 0 and  
performs a write cycle at address 6 with BE2 active.  
#
16-Bit Bus Width  
The various access sequences as well as the duplication of  
data are summarized in tables 3-11 to 3-13.  
Up to two access cycles are needed to complete the write  
operation.  
61  
3.0 Functional Description (Continued)  
@
Example: MOVW R0,  
5
signals. By asserting HOLD, an external device requests ac-  
cess to the bus. On receipt of HLDA from the CPU, the  
device may perform bus cycles, as the CPU at this point has  
placed all the output signals shown in Figure 3-32 into the  
TRI-STATE condition.  
The CPU duplicates byte 1 of the data bus into byte 0 and  
performs a write cycle at address 5 with BE1 and BE2  
active.  
#
A write at address 6 is then performed with BE2 active  
and the original byte 2 of the data bus placed on byte 0.  
#
To return control of the bus to the CPU, the external device  
sets HOLD inactive, and the CPU acknowledges return of  
the bus by setting HLDA inactive.  
8-Bit Bus Width  
Up to 4 access cycles are needed in this case to complete  
the write operation.  
The CPU samples HOLD in the middle of each T-state on  
the falling edge of BCLK. If HOLD is asserted when the bus  
is idle between access sequences, then the bus is granted  
immediately (see Figure 3-31). If HOLD is asserted during  
an access sequence, then the bus is granted immediately  
after the access sequence, including any retried bus cycles,  
has completed (see Figure 4-13). Note that an access se-  
quence can be composed of several bus cycles if the bus  
width is 8 or 16 bits.  
@
Example: MOVB R0,  
7
The CPU duplicates byte 3 of the data bus into bytes 0  
and 1, and then performs a write cycle at address 7 with  
BE3 active.  
#
3.5.7 Bus Access Control  
The NS32532 has the capability of relinquishing its control  
of the bus upon request from a DMA device or another CPU.  
This capability is implemented with the HOLD and HLDA  
TABLE 3-11. Data Writes to a 32-Bit Bus  
1. Bytes on the data bus marked with ‘ ’ are undefined.  
#
Number  
of Bytes  
Address  
LSB  
Address  
Data to be Written  
BE0–3  
Data Bus  
Bus  
1
1
1
1
2
2
2
3
3
4
00  
01  
10  
11  
00  
01  
10  
00  
01  
00  
Ð
Ð
Ð
Ð
Ð
B0  
Ð
B0  
Ð
Ð
Ð
B0  
Ð
Ð
B0  
Ð
B0  
A
H H H L  
H H L H  
H L H H  
L H H H  
H H L L  
H L L H  
L L H H  
H L L L  
L L L H  
L L L L  
B0  
B0  
B0  
B0  
B0  
B0  
B0  
B0  
B0  
B0  
#
#
#
#
A
B0  
#
Ð
B0  
Ð
A
B0  
#
#
B0  
Ð
Ð
A
B0  
B0  
B1  
B0  
B1  
B1  
B0  
B1  
#
Ð
B1  
B0  
Ð
A
#
#
Ð
B1  
B0  
B2  
B1  
B2  
A
B1  
B0  
B2  
B1  
B2  
#
B1  
Ð
A
B1  
B1  
B0  
B1  
A
#
B2  
B3  
A
B2  
B3  
A
TABLE 3-12. Data Writes to a 16-Bit Bus  
Address  
Number  
of Bytes  
Address  
LSB  
Data to be Written  
BE0–3  
Data Bus  
Bus  
1
1
1
1
2
2
00  
01  
10  
11  
00  
01  
Ð
Ð
Ð
B0  
Ð
Ð
Ð
Ð
B0  
Ð
Ð
B1  
Ð
B0  
Ð
B0  
Ð
Ð
Ð
B0  
Ð
A
H H H L  
H H L H  
H L H H  
L H H H  
H H L L  
B0  
B0  
B0  
B0  
B0  
#
#
#
#
#
A
B0  
A
B0  
#
#
Ð
A
B0  
B0  
B1  
#
B1  
B0  
A
#
#
A
H L L H  
H L H H  
B1  
B0  
B0  
B1  
#
#
a
A
1
#
#
2
3
10  
00  
B1  
Ð
B0  
B2  
Ð
Ð
A
L L H H  
B1  
B0  
B1  
B0  
B1  
B0  
A
H L L L  
B2  
B1  
B0  
B2  
#
#
a
A
A
A
2
1
2
H L H H  
#
#
3
4
01  
00  
B2  
B3  
B1  
B2  
B0  
B1  
Ð
A
L L L H  
L L H H  
B2  
B1  
B0  
B2  
B0  
B1  
a
#
#
B0  
A
L L L L  
B3  
B2  
B1  
B3  
B0  
B2  
a
L L H H  
#
#
62  
3.0 Functional Description (Continued)  
TABLE 3-13. Data Writes to an 8-Bit Bus  
Number  
of Bytes  
Address  
LSB  
Address  
Bus  
Data to be Written  
BE0–3  
Data Bus  
1
1
1
1
2
00  
01  
10  
11  
00  
Ð
Ð
Ð
B0  
Ð
Ð
Ð
B0  
Ð
Ð
Ð
B0  
Ð
B0  
Ð
Ð
Ð
B0  
A
A
A
A
A
H H H L  
H H L H  
H L H H  
L H H H  
B0  
B0  
B0  
B0  
#
#
#
#
#
B0  
B0  
#
#
Ð
B0  
B0  
#
B1  
H H L L  
H H L H  
B1  
B0  
B1  
#
#
#
#
a
A
A
A
1
1
1
#
2
2
3
01  
10  
00  
Ð
B1  
Ð
B1  
B0  
B2  
B0  
Ð
Ð
Ð
A
H L L H  
H L H H  
B1  
B0  
B0  
B1  
#
#
a
#
#
A
L L H H  
L H H H  
B1  
B0  
B1  
B0  
B1  
a
#
#
#
B1  
B0  
A
H L L L  
H L L H  
H L H H  
B2  
#
#
B1  
#
#
B0  
B1  
B2  
#
#
#
a
a
A
A
1
2
3
4
01  
00  
B2  
B3  
B1  
B2  
B0  
B1  
Ð
A
L L L H  
L L H H  
L H H H  
B2  
#
#
B1  
#
#
B0  
#
#
B0  
B1  
B2  
a
a
A
A
1
2
B0  
A
L L L L  
L L L H  
L L H H  
L H H H  
B3  
#
#
B2  
#
#
B1  
#
#
B0  
B1  
B2  
B3  
a
a
a
A
A
A
1
2
3
#
#
#
63  
3.0 Functional Description (Continued)  
TL/EE/935437  
FIGURE 3-32. Hold Acknowledge. (Bus Initially Idle.)  
Note: The status indicates ‘IDLE’ while the bus is granted. If the cause of the IDLE changes (e.g., CPU starts waiting for an interrupt), the status also changes.  
The CPU will never grant the bus between interlocked read  
and write bus cycles.  
When IODEC is active during a bus cycle for which IOINH is  
asserted, the CPU discards the data and applies the special  
handling required for I/O devices. Figure 3-33 shows a pos-  
sible implementation of an I/O device interface where the  
address mapping of the I/O devices is fixed.  
Note: If an external device requires a very short latency to get control of the  
bus, the bus retry signal (BRT) can be used instead of hold. See  
Section 3.5.5.  
3.5.8 Interfacing Memory-Mapped I/O Devices  
In an open system configuration, IODEC could be generated  
by the decoding logic of each I/O device subsystem.  
In Section 3.1.3.2 it was mentioned that some special pre-  
cautions are needed when interfacing I/O devices to the  
NS32532 due to its internal pipelined implementation. Two  
special signals are provided for this purpose: IOINH and  
IODEC. The CPU asserts IOINH during a read bus cycle to  
indicate that the bus cycle should be ignored if an I/O de-  
vice is selected. The system responds by asserting IODEC  
to indicate to the CPU that an I/O device has been select-  
ed. IODEC is sampled by the CPU in the middle of state T2.  
If the cycle is extended, then the CPU uses the IODEC val-  
ue sampled during the last wait state. If a bus error or a bus  
retry occurs, the sampled IODEC value is ignored. IODEC  
must be kept high during burst transfer cycles.  
When the on-chip MMU is enabled, the CIOUT signal could  
also be used for this purpose, since I/O devices are located  
in noncacheable areas. In this case however, a small per-  
formance degradation could result, due to the fact that the  
special I/O handling is also applied on references to non-  
cacheable program and/or data areas.  
Note 1: When IODEC is active in response to a read bus cycle, the CPU  
treats the reference as noncacheable.  
Note 2: IOINH is kept inactive during write cycles.  
64  
3.0 Functional Description (Continued)  
INVIC, INVDC, INVSET and CIA0CIA6 are all sampled  
synchronously by the CPU on the rising edge of BCLK. The  
CPU can respond to cache invalidation requests at a rate of  
one per BCLK cycle.  
As shown in Figures 3-16 and 3-17, the validity bits of the  
on-chip caches are dual-ported. One port is used for ac-  
cessing and updating the caches, while the other port is  
used independently for invalidation requests. Consequently,  
invalidation of the on-chip caches occurs with no interfer-  
ence to on-going cache accesses or bus cycles.  
TL/EE/935438  
FIGURE 3-33. Typical I/O Device Interface  
3.5.9 Interrupt and Debug Trap Requests  
A cache invalidation request can occur during a read bus  
cycle for a location affected by the invalidation. In such a  
case, the data will be invalid in the cache if the invalidation  
request occurs after the T2- or T2B-state of the bus cycle.  
Three signals are provided by the CPU to externally request  
interrupts and/or a debug trap. INT and NMI are for maska-  
ble and non-maskable interrupts respectively. DBG is used  
for requesting an external debug trap.  
Note: In the case of the Data Cache, the cache location will also be invali-  
dated if the invalidation occurs during T2 or T2B of the read cycle.  
Refer to Figure 4-18 in Section 4 for timing details.  
The CPU samples INT and NMI on every other rising edge  
of BCLK, starting with the second rising edge of BCLK after  
RST goes high.  
3.5.11 Internal Status  
The NS32532 provides information on the system interface  
concerning its internal activity.  
NMI is edge-sensitive; a high-to-low transition on it is detect-  
ed by the CPU and stored in an internal latch, so that there  
is no need to keep it asserted until it is acknowledged.  
The U/S signal indicates the Address Space for a memory  
reference (See Section 2.4.2).  
INT is level-sensitive and, as such, once asserted, it must  
be kept asserted until it is acknowledged.  
Note that U/S does not necessarily reflect the value of the  
U bit in the PSR register. For example, U/S is high during  
the memory access used to store the destination operand of  
a MOVSU instruction.  
The DBG signal, like NMI, is edge-sensitive; it differs from  
NMI in that the CPU samples it on each rising edge of  
BCLK. DBG can be asserted asynchronously to the CPU  
clock, but it should be at least 1.5 clock cycles wide in order  
to be recognized.  
The PFS signal is asserted for one BCLK cycle when the  
CPU begins executing a new instruction. The ISF signal is  
driven High along with PFS if the new instruction does not  
follow the previous instruction in sequence. More specifical-  
ly, ISF is High along with PFS after processing an exception  
or after executing one of the following instructions: ACB  
(branch taken), Bcond (branch taken), BR, BSR, CASE,  
CXP, CXPD, DIA, JSR, JUMP, RET, RETT, RETI, and RXP.  
If DBG meets the specified setup and hold times, it will be  
recognized on the rising edge of BCLK deterministically.  
Refer to Figures 4-19 and 4-20 for more details on the tim-  
ing of the above signals.  
Note: If the NMI signal is pulsed to request a non-maskable interrupt, it may  
be necessary to keep it asserted for a minimum of two clock cycles to  
guarantee its detection, unless extra logic ensures that the pulse oc-  
curs around the BCLK sampling edge.  
The BP signal is asserted for one BCLK cycle when an ad-  
dress-compare or PC-match condition is detected. If the BP  
signal is asserted one BCLK cycle after PFS, it indicates  
that an address-compare debug condition has been detect-  
ed. If BP is asserted at any other time, it indicates that a PC-  
Match debug condition has been detected.  
3.5.10 Cache Invalidation Requests  
The contents of the on-chip Instruction and Data Caches  
can be invalidated by external requests from the system. It  
is possible to invalidate a single set or all sets in the Instruc-  
tion Cache, Data Cache or both. The input signals INVIC  
and INVDC request invalidation of the Instruction Cache  
and Data Cache respectively. The input signal INVSET indi-  
cates whether the invalidation applies to a single set (16  
bytes for the Instruction Cache and 32 bytes for the Data  
Cache) or to the entire cache. When only a single set is  
invalidated, the set number is specified on CIA0CIA6.  
While executing an LMR or CINV instruction, the CPU dis-  
plays the operation code and source operand using slave  
processor write bus cycles. This information can be used to  
monitor the contents of the on-chip TLB, Instruction Cache  
and Data Cache.  
During idle bus cycles, the signals ST0ST4 indicate wheth-  
er the CPU is waiting for an interrupt, waiting for a Slave  
Processor to complete executing an instruction or halted.  
65  
4.0 Device Specifications  
TL/EE/935439  
FIGURE 4-1. NS32532 Interface Signals  
4.1 NS32532 PIN DESCRIPTIONS  
Descriptions of the NS32532 pins are given in the following  
sections.  
4.1.2 Input Signals  
CLK  
Clock.  
Included are also references to portions of the functional  
description, Section 3.  
Input Clock used to derive all CPU Timing.  
SYNC  
Synchronize.  
Figure 4-1 shows the NS32532 interface signals grouped  
according to related functions.  
When SYNC is active, BCLK will stop tog-  
gling. This signal can be used to synchronize  
two or more CPUs (Section 3.5.2).  
Note: An asterisk next to the signal name indicates a TRI-STATE condition  
for that signal when HOLD is acknowledged or during an extended  
retry.  
HOLD  
Hold Request.  
When active, causes the CPU to release the  
bus for DMA or multiprocessing purposes  
(Section 3.5.7).  
4.1.1 Supplies  
VCCL1–6  
Logic Power.  
a
5V positive supplies for on-chip logic.  
Note:  
VCCB114 Buffers Power.  
If the HOLD signal is generated asynchronously, its set  
up and hold times may be violated. In this case it is rec-  
ommended to synchronize it with the falling edge of  
BCLK to minimize the possibility of metastable states.  
a
buffers.  
5V positive supplies for on-chip output  
VCCCLK  
Bus Clock Power.  
The CPU provides only one synchronization stage to min-  
imize the HLDA latency. This is to avoid speed degrada-  
tions in cases of heavy HOLD activity (i.e. DMA controller  
cycles interleaved with CPU cycles).  
a
ers.  
5V positive supply for on-chip clock driv-  
GNDL1–6  
Logic Ground.  
RST  
INT  
Reset.  
Ground references for on-chip logic.  
When RST is active, the CPU is initialized to  
a known state (Section 3.5.3).  
GNDB113 Buffers Ground.  
Ground references for on-chip output buffers.  
Bus Clock Ground.  
Ground reference for on-chip clock drivers.  
Interrupt.  
GNDCLK  
A low level on this signal requests a maska-  
ble interrupt (Section 3.5.9).  
NMI  
Nonmaskable Interrupt.  
A High-to-Low transition of this signal re-  
a nonmaskable interrupt (Section  
quests  
3.5.9).  
66  
4.0 Device Specifications (Continued)  
DBG  
Debug Trap Request.  
10Ð16 Bits  
11Ð32 Bits  
Bus Retry.  
A High-to-Low transition of this signal re-  
quests a debug trap (Section 3.5.9).  
BRT  
BER  
CIA0–6  
Cache Invalidation Address Bus.  
When active, the CPU will reexecute the last  
bus cycle (Section 3.5.5).  
Bits 0 through 4 specify the set address to  
invalidate in the on-chip caches. CIA0 is the  
least significant. Bits 5 and 6 are reserved  
(Section 3.5.10).  
Bus Error.  
When active, indicates that an error occurred  
during a bus cycle. It is treated by the CPU as  
the highest priority exception after reset.  
INVSET  
INVDC  
Invalidate Set.  
When Low, only a set in the on-chip cache(s)  
is invalidated; when High, the entire cache(s)  
is (are) invalidated.  
4.1.3 Output Signals  
BCLK  
BCLK  
HLDA  
Bus Clock.  
Output clock for bus timing (Section 3.5.2).  
Bus Clock Inverse.  
Invalidate Data Cache.  
When Low, the Data Cache contents are in-  
validated. INVSET determines whether a sin-  
gle set or the entire Data Cache is invalidat-  
ed.  
Inverted output clock.  
Hold Acknowledge.  
Activated by the CPU in response to the  
HOLD input to indicate that the CPU has re-  
leased the bus.  
INVIC  
CIIN  
Invalidate Instruction Cache.  
When Low, the Instruction Cache contents  
are invalidated. INVSET determines whether  
a single set or the entire Instruction Cache is  
invalidated.  
PFS  
ISF  
Program Flow Status.  
A pulse on this signal indicates the beginning  
of execution for each instruction (Section  
3.5.11).  
Cache Inhibit In.  
When active, indicates that the location refer-  
enced in the current bus cycle is not cache-  
able. CIIN must not change within an aligned  
16-byte block.  
Internal Sequential Fetch.  
Indicates along with PFS that the instruction  
beginning execution is sequential (ISF Low)  
or non-sequential (ISF High).  
IODEC  
FSSR  
I/O Decode.  
U/S  
BP  
User/Supervisor.  
Indicates to the CPU that a peripheral device  
is addressed by the current bus cycle. The  
value of IODEC must not change within an  
aligned 16-byte block (Section 3.5.8).  
User or supervisor mode status.  
Break Point.  
This signal is activated when the CPU de-  
tects a PC or operand-address match debug  
condition (Section 3.3.2).  
Force Slave Status Read.  
When asserted, indicates that the slave  
status word should be read by the CPU (Sec-  
tion 3.1.4.1). An external 10 kX resistor  
should be connected between FSSR and  
CASEC  
CIOUT  
*Cache Section.  
For cacheable data read bus cycles indicates  
the Section of the on-chip Data Cache where  
the data will be placed; undefined for other  
bus cycles. This signal can be used for exter-  
nal monitoring of the data cache contents.  
V
.
CC  
SDN  
Slave Done.  
Used by a slave processor to signal the com-  
a slave instruction (Section  
Cache Inhibit Out.  
pletion of  
3.1.4.1). An external 10 kX resistor should be  
This signal reflects the state of the CI bit in  
the second level page table entry (PTE). It is  
used to specify non-cacheable pages. It is  
held low while address translation is disabled  
and for MMU references to page table en-  
tries.  
connected between SDN and V  
.
CC  
BIN  
Burst In.  
When active, indicates to the CPU that the  
memory supports burst cycles (Section  
3.5.4.3).  
IOINH  
I/O Inhibit.  
RDY  
Ready.  
Indicates that the current bus cycle should  
a peripheral device is ad-  
While this signal is not active, the CPU ex-  
tends the current bus cycle to support a slow  
memory or peripheral device.  
be ignored if  
dressed.  
SPC  
Slave Processor Control.  
Data strobe for slave processor transfers.  
*Burst Out.  
BW0–1  
Bus Width.  
These lines define the bus width (8, 16 or 32  
bits) for each data transfer; BW0 is the least  
significant bit. The bus width must not  
change within an aligned 16-byte blockÐen-  
codings are:  
BOUT  
When active, indicates that the CPU is re-  
questing to perform burst cycles.  
ILO  
Interlocked Operation.  
00ÐReserved  
01Ð8 Bits  
When active, indicates that interlocked cy-  
cles are being performed (Section 3.5.4.5).  
67  
4.0 Device Specifications (Continued)  
DDIN  
*Data Direction.  
00101ÐInterrupt Acknowledge, Cascaded.  
00110ÐEnd of Interrupt, Master.  
00111ÐEnd of Interrupt, Cascaded.  
01000ÐSequential Instruction Fetch.  
01001ÐNon-Sequential Instruction Fetch.  
01010ÐData Transfer.  
Indicates the direction of a data transfer. It is  
low for reads and high for writes.  
CONF  
*Confirm Bus Cycle.  
When active, indicates that a bus cycle initia-  
ted by ADS is valid; that is, the bus cycle has  
not been cancelled (Section 3.5.4.2).  
01011ÐRead Read-Modify-Write Operand.  
01100ÐRead for Effective Address.  
01101ÐAccess PTE1 by MMU.  
01110ÐAccess PTE2 by MMU.  
01111  
BMT  
ADS  
*Begin Memory Transaction.  
When Stable Low indicates that the current  
bus cycle is valid; that is, the bus cycle has  
not been cancelled (Section 3.5.4.2).  
*Address Strobe.  
#
#
When active, indicates that a bus cycle has  
begun and a valid address is on the address  
bus.  
Reserved.  
#
11100  
BE0–3  
ST0–4  
*Byte Enables.  
*
Used to selectively enable data transfers on  
bytes 0–3 of the data bus.  
11101ÐTransfer Slave Operand.  
11110ÐRead Slave Status Word.  
11111ÐBroadcast Slave ID.  
*Address Bus.  
Status.  
Bus cycle status code; ST0 is the least signif-  
icant. Encodings are:  
A031  
Used by the CPU to output a 32-bit address  
at the beginning of a bus cycle. A0 is the  
least significant.  
00000ÐIdle: CPU Inactive on Bus.  
00001ÐIdle: WAIT Instruction.  
00010ÐIdle: Halted.  
4.1.4 Input/Output Signals  
00011ÐIdle: The bus is idle while the slave  
processor is executing an instruction.  
D031  
*Data Bus.  
Used by the CPU to input or output data dur-  
ing a read or write cycle respectively.  
00100ÐInterrupt Acknowledge, Master.  
4.2 ABSOLUTE MAXIMUM RATINGS  
All Input or Output Voltages with  
Respect to GND  
b
a
0.5V to 7V  
If Military/Aerospace specified devices are required,  
please contact the National Semiconductor Sales  
Office/Distributors for availability and specifications.  
Power Dissipation  
4 W  
Note: Absolute maximum ratings indicate limits beyond  
which permanent damage may occur. Continuous operation  
at these limits is not intended; operation should be limited to  
those conditions specified under Electrical Characteristics.  
a
0 C to 95 C  
Case Temperature Under Bias  
Storage Temperature  
§
65 C to 150 C  
§
b
a
§
§
e
a
0 to 95 C, V  
e
e
g
5V 5%, GND  
4.3 ELECTRICAL CHARACTERISTICS T  
0V  
Typ  
§
§
CASE  
CC  
Symbol  
Parameter  
High Level Input Voltage  
Low Level Input Voltage  
High Level Output Voltage  
Conditions  
Min  
Max  
Units  
a
V
V
V
V
2.0  
V
CC  
0.5  
V
V
V
IH  
b
0.5  
0.8  
IL  
e b  
I
400 mA  
2.4  
OH  
OL  
OH  
Low Level Output Voltage  
A011, D031, DDIN  
CONF, BMT  
e
e
e
e
I
I
I
I
4 mA  
0.4  
0.4  
0.4  
0.4  
V
V
V
V
OL  
OL  
OL  
OL  
s
6 mA  
16 mA  
2 mA  
BCLK, BCLK  
All Other Outputs  
s
b
b
I
I
Input Load Current  
0
V
IN  
V
20  
20  
20  
20  
mA  
mA  
L
CC  
s
s
Leakage Current (Output and  
0.4  
V
IN  
V
CC  
L
I/O pins in TRI-STATE/Input Mode)  
C
I
CLK Input Capacitance  
15  
pF  
IN  
@
650 30 MHz  
@
800 30 MHz  
e
e
e
A
Active Supply Current  
I
0, T  
25 C,  
§
CC  
OUT  
@
550 25 MHz  
@
675 25 MHz  
V
5V  
mA  
CC  
@
450 20 MHz  
@
575 20 MHz  
68  
4.0 Device Specifications (Continued)  
Connection Diagram  
TL/EE/935440  
Bottom View  
Order Number NS32532-20, NS32532-25 or NS32532-30  
FIGURE 4-2. 175-Pin PGA Package  
NS32532 Pinout Descriptions  
Desc  
Pin  
Desc  
Pin  
Desc  
Pin  
Desc  
Pin  
Desc  
Pin  
Desc  
Pin  
Reserved  
Reserved  
Reserved  
BP  
A1 D26  
B16 GNDB13 D14  
GNDL6  
VCCL5  
D13  
J14  
J15  
J16  
K1  
GNDL5  
CONF  
RDY  
N9  
A0  
R6  
R7  
A2 Reserved C1 VCCB14  
A3 Reserved C2 D23  
D15  
D16  
E1  
N10 VCCB9  
N11 CIOUT  
N12 SPC  
R8  
A4 VCCL2  
C3 IOINH  
VCCB6  
A23  
HOLD  
VCCB11  
GNDB10  
D4  
R9  
ISF  
A5 Reserved C4 ILO  
E2  
K2  
N13 BE3  
R10  
R11  
R12  
R13  
R14  
R15  
R16  
S1  
RST  
A6 PFS  
A7 SDN  
C5 GNDB3  
C6 D24  
E3  
GNDL4  
GNDB11  
D11  
K3  
N14 VCCB10  
N15 ADS  
NMI  
E14  
E15  
E16  
F1  
K14  
K15  
K16  
L1  
GNDB1  
Reserved  
VCCB2  
INVIC  
A8 Reserved C7 D22  
D6  
N16 BW1  
A9 BCLK  
A10 VCCCLK  
A11 SYNC  
C8 D20  
C9 A30  
D12  
A16  
P1  
P2  
BER  
CIIN  
D2  
A22  
VCCB7  
GNDB6  
A10  
C10 CASEC  
C11 Reserved  
C12 D21  
C13 D19  
C14 D18  
C15 A29  
F2  
A21  
L2  
P3  
Reserved (1) A12 CIA0  
F3  
VCCL3  
D8  
L3  
P4  
A13  
A8  
CIA1  
A13 CIA6  
A14 VCCL6  
A15 D29  
B1 D27  
B2 D25  
B3 U/S  
F14  
F15  
F16  
G1  
L14  
L15  
L16  
M1  
M2  
M3  
A6  
P5  
S2  
CIA4  
D9  
A2  
P6  
A5  
S3  
VCCB1  
Reserved  
VCCB4  
Reserved  
Reserved  
VCCB3  
FSSR  
D10  
ST3  
P7  
A3  
S4  
A20  
GNDB8  
VCCL4  
BE1  
P8  
A1  
S5  
C16 A31  
G2  
GNDB5  
A17  
P9  
ST2  
ST1  
ST0  
BOUT  
DDIN  
BE2  
BE0  
BMT  
BRT  
IODEC  
D1  
S6  
D1 VCCB5  
G3  
P10  
P11  
P12  
P13  
P14  
P15  
P16  
R1  
S7  
B4 Reserved D2 GNDB12 G14  
D5  
M14 GNDB9  
M15 BW0  
M16 BIN  
S8  
B5 Reserved D3 D17  
G15  
G16  
H1  
D7  
S9  
B6 GNDL3  
B7 GNDB2  
B8 DBG  
D4 D16  
D5 A27  
D6 A28  
VCCB12  
A19  
S10  
S11  
S12  
S13  
S14  
S15  
S16  
INT  
N1  
N2  
N3  
N4  
N5  
N6  
N7  
N8  
Reserved  
D0  
VCCL1  
GNDL2  
INVSET  
INVDC  
CIA3  
H2  
A18  
B9 Reserved D7 GNDB4  
H3  
A14  
D3  
B10 BCLK  
B11 GNDCLK  
B12 CLK  
D8 VCCB13  
D9 D15  
H14  
H15  
H16  
J1  
A11  
A15  
A12  
A9  
VCCB8  
GNDB7  
ST4  
R2  
D10 D14  
D11 A26  
D12 A25  
D13 A24  
R3  
CIA5  
B13 CIA2  
B14 D31  
A7  
R4  
D30  
J2  
HLDA  
A4  
R5  
D28  
B15 GNDL1  
J3  
Note 1: This pin should be grounded.  
All other reserved pins should be left open.  
69  
4.0 Device Specifications (Continued)  
4.4 SWITCHING CHARACTERISTICS  
4.4.1 Definitions  
ABBREVIATIONS:  
L.E.Ðleading edge R.E.Ðrising edge  
T.E.Ðtraining edge F.E.Ðfalling edge  
All the timing specifications given in this section refer to  
0.8V or 2.0V on all the signals as illustrated in Figures 4-3  
and 4-4, unless specifically stated otherwise.  
TL/EE/935442  
FIGURE 4-4. Input Signals Specification Standard  
TL/EE/935441  
FIGURE 4-3. Output Signals Specification Standard  
70  
4.0 Device Specifications (Continued)  
4.4.2 Timing Tables  
4.4.2.1 Output Signals: Internal Propagation Delays, NS32532-20, NS32532-25, NS32532-30  
Maximum times assume capacitive loading of 100 pF on the clock signals and 50 pF on all the other signals. A minimum  
capacitance load of 50 pF on BCLK and BCLK is also assumed.  
#
NS32532-20  
NS32532-25  
NS32532-30  
Name  
Figure  
4–24  
4–24  
4–24  
4–24  
4–24  
4–24  
4–24  
4–24  
4–24  
4–24  
4–24  
4–24  
4–24  
4–24  
4–24  
Description  
Bus Clock Period  
BCLK High Time  
BCLK Low Time  
BCLK Rise Time  
BCLK Fall Time  
BCLK High Time  
BCLK Low Time  
BCLK Rise Time  
BCLK Fall Time  
Reference/Conditions  
Units  
Min  
Max  
Min  
Max  
Min  
Max  
t
t
t
t
t
t
t
t
t
t
t
t
t
t
t
t
t
t
t
R.E., BCLK to Next  
R.E., BCLK  
BC  
p
50  
100  
40  
100  
33.3  
100  
ns  
ns  
ns  
ns  
ns  
ns  
ns  
ns  
ns  
ns  
ns  
ns  
ns  
ns  
ns  
ns  
ns  
ns  
ns  
At 2.0V on BCLK  
(Both Edges)  
0.5 t  
BCp  
0.5 t  
BCp  
0.5 t  
BCp  
BC  
h
b
b
b
5
4
3.65  
At 0.8V on BCLK  
(Both Edges)  
0.5 t  
b
0.5 t  
b
0.5 t  
BCp  
BC  
l
BCp  
5
BCp  
4
b
3.65  
(1)  
0.8V to 2.0V on  
R.E., BCLK  
BC  
r
5
5
4
4
3
3
(1)  
2.0V to 0.8V on  
F.E., BCLK  
BC  
f
At 2.0V on BCLK  
(Both Edges)  
0.5 t  
b
0.5 t  
b
0.5 t  
BCp  
NBC  
h
BCp  
5
BCp  
4
b
3.65  
At 0.8V on BCLK  
(Both Edges)  
0.5 t  
b
0.5 t  
b
0.5 t  
BCp  
NBC  
l
BCp  
5
BCp  
4
b
3.65  
(1)  
0.8V to 2.0V on  
R.E., BCLK  
NBC  
r
5
4
3
(1)  
2.0V to 0.8V on  
F.E., BCLK  
NBC  
f
5
4
3
CLK to BCLK  
R.E. Delay  
2.0V on R.E., CLK to  
2.0V on R.E., BCLK  
CBC  
dr  
20  
20  
20  
20  
17  
17  
17  
17  
15  
15  
15  
15  
CLK to BCLK  
F.E. Delay  
2.0V on R.E., CLK to  
0.8V on F.E., BCLK  
CBC  
df  
CLK to BCLK  
R.E. Delay  
2.0V on R.E., CLK to  
0.8V on R.E., BCLK  
CNBC  
dr  
CLK to BCLK  
F.E. Delay  
2.0V on R.E., CLK to  
0.8V on F.E., BCLK  
CNBC  
df  
Bus Clocks Skew 2.0V on R.E., BCLK to  
0.8V on F.E., BCLK  
BCNBC  
rf  
b
b
a
a
b
b
a
a
b
a
a
2
2
2
2
2
2
2
2
1
1
1
1
Bus Clocks Skew 0.8V on F.E., BCLK to  
2.0V on R.E., BCLK  
BCNBC  
fr  
b
4–5, 4–6 Address Bits 031 After R.E., BCLK T1  
Valid  
A
v
11  
9
8
4–5, 4–6 Address Bits 031 After R.E., BCLK T1 or Ti  
Hold  
A
h
0
0
0
0
0
0
4–11, 412 Address Bits 031 After F.E., BCLK Ti  
Floating  
A
f
21  
17  
13  
4–11, 412 Address Bits 031 After F.E., BCLK Ti  
Not Floating  
An  
f
Note 1: Guaranteed by characterization. Due to tester conditions this parameter is not 100% tested.  
71  
4.0 Device Specifications (Continued)  
4.4.2.1 Output Signals: Internal Propagation Delays, NS32532-20, NS32532-25, NS32532-30 (Continued)  
NS32532-20  
NS32532-25  
NS32532-30  
Name  
Figure  
4–8  
Description  
Reference/Conditions  
Units  
Min  
Max  
Min  
Max  
Min  
Max  
t
t
t
Address Bits A2, A3 After R.E., BCLK T2B  
Valid (Burst Cycle)  
AB  
v
11  
9
8
ns  
4–8  
Address Bits A2, A3 After R.E., BCLK T2B  
Hold (Burst Cycle)  
AB  
h
0
0
0
ns  
ns  
4–6, 415 Data Out Valid  
After R.E., BCLK T1  
0.5 t  
BCp  
a
13 ns  
0.5 t  
BCp  
a
12 ns  
0.5 t  
BCp  
a
11 ns  
DO  
v
0.5 t  
BCp  
0.5 t  
BCp  
0.5 t  
BCp  
t
t
4–6, 415 Data Out Hold  
After R.E., BCLK T1 or Ti  
Before SPC T.E.  
0
0
0
8
ns  
ns  
DO  
h
4–15  
4–7  
Data Out Setup  
(Slave Write)  
DO  
spc  
12  
10  
t
Data Bus Floating  
After R.E., BCLK  
T1 or Ti  
DO  
f
21  
17  
13  
ns  
ns  
t
4–7  
Data Bus  
After F.E., BCLK T1  
DO  
nf  
0
0
0
0
0
0
0
0
0
Not Floating  
t
t
t
t
4–5, 4–7 BMT Signal Valid  
4–5, 4–7 BMT Signal Hold  
After R.E., BCLK T1  
After R.E., BCLK T2  
30  
21  
25  
17  
21  
13  
ns  
ns  
ns  
BMT  
v
BMT  
h
4–11, 412 BMT Signal Floating After F.E., BCLK Ti  
BMT  
f
4–11, 412 BMT Signal  
Not Floating  
After F.E., BCLK Ti  
BMT  
hf  
ns  
ns  
t
4–5, 4–8 CONF Signal Active After R.E., BCLK T1  
0.5 t  
a
0.5 t  
a
0.5 t  
a
CONF  
a
BC  
11  
BC  
9
BC  
8
p
p
p
0.5 t  
BC  
0.5 t  
BC  
0.5 t  
BC  
p
p
p
t
t
t
4–5, 4–8 CONF Signal Inactive After R.E., BCLK T1 or Ti  
4–11, 412 CONF Signal Floating After F.E., BCLK Ti  
11  
9
8
ns  
ns  
CONF  
ia  
21  
17  
13  
CONF  
f
4–11, 412 CONF Signal  
Not Floating  
After F.E., BCLK Ti  
CONF  
nf  
0
0
0
ns  
t
t
t
t
t
4–5, 4–8 ADS Signal Active  
After R.E., BCLK T1  
11  
11  
9
9
8
8
ns  
ns  
ns  
ns  
ADS  
a
4–5, 4–8 ADS Signal Inactive After F.E., BCLK T1  
4–6 ADS Pulse Width At 0.8V (Both Edges)  
4–11, 412 ADS Signal Floating After F.E., BCLK Ti  
ADS  
ia  
15  
0
12  
0
10  
0
ADS  
w
21  
11  
21  
17  
9
13  
8
ADS  
f
4–11, 412 ADS Signal  
Not Floating  
After F.E., BCLK Ti  
ADS  
nf  
ns  
ns  
ns  
ns  
ns  
t
t
4–6, 4–8 BE Signals Valid  
n
After R.E., BCLK T1  
BE  
v
4–6, 4–8 BE Signals Hold  
n
After R.E., BCLK T1,  
Ti or T2B  
BE  
h
0
0
0
t
t
4–11, 4-12 BE Signals Floating After F.E., BCLK Ti  
n
17  
13  
BE  
f
4–11, 412 BE Signals  
n
Not Floating  
After F.E., BCLK Ti  
BE  
nf  
0
0
0
0
0
0
0
0
0
t
t
t
t
4–5, 4–6 DDIN Signal Valid  
4–5, 4–6 DDIN Signal Hold  
After R.E., BCLK T1  
11  
21  
9
8
ns  
ns  
ns  
DDIN  
DDIN  
DDIN  
DDIN  
v
After R.E., BCLK T1 or Ti  
h
f
4–11, 412 DDIN Signal Floating After F.E., BCLK Ti  
17  
13  
4–11, 412 DDIN Signal  
Not Floating  
After F.E., BCLK Ti  
nf  
ns  
ns  
t
4–14, 415 SPC Signal Active  
After R.E., BCLK T1  
19  
15  
12  
SPC  
a
72  
4.0 Device Specifications (Continued)  
4.4.2.1 Output Signals: Internal Propagation Delays, NS32532-20, NS32532-25, NS32532-30 (Continued)  
NS32532-20 NS32532-25 NS32532-30  
Name  
Figure  
Description  
Reference/Conditions  
Units  
Min  
Max  
Min  
Max  
Min  
Max  
t
t
4–14, 415 SPC Signal Inactive  
After R.E., BCLK Ti, T1 or T2  
Before SPC L.E.  
19  
15  
12  
ns  
ns  
SPC  
ia  
(1)  
4–14  
DDIN Valid to  
SPC Active  
DDSPC  
0
0
0
t
t
t
t
t
t
4–12, 413 HLDA Signal Active  
4–12 HLDA Signal Inactive  
After F.E., BCLK Ti  
After F.E., BCLK Ti  
After R.E., BCLK T1  
After R.E., BCLK T1 or Ti  
After R.E., BCLK T2  
15  
15  
11  
11  
11  
9
10  
10  
8
ns  
ns  
ns  
ns  
ns  
HLDA  
a
HLD  
Aia  
4–5, 414 Status (ST04) Valid  
4–5, 414 Status (ST04) Hold  
4–8, 4–9 BOUT Signal Active  
4–8, 4–9 BOUT Signal Inactive  
ST  
v
0
0
0
ST  
h
15  
15  
21  
12  
12  
17  
11  
11  
13  
BOUT  
a
After R.E., BCLK  
Last T2B, T1 or Ti  
BOUT  
ia  
ns  
ns  
ns  
t
t
4–11, 412 BOUT Signal Floating  
After F.E., BCLK Ti  
After F.E., BCLK Ti  
BOUT  
f
4–11, 412 BOUT Signal  
Not Floating  
BOUT  
nf  
0
0
0
t
t
t
t
t
t
t
t
t
t
t
t
t
t
4–7  
4–7  
Interlock Signal Active After F.E., BCLK Ti  
Interlock Signal Inactive After F.E., BCLK Ti  
11  
11  
15  
15  
15  
15  
15  
15  
11  
9
8
ns  
ns  
ns  
ns  
ns  
ns  
ns  
ns  
ns  
ns  
ns  
ns  
ns  
ILO  
a
9
8
ILO  
ia  
4–21  
4–21  
4–22  
4–22  
4–23  
4–23  
4–5  
PFS Signal Active  
PFS Signal Inactive  
ISF Signal Active  
ISF Signal Inactive  
BP Signal Active  
BP Signal Inactive  
U/S Signal Valid  
U/S Signal Hold  
After F.E., BCLK  
11  
11  
11  
11  
11  
11  
9
10  
10  
10  
10  
10  
10  
8
PFS  
a
After F.E., Next BCLK  
After F.E., BCLK  
PFS  
ia  
ISF  
a
After F.E., Next BCLK  
After F.E., BCLK  
ISF  
ia  
BP  
a
After F.E., Next BCLK  
After R.E., BCLK T1  
After R.E., BCLK T1 or Ti  
After F.E., BCLK T1  
After R.E., BCLK T1 or Ti  
BP  
ia  
US  
v
4–5  
0
0
0
0
0
0
US  
h
4–5  
CASEC Signal Valid  
CASEC Signal Hold  
15  
21  
11  
17  
10  
13  
CAS  
v
4–5  
CAS  
h
4–11, 412 CASEC Signal Floating After F.E., BCLK Ti  
CAS  
f
4–11, 412 CASEC Signal  
Not Floating  
After F.E., BCLK Ti  
CAS  
nf  
0
0
0
ns  
t
t
t
t
4–5  
4–5  
4–5  
4–5  
CIOUT Signal Valid  
After R.E., BCLK T1  
15  
15  
11  
11  
10  
10  
ns  
ns  
ns  
ns  
CIO  
CIO  
IOI  
v
CIOUT Signal Hold  
IOINH Signal Valid  
IOINH Signal Hold  
After R.E., BCLK T1 or Ti  
After R.E., BCLK T1  
0
0
0
0
0
0
h
v
After R.E., BCLK T1 or Ti  
IOI  
h
73  
4.0 Device Specifications (Continued)  
4.4.2.2 Input Signal Requirements: NS32532-20, NS32532-25, NS32532-30  
NS32532-20 NS32532-25 NS32532-30  
Name  
Figure  
4–24  
4–24  
4–24  
Description  
Reference/Conditions  
Units  
Min  
Max  
Min  
Max  
Min  
Max  
t
t
t
Input Clock Period R.E., CLK to Next  
R.E., CLK  
C
p
h
l
25  
50  
20  
50  
16.6  
50  
ns  
ns  
ns  
CLK High Time  
At 2.0V on CLK  
(Both Edges)  
0.5 t  
b
0.5 t  
b
0.5 t  
b
C
C
C
C
C
p
p
p
p
p
p
5
5
4
CLK Low Time  
At 0.8V on CLK  
(Both Edges)  
0.5 t  
b
0.5 t  
b
0.5 t  
C
C
C
b
4
5
5
t
t
t
t
t
(1)  
(1)  
4–24  
4–24  
CLK Rise Time  
CLK Fall Time  
0.8V to 2.0V on R.E., CLK  
2.0V to 0.8V on F.E., CLK  
Before R.E., BCLK T1 or Ti  
After R.E., BCLK T1 or Ti  
5
5
4
4
3
3
ns  
ns  
ns  
ns  
C
C
r
f
4–5, 414 Data In Setup  
4–5, 414 Data In Hold  
12  
1
10  
1
8
1
DI  
DI  
s
h
4–5  
RDY Setup Time  
Before R.E., BCLK T2(W),  
T1 or Ti  
RDY  
s
19  
1
15  
1
12  
1
ns  
ns  
t
4–5  
RDY Hold Time  
Ater R.E., BCLK T2(W),  
T1 or Ti  
RDY  
h
t
t
t
t
t
t
t
t
t
t
t
t
t
4–5  
4–5  
BW0–1 Setup Time Before F.E., BCLK T2 or T2(W)  
BW0–1 Hold Time After F.E., BCLK T2 or T2(W)  
19  
1
15  
1
12  
1
ns  
ns  
ns  
ns  
ns  
ns  
ns  
ns  
ns  
ns  
ns  
ns  
BW  
s
BW  
h
4–12, 413 HOLD Setup Time Before F.E., BCLK  
19  
1
15  
1
12  
1
HOLD  
s
4–12  
4–8  
HOLD Hold Time  
BIN Setup Time  
BIN Hold Time  
After F.E., BCLK  
HOLD  
h
Before F.E., BCLK T2 or T2(W)  
After F.E., BCLK T2 or T2(W)  
Before R.E., BCLK T1 or Ti  
After R.E., BCLK T1 or Ti  
Before R.E., BCLK T1 or Ti  
After R.E., BCLK T1 or Ti  
18  
1
14  
1
11  
1
BIN  
s
4–8  
BIN  
h
4–6, 4–8 BER Setup Time  
4–6, 4–8 BER Hold Time  
4–6, 4–8 BRT Setup Time  
4–6, 4–8 BRT Hold Time  
19  
1
15  
1
12  
1
BER  
s
BER  
h
19  
1
15  
1
12  
1
BRT  
s
BRT  
h
4–5  
4–5  
IODEC Setup Time Before F.E., BCLK T2 or T2(W)  
18  
1
14  
1
11  
1
IOD  
s
IODEC Hold Time  
After F.E., BCLK T2 or T2(W)  
After VCC Reaches 4.5V  
IOD  
h
(1)  
4–26  
Power Stable to  
R.E. of RST  
PWR  
50  
40  
30  
ms  
t
t
4–27  
4–27  
RST Setup Time  
RST Pulse Width  
Before R.E., BCLK  
14  
64  
12  
64  
11  
64  
ns  
RST  
s
At 0.8V (Both Edges)  
t
BC  
RST  
w
p
Note 1: Guaranteed by characterization. Due to tester conditions this parameter is not 100% tested.  
74  
4.0 Device Specifications (Continued)  
4.4.2.2 Input Signal Requirements: NS32532-20, NS32532-25, NS32532-30 (Continued)  
NS32532-20  
NS32532-25  
NS32532-30  
Name  
Figure  
Description  
Reference/Conditions  
Units  
Min  
19  
1
Max  
Min  
15  
1
Max  
Min  
12  
1
Max  
t
4–5  
CIIN Setup Time  
CIIN Hold Time  
INT Setup Time  
INT Hold Time  
Before F.E., BCLK T2  
After F.E., BCLK T2  
Before R.E., BCLK  
After R.E., BCLK  
Before R.E., BCLK  
After R.E., BCLK  
Before R.E., BCLK  
After R.E., BCLK  
Before R.E., BCLK  
After R.E., BCLK  
Before R.E., CLK  
After R.E., CLK  
ns  
ns  
ns  
ns  
ns  
ns  
ns  
ns  
ns  
ns  
ns  
ns  
ns  
ns  
ns  
ns  
ns  
ns  
ns  
ns  
ns  
ns  
CII  
s
t
t
t
t
t
t
t
t
t
t
t
t
t
t
t
t
t
t
t
t
t
4–5  
CII  
h
4–19  
4–19  
4–19  
4–19  
4–16  
4–16  
4–17  
4–17  
4–25  
4–25  
4–18  
4–18  
4–18  
4–18  
4–18  
4–18  
4–18  
4–18  
4–20  
4–20  
12  
1
10  
1
9
INT  
INT  
s
1
h
NMI Setup Time  
NMI Hold Time  
18  
1
15  
1
14  
1
NMI  
NMI  
s
h
SDN Setup Time  
SDN Hold Time  
FSSR Setup Time  
FSSR Hold Time  
SYNC Setup Time  
SYNC Hold Time  
CIA0–6 Setup Time  
CIA0–6 Hold Time  
12  
1
10  
1
9
SD  
s
1
SD  
h
12  
1
10  
1
9
FSSR  
s
1
FSSR  
h
10  
1
8
7
SYNC  
SYNC  
s
1
1
h
Before R.E., BCLK  
After R.E., BCLK  
12  
1
10  
1
9
CIA  
s
1
CIA  
h
INVSET Setup Time Before R.E., BCLK  
12  
1
11  
1
9
INVS  
s
INVSET Hold Time  
INVIC Setup Time  
INVIC Hold Time  
INVDC Setup Time  
INVDC Hold Time  
DBG Setup Time  
DBG Hold Time  
After R.E., BCLK  
Before R.E., BCLK  
After R.E., BCLK  
Before R.E., BCLK  
After R.E., BCLK  
Before R.E., BCLK  
After R.E., BCLK  
1
INVS  
INVI  
h
12  
1
10  
1
9
s
1
INVI  
h
12  
1
10  
1
9
INVD  
INVD  
DBG  
s
1
h
12  
1
10  
1
9
s
1
DBG  
h
75  
4.0 Device Specifications (Continued)  
4.4.3 Timing Diagrams  
TL/EE/935443  
FIGURE 4-5. Basic Read Cycle Timing  
76  
4.0 Device Specifications (Continued)  
TL/EE/935444  
Note: An Idle State is always inserted before a Write Cycle when the Write immediately follows a confirmed Read Cycle.  
FIGURE 4-6. Write Cycle Timing  
77  
4.0 Device Specifications (Continued)  
TL/EE/935445  
FIGURE 4-7. Interlocked Read and Write Cycles  
78  
4.0 Device Specifications (Continued)  
TL/EE/935446  
FIGURE 4-8. Burst Read Cycles  
79  
4.0 Device Specifications (Continued)  
TL/EE/935447  
FIGURE 4-9. External Termination of Burst Cycles  
TL/EE/935448  
FIGURE 4-10. Bus Error or Retry During Burst Cycles  
Note: Two idle state are always inserted by the CPU following the assertion of BRT.  
80  
4.0 Device Specifications (Continued)  
TL/EE/935449  
FIGURE 4-11. Extended Retry Timing  
81  
4.0 Device Specifications (Continued)  
TL/EE/935450  
FIGURE 4-12. Hold Timing (Bus Initially Idle)  
82  
4.0 Device Specifications (Continued)  
TL/EE/935452  
FIGURE 4-14. Slave Processor Read Timing  
TL/EE/935451  
FIGURE 4-13. HOLD Acknowledge Timing  
(Bus Initially Not Idle)  
TL/EE/935454  
FIGURE 4-16. Slave Processor Done  
TL/EE/935453  
FIGURE 4-15. Slave Processor Write Timing  
TL/EE/935455  
FIGURE 4-17. FSSR Signal Timing  
83  
4.0 Device Specifications (Continued)  
TL/EE/935456  
FIGURE 4-18. Cache Invalidation Request  
Note 1: CIA0–6 and INVSET are only relevant when INVIC and/or INVDC are asserted.  
TL/EE/935457  
FIGURE 4-19. INT and NMI Signals Sampling  
Note 1: INT and NMI are sampled on every other rising edge of BCLK, starting with the second rising edge of BCLK after RST goes high.  
Note 2: INT is level sensitive, and once asserted, it should not be deasserted until it is acknowledged.  
TL/EE/935458  
TL/EE/935459  
FIGURE 4-20. Debug Trap Request  
FIGURE 4-21. PFS Signal Timing  
TL/EE/935460  
TL/EE/935461  
FIGURE 4-22. ISF Signal Timing  
FIGURE 4-23. Break Point Signal Timing  
84  
4.0 Device Specifications (Continued)  
TL/EE/935462  
FIGURE 4-24. Clock Waveforms  
TL/EE/935463  
FIGURE 4-25. Bus Clock Synchronization  
TL/EE/935464  
FIGURE 4-26. Power-On Reset  
TL/EE/935465  
FIGURE 4-27. Non-Power-On Reset  
85  
Appendix A: Instruction Formats  
NOTATIONS:  
Options: in String Instructions  
e
i
Integer Type Field  
e
U/W  
B
T
B
00 (Byte)  
01 (Word)  
11 (Double Word)  
e
e
e
e
T
B
Translated  
Backward  
00: None  
01: While Match  
11: Until Match  
W
D
e
U/W  
e
e
e
f
c
Floating Point Type Field  
e
e
F
L
1 (Std. Floating: 32 bits)  
0 (Long Floating: 64 bits)  
Configuration bits, in SETCFG Instruction:  
Custom Type Field  
e
e
D
Q
1 (Double Word)  
0 (Quad Word)  
C
M
F
I
mreg: MMU Register number, in LMR, SMR.  
e
op  
Operation Code  
Valid encodings shown with each format.  
0000  
#
#
e
gen, gen 1, gen 2 General Addressing Mode Field  
See Section 2.2 for encodings.  
Trap (UND)  
#
e
e
reg  
General Purpose Register Number  
Condition Code Field  
e
e
e
e
e
e
e
e
e
0111  
1000  
1001  
1010  
1011  
1100  
1101  
1110  
1111  
*
cond  
Reserved  
MCR  
e
e
e
e
e
e
e
e
e
e
e
e
e
e
e
e
e
1
0000  
0001  
0010  
0011  
0100  
0101  
0110  
0111  
1000  
1001  
1010  
1011  
1100  
1101  
1110  
1111  
EQual: Z  
e
e
Not Equal: Z  
Carry Set: C  
Carry Clear: C  
0
MSR  
1
TEAR  
PTB0  
e
0
e
Lower or Same: L  
HIgher: L  
1
PTB1  
e
0
IVAR0  
IVAR1  
e
e
Greater Than: N  
Less or Equal: N  
1
0
e
Flag Set: F  
1
7
7
0
e
0 and Z  
Flag Clear: F  
0
cond  
1
0
0
0
1
1
0
e
Higher or Same: L  
e
0
LOwer: L  
e
0 and Z  
e
1 or Z  
1
Format 0  
e
Greater or Equal: N  
e
1 or Z  
Less Than: N  
0
Bcond  
(BR)  
e
e
1
(Unconditionally True)  
(Unconditionally False)  
0
op  
0
e
short  
Short Immediate value. May contain:  
quick: Signed 4-bit value, in MOVQ, ADDQ,  
CMPQ, ACB.  
Format 1  
BSR  
RET  
-0000  
-0001  
-0010  
-0011  
-0100  
-0101  
-0110  
-0111  
ENTER  
EXIT  
NOP  
WAIT  
DIA  
-1000  
-1001  
-1010  
-1011  
-1100  
-1101  
-1110  
-1111  
cond: Condition Code (above), in Scond.  
areg: CPU Dedicated Register, in LPR, SPR.  
CXP  
e
e
e
e
e
0000  
0001  
0010  
0011  
0100  
UPSR  
DCR  
BPC  
RXP  
RETT  
RETI  
SAVE  
FLAG  
SVC  
DSR  
CAR  
RESTORE  
BPT  
e
01010111  
(Reserved)  
e
e
e
e
e
e
e
e
1000  
1001  
1010  
1011  
1100  
1101  
1110  
1111  
FP  
SP  
SB  
15  
8
7
0
gen  
short  
op  
1
1
i
USP  
CFG  
PSR  
Format 2  
ADDQ  
CMPQ  
SPR  
-000  
ACB  
-100  
-101  
-110  
INTBASE  
MOD  
-001  
-010  
-011  
MOVQ  
LPR  
Scond  
86  
Appendix A: Instruction Formats (Continued)  
15  
8 7  
0
gen  
op  
1 1 1 1 1  
i
Format 3  
TL/EE/935466  
CXPD  
-0000  
ADJSP  
JSR  
-1010  
-1100  
-1110  
Format 8  
BICPSR  
JUMP  
-0010  
-0100  
-0110  
EXT  
-0 00  
INDEX  
FFS  
-1 00  
-1 01  
CASE  
CVTP  
INS  
-0 01  
BISPSR  
-0 10  
Trap (UND) on XXX1, 1000  
CHECK  
MOVSU  
MOVUS  
-0 11  
e
e
-110, reg  
-110, reg  
001  
011  
15  
8 7  
0
gen 1  
gen 2  
op  
i
23  
16 15  
gen 2  
8 7  
0
Format 4  
gen 1  
op  
Format 9  
-000  
f
i
0 0 1 1 1 1 1 0  
ADD  
CMP  
BIC  
-0000  
-0001  
-0010  
-0100  
-0101  
-0110  
SUB  
-1000  
-1001  
-1010  
-1100  
-1101  
-1110  
ADDR  
AND  
MOVif  
LFSR  
ROUND  
TRUNC  
SFSR  
-100  
-101  
-110  
-111  
ADDC  
MOV  
OR  
SUBC  
TBIT  
XOR  
-001  
-010  
-011  
MOVLF  
MOVFL  
FLOOR  
23  
16 15  
8 7  
0
0 0 0 0 0 short  
0
op  
i
0 0 0 0 1 1 1 0  
TL/EE/935467  
Format 5  
Format 10  
MOVS  
CMPS  
-0000  
-0001  
SETCFG  
SKPS  
-0010  
-0011  
Trap (UND) Always  
23  
16 15  
gen 2  
8 7  
0
Trap (UND) on 1XXX, 01XX  
gen 1  
op  
0
f
1 0 1 1 1 1 1 0  
23  
16 15  
8 7  
0
gen 1  
gen 2  
op  
i
0 1 0 0 1 1 1 0  
Format 11  
-0000 DIVf  
ADDf  
-1000  
-1001  
-1010  
-1011  
-1100  
-1101  
-1110  
-1111  
Format 6  
-0000  
MOVf  
CMPf  
Note 3  
SUBf  
-0001  
-0010  
-0011  
-0100  
-0101  
-0110  
-0111  
Note 1  
Note 3  
Note 1  
MULf  
ROT  
NEG  
NOT  
-1000  
-1001  
ASH  
-0001  
-0010  
-0011  
-0100  
-0101  
-0110  
-0111  
CBIT  
Trap (UND)  
SUBP  
ABS  
-1010  
-1011  
-1100  
-1101  
-1110  
-1111  
CBITI  
Trap (UND)  
LSH  
NEGf  
Note 2  
Note 1  
ABSf  
Note 2  
Note 1  
COM  
SBIT  
IBIT  
SBITI  
ADDP  
23  
16 15  
gen 2  
8 7  
0
gen 1  
op  
i
1 1 0 0 1 1 1 0  
Format 7  
MOVM  
-0000  
MUL  
MEI  
-1000  
-1001  
CMPM  
-0001  
-0010  
-0011  
-0100  
-0101  
-0110  
-0111  
INSS  
Trap (UND)  
DEI  
-1010  
-1011  
-1100  
-1101  
-1110  
-1111  
EXTS  
MOVXBW  
MOVZBW  
MOVZiD  
MOVXiD  
QUO  
REM  
MOD  
DIV  
87  
Appendix A: Instruction Formats (Continued)  
Format 15.1  
23  
16 15  
8
7
0
CCV3  
LCSR  
CCV5  
CCV4  
-000  
CCV2  
CCV1  
SCSR  
CCV0  
-100  
-101  
-110  
-111  
gen 1  
gen 2  
op  
0
f
1
1 1 1 1 1 1 0  
-001  
-010  
-011  
Format 12  
Note 2  
SQRTf  
POLYf  
DOTf  
-0000  
Note 2  
Note 1  
MACf  
-1000  
-1001  
-1010  
-1011  
-1100  
-1101  
-1110  
-1111  
23  
16 15  
gen 2  
8
-0001  
-0010  
-0011  
-0100  
-0101  
-0110  
-0111  
101  
gen 1  
op  
x
c
Note 1  
Note 2  
Note 1  
Note 2  
Note 1  
SCALBf  
LOGBf  
Note 2  
Note 1  
Format 15.5  
CCAL3  
CCAL0  
CMOV0  
CCMP0  
CCMP1  
CCAL1  
CMOV2  
Note 2  
Note 1  
-0000  
-0001  
-0010  
-0011  
-0100  
-0101  
-0110  
-0111  
-1000  
-1001  
-1010  
-1011  
-1100  
-1101  
-1110  
-1111  
CMOV3  
Note 3  
Note 1  
CCAL2  
CMOV1  
Note 2  
Note 1  
TL/EE/935468  
Format 13  
Trap (UND) Always  
23  
16 15  
short  
8
7
0
23  
16 15  
gen 2  
8
gen 1  
0
op  
i
0
0 0 1 1 1 1 0  
111  
gen 1  
op  
x
c
Format 14  
Format 15.7  
RDVAL  
WRVAL  
-0000  
-0001  
LMR  
SMR  
-0010  
-0011  
-1001  
Note 2  
Note 1  
Note 3  
Note 3  
Note 2  
Note 1  
Note 2  
Note 1  
-0000  
Note 2  
-1000  
-1001  
-1010  
-1011  
-1100  
-1101  
-1110  
-1111  
CINV  
Trap (UND) on 01XX, 1000, 101X, 11XX  
-0001  
-0010  
-0011  
-0100  
-0101  
-0110  
-0111  
Note 1  
Note 3  
Note 1  
Note 2  
Note 1  
Note 2  
Note 1  
23  
16 15  
8 7  
0
n n n 1 0 1 1 0  
ID Byte  
Operation Word  
e
If nnn  
010, 011, 100, 110 then Trap (UND) Always.  
Format 15  
(Custom Slave)  
nnn  
Operation Word Format  
TL/EE/935469  
23  
16 15  
8
Format 16  
000  
gen 1  
short  
x
op  
i
Trap (UND) Always  
Format 15.0  
LCR  
SCR  
-0010  
-0011  
TL/EE/935470  
Format 17  
Trap (UND) on all others  
Trap (UND) Always  
23  
16 15  
gen 2  
8
001  
gen 1  
op  
c
i
TL/EE/935471  
88  
erence Manual, then the results produced by the  
NS32532 may differ from those of the NS32032.  
Appendix A: Instruction  
Formats (Continued)  
3. Either the program does not depend on the use of a  
Memory Management Unit (MMU), or it is written for op-  
eration with the NS32382 MMU and does not use the  
bus-error or debugging features of the NS32382.  
Format 18  
Trap (UND) Always  
4. The program does not depend on the detection of bus  
errors according to the implementation of the NS32332.  
For example, the NS32532 distinguishes between re-  
startable and nonrestartable bus errors by transferring  
control to the appropriate bus-error exception service  
procedure through one of two distinct entries in the In-  
terrupt Dispatch Table. In contrast, the NS32332 uses a  
single entry in the Interrupt Dispatch Table for all bus  
errors.  
TL/EE/935472  
Format 19  
Trap (UND) Always  
Implied Immediate Encodings:  
7
0
5. The program does not modify itself. Refer to Section B.4  
for more information.  
r7  
r6  
r5  
r4  
r3  
r2  
r1  
r0  
6. The program does not depend on the execution of cer-  
tain complex instructions to be non-interruptible. Refer  
to Section B.5 on. ‘‘Memory-Mapped I/O’’ for more in-  
formation.  
Register Mark, Appended to SAVE, ENTER  
7
0
r0  
r1  
r2  
r3  
r4  
r5  
r6  
r7  
7. The program does not use the custom slave instructions  
CATSTO and CATST1, as they are not supported by the  
NS32532 and will result in a Trap (UND) when their exe-  
cution is attempted.  
Register Mark, Appended to RESTORE, EXIT  
7
0
B.2 ARCHITECTURE EXTENSIONS  
offset  
length - 1  
The NS32532 implements the following extensions of the  
Series 32000 architecture using previously reserved control  
bits, instruction encodings, and memory locations. Exten-  
sions implemented earlier in the NS32332, such as 32-bit  
addressing, are not listed.  
Offset/Length Modifier Appended to INSS, EXTS  
Note 1: Opcode not defined; CPU treats like MOV or CMOV . First operand  
c
f
has access class of read; second operand has access class of write; f or c  
field selects 32- or 64-bit data.  
Note 2: Opcode not defined; CPU treats like ADD or CCAL . First operand  
c
f
1. The DC, LDC, IC, and LIC bits in the CFG register have  
been defined to control the on-chip Instruction and Data  
Caches. The DE-bit in the CFG register has been de-  
fined to enable Direct-Exception Mode.  
has access class of read;, second operand has access class of read-modify-  
write; f or c field selects 32- or 64-bit data.  
Note 3: Opcode not defined; CPU treats like CMP or CCMP . First operand  
c
f
has access class of read;, second operand has access class of read; f or c  
field selects 32- or 64-bit data.  
2. The V-flag in the PSR register has been defined to en-  
able the Integer-Overflow Trap.  
Appendix B. Compatibility Issues  
3. The DCR, BPC, DSR, and CAR registers have been de-  
fined to control debugging features. Access to these  
registers has been added to the definition of the LPR  
and SPR instructions.  
The NS32532 is compatible with the Series 32000 architec-  
ture implemented by the NS32032, NS32332, and previous  
microprocessors in the family. Compatibility means that  
within certain limited constraints, programs that execute on  
one of the earlier Series 32000 microprocessors will pro-  
duce identical results when executed on the NS32532.  
Compatibility applies to privileged operating systems pro-  
grams, as well as to non-privileged applications programs.  
This appendix explains both the restrictions on compatibility  
with previous Series 32000 microprocessors and the exten-  
sions to the architecture that are implemented by the  
NS32532.  
4. Access to the CFG and SP1 registers has been added  
to the definition of the LPR and SPR instructions.  
5. The CINV instruction has been defined to invalidate  
control of the on-chip Instruction and Data Caches.  
6. Direct-Exception Mode has been added to support fast-  
er interrupt service time and systems without module  
tables.  
7. A new entry has been added to the Interrupt Dispatch  
Table for supporting vectors to distinguish between re-  
startable and nonrestartable bus errors. Two additional  
entries support Trap (OVF) and Trap (DBG).  
B.1 RESTRICTIONS ON COMPATIBILITY  
If the following restrictions are observed, then a program  
that executes on an earlier Series 32000 microprocessor  
will produce identical results when executed on the  
NS32532 in an appropriately configured system:  
8. Restrictions have been eliminated for recovery from  
Trap (ABT) for operands with access class of write that  
cross page boundaries. Restrictions still exist however,  
for the operands of the MOVMi instruction.  
1. The program is not time-dependent. For example, the  
program should not use instruction loops to control real-  
time delays.  
B.3 INTEGER OVERFLOW TRAP  
A new trap condition is recognized for integer arithmetic  
overflow. Trap (OVF) is enabled by the V-flag in the PSR.  
This new trap is important because detection of integer  
overflow conditions is required for certain programming lan-  
guages, such as ADA, and the PSR flags do not indicate the  
occurrence of overflow for ASHi, DIVi and MULi instructions.  
2. The program does not use any encodings of instruc-  
tions, operands, addresses, or control fields identified to  
be reserved or undefined. For example, if the count op-  
erand’s value for an LSHi instruction is not within the  
range specified by the Series 32000 Instruction Set Ref-  
89  
Appendix C. Instruction Set Extensions (Continued)  
C.1 PROCESSOR SERVICE INSTRUCTIONS  
15  
15  
8 7  
0
0
The CFG register, User Stack Pointer (SP1), and Debug  
Registers can be loaded and stored using privileged forms  
of the LPRi and SPRi instructions.  
gen  
src  
short 1 1 0 1 1  
i
i
procreg  
8 7  
LPRi  
When the SETCFG instruction is executed, the CFG register  
bits 0 through 3 are loaded from the instruction’s short field,  
bits 4 through 7 are forced to 1, and bits 8 through 12 are  
forced to 0.  
gen  
short 0 1 0 1 1  
procreg SPRi  
FIGURE C-1. LPRi/SPRi Instruction Formats  
dest  
The contents of the on-chip Instruction Cache and Data  
Cache can be invalidated by executing the privileged in-  
struction CINV. While executing the CINV instruction, the  
CPU generates 2 slave bus cycles on the system interface  
to display the first 3 bytes of the instruction and the source  
operand. External circuitry can thereby detect the execution  
of the CINV instruction for use in monitoring the contents of  
the on-chip caches.  
TABLE C-1. LPRi/SPRi New ‘Short’ Field Encodings  
Register  
procreg  
DCR  
BPC  
short field  
0001  
Debug Condition Register  
Breakpoint Program Counter  
Debug Status Register  
Compare Address Register  
User Stack Pointer  
0010  
DSR  
0011  
C.2 MEMORY MANAGEMENT INSTRUCTIONS  
CAR  
0100  
The NS32532 on-chip MMU does not implement the BAR,  
BDR, BEAR, and BMR registers of the NS32382. These  
registers are used in the NS32382 to support bus error and  
debugging features. When an attempt is made to access  
one of these 4 registers by executing an LMR or SMR in-  
struction, a Trap (UND) occurs. More generally, a Trap  
(UND) occurs whenever an attempt is made to execute an  
LMR or SMR instruction and the most-significant bit of the  
short-field is 0.  
USP  
1011  
Configuration Register  
CFG  
1100  
Cache Invalidate  
[
]
options , src  
Syntax: CINV  
gen  
read. D  
While executing an LMR instruction, the CPU generates 2  
slave bus cycles on the system interface to display the first  
3 bytes of the instruction and the source operand. External  
circuitry can thereby detect the execution of an LMR in-  
struction for use in monitoring the contents of the on-chip  
Translation Lookaside Buffer.  
The CINV instruction invalidates the contents of locations in  
the on-chip Instruction Cache and Data Cache. The instruc-  
tion can be used to invalidate either the entire contents of  
the on-chip caches or only a 16-byte block. In the latter  
case, the 28 most-significant bits of the source operand  
specify the physical address of the aligned 16-byte block;  
the 4 least-significant bits of the source operand are ig-  
nored. If the specified block is not located in the on-chip  
caches, then the instruction has no effect. If the entire  
cache contents is to be invalidated, then the source oper-  
and is read, but its value is ignored.  
Like the NS32382 MMU, the F-flag in the PSR is set and no  
Trap (ABT) occurs when a RDVAL or WRVAL instruction is  
executed and the Protection Level in the Level-1 Page Ta-  
ble Entry indicates that the access is not allowed. In the  
NS32082 MMU, an abort occurs when the Level-1 PTE is  
invalid, regardless of the Protection Level.  
Options are specified by listing the letters A (invalidate All), I  
(Instruction Cache), and D (Data Cache). If neither the I nor  
D option is specified, the instruction has no effect.  
C.3 INSTRUCTION DEFINITIONS  
This section provides a description of the operations and  
encodings of the new NS32532 privileged instructions.  
In the instruction encoding, the options are represented in  
the A, I, and D fields as follows:  
Load and Store Processor Registers  
A: 0Ðinvalidate only a 16-byte block  
1Ðinvalidate the entire cache  
Syntax: LPRi  
procreg,  
short  
src  
gen  
I: 0Ðdo not affect the Instruction Cache  
1Ðinvalidate the Instruction Cache  
read.i  
SPRi  
procreg  
short  
dest  
gen  
write.i  
D: 0Ðdo not affect the Data Cache  
1Ðinvalidate the Data Cache  
Flags Affected: None  
The LPRi and SPRi instructions can be used to load and  
store the User Stack Pointer (USP or SP1), the Configura-  
tion Register (CFG) and the Debug Registers in addition to  
the Processor Registers supported by the previous Series  
32000 CPUs. Access to these registers is privileged.  
Traps:  
Illegal Operation Trap (ILL) occurs if an at-  
tempt is made to execute this instruction  
while the U-flag is 1.  
Examples:  
Figure C-1 and Table C-1 show the instruction formats and  
the new ‘short’ field encodings for LPRi and SPRi.  
[
]
1. CINV A, D, I , R3 1E A7 1B  
[ ]  
2. CINV I , R3  
1E 27 19  
Flags Affected: No flags affected by loading or storing the  
USP, CFG, or Debug Registers.  
Example 1 invalidates the entire Instruction Cache and Data  
Cache.  
Traps:  
Illegal Instruction Trap (ILL) occurs if an  
attempt is made to load or store the USP,  
CFG or Debug Registers while the U-flag  
is 1.  
Example 2 invalidates the 16-byte block whose physical ad-  
dress in the Instruction Cache is contained in R3.  
90  
Appendix C. Instruction Set Extensions (Continued)  
C.1 PROCESSOR SERVICE INSTRUCTIONS  
15  
15  
8 7  
0
0
The CFG register, User Stack Pointer (SP1), and Debug  
Registers can be loaded and stored using privileged forms  
of the LPRi and SPRi instructions.  
gen  
src  
short 1 1 0 1 1  
i
i
procreg  
8 7  
LPRi  
When the SETCFG instruction is executed, the CFG register  
bits 0 through 3 are loaded from the instruction’s short field,  
bits 4 through 7 are forced to 1, and bits 8 through 12 are  
forced to 0.  
gen  
short 0 1 0 1 1  
procreg SPRi  
FIGURE C-1. LPRi/SPRi Instruction Formats  
dest  
The contents of the on-chip Instruction Cache and Data  
Cache can be invalidated by executing the privileged in-  
struction CINV. While executing the CINV instruction, the  
CPU generates 2 slave bus cycles on the system interface  
to display the first 3 bytes of the instruction and the source  
operand. External circuitry can thereby detect the execution  
of the CINV instruction for use in monitoring the contents of  
the on-chip caches.  
TABLE C-1. LPRi/SPRi New ‘Short’ Field Encodings  
Register  
procreg  
DCR  
BPC  
short field  
0001  
Debug Condition Register  
Breakpoint Program Counter  
Debug Status Register  
Compare Address Register  
User Stack Pointer  
0010  
DSR  
0011  
C.2 MEMORY MANAGEMENT INSTRUCTIONS  
CAR  
0100  
The NS32532 on-chip MMU does not implement the BAR,  
BDR, BEAR, and BMR registers of the NS32382. These  
registers are used in the NS32382 to support bus error and  
debugging features. When an attempt is made to access  
one of these 4 registers by executing an LMR or SMR in-  
struction, a Trap (UND) occurs. More generally, a Trap  
(UND) occurs whenever an attempt is made to execute an  
LMR or SMR instruction and the most-significant bit of the  
short-field is 0.  
USP  
1011  
Configuration Register  
CFG  
1100  
Cache Invalidate  
Syntax: CINV  
options, src  
gen  
read. D  
While executing an LMR instruction, the CPU generates 2  
slave bus cycles on the system interface to display the first  
3 bytes of the instruction and the source operand. External  
circuitry can thereby detect the execution of an LMR in-  
struction for use in monitoring the contents of the on-chip  
Translation Lookaside Buffer.  
The CINV instruction invalidates the contents of locations in  
the on-chip Instruction Cache and Data Cache. The instruc-  
tion can be used to invalidate either the entire contents of  
the on-chip caches or only a 16-byte block. In the latter  
case, the 28 most-significant bits of the source operand  
specify the physical address of the aligned 16-byte block;  
the 4 least-significant bits of the source operand are ig-  
nored. If the specified block is not located in the on-chip  
caches, then the instruction has no effect. If the entire  
cache contents is to be invalidated, then the source oper-  
and is read, but its value is ignored.  
Like the NS32382 MMU, the F-flag in the PSR is set and no  
Trap (ABT) occurs when a RDVAL or WRVAL instruction is  
executed and the Protection Level in the Level-1 Page Ta-  
ble Entry indicates that the access is not allowed. In the  
NS32082 MMU, an abort occurs when the Level-1 PTE is  
invalid, regardless of the Protection Level.  
Options are specified by listing the letters A (invalidate All), I  
(Instruction Cache), and D (Data Cache). If neither the I nor  
D option is specified, the instruction has no effect.  
C.3 INSTRUCTION DEFINITIONS  
This section provides a description of the operations and  
encodings of the new NS32532 privileged instructions.  
In the instruction encoding, the options are represented in  
the A, I, and D fields as follows:  
Load and Store Processor Registers  
A: 0Ðinvalidate only a 16-byte block  
1Ðinvalidate the entire cache  
Syntax: LPRi  
procreg,  
short  
src  
gen  
I: 0Ðdo not affect the Instruction Cache  
1Ðinvalidate the Instruction Cache  
read.i  
SPRi  
procreg  
short  
dest  
gen  
write.i  
D: 0Ðdo not affect the Data Cache  
1Ðinvalidate the Data Cache  
Flags Affected: None  
The LPRi and SPRi instructions can be used to load and  
store the User Stack Pointer (USP or SP1), the Configura-  
tion Register (CFG) and the Debug Registers in addition to  
the Processor Registers supported by the previous Series  
32000 CPUs. Access to these registers is privileged.  
Traps:  
Illegal Operation Trap (ILL) occurs if an at-  
tempt is made to execute this instruction  
while the U-flag is 1.  
Examples:  
Figure C-1 and Table C-1 show the instruction formats and  
the new ‘short’ field encodings for LPRi and SPRi.  
1. CINV A, D, I, R3 1E A7 1B  
2. CINV I, R3 1E 27 19  
Flags Affected: No flags affected by loading or storing the  
USP, CFG, or Debug Registers.  
Example 1 invalidates the entire Instruction Cache and Data  
Cache.  
Traps:  
Illegal Instruction Trap (ILL) occurs if an  
attempt is made to load or store the USP,  
CFG or Debug Registers while the U-flag  
is 1.  
Example 2 invalidates the 16-byte block whose physical ad-  
dress in the Instruction Cache is contained in R3.  
91  
Appendix C. Instruction Set Extensions (Continued)  
23  
15  
8 7  
0
23  
16 15  
short 0 0 0 1 1 1 1 0 0 0 1 1 1 1 0  
mmureg SMR  
8 7  
0
gen  
src  
0 A  
I
D 0 1 0 0 1 1 1 0 0 0 1 1 1 1 0  
CINV  
FIGURE C-2. CINV Instruction Format  
gen  
options  
dest  
FIGURE C-3. LMR/SMR Instruction Formats  
16 15 8 7  
short 0 0 0 1 0 1 1 0 0 0 1 1 1 1 0  
mmureg LMR  
Load and Store Memory Management Register  
23  
0
Syntax: LMR  
SMR  
mmreg,  
short  
src  
gen  
read.D  
gen  
src  
mmureg,  
short  
dest  
gen  
write.D  
Appendix D. Instruction  
Execution Times  
The NS32532 achieves its performance by using an ad-  
vanced implementation incorporating a 4-stage Instruction  
Pipeline, a Memory Management Unit, an Instruction Cache  
and a Data Cache into a single integrated circuit.  
The LMR and SMR instruction load and store the on-chip  
MMU registers as 32-bit quantities to and from any general  
operand. For reasons of system security, these instructions  
are privileged. In order to be executable, they must also be  
enabled by setting the M bit in the CFG register.  
As a consequence of this advanced implementation, the  
performance evaluation for the NS32532 is more complex  
than for the previous microprocessors in the Series 32000  
family. In fact, it is no longer possible to determine the exe-  
cution time for an instruction using only a set of tables for  
operations and addressing modes. Rather, it is necessary to  
consider dependencies between the various instructions ex-  
ecuting in the pipeline, as well as the occurrence of misses  
for the on-chip caches.  
The instruction formats as well as the ‘short’ field encodings  
are shown in Figure C-3 and Table C-2 respectively.  
It is to be noted that the IVAR0 and IVAR1 registers are  
write-only, and as such, they can only be loaded by the LMR  
instruction.  
Flags Affected: none  
Traps:  
Undefined Instruction Trap (UND) occurs if  
an attempt is made to execute this instruc-  
tion while either of the following conditions  
is true:  
The following sections explain the method to evaluate the  
performance of the NS32532 by calculating various timing  
parameters for an instruction sequence. Due to the high  
degree of parallelism in the NS32532, the evaluation tech-  
niques presented here include some simplifications and ap-  
proximations.  
1. The M-bit in the CFG register is 0.  
2. The U-Flag in the PSR is 0 and the  
most-significant bit of the short field is 0.  
Illegal Instruction Trap (ILL) occurs if an at-  
tempt is made to execute this instruction  
while the M-bit in the CFG register and the  
U-flag in the PSR are both 1.  
D.1 INTERNAL ORGANIZATION  
AND INSTRUCTION EXECUTION  
The NS32532 is organized internally as 8 functional units as  
shown in Figure 1. The functional units operate in parallel to  
execute instructions in the 4-stage pipeline. The structure of  
this pipeline is shown in Figure 3-2. The Instruction Fetch  
and Instruction Decode pipeline stages are implemented in  
the loader along with the 8-byte instruction queue and the  
buffer for a decoded instruction. The Address Calculation  
pipeline stage is implemented in the address unit. The Exe-  
cute pipeline stage is implemented in the Execution Unit  
along with the write data buffer that holds up to two results  
directed to memory.  
TABLE C-2. LMR/SMR ‘Short’ Field Encodings  
Register  
mmureg  
short field  
Memory Management  
Control Reg  
MCR  
1001  
Memory Management  
Status Reg  
MSR  
1010  
1011  
1100  
1101  
1110  
1111  
Translation Exception  
Address Reg  
TEAR  
PTB0  
PTB1  
IVAR0  
IVAR1  
The Address Unit and Execution Unit can process instruc-  
tions at a peak rate of 2 clock cycles per instruction, en-  
abling a sustained pipeline throughput at 30 MHz of  
15 MIPS (million instructions per second) for sequences of  
register-to-register, immediate-to-register, register-to-mem-  
ory and memory-to-register instructions. Nevertheless, the  
execution of instructions in the pipeline is reduced from the  
peak throughput of 2 cycles by the following causes of de-  
lay:  
Page Table Base  
Register 0  
Page Table Base  
Register 1  
Invalidate Virtual  
Address 0  
Invalidate Virtual  
Address 1  
1. Complex operations, like division, require more than 2 cy-  
cles in the Execution Unit, and complex addressing  
modes, like memory relative, require more than 2 cycles  
in the Address Unit.  
92  
Appendix D. Instruction Execution Times (Continued)  
2. Dependencies between instructions can limit the flow  
through the pipeline. A data dependency can arise when  
the result of one instruction is the source of a following  
instruction. Control dependencies arise when branching  
instructions are executed. Section D.3 describes the  
types of instruction dependencies that impact perform-  
ance and explains how to calculate the pipeline delays.  
2 cycles in total to process a double-precision floating-  
point immediate value.  
#
D.2.2 Address Unit Timing  
The processing time of the Address Unit depends on the  
instruction’s operation and the number and type of its gen-  
eral addressing modes. The basic time for most instructions  
is 2 cycles. A relatively small number of instructions require  
an additional address unit time, as shown in the timing ta-  
bles in Section D.5.5. Non-pipelined floating-point instruc-  
tions as well as Custom-Slave instructions require an addi-  
tional 3 cycles plus 2 cycles for each quad-word operand in  
memory.  
3. Cache and TLB misses can cause the flow of instructions  
through the pipeline to be delayed, as can non-aligned  
references. Section D.4 explains the performance impact  
for these forms of storage delays.  
The effective time T needed to execute an instruction is  
eff  
given by the following formula:  
For instructions with 2 general addressing modes, 2 addi-  
tional cycles are required when both addressing modes re-  
fer to memory. Certain general addressing modes require an  
additional processing time, as shown in Table D-1. For ex-  
ample, the instruction MOVD 4(8(FP)), TOS requires 7 cy-  
cles in the Address Unit; 2 cycles for the basic time, an  
additional 2 cycles because both modes refer to memory,  
and an additional 3 cycles for Memory Relative addressing  
mode.  
e
a
a
T T  
T
eff  
T
e
d
s
T
e
is the execution time in the pipeline in the absence of  
data dependencies between instructions and storage de-  
lays, T is the delay due to data dependencies, and T is the  
d s  
effect of storage delays.  
D.2 BASIC EXECUTION TIMES  
Instruction flow in sequence through the pipeline stages im-  
plemented by the Loader, Address Unit, and Execution Unit.  
In almost all cases, the Loader is at least as fast at decod-  
ing an instruction as the Address Unit is at processing the  
instruction. Consequently, the effects of the Loader can be  
ignored when analyzing the smooth flow of instructions in  
the pipeline, and it is only necessary to consider the times  
for the Address Unit and Execution Unit. The time required  
by the Loader to fetch and decode instructions is significant  
only when there are control dependencies between instruc-  
tions or Instruction Cache misses, both of which are ex-  
plained later.  
TABLE D-1. Additional Address Unit Processing  
Time for Complex Addressing Modes  
Additional  
Mode  
Cycles  
Memory Relative  
External  
3
8
2
Scaled Indexing  
D.2.3 Execution Unit Timing  
The time for the pipeline to advance from one instruction to  
the next is typically determined by the maximum time of the  
Address Unit and Execution Unit to complete processing of  
the instruction on which they are operating. For example, if  
the Execution Unit is completing instruction n in 2 cycles  
The Execution Unit processing times for the various  
NS32532 instructions are provided in Section D.5.5. Certain  
operations cause a break in the instruction flow through the  
pipeline.  
Some of these operation simply stop the Address Unit,  
while others flush the instruction queue as well. The infor-  
mation on how to evaluate the penalty resulting from in-  
struction flow breaks is provided in the following sections.  
a
and the Address Unit is completing instruction n 1 in 4  
cycles, then the pipeline will advance in 4 cycles. For certain  
instructions, such as RESTORE, the Address Unit waits until  
the Execution Unit has completed the instruction before  
proceeding to the next instruction. When such an instruction  
is in the Execution Unit, the time for the pipeline to advance  
is equal to the sum of the time for the Execution Unit to  
complete instruction n and the time for the Address Unit to  
D.3 INSTRUCTION DEPENDENCIES  
Interactions between instructions in the pipeline can cause  
delays. Two types of interactions can arise, as described  
below.  
a
complete instruction n 1. The processing times for the  
Loader, Address Unit, and Execution Unit are explained be-  
low.  
D.3.1 Data Dependencies  
In certain circumstances the flow of instructions in the pipe-  
line will be delayed when the result of an instruction is used  
as the source of a succeeding instruction. Such interlocks  
are automatically detected by the microprocessor and han-  
dled with complete transparency to software.  
D.2.1 Loader Timing  
The Loader can process an instruction field on each clock  
cycle, where a field is one of the following:  
An opcode of 1 to 3 bytes including addressing mode  
specifiers.  
#
D.3.1.1 Register Interlocks  
When an instruction uses a base register that is the destina-  
tion of either of the previous 2 instructions, a delay occurs.  
The delay is 3 cycles when, as in the following example, the  
base register is modified by the immediately preceding in-  
struction. Modifications of the Stack Pointer resulting from  
the use of TOS addressing mode do not cause any delay.  
Also, there is no delay for a data dependency when the  
instruction that modifies the register is one for which the  
Address Unit stops.  
Up to 2 index bytes, if scaled index addressing mode is  
used.  
#
A displacement.  
#
An immediate value of 8, 16 or 32 bits.  
#
The Loader requires additional time in the following cases:  
1 additional cycle when 2 consecutive double-word fields  
begin at an odd address.  
#
93  
Appendix D. Instruction Execution Times (Continued)  
n: ADDD R1,R0  
; modify R0  
The delay is 2 cycles when the memory location is modified  
2 instructions before its use as a source operand or effec-  
tive address, as shown in this example.  
n01: MOVD 4(R0),R2 ; R0 is base register,  
delay 3 cycles  
n: ADDQD 1,4(SP) ; modify 4(SP)  
The delay is 1 cycle when the register is modified 2 instruc-  
tions before its use as a base register, as shown in this  
example.  
n01: MOVD R0,R1  
; no reference to 4(SP)  
n02: CMPD 10, 4(SP); read 4(SP),  
n: ADDD R1,R0  
; modify R0  
2 cycles delay  
n01: MOVD 4(SP),R3 ; R0 not used  
n02: MOVD 4(R0),R2 ; R0 is base register,  
delay 1 cycle  
Certain sequences of read and write references can cause  
a delay of 1 cycle although there is no data dependency  
between the references. This arises because the Data  
Cache is occupied for 2 cycles on write references. In the  
absence of data dependencies, read references are given  
priority over write references. Therefore, this delay only oc-  
curs when an instruction with destination in memory is fol-  
lowed 2 instructions later by an instruction that refers to  
memory (read or write) and 3 instructions later by an instruc-  
tion that reads from memory. Here is an example:  
When an instruction uses an index register that is the desti-  
nation of the previous instruction, a delay of 1 cycle occurs,  
as shown in the example below. If the register is modified 2  
or more instructions prior to its use as an index register,  
then no delay occurs.  
n: ADDD R1,R0  
n01: MOVD 4(SP)[R0:B],R2  
; R0 is index register,  
; modify R0  
n: MOVD R0,4(SP) ; memory write  
n01: MOVD R6,R7  
; any instruction  
delay 1 cycle  
n02: MOVD 8(SP),R0 ; memory read or write  
n03: MOVD 12(SP),R1; memory read  
delayed 1 cycle  
Bypass circuitry in the Execution Unit generally avoids delay  
when a register modified by one instruction is used as the  
source operand of the following instruction, as in the follow-  
ing example.  
D.3.2 Control Dependencies  
The flow of instructions through the pipeline is delayed  
when the address from which to fetch an instruction de-  
pends on a previous instruction, such as when a conditional  
branch is excuted. The Loader includes special circuitry to  
handle branch instructions (ACB, BR, Bcond, and BSR) that  
serves to reduce such delays. When a branch instruction is  
decoded, the Loader calculates the destination address and  
selects between the sequential and non-sequential instruc-  
tion streams. The non-sequential stream is selected for un-  
conditional branches. For conditional branches the selec-  
tion is based on the branch’s direction (forward or back-  
ward) as well as the tested condition. The branch is predict-  
ed taken in any of the following cases.  
n: ADDD R1,R0  
; modify R0  
n01: MOVD R0,R2  
; R0 is source register,  
no delay  
For the uncommon case where the operand in the source  
register is larger than the destination of the previous instruc-  
tion, a delay of 2 cycles occurs. Here is an example.  
n: ADDB R1,R0  
; modify byte in R0  
; R0 dw source operand,  
2 cycle delay  
n01: MOVD R0,R2  
Note: The Address Unit does not make any differentiation between CPU  
and FPU registers. Therefore, register interlocks can occur between  
integer and floating-point instructions.  
The branch is backward.  
#
#
D.3.1.2 Memory Interlocks  
The tested condition is either NE or LE.  
When an instruction reads a source operand (or address for  
effective address calculation) from memory that depends on  
the destination of either of the previous 2 instructions, a  
delay occurs. The CPU detects a dependency between a  
read and a write reference in the following cases, which  
include some false dependencies in addition to all actual  
dependencies:  
Measurements have shown that the correct stream is se-  
lected for 64% of conditional branches and 71% of total  
branches.  
If the Loader selects the non-sequential stream, then the  
destination address is transferred to the Instruction Cache.  
For conditional branches, the Loader saves the address of  
the alternate stream (the one not selected). When a condi-  
tional branch instruction reaches the Execution Unit, the  
condition is resolved, and the Execution Unit signals the  
Loader whether or not the branch was taken. If the branch  
had been incorrectly predicted, the Instruction Cache be-  
gins fetching instructions from the correct stream.  
Either reference crosses a double-word boundary  
#
Address bits 0 through 11 are equal  
#
Address bits 2 through 11 are equal and either reference  
is for a word  
#
Address bits 2 through 11 are equal and either reference  
is for a double-word  
#
The delay for handling a branch instruction depends on  
whether the branch is taken and whether it is predicted cor-  
rectly. Unconditional branches have the same delay as cor-  
rectly predicted, taken conditional branches.  
The delay for a memeory interlock is 4 cycles when, as in  
the following example, the memory location is modified by  
the immediately preceding instruction.  
Another form of delay occurs when 2 consecutive condition-  
al branch instructions are executed. This delay of 2 cycles  
arises from contention for the register that holds the alter-  
nate stream address in the Loader.  
n: ADDQD 1,4(SP) ; modify 4(SP)  
n01: CMPD 10,4(SP) ; read, 4(SP),  
4 cycle delay  
Control dependencies also arise when JUMP, RET, and oth-  
er non-branch instructions alter the sequential execution of  
instructions.  
94  
Appendix D. Instruction Execution Times (Continued)  
D.4 STORAGE DELAYS  
should be separately evaluated through a careful examina-  
tion of the instruction sequence.  
The flow of instructions in the pipeline can be delayed by  
off-chip memory references that result from misses in the  
on-chip storage buffers and by misalignment of instructions  
and operands. These considerations are explained in the  
following sections. The delays reported assume no wait  
states on the external bus and no interference between in-  
struction and data references.  
The following assumptions are made:  
Ð The entire instruction, with displacements and immedi-  
ate operands, is present in the instruction queue when  
needed.  
Ð All memory operands are available to the Execution Unit  
and Address Unit when needed.  
D.4.1 Instruction Cache Misses  
Ð Memory writes are performed at full speed through the  
write buffer.  
An Instruction Cache miss causes a 5 cycle gap in the fetch-  
ing of instructions. When the miss occurs for a non-sequen-  
tial instruction fetch, the pipeline is idle for the entire gap, so  
the delay is 5 cycles. When the miss occurs for a sequential  
fetch, the pipeline is not idle for the entire gap because  
instructions that have been prefetched ahead and buffered  
can be executed. The delay for misses on non-sequential  
instruction fetches can be estimated to be approximately  
half the gap, or 2.5 cycles.  
Ð Where possible, the values of operands are taken into  
consideration when they affect instruction timing, and a  
range of times is given. When this is not done, the worst  
case is assumed.  
D.5.1 Definitions  
T
Time required by the Execution Unit to execute an  
instruction.  
eu  
T
T
Total processing time in the Address Unit.  
au  
ad  
D.4.2 Data Cache Misses  
Extra time needed by the Address Unit, in addition  
to the basic time, to process more complex cases.  
A Data Cache miss causes a delay of 2 cycles. When a  
burst read cycle is used to fill the cache block, then 3 addi-  
tional cycles are required to update the Data Cache. In case  
a burst cycle is used and either of the 2 instructions follow-  
ing the instruction that caused the miss also reads from  
memory, then an additional delay occurs: 3 cycle delay  
when the instruction that reads from memory immediately  
follows the miss, and 2 cycle delay when the memory read  
occurs 2 instructions after the miss.  
T
T
T
can be evaluated as follows:  
ad  
e
a
a
T T  
y1  
T
x
ad  
y2  
e
2 if the instruction has two general operands  
and both of them are in memory.  
x
0 otherwise.  
T
and T are related to operands 1 and 2 re-  
y2  
spectively. Their values are given below.  
y1  
D.4.3 TLB Misses  
e
T
3 if Memory Relative  
8 if External  
y(1, 2)  
There is a delay for the MMU to translate a virtual address  
whenever there is a TLB miss for an instruction fetch, data  
read or data write and whenever the M-bit in the Page Table  
Entry (PTE) must be set for a data write that hits in the TLB.  
The delay for the MMU to handle a TLB miss is 15 cycles  
when no update to the PTEs is necessary. When only the  
Level-1 PTE must be updated, the delay is 17 cycles; when  
only the Level-2 PTE must be updated, the delay is 22 cy-  
cles. When both PTEs must be updated, the delay is 24  
cycles.  
2 if Scaled Indexing  
0 if any other addressing mode  
The following parameters are only used for floating-point  
execution time calculations.  
T
anp  
Additional Address Unit time needed to process  
floating-point instructions in non-pipelined mode.  
(Section D.2.2).  
T
may be totally hidden for pipelined instruc-  
anp  
tions. For non-pipelined instructions it can be cal-  
culated as follows:  
D.4.4 Instruction and Operand Alignment  
When a data reference (either read or write) crosses a dou-  
ble-word boundary, there is a delay of 2 cycles.  
e
a
2 * (Number of 64-bit operands in  
memory)  
T
anp  
3
When the opcode for a non-sequential instruction crosses a  
double-word boundary, there is a delay of 1 cycle. No delay  
occurs in the same situation for a sequential instruction.  
There is also a delay of 2 cycles when an instruction fetch is  
located on a different page from the previous fetch and  
there is a hit in the Instruction Cache. This delay, which is  
due to the time required to translate the new page’s ad-  
dress, also occurs following any serializing operation.  
T
tcs  
Time required to transfer ID and Opcode, if no op-  
erand needs to be transferred to the slave. Other-  
wise, it is the time needed to transfer the last 32  
bits of operand data to the slave. In the latter case  
the transfer of ID and Opcode as well as any oper-  
and data except the last 32 bits is included in the  
Execution Unit timing.  
T
tsc  
Time required by the CPU to complete the floating-  
point instruction upon receiving the DONE signal  
from the slave. This includes the time to process  
the DONE signal itself in addition to the time need-  
ed to read the result (if any) from the slave.  
D.5 EXECUTION TIME CALCULATIONS  
This section provides the necessary information to calculate  
the T portion of the effective time required by the CPU to  
e
execute an instruction.  
The effects of data dependencies and storage delays are  
not taken into account in the evaluation of T , rather, they  
e
95  
Appendix D. Instruction Execution Times (Continued)  
l
This parameter is related to the floating-point oper-  
and size as follows:  
6. The keyword defined for the Bcond instruction have the  
following meaning:  
e
e
Standard floating (32 bits): l  
Long floating (64 bits):  
0
1
BTPC Branch Taken, Predicted Correctly  
l
BTPI  
Branch Taken, Predicted Incorrectly  
BNTPC Branch Not Taken, Predicted Correctly  
BNTPI Branch Not Taken, Predicted Incorrectly  
D.5.2 Notes on Table Use  
1. In the T column the notation n1  
eu  
mum, n2 maximum.  
x
n2 means n1 mini-  
D.5.3 T Evaluation  
eff  
2. In the notes column, notations held within angle brackets  
kl  
The T portion of the effective execution time for a certain  
e
instruction in an instruction sequence is obtained by per-  
forming the following steps:  
indicate alternatives in the operand addressing  
modes which affect the execution time. A table entry  
which is affected by the operand addressing may have  
multiple values, corresponding to the alternatives. This  
addressing notations are:  
1. Label the current and previous instruction in the se-  
b
quence with n and n 1 respectively.  
2. Obtain from the tables the values of T and T for in-  
eu au  
k l  
I
Immediate  
b
struction n and T for instruction n 1.  
eu  
k
k
k
k
k
k
k
l
R
CPU register  
3. For floating-point instructions, obtain the values of T  
tcs  
l
M
Memory  
and T  
.
tsc  
4. Use the following formula to determine the execution time  
T .  
l
F
FPU register, either 32 or 64 bits  
Memory, except Top of Stack  
Top of Stack  
l
m
e
l
e
a
b
b
(n)  
flt  
b
T
T
e
T
(n)  
dpf  
Break (n 1))  
func (T (n), T (n 1), T (n 1),  
au eu flt  
a
a
T (n)  
eu  
T
l
x
Any addressing mode  
T
dpf  
is the delay incurred before an instruction can begin  
l
ab  
a and b represent the addressing modes of oper-  
ands 1 and 2 respectively. Both of them can be  
execution. It must be considered only when the floating-  
point pipelined mode is enabled.  
k
l
any addressing mode. (e.g.,  
memory to CPU register).  
MR  
means  
For a non-floating-point instruction, it represents the time  
needed to complete all the instructions in the FIFO. For a  
floating-point instruction, it is only relevant if the FIFO is  
full, and represents the time to complete the first instruc-  
tion in the FIFO.  
3. The notation ‘Break K’ provides pipeline status informa-  
tion after executing the instruction to which ‘Break K’ ap-  
plies. The value of K is interpreted as follows:  
e
K
0
The Address Unit was stopped by the instruction  
but the pipeline was not flushed. The Address  
Unit can start processing the next instruction im-  
mediately.  
func provides the amount of processing time in the Ad-  
dress Unit that cannot be hidden. Its definition is given  
below.  
s
e
b
func  
0
if T (n)  
au  
(T (n 1)  
eu  
l
k
K
K
0
0
The pipeline was flushed by the instruction. The  
Address Unit must wait for K cycles before it can  
start processing the next instruction.  
a
b
(n 1))  
T
flt  
AND NOT Break (n 1)  
b
The Address Unit was stopped at the beginning  
of the instruction but it was restarted K cycles  
l
b
a
b
(n 1)  
b
T
(n)  
(n)  
T
if T (n)  
au  
(T (n 1)  
eu  
au  
eu  
l
l
a
b
(n 1))  
T
flt  
AND NOT Break (n 1)  
before the end of it. The Address Unit can start  
processing the next instruction cycles before  
b
K
l
l
l
a
T
0
K
if (T (n)  
au  
K)  
0
the end of the instruction to which ‘Break K’ ap-  
plies.  
au  
b
AND Break (n 1)  
4. Some instructions must wait for pending writes to com-  
plete before being able to execute. The number of cycles  
that these instructions must wait for, is between 6 and 7  
for the first operand in the write buffer and 2 for the sec-  
ond operand, if any.  
s
K)  
a
if (T (n)  
au  
0
b
AND Break (n 1)  
b
K is the value associated with Break (n 1).  
5. The CBITIi and SBITIi instructions will execute a RMW  
access after waiting for pending writes. The extra time  
required for the RMW access is only 3 cycles since the  
read portion is overlapped with the time in the Execution  
Unit.  
96  
Appendix D. Instruction Execution Times (Continued)  
T
only applies to floating-point instructions and is al-  
ways 0 for other instructions. It is evaluated as follows:  
function. In this example there are no data dependencies or  
storage buffer misses; only the basic instruction execution  
times in the pipeline, control dependencies, and instruction  
alignment are considered.  
flt  
if pipelined mode is disabled, then  
e
a
a
T
tsc  
T
t
tcs  
T
fpu  
flt  
else  
The following is the source of the procedure in C.  
e
Tflt  
0
if group A instruction.  
if group B instruction.  
unsigned fib(x)  
a
max (T , T  
)
T
tsc  
prv tcs  
int  
x;  
T
is the execution time in the Floating-Point  
fpu  
À
Unit. T is the time needed by the CPU and FPU  
prv  
l
if (x  
else  
2)  
to complete all the floating-point instructions in the  
FIFO.  
return (fib(x11) 0 fib(x12));  
5. Calculate the total execution time T by using the follow-  
eff  
ing formula:  
return(1);  
e
a
a
T T  
T
T
e
Ó
eff  
d
s
Where T and T are dependent on the instruction se-  
s
d
The assembly code for the procedure with comments indi-  
cating the execution time is shown below. The procedure  
requires 26 cycles to execute when the actual parameter is  
less than or equal to 2 (branch taken) and 99 cycles when  
the actual parameter is equal to 3 (recursive calls).  
quence, and can be obtained using the information pro-  
vided in Section D.4.  
D.5.4 Instruction Timing Example  
This section presents a simple instruction timing example  
for a procedure that recursively evaluates the Fibonacci  
fib:  
movd  
movd  
movd  
cmpqd  
bge  
r3,tos  
r4,tos  
r1,r3  
$(2),r3  
.L1  
; 2 cycles  
; 2 cycles  
; 2 cycles  
; 2 cycles  
; 2 cycles, Break 4 If Branch Taken  
; 2 cycles  
movd  
addqd  
bsr  
r3,r1  
$(12),rl ; 2 cycles  
fib  
; 3 cycles  
movd  
movd  
addqd  
bsr  
r0,r4  
r3,r1  
; 4 cycles 0 4 Cycles due to RET  
; 2 cycles  
$(11),r1 ; 2 cycles  
fib  
; 3 cycles  
a
addd  
movd  
movd  
ret  
r4,r0  
tos,r4  
tos,r3  
$(0)  
; 4 cycles 0 1 cycle alignment  
; 2 cycles  
4 cycles due to RET  
; 2 cycles  
; 4 cycles, break 4  
.align 4  
movqd  
movd  
movd  
ret  
a
L1:  
$(1),r0  
tos,r4  
tos,r3  
$(0)  
; 4 cycles  
; 2 cycles  
; 2 cycles  
4 cycles due to BGE  
; 4 cycles, Break 4  
97  
Appendix D. Instruction Execution Times (Continued)  
D.5.5 Execution Timing Tables  
The following tables provide the execution timing information for all the NS32532 instructions. The table for the floating-point  
instructions provides only the CPU portion of the total execution time. The FPU execution times can be found in the NS32381  
and NS32580 datasheets.  
D.5.5.1 Basic and Memory Management Instructions  
Mnemonic  
ABSi  
T
T
Notes  
Mnemonic  
T
T
au  
Notes  
eu  
au  
eu  
a
a
a
b
5
5
2
2
T
T
CHECKi  
10  
2
T
Break 3.  
ad  
ad  
If SRC is out  
of bounds and  
the V bit in the  
PSR is set,  
then add trap  
time.  
ACBi  
If incorrect prediction  
then Break 1  
ad  
a
a
a
a
a
ADDi  
2
2
9
2
2
2
2
2
2
4
T
T
T
T
T
ad  
ad  
ad  
ad  
ad  
ADDCi  
ADDPi  
ADDQi  
ADDR  
ADJSPi  
a
a
CINV  
10  
2
2
2
T
T
Wait for  
pending  
writes.  
ad  
Break 5  
a
a
e
e
5
3
2
2
T
T
i
i
B, W  
D
Break 0  
Break 0  
ad  
CMPi  
ad  
ad  
a
e
number  
of elements.  
Break 0  
CMPMi  
6
8 * n  
n
a
a
ANDi  
ASHi  
2
9
2
2
T
ad  
T
ad  
B
COND  
2
x
2
3
2
BTPC  
a
CMPQi  
CMPSi  
2
2
T
ad  
2
2
2
BTPI  
Break 2  
Break 2  
a
a
a
e
n number  
of elements.  
Break 0  
7
13 * n  
2
T
ad  
2
BNTPC  
2
BNTPI  
(see Note 5 in  
Section D.5.2)  
a
e
number  
CMPST  
6
20 * n  
2
T
ad  
n
of elements.  
Break 0  
a
BICi  
2
6
2
2
T
T
ad  
a
BICPSRi  
Wait for pending writes.  
Break 5  
ad  
a
a
COMi  
CVTP  
CXP  
2
5
2
4
T
T
ad  
ad  
a
BISPSRi  
BPT  
6
2
T
ad  
Wait for pending writes.  
Break 5  
17  
21  
13  
Break 5  
Break 5  
30  
21  
2
Modular  
Direct  
a
T
CXPD  
DEIi  
11  
5
ad  
2
a
a
e
i 0/4/12 for  
28  
4 * i  
T
ad  
Break 5  
B/W/D.  
Break 0  
BR  
2
2
x
x
7
3
3
2
DIA  
3
2
Break 5  
a
BSR  
CASEi  
CBITi  
3
2
T
T
ad  
a
a
e
for B/W/D  
DIVi  
(30  
x
40)  
4 * i  
2
T
i
0/4/12  
ad  
a
Break 5  
ad  
k
k
l
l
10  
14  
2
2
R
a
e
number  
ENTER  
15  
2 * n  
3
n
a
a
T
T
M
Break 0  
ad  
of registers  
saved.  
k
l
CBITIi  
18  
2
M
ad  
Wait for pending writes.  
Execute interlocked  
Break 0  
a
e
number  
EXIT  
EXTi  
8
2 * n  
2
8
n
RMW access. Break 5  
of registers  
restored  
k
k
l
l
12  
R
a
13  
8
6
T
M
ad  
ad  
b
b
Break  
3
3
k
k
l
l
EXSi  
11  
14  
6
R
a
T
M
Break  
98  
Appendix D. Instruction Execution Times (Continued)  
D.5.5.1 Basic and Memory Management Instructions (Continued)  
Mnemonic  
T
T
Notes  
number  
Mnemonic  
T
T
au  
Notes  
eu  
au  
eu  
a
a
e
of bytes  
a
FFSi  
11  
3 * i  
2
T
i
MOVSVi  
9
2
2
T
Wait for  
pending writes.  
Break 5  
ad  
ad  
ad  
FLAG  
4
32  
21  
2
2
2
No trap  
a
Trap, Modular  
Trap, Direct  
If trap then:  
MOVUSi  
11  
T
Wait for  
pending writes.  
Break 5  
À
pending writes;  
wait for  
a
a
a
MOVXii  
MOVZii  
MULi  
2
2
2
2
2
T
ad  
T
ad  
T
ad  
Ó
Break 5  
a
e
0/4/12  
for B/W/D.  
General case.  
If MULD and  
13  
2 * i  
i
k
k
l
l
IBITi  
10  
14  
2
R
M
a
2
5
T
Break 0  
ad  
ad  
a
INDEXi  
INSi  
43  
T
a
a
24  
2
2
T
T
ad  
k
k
l
s
s
SRC 255  
15  
18  
8
R
0
l
a
8
6
T
M
ad  
ad  
NEGi  
NOP  
NOTi  
ORi  
2
2
ad  
k
k
l
l
INSSi  
14  
19  
6
R
2
a
T
M
a
a
a
3
2
2
T
ad  
T
ad  
T
ad  
Break 0  
Break 5  
Break 5  
2
a
a
a
JSR  
3
3
9
4
2
T
T
T
ad  
ad  
ad  
e
for B/W/D  
QUOi  
(30  
(32  
x
40) 2  
i
0/4/12  
JUMP  
LMR  
a
4 * i  
11  
Wait for  
pending writes.  
Break 5  
a
a
RDVAL  
10  
2
T
Wait for  
pending writes.  
Break 5  
ad  
ad  
a
a
e
FP,  
SP, USP, SP, MOD.  
LPRi  
6
5
2
2
T
T
CPU Reg  
e
for B/W/D  
ad  
REMi  
x
42) 2  
T
i
0/4/12  
a
4 * i  
Break 0  
CPU Reg  
INTBASE, DSR,  
BPC, UPSR.  
Wait for pending  
writes.  
a
e
number  
RESTORE  
7
2 * n  
2
n
e
CFG,  
ad  
of registers  
restored.  
Break 0  
RET  
4
3
Break 4  
Break 5  
CPU Reg  
PSR CAR. Wait for  
pending writes.  
Break 5  
RETI  
19  
13  
29  
22  
5
5
5
5
Noncascaded, Modular  
Noncascaded, Direct  
Cascaded, Modular  
Cascaded, Direct  
a
e
DCR,  
7
3
2
T
ad  
Wait for  
pending writes.  
Break 5  
a
a
LSHi  
MEIi  
2
5
T
T
ad  
a
e
0/4/12  
for B/W/D.  
13  
2 * i  
i
ad  
RETT  
14  
8
5
5
Modular  
Direct  
Break 0  
a
e
for B/W/D  
MODi  
(34  
a
x
49)  
4 * i  
2
T
i
0/4/12  
ad  
Wait for  
pending writes.  
Break 5  
a
a
MOVi  
2
2
2
T
T
ad  
a
e
number  
MOVMi  
5
4 * n  
n
a
ad  
ROTi  
7
8
3
2
2
T
T
ad  
of elements.  
Break 0  
RXP  
5
Break 5  
a
SCONDi  
SAVE  
ad  
a
MOVQi  
MOVSi  
2
2
T
ad  
a
e
number  
8
2 * n  
2
n
e
of elements.  
n
number  
of registers.  
Break 0  
a
a
a
a
12  
14  
4 * n  
8 * n  
2
2
T
ad  
T
ad  
No options.  
B, W and/or U  
Options in effect.  
k
k
l
l
SBITi  
10  
14  
2
a
R
M
2
T
ad  
Break 0  
Break 0  
a
a
e
n number  
of elements.  
MOVST  
16  
9 * n  
2
T
ad  
Break 0  
99  
Appendix D. Instruction Execution Times (Continued)  
D.5.5.1 Basic and Memory Management Instructions (Continued)  
Mnemonic  
T
T
Notes  
Mnemonic  
T
T
au  
Notes  
eu  
10  
18  
au  
eu  
k
k
l
l
a
a
e
SBITIi  
2
a
R
M
SPRi  
5
3
2
2
T
T
CPU Reg  
CPU Reg  
PSR, CAR  
e
all others  
ad  
ad  
2
T
ad  
a
a
a
SUBi  
2
2
6
2
2
2
T
ad  
T
ad  
T
ad  
Wait for pending  
writes. Execute  
interlocked RMW  
access.  
SUBCi  
SUBPi  
SVC  
Break 5  
32  
21  
2
2
Modular  
Direct  
SETCFG  
SKPSi  
6
2
Break 5  
Wait for  
pending writes.  
Break 5  
a
a
e
n number of  
elements.  
Break 0  
8
6 * n  
2
2
2
T
T
T
ad  
ad  
ad  
k
k
l
l
TBITi  
WAIT  
8
11  
2
R
M
a
a
a
e
n number of  
elements.  
Break 0  
SKPST  
SMR  
6
20 * n  
a
2
T
Break 0  
ad  
3
10  
2
2
Wait for pending  
writes. Wait  
for interrupt  
7
Wait for  
pending writes.  
Break 5  
a
WRVAL  
XORi  
2
2
T
Wait for  
pending writes.  
Break 5  
ad  
ad  
a
T
100  
Appendix D. Instruction Execution Times (Continued)  
D.5.5.2 Floating-Point Instructions, CPU Portion  
Mnemonic  
T
eu  
T
T
tcs  
T
tsc  
Group  
Notes  
au  
k
k
k
k
k
k
l
FF  
a
a
a
a
a
a
MOVf, NEGf,  
ABSf, SQRTf,  
LOGBf  
2
4
6
6
11  
13  
2
2
2
2
2
2
T
T
T
T
T
T
2
2
2
2
2
2
1
1
1
1
3
3
A
A
B
B
B
B
anp  
anp  
anp  
anp  
anp  
anp  
l
a
a
a
a
3 * l  
3 * l  
3 * l  
4 * l  
7 * l  
T
MF  
ad  
l
IF  
TF  
l
l
a
a
a
a
a
b
a
T
ad  
T
ad  
2 * l  
2 * l  
FM Break (1 l)  
k
l
l
IM Break (1  
a
b
a
a
MM  
,
l)  
k
k
k
k
k
k
l
a
a
a
a
a
a
ADDf, SUBf,  
MULf, DIVf,  
SCALBf  
2
2
2
2
2
2
2
T
anp  
T
anp  
T
anp  
T
anp  
T
anp  
T
anp  
2
2
2
2
2
2
1
1
1
1
3
3
A
A
B
B
B
B
FF  
MF  
l
a
a
a
a
a
4
6
6
17  
19  
3 * l  
3 * l  
3 * l  
7 * l  
10 * l  
l
IF  
TF  
l
l
a
a
a
a
b
a
T
ad  
T
ad  
2 * l  
2 * l  
FM Break (1 l)  
k
l
l
IM Break (1  
b
MM  
,
l)  
l)  
k
k
k
k
l
a
a
a
a
a
a
a
a
b
1
ROUNDfi, TRUNCfi,  
FLOORfi  
11  
11  
13  
13  
2
2
2
2
T
anp  
T
anp  
T
anp  
T
anp  
2
2
2
2
3
3
3
3
2 * l  
2 * l  
2 * l  
2 * l  
B
B
B
B
FR Break  
l
a
a
a
a
a
b
a
IR Break  
4 * l  
7 * l  
T
ad  
T
ad  
T
ad  
FM Break (1  
l)  
b
b
l
k
k
l
MR  
MM  
,
,
1
IM Break (1  
l
l
a
l
k
k
k
k
l
a
a
a
a
CMPf  
18  
20  
23  
25  
2
2
2
2
T
anp  
T
anp  
T
anp  
T
anp  
2
2
2
2
B
B
B
B
FF  
l
l
a
a
a
a
a
a
3 * l  
3 * l  
6 * l  
T
ad  
T
ad  
T
ad  
MF  
FM  
l
k
l
k
l
k
k
MM  
,
IM  
,
MI  
,
II  
II  
Break 3  
k
k
k
k
l
FF  
a
a
a
a
POLYf, DOTf,  
MACf  
2
4
6
2
2
2
2
T
anp  
T
anp  
T
anp  
T
anp  
2
2
2
2
1
1
1
1
A
A
B
A
l
a
a
a
a
3 * l  
T
ad  
MF  
IF  
l
k
l
TF  
3 * l  
,
l
FM Break (1  
a
a
b
a
l)  
11  
4 * l  
T
T
ad  
k
Break (1  
l
k
l
k
l
IM ,  
l
a
a
a
13  
7 * l  
2
T
anp  
2
1
B
MM  
,
MI  
,
ad  
b
a
l)  
k
k
k
k
l
RF  
a
a
a
a
MOVif  
LFSR  
6
13  
6
13  
2
2
2
2
T
anp  
T
anp  
T
anp  
T
anp  
2
2
2
2
1
1
B
B
B
B
l
a
a
a
b
T
ad  
T
ad  
T
ad  
RM Break  
k
1
l
l
k
IM Break (1  
l
a
a
3 * l  
7 * l  
MF  
MM  
,
,
IF  
,
TF  
l
k
l
b
l)  
k
k
k l  
l
a
a
a
a
6
2
2
2
2
T
anp  
T
anp  
T
anp  
T
anp  
2
2
2
2
1
1
1
1
B
B
B
B
R
l
a
a
a
a
6
6
6
3 * l  
3 * l  
3 * l  
T
ad  
M
I
k
l
T
a
a
a
b
Break  
SFSR  
11  
2
T
T
T
2
3
B
1
anp  
ad  
k
k
l
l
a
a
MOVFL  
4
6
2
2
T
anp  
T
anp  
2
2
1
1
B
B
FF  
MF  
k
l
k
l
TF  
,
IF  
,
ad  
k
k
l
FM Break 0  
a
a
a
a
15  
17  
2
2
T
anp  
T
anp  
T
ad  
T
ad  
2
2
B
B
l
k l  
IM Break 0  
MM  
,
k
k
l
l
a
a
MOVLF  
4
9
2
2
T
anp  
T
anp  
2
2
1
1
B
B
FF  
MF  
k
l
k
l
TF  
a
T
ad  
,
IF  
,
k
k
l
FM Break 0  
a
a
a
a
15  
20  
2
2
T
anp  
T
anp  
T
ad  
T
ad  
2
2
B
B
l
k l  
IM Break 0  
MM  
,
101  
Ý
Lit. 114272  
Physical Dimensions inches (millimeters)  
Hermetic Pin Grid Array (U)  
Order Number NS32532-20, NS32532-25 or NS32532-30  
NS Package Number U175A  
LIFE SUPPORT POLICY  
NATIONAL’S PRODUCTS ARE NOT AUTHORIZED FOR USE AS CRITICAL COMPONENTS IN LIFE SUPPORT  
DEVICES OR SYSTEMS WITHOUT THE EXPRESS WRITTEN APPROVAL OF THE PRESIDENT OF NATIONAL  
SEMICONDUCTOR CORPORATION. As used herein:  
1. Life support devices or systems are devices or  
systems which, (a) are intended for surgical implant  
into the body, or (b) support or sustain life, and whose  
failure to perform, when properly used in accordance  
with instructions for use provided in the labeling, can  
be reasonably expected to result in a significant injury  
to the user.  
2. A critical component is any component of a life  
support device or system whose failure to perform can  
be reasonably expected to cause the failure of the life  
support device or system, or to affect its safety or  
effectiveness.  
National Semiconductor  
Corporation  
National Semiconductor  
Europe  
National Semiconductor  
Hong Kong Ltd.  
National Semiconductor  
Japan Ltd.  
a
1111 West Bardin Road  
Arlington, TX 76017  
Tel: 1(800) 272-9959  
Fax: 1(800) 737-7018  
Fax:  
(
49) 0-180-530 85 86  
@
13th Floor, Straight Block,  
Ocean Centre, 5 Canton Rd.  
Tsimshatsui, Kowloon  
Hong Kong  
Tel: (852) 2737-1600  
Fax: (852) 2736-9960  
Tel: 81-043-299-2309  
Fax: 81-043-299-2408  
Email: cnjwge tevm2.nsc.com  
a
a
a
a
Deutsch Tel:  
English Tel:  
Fran3ais Tel:  
Italiano Tel:  
(
(
(
(
49) 0-180-530 85 85  
49) 0-180-532 78 32  
49) 0-180-532 93 58  
49) 0-180-534 16 80  
National does not assume any responsibility for use of any circuitry described, no circuit patent licenses are implied and National reserves the right at any time without notice to change said circuitry and specifications.  

相关型号:

NS32580-20

SPECIALTY MICROPROCESSOR CIRCUIT, CPGA172, CAVITY DOWN, PGA-172
TI

NS32580-25

SPECIALTY MICROPROCESSOR CIRCUIT, CPGA172, CAVITY DOWN, PGA-172
TI

NS32817D-70

IC 256K X 1, DRAM CONTROLLER, CDIP48, SIDE BRAZED, CERAMIC, DIP-48, Memory Controller
TI

NS32817D-80

IC 256K X 1, DRAM CONTROLLER, CDIP48, SIDE BRAZED, CERAMIC, DIP-48, Memory Controller
NSC

NS32817D-80

256KX1, DRAM CONTROLLER, CDIP48, SIDE BRAZED, CERAMIC, DIP-48
TI

NS32817N-70

256KX1, DRAM CONTROLLER, PDIP48, 0.600 INCH, PLASTIC, DIP-48
TI

NS32817V-70X

IC 256K X 1, DRAM CONTROLLER, PQCC64, PLASTIC, CC-68, Memory Controller
NSC

NS32817V-80

IC 256K X 1, DRAM CONTROLLER, PQCC64, PLASTIC, LCC-68, Memory Controller
TI

NS32817V-80X

IC 256K X 1, DRAM CONTROLLER, PQCC64, PLASTIC, CC-68, Memory Controller
NSC

NS32818D-70

256KX1, DRAM CONTROLLER, CDIP48, SIDE BRAZED, CERAMIC, DIP-48
NSC