Appearance
MIPS Assembly Language
The classic RISC architecture powering networking and embedded systems worldwide
MIPS (Microprocessor without Interlocked Pipelined Stages) represents one of the most influential and enduring processor architectures in computing history. Originally developed by MIPS Computer Systems in the 1980s, this reduced instruction set computer (RISC) architecture has maintained its relevance through decades of technological evolution, currently dominating networking equipment with over 90% market share in routers and switches while continuing to serve critical roles in embedded systems, telecommunications, and computer architecture education.
Architecture Overview
MIPS embodies the fundamental principles of RISC design with a clean, orthogonal instruction set that prioritizes simplicity and performance predictability. The architecture features a load-store design where arithmetic operations work exclusively on registers, with separate instructions for memory access. This approach simplifies processor implementation while enabling high clock frequencies and efficient pipelining.
Key Design Principles
The MIPS architecture demonstrates several foundational design principles that have influenced processor development across the industry. The instruction set maintains a fixed 32-bit instruction length, simplifying instruction fetch and decode logic while enabling efficient instruction caching. All instructions execute in a consistent number of clock cycles, making performance analysis and optimization more predictable than complex instruction set computers (CISC).
The architecture employs a large register file with 32 general-purpose registers, reducing memory traffic and enabling efficient code generation by compilers. The simple addressing modes and regular instruction formats minimize hardware complexity while providing sufficient flexibility for diverse programming tasks. The delayed branch mechanism, while initially challenging for programmers, enables efficient pipeline implementation by eliminating branch delay slots in hardware.
Historical Context and Evolution
MIPS emerged during the 1980s RISC revolution, competing with other pioneering architectures like SPARC and ARM. The architecture gained significant traction in workstations, servers, and embedded systems throughout the 1990s and 2000s. While x86 processors dominated desktop computing and ARM captured mobile markets, MIPS found its niche in networking infrastructure and specialized embedded applications where its predictable performance and efficient implementation proved advantageous.
The architecture has evolved through multiple generations, from the original MIPS I through MIPS V, with the modern MIPS32 and MIPS64 specifications providing contemporary features while maintaining backward compatibility. Recent developments include the MIPS32/64 Release 6 specification, which modernizes the architecture with improved instruction encoding and enhanced features for current applications.
Register Architecture
MIPS provides 32 general-purpose registers, each 32 bits wide in MIPS32 or 64 bits wide in MIPS64 implementations. The register file represents one of the architecture's key strengths, providing ample storage for compiler optimization while maintaining reasonable implementation complexity.
Register Set and Naming Conventions
Register | Number | Name | Description | Preserved |
---|---|---|---|---|
$zero | $0 | zero | Hard-wired zero | N/A |
$at | $1 | at | Assembler temporary | No |
$v0-$v1 | $2-$3 | v0-v1 | Function return values | No |
$a0-$a3 | $4-$7 | a0-a3 | Function arguments | No |
$t0-$t7 | $8-$15 | t0-t7 | Temporary registers | No |
$s0-$s7 | $16-$23 | s0-s7 | Saved registers | Yes |
$t8-$t9 | $24-$25 | t8-t9 | More temporary registers | No |
$k0-$k1 | $26-$27 | k0-k1 | Kernel registers | N/A |
$gp | $28 | gp | Global pointer | N/A |
$sp | $29 | sp | Stack pointer | Yes |
$fp | $30 | fp | Frame pointer | Yes |
$ra | $31 | ra | Return address | No |
Register Usage Conventions
The MIPS calling convention establishes specific roles for registers to ensure compatibility between different compilers and software components. Argument registers $a0-$a3 pass the first four function parameters, with additional arguments passed on the stack. Return values use $v0 for single values and $v0-$v1 for 64-bit returns on MIPS32 systems.
Temporary registers $t0-$t9 provide scratch space for computations and need not be preserved across function calls, making them ideal for intermediate calculations and temporary storage. Saved registers $s0-$s7 must be preserved by called functions, making them suitable for variables that span multiple function calls or loop iterations.
The stack pointer $sp must always point to valid stack memory and is typically adjusted by called functions to allocate local storage. The frame pointer $fp optionally maintains a fixed reference point within the current stack frame, simplifying access to local variables and parameters in complex functions. The return address register $ra holds the return address for function calls, automatically set by jump-and-link instructions.
Special Purpose Registers
Beyond the general-purpose register file, MIPS processors include several special-purpose registers for system control and status monitoring. The program counter (PC) tracks the current instruction address but is not directly accessible through normal instructions. The HI and LO registers store the results of multiplication and division operations, accessible through special move instructions.
Coprocessor 0 (CP0) registers provide system control functionality including exception handling, memory management, and processor configuration. These registers enable operating system implementation and system-level programming but require privileged access in most implementations.
Instruction Set Architecture
The MIPS instruction set provides a comprehensive foundation for general-purpose computing while maintaining the simplicity and regularity that characterize RISC architectures. The base instruction set includes arithmetic, logical, memory access, and control flow operations sufficient for implementing complex software systems.
Instruction Formats
MIPS uses three basic instruction formats that encode different types of operations while maintaining consistent field positions for common elements like register specifiers and opcodes.
R-Type Instructions (Register)
31 26 25 21 20 16 15 11 10 6 5 0
[opcode] [rs] [rt] [rd] [shamt] [funct]
R-type instructions perform operations between registers, with two source registers (rs, rt) and one destination register (rd). The shamt field specifies shift amounts for shift operations, while the funct field provides additional opcode space for different operations within the same primary opcode.
assembly
add $t0, $t1, $t2 # $t0 = $t1 + $t2
sub $s0, $s1, $s2 # $s0 = $s1 - $s2
and $a0, $a1, $a2 # $a0 = $a1 & $a2
or $v0, $v1, $t0 # $v0 = $v1 | $t0
sll $t3, $t4, 4 # $t3 = $t4 << 4
I-Type Instructions (Immediate)
31 26 25 21 20 16 15 0
[opcode] [rs] [rt] [immediate]
I-type instructions operate on a register and a 16-bit immediate value. This format covers immediate arithmetic, load and store operations, and conditional branches. The immediate field is sign-extended to 32 bits for arithmetic operations.
assembly
addi $t0, $t1, 100 # $t0 = $t1 + 100
lw $s0, 8($sp) # $s0 = memory[$sp + 8]
sw $a0, 12($fp) # memory[$fp + 12] = $a0
beq $t0, $t1, label # if $t0 == $t1, branch to label
J-Type Instructions (Jump)
31 26 25 0
[opcode] [address]
J-type instructions perform unconditional jumps with a 26-bit address field. The target address is formed by combining the address field with the upper bits of the program counter, enabling jumps within a 256MB region.
assembly
j target # jump to target
jal function # jump to function, save return address
Arithmetic and Logical Instructions
MIPS provides a comprehensive set of arithmetic and logical instructions for both register-register and register-immediate operations. These instructions form the computational foundation for mathematical operations, bit manipulation, and address calculations.
Basic Arithmetic Operations
assembly
# Addition and subtraction
add $rd, $rs, $rt # $rd = $rs + $rt
addi $rt, $rs, imm # $rt = $rs + sign_extend(imm)
addu $rd, $rs, $rt # $rd = $rs + $rt (unsigned, no overflow)
addiu $rt, $rs, imm # $rt = $rs + sign_extend(imm) (unsigned)
sub $rd, $rs, $rt # $rd = $rs - $rt
subu $rd, $rs, $rt # $rd = $rs - $rt (unsigned, no overflow)
# Multiplication and division
mult $rs, $rt # HI:LO = $rs * $rt (signed)
multu $rs, $rt # HI:LO = $rs * $rt (unsigned)
div $rs, $rt # LO = $rs / $rt, HI = $rs % $rt (signed)
divu $rs, $rt # LO = $rs / $rt, HI = $rs % $rt (unsigned)
mfhi $rd # $rd = HI
mflo $rd # $rd = LO
mthi $rs # HI = $rs
mtlo $rs # LO = $rs
Logical Operations
assembly
# Bitwise operations
and $rd, $rs, $rt # $rd = $rs & $rt
andi $rt, $rs, imm # $rt = $rs & zero_extend(imm)
or $rd, $rs, $rt # $rd = $rs | $rt
ori $rt, $rs, imm # $rt = $rs | zero_extend(imm)
xor $rd, $rs, $rt # $rd = $rs ^ $rt
xori $rt, $rs, imm # $rt = $rs ^ zero_extend(imm)
nor $rd, $rs, $rt # $rd = ~($rs | $rt)
# Shift operations
sll $rd, $rt, shamt # $rd = $rt << shamt
sllv $rd, $rt, $rs # $rd = $rt << ($rs & 0x1F)
srl $rd, $rt, shamt # $rd = $rt >> shamt (logical)
srlv $rd, $rt, $rs # $rd = $rt >> ($rs & 0x1F) (logical)
sra $rd, $rt, shamt # $rd = $rt >> shamt (arithmetic)
srav $rd, $rt, $rs # $rd = $rt >> ($rs & 0x1F) (arithmetic)
Comparison Operations
assembly
# Set on less than
slt $rd, $rs, $rt # $rd = ($rs < $rt) ? 1 : 0 (signed)
slti $rt, $rs, imm # $rt = ($rs < sign_extend(imm)) ? 1 : 0
sltu $rd, $rs, $rt # $rd = ($rs < $rt) ? 1 : 0 (unsigned)
sltiu $rt, $rs, imm # $rt = ($rs < sign_extend(imm)) ? 1 : 0 (unsigned)
Memory Access Instructions
MIPS employs a load-store architecture where arithmetic operations work exclusively on registers, with separate instructions for transferring data between registers and memory. This design simplifies processor implementation while enabling efficient caching and memory optimization.
Load Instructions
Load instructions transfer data from memory to registers, with support for different data sizes and sign extension options. The effective address is calculated by adding a 16-bit signed offset to a base register value.
assembly
# Load word (32-bit)
lw $rt, offset($rs) # $rt = memory[$rs + offset]
# Load halfword (16-bit)
lh $rt, offset($rs) # $rt = sign_extend(memory[$rs + offset][15:0])
lhu $rt, offset($rs) # $rt = zero_extend(memory[$rs + offset][15:0])
# Load byte (8-bit)
lb $rt, offset($rs) # $rt = sign_extend(memory[$rs + offset][7:0])
lbu $rt, offset($rs) # $rt = zero_extend(memory[$rs + offset][7:0])
# Load upper immediate
lui $rt, imm # $rt = imm << 16
# Load word left/right (unaligned access)
lwl $rt, offset($rs) # Load left portion of unaligned word
lwr $rt, offset($rs) # Load right portion of unaligned word
Store Instructions
Store instructions transfer data from registers to memory, supporting the same data sizes as load instructions. The stored value is taken from the source register, with appropriate masking for smaller data sizes.
assembly
# Store word (32-bit)
sw $rt, offset($rs) # memory[$rs + offset] = $rt
# Store halfword (16-bit)
sh $rt, offset($rs) # memory[$rs + offset][15:0] = $rt[15:0]
# Store byte (8-bit)
sb $rt, offset($rs) # memory[$rs + offset][7:0] = $rt[7:0]
# Store word left/right (unaligned access)
swl $rt, offset($rs) # Store left portion of unaligned word
swr $rt, offset($rs) # Store right portion of unaligned word
Addressing Modes
MIPS supports several addressing modes through combinations of instructions and register usage patterns. Base addressing uses a register plus offset for accessing data structures and stack variables. Indexed addressing combines two registers for array access, typically implemented using addition followed by load/store operations.
assembly
# Base addressing
lw $t0, 8($sp) # Load from stack (base + offset)
sw $a0, 12($fp) # Store to frame (base + offset)
# Indexed addressing (requires address calculation)
add $t1, $s0, $t2 # Calculate array[index] address
lw $t0, 0($t1) # Load array element
# PC-relative addressing (using labels)
la $t0, data_label # Load address of label
lw $t1, 0($t0) # Load data at label
Control Flow Instructions
Control flow instructions manage program execution by implementing conditional and unconditional branches, function calls, and returns. MIPS provides a rich set of branch instructions that compare registers and transfer control based on the results.
Conditional Branch Instructions
assembly
# Equality branches
beq $rs, $rt, label # if ($rs == $rt) goto label
bne $rs, $rt, label # if ($rs != $rt) goto label
# Zero comparison branches
beqz $rs, label # if ($rs == 0) goto label (pseudo)
bnez $rs, label # if ($rs != 0) goto label (pseudo)
# Comparison branches
blez $rs, label # if ($rs <= 0) goto label
bgtz $rs, label # if ($rs > 0) goto label
bltz $rs, label # if ($rs < 0) goto label
bgez $rs, label # if ($rs >= 0) goto label
# Branch and link variants
bltzal $rs, label # if ($rs < 0) { $ra = PC + 8; goto label }
bgezal $rs, label # if ($rs >= 0) { $ra = PC + 8; goto label }
Unconditional Jump Instructions
assembly
# Jump instructions
j target # goto target
jal target # $ra = PC + 8; goto target
jr $rs # goto $rs
jalr $rd, $rs # $rd = PC + 8; goto $rs
# Common jump patterns
jal function # Call function
jr $ra # Return from function
Branch Delay Slots
MIPS implements delayed branches where the instruction immediately following a branch or jump executes regardless of whether the branch is taken. This design simplifies pipeline implementation but requires careful instruction scheduling.
assembly
# Branch with delay slot
beq $t0, $t1, target
add $t2, $t3, $t4 # This instruction always executes
# Jump with delay slot
j function
nop # Delay slot (no operation)
# Optimized delay slot usage
beq $t0, $t1, skip
add $s0, $s1, $s2 # Useful work in delay slot
skip:
Pseudoinstructions
MIPS assembly language includes numerous pseudoinstructions that expand to one or more actual instructions, providing convenient mnemonics for common operations and simplifying assembly programming.
Data Movement Pseudoinstructions
assembly
# Register operations
move $rd, $rs # add $rd, $rs, $zero
nop # sll $zero, $zero, 0
clear $rd # add $rd, $zero, $zero
# Immediate loading
li $rt, imm # Load 32-bit immediate (may use lui + ori)
la $rt, label # Load address of label
# Examples of li expansion:
li $t0, 0x1234 # ori $t0, $zero, 0x1234
li $t1, 0x12345678 # lui $t1, 0x1234; ori $t1, $t1, 0x5678
Arithmetic Pseudoinstructions
assembly
# Negation
neg $rd, $rs # sub $rd, $zero, $rs
negu $rd, $rs # subu $rd, $zero, $rs
# Absolute value
abs $rd, $rs # Complex expansion with branches
# Multiplication with immediate
mul $rd, $rs, imm # Load immediate, then mult/mflo
# Division with immediate
div $rd, $rs, imm # Load immediate, then div/mflo
rem $rd, $rs, $rt # div $rs, $rt; mfhi $rd
Comparison Pseudoinstructions
assembly
# Set operations
seq $rd, $rs, $rt # Set if equal
sne $rd, $rs, $rt # Set if not equal
sge $rd, $rs, $rt # Set if greater or equal
sgt $rd, $rs, $rt # Set if greater than
sle $rd, $rs, $rt # Set if less or equal
# Branch pseudoinstructions
blt $rs, $rt, label # Branch if less than
bgt $rs, $rt, label # Branch if greater than
ble $rs, $rt, label # Branch if less or equal
bge $rs, $rt, label # Branch if greater or equal
Programming Examples
Hello World Program
assembly
.data
hello_msg: .asciiz "Hello, MIPS World!\n"
.text
.globl main
main:
# Print string system call
li $v0, 4 # sys_print_string
la $a0, hello_msg # load string address
syscall # system call
# Exit program
li $v0, 10 # sys_exit
syscall # system call
Factorial Function
assembly
.text
.globl factorial
# Calculate factorial of n
# Input: $a0 = n
# Output: $v0 = n!
factorial:
# Base case: if n <= 1, return 1
li $t0, 1
ble $a0, $t0, fact_base
# Save registers
addi $sp, $sp, -8
sw $ra, 4($sp)
sw $a0, 0($sp)
# Recursive call: factorial(n-1)
addi $a0, $a0, -1
jal factorial
# Restore n and calculate n * factorial(n-1)
lw $a0, 0($sp)
lw $ra, 4($sp)
addi $sp, $sp, 8
mult $a0, $v0 # n * factorial(n-1)
mflo $v0 # result in $v0
jr $ra
fact_base:
li $v0, 1 # return 1
jr $ra
Array Sum Function
assembly
.text
.globl array_sum
# Calculate sum of array elements
# Input: $a0 = array address, $a1 = array length
# Output: $v0 = sum
array_sum:
li $v0, 0 # sum = 0
li $t0, 0 # index = 0
sum_loop:
bge $t0, $a1, sum_done # if index >= length, exit
sll $t1, $t0, 2 # t1 = index * 4 (word size)
add $t2, $a0, $t1 # t2 = array + offset
lw $t3, 0($t2) # load array[index]
add $v0, $v0, $t3 # sum += array[index]
addi $t0, $t0, 1 # index++
j sum_loop
sum_done:
jr $ra # return sum in $v0
String Length Function
assembly
.text
.globl strlen
# Calculate string length
# Input: $a0 = string address
# Output: $v0 = string length
strlen:
move $t0, $a0 # save original address
li $v0, 0 # length = 0
strlen_loop:
lb $t1, 0($t0) # load byte
beqz $t1, strlen_done # if null terminator, done
addi $v0, $v0, 1 # increment length
addi $t0, $t0, 1 # advance pointer
j strlen_loop
strlen_done:
jr $ra # return length in $v0
MIPS32 and MIPS64 Extensions
MIPS64 Architecture
MIPS64 extends the base architecture to support 64-bit operations while maintaining compatibility with 32-bit code. The register file expands to 64 bits per register, and additional instructions handle 64-bit arithmetic and memory operations.
assembly
# 64-bit arithmetic (MIPS64)
dadd $rd, $rs, $rt # 64-bit addition
daddi $rt, $rs, imm # 64-bit add immediate
dsub $rd, $rs, $rt # 64-bit subtraction
dmult $rs, $rt # 64-bit multiplication
ddiv $rs, $rt # 64-bit division
# 64-bit memory operations
ld $rt, offset($rs) # Load doubleword (64-bit)
sd $rt, offset($rs) # Store doubleword (64-bit)
ldl $rt, offset($rs) # Load doubleword left
ldr $rt, offset($rs) # Load doubleword right
sdl $rt, offset($rs) # Store doubleword left
sdr $rt, offset($rs) # Store doubleword right
# 64-bit shifts
dsll $rd, $rt, shamt # Doubleword shift left logical
dsrl $rd, $rt, shamt # Doubleword shift right logical
dsra $rd, $rt, shamt # Doubleword shift right arithmetic
dsll32 $rd, $rt, shamt # Doubleword shift left logical + 32
dsrl32 $rd, $rt, shamt # Doubleword shift right logical + 32
dsra32 $rd, $rt, shamt # Doubleword shift right arithmetic + 32
Floating-Point Operations
MIPS processors typically include a floating-point coprocessor (FPU) that provides IEEE 754-compliant floating-point arithmetic. The FPU has its own register file and instruction set.
assembly
# Single-precision floating-point
add.s $f0, $f1, $f2 # $f0 = $f1 + $f2 (single)
sub.s $f0, $f1, $f2 # $f0 = $f1 - $f2 (single)
mul.s $f0, $f1, $f2 # $f0 = $f1 * $f2 (single)
div.s $f0, $f1, $f2 # $f0 = $f1 / $f2 (single)
# Double-precision floating-point
add.d $f0, $f2, $f4 # $f0:$f1 = $f2:$f3 + $f4:$f5 (double)
sub.d $f0, $f2, $f4 # $f0:$f1 = $f2:$f3 - $f4:$f5 (double)
mul.d $f0, $f2, $f4 # $f0:$f1 = $f2:$f3 * $f4:$f5 (double)
div.d $f0, $f2, $f4 # $f0:$f1 = $f2:$f3 / $f4:$f5 (double)
# Floating-point load/store
lwc1 $f0, offset($rs) # Load word to FPU register
swc1 $f0, offset($rs) # Store word from FPU register
ldc1 $f0, offset($rs) # Load doubleword to FPU register
sdc1 $f0, offset($rs) # Store doubleword from FPU register
# Floating-point comparisons
c.eq.s $f0, $f1 # Compare equal (single)
c.lt.s $f0, $f1 # Compare less than (single)
c.le.s $f0, $f1 # Compare less or equal (single)
bc1t label # Branch if FP condition true
bc1f label # Branch if FP condition false
System Programming
Exception Handling
MIPS provides a comprehensive exception handling mechanism for implementing operating systems and handling runtime errors. Exceptions transfer control to predefined handler addresses with minimal hardware overhead.
assembly
# Exception vector addresses (typical)
# 0x80000000: Reset/NMI
# 0x80000180: General exception
# 0x80000200: Interrupt
.ktext 0x80000180
exception_handler:
# Save all registers (kernel exception handler)
.set noat
move $k0, $at # save $at in $k0
.set at
# Save registers to kernel stack
la $k1, kernel_stack
sw $v0, 0($k1)
sw $v1, 4($k1)
sw $a0, 8($k1)
# ... save all registers
# Determine exception cause
mfc0 $k0, $13 # read Cause register
andi $k0, $k0, 0x7C # extract exception code
srl $k0, $k0, 2 # shift to get code
# Jump to specific handler
la $k1, exception_table
sll $k0, $k0, 2 # multiply by 4 for word offset
add $k1, $k1, $k0
lw $k1, 0($k1)
jr $k1
# System call handler
syscall_handler:
# $v0 contains system call number
# $a0-$a3 contain arguments
# Implement system call dispatch
beq $v0, 1, sys_print_int
beq $v0, 4, sys_print_string
beq $v0, 10, sys_exit
# ... other system calls
# Return from exception
mfc0 $k0, $14 # read EPC (return address)
addi $k0, $k0, 4 # skip syscall instruction
mtc0 $k0, $14 # update EPC
# Restore registers
# ... restore all registers
eret # return from exception
Memory Management
assembly
# TLB (Translation Lookaside Buffer) management
tlb_miss_handler:
# Handle TLB miss exception
mfc0 $k0, $8 # read BadVAddr register
mfc0 $k1, $10 # read EntryHi register
# Look up page table entry
# ... page table lookup code
# Load TLB entry
mtc0 $k0, $2 # write EntryLo0
mtc0 $k1, $3 # write EntryLo1
mtc0 $t0, $10 # write EntryHi
tlbwr # write TLB entry
eret # return from exception
# Cache management
cache_flush:
# Flush instruction cache
li $t0, 0x10000 # cache size
li $t1, 32 # cache line size
flush_loop:
cache 0x0, 0($t0) # index invalidate I-cache
subu $t0, $t0, $t1 # next cache line
bgez $t0, flush_loop
jr $ra
Assembly Programming Techniques
Function Calling Convention
MIPS follows a standard calling convention that ensures compatibility between different compilers and libraries. Understanding this convention is essential for writing assembly functions that interface with high-level language code.
assembly
# Standard function prologue
function_name:
# Save return address and callee-saved registers
addi $sp, $sp, -32 # allocate stack frame
sw $ra, 28($sp) # save return address
sw $fp, 24($sp) # save frame pointer
sw $s0, 20($sp) # save callee-saved register
sw $s1, 16($sp) # save callee-saved register
addi $fp, $sp, 32 # set frame pointer
# Function body
# Arguments in $a0-$a3, return value in $v0-$v1
# Use $t0-$t9 for temporary values
# Use $s0-$s7 for values that must survive function calls
# Function epilogue
lw $ra, 28($sp) # restore return address
lw $fp, 24($sp) # restore frame pointer
lw $s0, 20($sp) # restore callee-saved register
lw $s1, 16($sp) # restore callee-saved register
addi $sp, $sp, 32 # deallocate stack frame
jr $ra # return to caller
Optimized Memory Operations
assembly
# Optimized memory copy (word-aligned)
memcpy_words:
# $a0 = destination, $a1 = source, $a2 = word count
beqz $a2, copy_done # if count == 0, done
copy_loop:
lw $t0, 0($a1) # load word from source
sw $t0, 0($a0) # store word to destination
addi $a0, $a0, 4 # advance destination
addi $a1, $a1, 4 # advance source
addi $a2, $a2, -1 # decrement count
bnez $a2, copy_loop # continue if count > 0
copy_done:
jr $ra
# Unrolled loop for better performance
unrolled_copy:
# Copy 4 words at a time
andi $t0, $a2, 3 # remainder when divided by 4
srl $a2, $a2, 2 # divide count by 4
beqz $a2, copy_remainder
unroll_loop:
lw $t1, 0($a1) # load 4 words
lw $t2, 4($a1)
lw $t3, 8($a1)
lw $t4, 12($a1)
sw $t1, 0($a0) # store 4 words
sw $t2, 4($a0)
sw $t3, 8($a0)
sw $t4, 12($a0)
addi $a0, $a0, 16 # advance by 4 words
addi $a1, $a1, 16
addi $a2, $a2, -1 # decrement loop count
bnez $a2, unroll_loop
copy_remainder:
# Handle remaining words
beqz $t0, copy_done
# ... handle 1-3 remaining words
jr $ra
Data Structure Manipulation
assembly
# Linked list insertion
list_insert:
# $a0 = list head pointer address, $a1 = new node
lw $t0, 0($a0) # load current head
sw $t0, 4($a1) # new_node->next = head
sw $a1, 0($a0) # head = new_node
jr $ra
# Binary search in sorted array
binary_search:
# $a0 = array, $a1 = length, $a2 = target
# Returns: $v0 = index (-1 if not found)
li $t0, 0 # left = 0
move $t1, $a1 # right = length
search_loop:
bge $t0, $t1, not_found # if left >= right, not found
add $t2, $t0, $t1 # mid = (left + right) / 2
srl $t2, $t2, 1
sll $t3, $t2, 2 # calculate array[mid] address
add $t3, $a0, $t3
lw $t4, 0($t3) # load array[mid]
beq $t4, $a2, found # if array[mid] == target, found
blt $t4, $a2, search_right
# Search left half
move $t1, $t2 # right = mid
j search_loop
search_right:
addi $t0, $t2, 1 # left = mid + 1
j search_loop
found:
move $v0, $t2 # return index
jr $ra
not_found:
li $v0, -1 # return -1
jr $ra
Performance Optimization
Pipeline Optimization
MIPS processors use instruction pipelines to achieve high performance, but pipeline hazards can reduce efficiency. Understanding these hazards enables better code generation and optimization.
assembly
# Poor scheduling - data hazard
lw $t0, 0($a0)
add $t1, $t0, $t2 # stall: $t0 not ready
sw $t1, 4($a0) # stall: $t1 not ready
# Better scheduling - avoid hazards
lw $t0, 0($a0)
lw $t3, 8($a0) # independent instruction
add $t1, $t0, $t2 # $t0 now ready
add $t4, $t3, $t2 # independent operation
sw $t1, 4($a0)
sw $t4, 12($a0)
Branch Optimization
assembly
# Minimize branch penalties
# Poor: many branches
beq $t0, $zero, skip1
add $s0, $s0, 1
skip1:
beq $t1, $zero, skip2
add $s1, $s1, 1
skip2:
# Better: combine conditions
or $t2, $t0, $t1
bnez $t2, skip_all
add $s0, $s0, 1
add $s1, $s1, 1
skip_all:
# Use conditional moves when available
# Instead of: if (a > b) c = a; else c = b;
slt $t0, $a1, $a0 # $t0 = (b < a)
movn $a2, $a0, $t0 # if $t0 != 0, $a2 = $a0
movz $a2, $a1, $t0 # if $t0 == 0, $a2 = $a1
Debugging and Development Tools
GDB Integration
MIPS assembly programs can be debugged using GDB with MIPS support, providing powerful debugging capabilities for development and testing.
bash
# Compile with debug information
mips-linux-gnu-gcc -g -o program program.s
# Debug with GDB
mips-linux-gnu-gdb program
# GDB commands for MIPS
(gdb) info registers # show all registers
(gdb) info registers $t0 # show specific register
(gdb) x/10i $pc # disassemble 10 instructions at PC
(gdb) stepi # single step instruction
(gdb) break *0x400000 # set breakpoint at address
(gdb) print $v0 # print register value
Simulation and Emulation
MIPS programs can be tested using various simulators and emulators, enabling development without physical hardware.
bash
# SPIM simulator
spim -file program.s
# MARS (MIPS Assembler and Runtime Simulator)
java -jar Mars.jar program.s
# QEMU system emulation
qemu-system-mips -M malta -kernel vmlinux
# QEMU user mode emulation
qemu-mips program
Best Practices and Common Patterns
Error Handling
assembly
# Function with error checking
safe_divide:
# $a0 = dividend, $a1 = divisor
# Returns: $v0 = result, $v1 = error code (0 = success)
beqz $a1, divide_by_zero # check for division by zero
div $a0, $a1 # perform division
mflo $v0 # get quotient
li $v1, 0 # success
jr $ra
divide_by_zero:
li $v0, 0 # result = 0
li $v1, 1 # error code = 1
jr $ra
Resource Management
assembly
# Stack-based local storage
allocate_locals:
# Allocate space for local variables
addi $sp, $sp, -16 # allocate 16 bytes
# Use stack space
sw $t0, 0($sp) # local variable 1
sw $t1, 4($sp) # local variable 2
sw $t2, 8($sp) # local variable 3
sw $t3, 12($sp) # local variable 4
# ... function body
# Deallocate space
addi $sp, $sp, 16 # restore stack pointer
jr $ra