MIPS Assembly Language

The classic RISC architecture powering networking and embedded systems worldwide

MIPS (Microprocessor without Interlocked Pipelined Stages) represents one of the most influential and enduring processor architectures in computing history. Originally developed by MIPS Computer Systems in the 1980s, this reduced instruction set computer (RISC) architecture has maintained its relevance through decades of technological evolution, currently dominating networking equipment with over 90% market share in routers and switches while continuing to serve critical roles in embedded systems, telecommunications, and computer architecture education.

Architecture Overview

MIPS embodies the fundamental principles of RISC design with a clean, orthogonal instruction set that prioritizes simplicity and performance predictability. The architecture features a load-store design where arithmetic operations work exclusively on registers, with separate instructions for memory access. This approach simplifies processor implementation while enabling high clock frequencies and efficient pipelining.

Key Design Principles

The MIPS architecture demonstrates several foundational design principles that have influenced processor development across the industry. The instruction set maintains a fixed 32-bit instruction length, simplifying instruction fetch and decode logic while enabling efficient instruction caching. All instructions execute in a consistent number of clock cycles, making performance analysis and optimization more predictable than complex instruction set computers (CISC).

The architecture employs a large register file with 32 general-purpose registers, reducing memory traffic and enabling efficient code generation by compilers. The simple addressing modes and regular instruction formats minimize hardware complexity while providing sufficient flexibility for diverse programming tasks. The delayed branch mechanism, while initially challenging for programmers, enables efficient pipeline implementation by eliminating branch delay slots in hardware.

Historical Context and Evolution

MIPS emerged during the 1980s RISC revolution, competing with other pioneering architectures like SPARC and ARM. The architecture gained significant traction in workstations, servers, and embedded systems throughout the 1990s and 2000s. While x86 processors dominated desktop computing and ARM captured mobile markets, MIPS found its niche in networking infrastructure and specialized embedded applications where its predictable performance and efficient implementation proved advantageous.

The architecture has evolved through multiple generations, from the original MIPS I through MIPS V, with the modern MIPS32 and MIPS64 specifications providing contemporary features while maintaining backward compatibility. Recent developments include the MIPS32/64 Release 6 specification, which modernizes the architecture with improved instruction encoding and enhanced features for current applications.

Register Architecture

MIPS provides 32 general-purpose registers, each 32 bits wide in MIPS32 or 64 bits wide in MIPS64 implementations. The register file represents one of the architecture's key strengths, providing ample storage for compiler optimization while maintaining reasonable implementation complexity.

Register Set and Naming Conventions

Register	Number	Name	Description	Preserved
$zero	$0	zero	Hard-wired zero	N/A
$at	$1	at	Assembler temporary	No
$v0-$v1	$2-$3	v0-v1	Function return values	No
$a0-$a3	$4-$7	a0-a3	Function arguments	No
$t0-$t7	$8-$15	t0-t7	Temporary registers	No
$s0-$s7	$16-$23	s0-s7	Saved registers	Yes
$t8-$t9	$24-$25	t8-t9	More temporary registers	No
$k0-$k1	$26-$27	k0-k1	Kernel registers	N/A
$gp	$28	gp	Global pointer	N/A
$sp	$29	sp	Stack pointer	Yes
$fp	$30	fp	Frame pointer	Yes
$ra	$31	ra	Return address	No

Register Usage Conventions

The MIPS calling convention establishes specific roles for registers to ensure compatibility between different compilers and software components. Argument registers $a0-$a3 pass the first four function parameters, with additional arguments passed on the stack. Return values use $v0 for single values and $v0-$v1 for 64-bit returns on MIPS32 systems.

Temporary registers $t0-$t9 provide scratch space for computations and need not be preserved across function calls, making them ideal for intermediate calculations and temporary storage. Saved registers $s0-$s7 must be preserved by called functions, making them suitable for variables that span multiple function calls or loop iterations.

The stack pointer $sp must always point to valid stack memory and is typically adjusted by called functions to allocate local storage. The frame pointer $fp optionally maintains a fixed reference point within the current stack frame, simplifying access to local variables and parameters in complex functions. The return address register $ra holds the return address for function calls, automatically set by jump-and-link instructions.

Special Purpose Registers

Beyond the general-purpose register file, MIPS processors include several special-purpose registers for system control and status monitoring. The program counter (PC) tracks the current instruction address but is not directly accessible through normal instructions. The HI and LO registers store the results of multiplication and division operations, accessible through special move instructions.

Coprocessor 0 (CP0) registers provide system control functionality including exception handling, memory management, and processor configuration. These registers enable operating system implementation and system-level programming but require privileged access in most implementations.

Instruction Set Architecture

The MIPS instruction set provides a comprehensive foundation for general-purpose computing while maintaining the simplicity and regularity that characterize RISC architectures. The base instruction set includes arithmetic, logical, memory access, and control flow operations sufficient for implementing complex software systems.

Instruction Formats

MIPS uses three basic instruction formats that encode different types of operations while maintaining consistent field positions for common elements like register specifiers and opcodes.

R-Type Instructions (Register)

31    26 25  21 20  16 15  11 10   6 5     0
[opcode] [rs] [rt] [rd] [shamt] [funct]

R-type instructions perform operations between registers, with two source registers (rs, rt) and one destination register (rd). The shamt field specifies shift amounts for shift operations, while the funct field provides additional opcode space for different operations within the same primary opcode.

assembly

add $t0, $t1, $t2       # $t0 = $t1 + $t2
sub $s0, $s1, $s2       # $s0 = $s1 - $s2
and $a0, $a1, $a2       # $a0 = $a1 & $a2
or  $v0, $v1, $t0       # $v0 = $v1 | $t0
sll $t3, $t4, 4         # $t3 = $t4 << 4

I-Type Instructions (Immediate)

31    26 25  21 20  16 15                0
[opcode] [rs] [rt] [immediate]

I-type instructions operate on a register and a 16-bit immediate value. This format covers immediate arithmetic, load and store operations, and conditional branches. The immediate field is sign-extended to 32 bits for arithmetic operations.

assembly

addi $t0, $t1, 100      # $t0 = $t1 + 100
lw   $s0, 8($sp)        # $s0 = memory[$sp + 8]
sw   $a0, 12($fp)       # memory[$fp + 12] = $a0
beq  $t0, $t1, label    # if $t0 == $t1, branch to label

J-Type Instructions (Jump)

31    26 25                            0
[opcode] [address]

J-type instructions perform unconditional jumps with a 26-bit address field. The target address is formed by combining the address field with the upper bits of the program counter, enabling jumps within a 256MB region.

assembly

j    target             # jump to target
jal  function           # jump to function, save return address

Arithmetic and Logical Instructions

MIPS provides a comprehensive set of arithmetic and logical instructions for both register-register and register-immediate operations. These instructions form the computational foundation for mathematical operations, bit manipulation, and address calculations.

Basic Arithmetic Operations

assembly

# Addition and subtraction
add  $rd, $rs, $rt      # $rd = $rs + $rt
addi $rt, $rs, imm      # $rt = $rs + sign_extend(imm)
addu $rd, $rs, $rt      # $rd = $rs + $rt (unsigned, no overflow)
addiu $rt, $rs, imm     # $rt = $rs + sign_extend(imm) (unsigned)
sub  $rd, $rs, $rt      # $rd = $rs - $rt
subu $rd, $rs, $rt      # $rd = $rs - $rt (unsigned, no overflow)

# Multiplication and division
mult  $rs, $rt          # HI:LO = $rs * $rt (signed)
multu $rs, $rt          # HI:LO = $rs * $rt (unsigned)
div   $rs, $rt          # LO = $rs / $rt, HI = $rs % $rt (signed)
divu  $rs, $rt          # LO = $rs / $rt, HI = $rs % $rt (unsigned)
mfhi  $rd               # $rd = HI
mflo  $rd               # $rd = LO
mthi  $rs               # HI = $rs
mtlo  $rs               # LO = $rs

Logical Operations

assembly

# Bitwise operations
and  $rd, $rs, $rt      # $rd = $rs & $rt
andi $rt, $rs, imm      # $rt = $rs & zero_extend(imm)
or   $rd, $rs, $rt      # $rd = $rs | $rt
ori  $rt, $rs, imm      # $rt = $rs | zero_extend(imm)
xor  $rd, $rs, $rt      # $rd = $rs ^ $rt
xori $rt, $rs, imm      # $rt = $rs ^ zero_extend(imm)
nor  $rd, $rs, $rt      # $rd = ~($rs | $rt)

# Shift operations
sll  $rd, $rt, shamt    # $rd = $rt << shamt
sllv $rd, $rt, $rs      # $rd = $rt << ($rs & 0x1F)
srl  $rd, $rt, shamt    # $rd = $rt >> shamt (logical)
srlv $rd, $rt, $rs      # $rd = $rt >> ($rs & 0x1F) (logical)
sra  $rd, $rt, shamt    # $rd = $rt >> shamt (arithmetic)
srav $rd, $rt, $rs      # $rd = $rt >> ($rs & 0x1F) (arithmetic)

Comparison Operations

assembly

# Set on less than
slt  $rd, $rs, $rt      # $rd = ($rs < $rt) ? 1 : 0 (signed)
slti $rt, $rs, imm      # $rt = ($rs < sign_extend(imm)) ? 1 : 0
sltu $rd, $rs, $rt      # $rd = ($rs < $rt) ? 1 : 0 (unsigned)
sltiu $rt, $rs, imm     # $rt = ($rs < sign_extend(imm)) ? 1 : 0 (unsigned)

Memory Access Instructions

MIPS employs a load-store architecture where arithmetic operations work exclusively on registers, with separate instructions for transferring data between registers and memory. This design simplifies processor implementation while enabling efficient caching and memory optimization.

Load Instructions

Load instructions transfer data from memory to registers, with support for different data sizes and sign extension options. The effective address is calculated by adding a 16-bit signed offset to a base register value.

assembly

# Load word (32-bit)
lw   $rt, offset($rs)   # $rt = memory[$rs + offset]

# Load halfword (16-bit)
lh   $rt, offset($rs)   # $rt = sign_extend(memory[$rs + offset][15:0])
lhu  $rt, offset($rs)   # $rt = zero_extend(memory[$rs + offset][15:0])

# Load byte (8-bit)
lb   $rt, offset($rs)   # $rt = sign_extend(memory[$rs + offset][7:0])
lbu  $rt, offset($rs)   # $rt = zero_extend(memory[$rs + offset][7:0])

# Load upper immediate
lui  $rt, imm           # $rt = imm << 16

# Load word left/right (unaligned access)
lwl  $rt, offset($rs)   # Load left portion of unaligned word
lwr  $rt, offset($rs)   # Load right portion of unaligned word

Store Instructions

Store instructions transfer data from registers to memory, supporting the same data sizes as load instructions. The stored value is taken from the source register, with appropriate masking for smaller data sizes.

assembly

# Store word (32-bit)
sw   $rt, offset($rs)   # memory[$rs + offset] = $rt

# Store halfword (16-bit)
sh   $rt, offset($rs)   # memory[$rs + offset][15:0] = $rt[15:0]

# Store byte (8-bit)
sb   $rt, offset($rs)   # memory[$rs + offset][7:0] = $rt[7:0]

# Store word left/right (unaligned access)
swl  $rt, offset($rs)   # Store left portion of unaligned word
swr  $rt, offset($rs)   # Store right portion of unaligned word

Addressing Modes

MIPS supports several addressing modes through combinations of instructions and register usage patterns. Base addressing uses a register plus offset for accessing data structures and stack variables. Indexed addressing combines two registers for array access, typically implemented using addition followed by load/store operations.

assembly

# Base addressing
lw $t0, 8($sp)          # Load from stack (base + offset)
sw $a0, 12($fp)         # Store to frame (base + offset)

# Indexed addressing (requires address calculation)
add $t1, $s0, $t2       # Calculate array[index] address
lw  $t0, 0($t1)         # Load array element

# PC-relative addressing (using labels)
la  $t0, data_label     # Load address of label
lw  $t1, 0($t0)         # Load data at label

Control Flow Instructions

Control flow instructions manage program execution by implementing conditional and unconditional branches, function calls, and returns. MIPS provides a rich set of branch instructions that compare registers and transfer control based on the results.

Conditional Branch Instructions

assembly

# Equality branches
beq  $rs, $rt, label    # if ($rs == $rt) goto label
bne  $rs, $rt, label    # if ($rs != $rt) goto label

# Zero comparison branches
beqz $rs, label         # if ($rs == 0) goto label (pseudo)
bnez $rs, label         # if ($rs != 0) goto label (pseudo)

# Comparison branches
blez $rs, label         # if ($rs <= 0) goto label
bgtz $rs, label         # if ($rs > 0) goto label
bltz $rs, label         # if ($rs < 0) goto label
bgez $rs, label         # if ($rs >= 0) goto label

# Branch and link variants
bltzal $rs, label       # if ($rs < 0) { $ra = PC + 8; goto label }
bgezal $rs, label       # if ($rs >= 0) { $ra = PC + 8; goto label }

Unconditional Jump Instructions

assembly

# Jump instructions
j    target             # goto target
jal  target             # $ra = PC + 8; goto target
jr   $rs                # goto $rs
jalr $rd, $rs           # $rd = PC + 8; goto $rs

# Common jump patterns
jal  function           # Call function
jr   $ra                # Return from function

Branch Delay Slots

MIPS implements delayed branches where the instruction immediately following a branch or jump executes regardless of whether the branch is taken. This design simplifies pipeline implementation but requires careful instruction scheduling.

assembly

# Branch with delay slot
beq $t0, $t1, target
add $t2, $t3, $t4       # This instruction always executes

# Jump with delay slot
j   function
nop                     # Delay slot (no operation)

# Optimized delay slot usage
beq $t0, $t1, skip
add $s0, $s1, $s2       # Useful work in delay slot
skip:

Pseudoinstructions

MIPS assembly language includes numerous pseudoinstructions that expand to one or more actual instructions, providing convenient mnemonics for common operations and simplifying assembly programming.

Data Movement Pseudoinstructions

assembly

# Register operations
move $rd, $rs           # add $rd, $rs, $zero
nop                     # sll $zero, $zero, 0
clear $rd               # add $rd, $zero, $zero

# Immediate loading
li   $rt, imm           # Load 32-bit immediate (may use lui + ori)
la   $rt, label         # Load address of label

# Examples of li expansion:
li $t0, 0x1234          # ori $t0, $zero, 0x1234
li $t1, 0x12345678      # lui $t1, 0x1234; ori $t1, $t1, 0x5678

Arithmetic Pseudoinstructions

assembly

# Negation
neg  $rd, $rs           # sub $rd, $zero, $rs
negu $rd, $rs           # subu $rd, $zero, $rs

# Absolute value
abs  $rd, $rs           # Complex expansion with branches

# Multiplication with immediate
mul  $rd, $rs, imm      # Load immediate, then mult/mflo

# Division with immediate
div  $rd, $rs, imm      # Load immediate, then div/mflo
rem  $rd, $rs, $rt      # div $rs, $rt; mfhi $rd

Comparison Pseudoinstructions

assembly

# Set operations
seq  $rd, $rs, $rt      # Set if equal
sne  $rd, $rs, $rt      # Set if not equal
sge  $rd, $rs, $rt      # Set if greater or equal
sgt  $rd, $rs, $rt      # Set if greater than
sle  $rd, $rs, $rt      # Set if less or equal

# Branch pseudoinstructions
blt  $rs, $rt, label    # Branch if less than
bgt  $rs, $rt, label    # Branch if greater than
ble  $rs, $rt, label    # Branch if less or equal
bge  $rs, $rt, label    # Branch if greater or equal

Programming Examples

Hello World Program

assembly

.data
hello_msg:  .asciiz "Hello, MIPS World!\n"

.text
.globl main

main:
    # Print string system call
    li $v0, 4               # sys_print_string
    la $a0, hello_msg       # load string address
    syscall                 # system call
    
    # Exit program
    li $v0, 10              # sys_exit
    syscall                 # system call

Factorial Function

assembly

.text
.globl factorial

# Calculate factorial of n
# Input: $a0 = n
# Output: $v0 = n!
factorial:
    # Base case: if n <= 1, return 1
    li $t0, 1
    ble $a0, $t0, fact_base
    
    # Save registers
    addi $sp, $sp, -8
    sw $ra, 4($sp)
    sw $a0, 0($sp)
    
    # Recursive call: factorial(n-1)
    addi $a0, $a0, -1
    jal factorial
    
    # Restore n and calculate n * factorial(n-1)
    lw $a0, 0($sp)
    lw $ra, 4($sp)
    addi $sp, $sp, 8
    
    mult $a0, $v0           # n * factorial(n-1)
    mflo $v0                # result in $v0
    jr $ra
    
fact_base:
    li $v0, 1               # return 1
    jr $ra

Array Sum Function

assembly

.text
.globl array_sum

# Calculate sum of array elements
# Input: $a0 = array address, $a1 = array length
# Output: $v0 = sum
array_sum:
    li $v0, 0               # sum = 0
    li $t0, 0               # index = 0
    
sum_loop:
    bge $t0, $a1, sum_done  # if index >= length, exit
    
    sll $t1, $t0, 2         # t1 = index * 4 (word size)
    add $t2, $a0, $t1       # t2 = array + offset
    lw $t3, 0($t2)          # load array[index]
    add $v0, $v0, $t3       # sum += array[index]
    
    addi $t0, $t0, 1        # index++
    j sum_loop
    
sum_done:
    jr $ra                  # return sum in $v0

String Length Function

assembly

.text
.globl strlen

# Calculate string length
# Input: $a0 = string address
# Output: $v0 = string length
strlen:
    move $t0, $a0           # save original address
    li $v0, 0               # length = 0
    
strlen_loop:
    lb $t1, 0($t0)          # load byte
    beqz $t1, strlen_done   # if null terminator, done
    
    addi $v0, $v0, 1        # increment length
    addi $t0, $t0, 1        # advance pointer
    j strlen_loop
    
strlen_done:
    jr $ra                  # return length in $v0

MIPS32 and MIPS64 Extensions

MIPS64 Architecture

MIPS64 extends the base architecture to support 64-bit operations while maintaining compatibility with 32-bit code. The register file expands to 64 bits per register, and additional instructions handle 64-bit arithmetic and memory operations.

assembly

# 64-bit arithmetic (MIPS64)
dadd  $rd, $rs, $rt      # 64-bit addition
daddi $rt, $rs, imm      # 64-bit add immediate
dsub  $rd, $rs, $rt      # 64-bit subtraction
dmult $rs, $rt           # 64-bit multiplication
ddiv  $rs, $rt           # 64-bit division

# 64-bit memory operations
ld    $rt, offset($rs)   # Load doubleword (64-bit)
sd    $rt, offset($rs)   # Store doubleword (64-bit)
ldl   $rt, offset($rs)   # Load doubleword left
ldr   $rt, offset($rs)   # Load doubleword right
sdl   $rt, offset($rs)   # Store doubleword left
sdr   $rt, offset($rs)   # Store doubleword right

# 64-bit shifts
dsll  $rd, $rt, shamt    # Doubleword shift left logical
dsrl  $rd, $rt, shamt    # Doubleword shift right logical
dsra  $rd, $rt, shamt    # Doubleword shift right arithmetic
dsll32 $rd, $rt, shamt   # Doubleword shift left logical + 32
dsrl32 $rd, $rt, shamt   # Doubleword shift right logical + 32
dsra32 $rd, $rt, shamt   # Doubleword shift right arithmetic + 32

Floating-Point Operations

MIPS processors typically include a floating-point coprocessor (FPU) that provides IEEE 754-compliant floating-point arithmetic. The FPU has its own register file and instruction set.

assembly

# Single-precision floating-point
add.s $f0, $f1, $f2     # $f0 = $f1 + $f2 (single)
sub.s $f0, $f1, $f2     # $f0 = $f1 - $f2 (single)
mul.s $f0, $f1, $f2     # $f0 = $f1 * $f2 (single)
div.s $f0, $f1, $f2     # $f0 = $f1 / $f2 (single)

# Double-precision floating-point
add.d $f0, $f2, $f4     # $f0:$f1 = $f2:$f3 + $f4:$f5 (double)
sub.d $f0, $f2, $f4     # $f0:$f1 = $f2:$f3 - $f4:$f5 (double)
mul.d $f0, $f2, $f4     # $f0:$f1 = $f2:$f3 * $f4:$f5 (double)
div.d $f0, $f2, $f4     # $f0:$f1 = $f2:$f3 / $f4:$f5 (double)

# Floating-point load/store
lwc1  $f0, offset($rs)  # Load word to FPU register
swc1  $f0, offset($rs)  # Store word from FPU register
ldc1  $f0, offset($rs)  # Load doubleword to FPU register
sdc1  $f0, offset($rs)  # Store doubleword from FPU register

# Floating-point comparisons
c.eq.s $f0, $f1         # Compare equal (single)
c.lt.s $f0, $f1         # Compare less than (single)
c.le.s $f0, $f1         # Compare less or equal (single)
bc1t  label             # Branch if FP condition true
bc1f  label             # Branch if FP condition false

System Programming

Exception Handling

MIPS provides a comprehensive exception handling mechanism for implementing operating systems and handling runtime errors. Exceptions transfer control to predefined handler addresses with minimal hardware overhead.

assembly

# Exception vector addresses (typical)
# 0x80000000: Reset/NMI
# 0x80000180: General exception
# 0x80000200: Interrupt

.ktext 0x80000180
exception_handler:
    # Save all registers (kernel exception handler)
    .set noat
    move $k0, $at           # save $at in $k0
    .set at
    
    # Save registers to kernel stack
    la $k1, kernel_stack
    sw $v0, 0($k1)
    sw $v1, 4($k1)
    sw $a0, 8($k1)
    # ... save all registers
    
    # Determine exception cause
    mfc0 $k0, $13           # read Cause register
    andi $k0, $k0, 0x7C     # extract exception code
    srl $k0, $k0, 2         # shift to get code
    
    # Jump to specific handler
    la $k1, exception_table
    sll $k0, $k0, 2         # multiply by 4 for word offset
    add $k1, $k1, $k0
    lw $k1, 0($k1)
    jr $k1
    
# System call handler
syscall_handler:
    # $v0 contains system call number
    # $a0-$a3 contain arguments
    
    # Implement system call dispatch
    beq $v0, 1, sys_print_int
    beq $v0, 4, sys_print_string
    beq $v0, 10, sys_exit
    # ... other system calls
    
    # Return from exception
    mfc0 $k0, $14           # read EPC (return address)
    addi $k0, $k0, 4        # skip syscall instruction
    mtc0 $k0, $14           # update EPC
    
    # Restore registers
    # ... restore all registers
    
    eret                    # return from exception

Memory Management

assembly

# TLB (Translation Lookaside Buffer) management
tlb_miss_handler:
    # Handle TLB miss exception
    mfc0 $k0, $8            # read BadVAddr register
    mfc0 $k1, $10           # read EntryHi register
    
    # Look up page table entry
    # ... page table lookup code
    
    # Load TLB entry
    mtc0 $k0, $2            # write EntryLo0
    mtc0 $k1, $3            # write EntryLo1
    mtc0 $t0, $10           # write EntryHi
    tlbwr                   # write TLB entry
    
    eret                    # return from exception

# Cache management
cache_flush:
    # Flush instruction cache
    li $t0, 0x10000         # cache size
    li $t1, 32              # cache line size
    
flush_loop:
    cache 0x0, 0($t0)       # index invalidate I-cache
    subu $t0, $t0, $t1      # next cache line
    bgez $t0, flush_loop
    
    jr $ra

Assembly Programming Techniques

Function Calling Convention

MIPS follows a standard calling convention that ensures compatibility between different compilers and libraries. Understanding this convention is essential for writing assembly functions that interface with high-level language code.

assembly

# Standard function prologue
function_name:
    # Save return address and callee-saved registers
    addi $sp, $sp, -32      # allocate stack frame
    sw $ra, 28($sp)         # save return address
    sw $fp, 24($sp)         # save frame pointer
    sw $s0, 20($sp)         # save callee-saved register
    sw $s1, 16($sp)         # save callee-saved register
    addi $fp, $sp, 32       # set frame pointer
    
    # Function body
    # Arguments in $a0-$a3, return value in $v0-$v1
    # Use $t0-$t9 for temporary values
    # Use $s0-$s7 for values that must survive function calls
    
    # Function epilogue
    lw $ra, 28($sp)         # restore return address
    lw $fp, 24($sp)         # restore frame pointer
    lw $s0, 20($sp)         # restore callee-saved register
    lw $s1, 16($sp)         # restore callee-saved register
    addi $sp, $sp, 32       # deallocate stack frame
    jr $ra                  # return to caller

Optimized Memory Operations

assembly

# Optimized memory copy (word-aligned)
memcpy_words:
    # $a0 = destination, $a1 = source, $a2 = word count
    beqz $a2, copy_done     # if count == 0, done
    
copy_loop:
    lw $t0, 0($a1)          # load word from source
    sw $t0, 0($a0)          # store word to destination
    addi $a0, $a0, 4        # advance destination
    addi $a1, $a1, 4        # advance source
    addi $a2, $a2, -1       # decrement count
    bnez $a2, copy_loop     # continue if count > 0
    
copy_done:
    jr $ra

# Unrolled loop for better performance
unrolled_copy:
    # Copy 4 words at a time
    andi $t0, $a2, 3        # remainder when divided by 4
    srl $a2, $a2, 2         # divide count by 4
    beqz $a2, copy_remainder
    
unroll_loop:
    lw $t1, 0($a1)          # load 4 words
    lw $t2, 4($a1)
    lw $t3, 8($a1)
    lw $t4, 12($a1)
    
    sw $t1, 0($a0)          # store 4 words
    sw $t2, 4($a0)
    sw $t3, 8($a0)
    sw $t4, 12($a0)
    
    addi $a0, $a0, 16       # advance by 4 words
    addi $a1, $a1, 16
    addi $a2, $a2, -1       # decrement loop count
    bnez $a2, unroll_loop
    
copy_remainder:
    # Handle remaining words
    beqz $t0, copy_done
    # ... handle 1-3 remaining words
    jr $ra

Data Structure Manipulation

assembly

# Linked list insertion
list_insert:
    # $a0 = list head pointer address, $a1 = new node
    lw $t0, 0($a0)          # load current head
    sw $t0, 4($a1)          # new_node->next = head
    sw $a1, 0($a0)          # head = new_node
    jr $ra

# Binary search in sorted array
binary_search:
    # $a0 = array, $a1 = length, $a2 = target
    # Returns: $v0 = index (-1 if not found)
    
    li $t0, 0               # left = 0
    move $t1, $a1           # right = length
    
search_loop:
    bge $t0, $t1, not_found # if left >= right, not found
    
    add $t2, $t0, $t1       # mid = (left + right) / 2
    srl $t2, $t2, 1
    
    sll $t3, $t2, 2         # calculate array[mid] address
    add $t3, $a0, $t3
    lw $t4, 0($t3)          # load array[mid]
    
    beq $t4, $a2, found     # if array[mid] == target, found
    blt $t4, $a2, search_right
    
    # Search left half
    move $t1, $t2           # right = mid
    j search_loop
    
search_right:
    addi $t0, $t2, 1        # left = mid + 1
    j search_loop
    
found:
    move $v0, $t2           # return index
    jr $ra
    
not_found:
    li $v0, -1              # return -1
    jr $ra

Performance Optimization

Pipeline Optimization

MIPS processors use instruction pipelines to achieve high performance, but pipeline hazards can reduce efficiency. Understanding these hazards enables better code generation and optimization.

assembly

# Poor scheduling - data hazard
lw $t0, 0($a0)
add $t1, $t0, $t2       # stall: $t0 not ready
sw $t1, 4($a0)          # stall: $t1 not ready

# Better scheduling - avoid hazards
lw $t0, 0($a0)
lw $t3, 8($a0)          # independent instruction
add $t1, $t0, $t2       # $t0 now ready
add $t4, $t3, $t2       # independent operation
sw $t1, 4($a0)
sw $t4, 12($a0)

Branch Optimization

assembly

# Minimize branch penalties
# Poor: many branches
beq $t0, $zero, skip1
add $s0, $s0, 1
skip1:
beq $t1, $zero, skip2
add $s1, $s1, 1
skip2:

# Better: combine conditions
or $t2, $t0, $t1
bnez $t2, skip_all
add $s0, $s0, 1
add $s1, $s1, 1
skip_all:

# Use conditional moves when available
# Instead of: if (a > b) c = a; else c = b;
slt $t0, $a1, $a0       # $t0 = (b < a)
movn $a2, $a0, $t0      # if $t0 != 0, $a2 = $a0
movz $a2, $a1, $t0      # if $t0 == 0, $a2 = $a1

Debugging and Development Tools

GDB Integration

MIPS assembly programs can be debugged using GDB with MIPS support, providing powerful debugging capabilities for development and testing.

bash

# Compile with debug information
mips-linux-gnu-gcc -g -o program program.s

# Debug with GDB
mips-linux-gnu-gdb program

# GDB commands for MIPS
(gdb) info registers        # show all registers
(gdb) info registers $t0    # show specific register
(gdb) x/10i $pc            # disassemble 10 instructions at PC
(gdb) stepi                # single step instruction
(gdb) break *0x400000      # set breakpoint at address
(gdb) print $v0            # print register value

Simulation and Emulation

MIPS programs can be tested using various simulators and emulators, enabling development without physical hardware.

bash

# SPIM simulator
spim -file program.s

# MARS (MIPS Assembler and Runtime Simulator)
java -jar Mars.jar program.s

# QEMU system emulation
qemu-system-mips -M malta -kernel vmlinux

# QEMU user mode emulation
qemu-mips program

Best Practices and Common Patterns

Error Handling

assembly

# Function with error checking
safe_divide:
    # $a0 = dividend, $a1 = divisor
    # Returns: $v0 = result, $v1 = error code (0 = success)
    
    beqz $a1, divide_by_zero    # check for division by zero
    
    div $a0, $a1                # perform division
    mflo $v0                    # get quotient
    li $v1, 0                   # success
    jr $ra
    
divide_by_zero:
    li $v0, 0                   # result = 0
    li $v1, 1                   # error code = 1
    jr $ra

Resource Management

assembly

# Stack-based local storage
allocate_locals:
    # Allocate space for local variables
    addi $sp, $sp, -16          # allocate 16 bytes
    
    # Use stack space
    sw $t0, 0($sp)              # local variable 1
    sw $t1, 4($sp)              # local variable 2
    sw $t2, 8($sp)              # local variable 3
    sw $t3, 12($sp)             # local variable 4
    
    # ... function body
    
    # Deallocate space
    addi $sp, $sp, 16           # restore stack pointer
    jr $ra

MIPS Assembly Language ​

Architecture Overview ​

Key Design Principles ​

Historical Context and Evolution ​

Register Architecture ​

Register Set and Naming Conventions ​

Register Usage Conventions ​

Special Purpose Registers ​

Instruction Set Architecture ​

Instruction Formats ​

R-Type Instructions (Register) ​

I-Type Instructions (Immediate) ​

J-Type Instructions (Jump) ​

Arithmetic and Logical Instructions ​

Basic Arithmetic Operations ​

Logical Operations ​

Comparison Operations ​

Memory Access Instructions ​

Load Instructions ​

Store Instructions ​

Addressing Modes ​

Control Flow Instructions ​

Conditional Branch Instructions ​

Unconditional Jump Instructions ​

Branch Delay Slots ​

Pseudoinstructions ​

Data Movement Pseudoinstructions ​

Arithmetic Pseudoinstructions ​

Comparison Pseudoinstructions ​

Programming Examples ​

Hello World Program ​

Factorial Function ​

Array Sum Function ​

String Length Function ​

MIPS32 and MIPS64 Extensions ​

MIPS64 Architecture ​

Floating-Point Operations ​

System Programming ​

Exception Handling ​

Memory Management ​

Assembly Programming Techniques ​

Function Calling Convention ​

Optimized Memory Operations ​

Data Structure Manipulation ​

Performance Optimization ​

Pipeline Optimization ​

Branch Optimization ​

Debugging and Development Tools ​

GDB Integration ​

Simulation and Emulation ​

Best Practices and Common Patterns ​

Error Handling ​

Resource Management ​

MIPS Assembly Language

Architecture Overview

Key Design Principles

Historical Context and Evolution

Register Architecture

Register Set and Naming Conventions

Register Usage Conventions

Special Purpose Registers

Instruction Set Architecture

Instruction Formats

R-Type Instructions (Register)

I-Type Instructions (Immediate)

J-Type Instructions (Jump)

Arithmetic and Logical Instructions

Basic Arithmetic Operations

Logical Operations

Comparison Operations

Memory Access Instructions

Load Instructions

Store Instructions

Addressing Modes

Control Flow Instructions

Conditional Branch Instructions

Unconditional Jump Instructions

Branch Delay Slots

Pseudoinstructions

Data Movement Pseudoinstructions

Arithmetic Pseudoinstructions

Comparison Pseudoinstructions

Programming Examples

Hello World Program

Factorial Function

Array Sum Function

String Length Function

MIPS32 and MIPS64 Extensions

MIPS64 Architecture

Floating-Point Operations

System Programming

Exception Handling

Memory Management

Assembly Programming Techniques

Function Calling Convention

Optimized Memory Operations

Data Structure Manipulation

Performance Optimization

Pipeline Optimization

Branch Optimization

Debugging and Development Tools

GDB Integration

Simulation and Emulation

Best Practices and Common Patterns

Error Handling

Resource Management