Skip to content

NumPy Cheatsheet

Installation

Platform Command
Ubuntu/Debian sudo apt update && sudo apt install python3-numpy
Ubuntu/Debian (pip) pip install numpy
macOS (pip) pip3 install numpy
macOS (Homebrew) brew install python && pip3 install numpy
Windows (pip) pip install numpy
Anaconda (all platforms) conda install numpy
Specific version pip install numpy==1.24.3
With optimizations pip install numpy[mkl]
Virtual environment python -m venv myenv && source myenv/bin/activate && pip install numpy
Verify installation python -c "import numpy as np; print(np.__version__)"

Basic Commands

Array Creation

Command Description
np.array([1, 2, 3]) Create 1D array from list
np.array([[1, 2], [3, 4]]) Create 2D array from nested list
np.zeros((3, 4)) Create 3×4 array filled with zeros
np.ones((2, 3)) Create 2×3 array filled with ones
np.full((3, 3), 7) Create 3×3 array filled with value 7
np.empty((2, 2)) Create 2×2 uninitialized array (faster)
np.arange(0, 10, 2) Create array [0, 2, 4, 6, 8] with step
np.linspace(0, 1, 5) Create 5 evenly spaced values from 0 to 1
np.eye(3) Create 3×3 identity matrix
np.zeros_like(arr) Create zeros array with same shape as arr
np.random.rand(3, 4) Create 3×4 array with random values [0,1)
np.random.randint(0, 10, (3, 4)) Create 3×4 array with random integers [0,10)

Array Properties

Command Description
arr.shape Get dimensions of array (e.g., (3, 4))
arr.ndim Get number of dimensions
arr.size Get total number of elements
arr.dtype Get data type of elements
arr.itemsize Get size of each element in bytes
arr.nbytes Get total bytes consumed by array
arr.T Get transposed array
len(arr) Get length of first dimension

Array Indexing & Slicing

Command Description
arr[0] Access first element
arr[-1] Access last element
arr[1:4] Slice elements at indices 1, 2, 3
arr[::2] Get every other element
arr[::-1] Reverse array
arr[0, 1] Access element at row 0, column 1 (2D)
arr[0:2, 1:3] Slice rows 0-1, columns 1-2 (2D)
arr[arr > 5] Boolean indexing: get elements > 5
arr[[0, 2, 4]] Fancy indexing: get elements at indices 0, 2, 4
np.where(arr > 5) Get indices where condition is True

Basic Mathematical Operations

Command Description
arr + 5 Add scalar to all elements
arr * 2 Multiply all elements by scalar
arr1 + arr2 Element-wise addition of arrays
arr1 * arr2 Element-wise multiplication
arr1 / arr2 Element-wise division
arr ** 2 Square all elements
np.sqrt(arr) Square root of all elements
np.abs(arr) Absolute value of all elements
np.sum(arr) Sum of all elements
np.mean(arr) Mean of all elements
np.std(arr) Standard deviation
np.min(arr) Minimum value
np.max(arr) Maximum value
np.argmin(arr) Index of minimum value
np.argmax(arr) Index of maximum value

Array Reshaping

Command Description
arr.reshape(3, 4) Reshape to 3×4 (must have same total elements)
arr.flatten() Convert to 1D array (copy)
arr.ravel() Convert to 1D array (view, faster)
arr.T Transpose array (swap dimensions)
np.expand_dims(arr, axis=0) Add new dimension at specified axis
np.squeeze(arr) Remove single-dimensional entries
arr.resize((3, 4)) Resize array in-place (can change size)

Array Concatenation & Splitting

Command Description
np.concatenate([arr1, arr2]) Concatenate arrays along existing axis
np.vstack([arr1, arr2]) Stack arrays vertically (row-wise)
np.hstack([arr1, arr2]) Stack arrays horizontally (column-wise)
np.column_stack([arr1, arr2]) Stack 1D arrays as columns
np.split(arr, 3) Split array into 3 equal parts
np.vsplit(arr, 2) Split array vertically into 2 parts
np.hsplit(arr, 3) Split array horizontally into 3 parts
np.array_split(arr, 3) Split into 3 parts (allows unequal splits)

Advanced Usage

Linear Algebra

Command Description
np.dot(a, b) Dot product of two arrays
a @ b Matrix multiplication (Python 3.5+)
np.matmul(a, b) Matrix multiplication (explicit)
np.linalg.inv(matrix) Inverse of matrix
np.linalg.det(matrix) Determinant of matrix
np.linalg.eig(matrix) Eigenvalues and eigenvectors
np.linalg.svd(matrix) Singular Value Decomposition
np.linalg.solve(A, b) Solve linear system Ax = b
np.linalg.norm(arr) Compute vector/matrix norm
np.linalg.matrix_rank(matrix) Rank of matrix
np.linalg.qr(matrix) QR decomposition
np.linalg.cholesky(matrix) Cholesky decomposition
np.trace(matrix) Sum of diagonal elements
np.diag(arr) Extract diagonal or create diagonal matrix

Statistical Functions

Command Description
np.median(arr) Median value
np.var(arr) Variance
np.percentile(arr, 75) 75th percentile
np.quantile(arr, 0.75) 0.75 quantile (same as 75th percentile)
np.corrcoef(arr1, arr2) Correlation coefficient matrix
np.cov(arr1, arr2) Covariance matrix
np.histogram(arr, bins=10) Compute histogram
np.bincount(arr) Count occurrences of each integer
np.average(arr, weights=w) Weighted average
np.cumsum(arr) Cumulative sum
np.cumprod(arr) Cumulative product
np.diff(arr) Discrete difference between consecutive elements

Broadcasting & Advanced Indexing

Command Description
arr + np.array([1, 2, 3]) Broadcasting: add array to each row
np.broadcast_to(arr, (3, 3)) Explicitly broadcast array to shape
np.newaxis Add new axis for broadcasting (e.g., arr[:, np.newaxis])
np.take(arr, [0, 2, 4]) Take elements at specified indices
np.put(arr, [0, 2], [99, 88]) Put values at specified indices
np.select([cond1, cond2], [val1, val2]) Choose values based on conditions
np.choose([0, 1, 0], [arr1, arr2, arr3]) Choose elements from multiple arrays
np.compress(condition, arr) Select elements using boolean array
np.extract(condition, arr) Extract elements satisfying condition

Universal Functions (ufuncs)

Command Description
np.sin(arr), np.cos(arr), np.tan(arr) Trigonometric functions
np.arcsin(arr), np.arccos(arr), np.arctan(arr) Inverse trigonometric functions
np.exp(arr) Exponential (e^x)
np.log(arr) Natural logarithm
np.log10(arr), np.log2(arr) Base-10 and base-2 logarithms
np.power(arr, 3) Raise to power (element-wise)
np.ceil(arr), np.floor(arr) Round up/down to nearest integer
np.round(arr, decimals=2) Round to specified decimals
np.clip(arr, min, max) Clip values to range [min, max]
np.sign(arr) Sign of elements (-1, 0, or 1)
np.maximum(arr1, arr2) Element-wise maximum of two arrays
np.minimum(arr1, arr2) Element-wise minimum of two arrays

Random Number Generation

Command Description
np.random.seed(42) Set random seed for reproducibility
np.random.rand(3, 4) Random floats in [0, 1) with shape (3, 4)
np.random.randn(3, 4) Random standard normal distribution
np.random.randint(0, 10, (3, 4)) Random integers in [0, 10)
np.random.random((3, 4)) Random floats in [0, 1)
np.random.choice(arr, size=5) Random sample from array
np.random.shuffle(arr) Shuffle array in-place
np.random.permutation(arr) Random permutation (returns copy)
np.random.uniform(0, 10, size=100) Uniform distribution [0, 10)
np.random.normal(0, 1, size=100) Normal distribution (mean=0, std=1)
np.random.exponential(2.0, size=100) Exponential distribution
np.random.binomial(10, 0.5, size=100) Binomial distribution
rng = np.random.default_rng(42) Create new random generator (modern API)
rng.random((3, 4)) Generate random floats with new API

Memory & Performance

Command Description
arr.copy() Create deep copy of array
arr.view() Create view (shares memory with original)
arr.astype(np.float32) Convert array to different data type
np.ascontiguousarray(arr) Return contiguous array in memory
arr.flags Get memory layout information
np.shares_memory(arr1, arr2) Check if arrays share memory
np.may_share_memory(arr1, arr2) Check if arrays might share memory
arr.nbytes Get total bytes consumed
sys.getsizeof(arr) Get size including overhead (import sys)

Configuration

Data Types

NumPy supports various data types for memory optimization:

# Integer types
np.int8      # -128 to 127
np.int16     # -32,768 to 32,767
np.int32     # -2^31 to 2^31-1
np.int64     # -2^63 to 2^63-1
np.uint8     # 0 to 255
np.uint16    # 0 to 65,535

# Float types
np.float16   # Half precision
np.float32   # Single precision
np.float64   # Double precision (default)

# Complex types
np.complex64  # Two 32-bit floats
np.complex128 # Two 64-bit floats

# Boolean
np.bool_     # True or False

# String types
np.str_      # Unicode string
np.bytes_    # Byte string

# Create array with specific dtype
arr = np.array([1, 2, 3], dtype=np.float32)

Configure how arrays are displayed:

# Set print options
np.set_printoptions(
    precision=3,        # Decimal places
    suppress=True,      # Suppress scientific notation
    threshold=1000,     # Max elements before summarizing
    edgeitems=3,        # Items at start/end when summarizing
    linewidth=120       # Characters per line
)

# Example: suppress scientific notation
np.set_printoptions(suppress=True)
print(np.array([1e-10, 1e10]))  # [0. 10000000000.]

# Reset to defaults
np.set_printoptions()

Error Handling

Configure how NumPy handles numerical errors:

# Set error handling
np.seterr(
    divide='warn',    # Division by zero: 'ignore', 'warn', 'raise', 'call'
    over='warn',      # Overflow: 'ignore', 'warn', 'raise', 'call'
    under='ignore',   # Underflow
    invalid='warn'    # Invalid operation (e.g., sqrt(-1))
)

# Context manager for temporary settings
with np.errstate(divide='ignore'):
    result = np.array([1, 2]) / 0  # No warning in this block

# Check for errors after operations
np.seterr(all='ignore')
result = np.array([1]) / 0
if np.isinf(result).any():
    print("Infinity detected")

Random Number Generator Configuration

# Legacy API (older code)
np.random.seed(42)

# Modern API (recommended)
rng = np.random.default_rng(seed=42)

# Use different algorithms
from numpy.random import PCG64, Philox, MT19937
rng = np.random.Generator(PCG64(seed=42))  # Default, fastest
rng = np.random.Generator(Philox(seed=42))  # Parallel streams
rng = np.random.Generator(MT19937(seed=42)) # Legacy compatibility

Common Use Cases

Use Case 1: Data Normalization

Normalize data to zero mean and unit variance (standardization):

import numpy as np

# Sample data
data = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]], dtype=float)

# Standardize: (x - mean) / std
mean = np.mean(data, axis=0)
std = np.std(data, axis=0)
normalized = (data - mean) / std

print(normalized)
# [[-1.22474487 -1.22474487 -1.22474487]
#  [ 0.          0.          0.        ]
#  [ 1.22474487  1.22474487  1.22474487]]

# Min-Max normalization: (x - min) / (max - min)
min_val = np.min(data, axis=0)
max_val = np.max(data, axis=0)
normalized_minmax = (data - min_val) / (max_val - min_val)

Use Case 2: Image Processing

Manipulate images as NumPy arrays:

import numpy as np
from PIL import Image

# Load image as array
img = np.array(Image.open('photo.jpg'))
print(f"Shape: {img.shape}")  # (height, width, channels)

# Convert to grayscale
gray = np.mean(img, axis=2).astype(np.uint8)

# Crop image (top-left 100x100 pixels)
cropped = img[:100, :100]

# Flip image vertically
flipped = np.flipud(img)

# Rotate 90 degrees
rotated = np.rot90(img)

# Adjust brightness (add value to all pixels)
brighter = np.clip(img + 50, 0, 255).astype(np.uint8)

# Apply threshold
threshold = 128
binary = np.where(gray > threshold, 255, 0).astype(np.uint8)

# Save result
Image.fromarray(binary).save('processed.jpg')

Use Case 3: Time Series Analysis

Calculate moving averages and statistics:

import numpy as np

# Sample time series data
prices = np.array([100, 102, 101, 105, 107, 106, 108, 110, 109, 111])

# Simple moving average (window size 3)
window = 3
moving_avg = np.convolve(prices, np.ones(window)/window, mode='valid')
print(f"Moving average: {moving_avg}")

# Calculate returns (percentage change)
returns = np.diff(prices) / prices[:-1] * 100
print(f"Returns: {returns}")

# Cumulative returns
cumulative_returns = np.cumprod(1 + returns/100) - 1
print(f"Cumulative returns: {cumulative_returns}")

# Volatility (rolling standard deviation)
def rolling_std(arr, window):
    return np.array([np.std(arr[i:i+window]) for i in range(len(arr)-window+1)])

volatility = rolling_std(returns, window=3)
print(f"Volatility: {volatility}")

# Detect outliers (values > 2 std from mean)
mean = np.mean(prices)
std = np.std(prices)
outliers = np.abs(prices - mean) > 2 * std
print(f"Outliers at indices: {np.where(outliers)[0]}")

Use Case 4: Matrix Operations for Machine Learning

Implement basic neural network operations:

import numpy as np

# Initialize weights and biases
np.random.seed(42)
input_size, hidden_size, output_size = 4, 5, 3
W1 = np.random.randn(input_size, hidden_size) * 0.01
b1 = np.zeros((1, hidden_size))
W2 = np.random.randn(hidden_size, output_size) * 0.01
b2 = np.zeros((1, output_size))

# Sample input (batch of 10 samples)
X = np.random.randn(10, input_size)

# Forward pass
def sigmoid(x):
    return 1 / (1 + np.exp(-x))

def softmax(x):
    exp_x = np.exp(x - np.max(x, axis=1, keepdims=True))
    return exp_x / np.sum(exp_x, axis=1, keepdims=True)

# Hidden layer
z1 = np.dot(X, W1) + b1
a1 = sigmoid(z1)

# Output layer
z2 = np.dot(a1, W2) + b2
a2 = softmax(z2)

print(f"Output shape: {a2.shape}")  # (10, 3)
print(f"Output probabilities sum to 1: {np.allclose(np.sum(a2, axis=1), 1)}")

# Calculate loss (cross-entropy)
y_true = np.array([0, 1, 2, 0, 1, 2, 0, 1, 2, 0])  # True labels
y_one_hot = np.eye(output_size)[y_true]
loss = -np.mean(np.sum(y_one_hot * np.log(a2 + 1e-8), axis=1))
print(f"Loss: {loss}")

Use Case 5: Statistical Analysis

Perform hypothesis testing and statistical calculations:

```python import numpy as np

Sample data: test scores from two groups

group_a = np.array([85, 88, 90, 92, 87, 89, 91, 86, 88, 90]) group_b = np.array([78, 82, 80, 85, 83, 81, 84, 79, 82, 80])

Descriptive statistics

print(f"Group A - Mean: {np.mean(group_a):.2f}, Std: {np.std(group_a):.2f}") print(f"Group B - Mean: {np.mean(group_b):.2f}, Std: {np.std(group_b):.2f}")

T-test (manual calculation)

n_a, n_b = len(group_a), len(group_b) mean_a, mean_b = np.mean(group_a), np.mean(group_b) var_a, var_b = np.var(group_a, ddof=1), np.var(group_b, ddof=1)

Pooled standard deviation

pooled_std = np.sqrt(((n_a - 1) * var_a + (n_b - 1) * var_b) / (n_a + n_b - 2)) t_statistic = (mean_a - mean_b) / (pooled_std * np.sqrt(1/n_a + 1/n_b)) print(f"T-statistic: {t_statistic:.3f}")

Correlation analysis

correlation = np.corrcoef(group_a, group_b)[0, 1] print(f"Correlation: {correlation:.3f}")

Bootstrap confidence interval

n_bootstrap = 10000 bootstrap_means = np.array([