NumPy Cheatsheet¶

Installation¶

Platform	Command
Ubuntu/Debian	`sudo apt update && sudo apt install python3-numpy`
Ubuntu/Debian (pip)	`pip install numpy`
macOS (pip)	`pip3 install numpy`
macOS (Homebrew)	`brew install python && pip3 install numpy`
Windows (pip)	`pip install numpy`
Anaconda (all platforms)	`conda install numpy`
Specific version	`pip install numpy==1.24.3`
With optimizations	`pip install numpy[mkl]`
Virtual environment	`python -m venv myenv && source myenv/bin/activate && pip install numpy`
Verify installation	`python -c "import numpy as np; print(np.__version__)"`

Basic Commands¶

Array Creation¶

Command	Description
`np.array([1, 2, 3])`	Create 1D array from list
`np.array([[1, 2], [3, 4]])`	Create 2D array from nested list
`np.zeros((3, 4))`	Create 3×4 array filled with zeros
`np.ones((2, 3))`	Create 2×3 array filled with ones
`np.full((3, 3), 7)`	Create 3×3 array filled with value 7
`np.empty((2, 2))`	Create 2×2 uninitialized array (faster)
`np.arange(0, 10, 2)`	Create array [0, 2, 4, 6, 8] with step
`np.linspace(0, 1, 5)`	Create 5 evenly spaced values from 0 to 1
`np.eye(3)`	Create 3×3 identity matrix
`np.zeros_like(arr)`	Create zeros array with same shape as arr
`np.random.rand(3, 4)`	Create 3×4 array with random values [0,1)
`np.random.randint(0, 10, (3, 4))`	Create 3×4 array with random integers [0,10)

Array Properties¶

Command	Description
`arr.shape`	Get dimensions of array (e.g., (3, 4))
`arr.ndim`	Get number of dimensions
`arr.size`	Get total number of elements
`arr.dtype`	Get data type of elements
`arr.itemsize`	Get size of each element in bytes
`arr.nbytes`	Get total bytes consumed by array
`arr.T`	Get transposed array
`len(arr)`	Get length of first dimension

Array Indexing & Slicing¶

Command	Description
`arr[0]`	Access first element
`arr[-1]`	Access last element
`arr[1:4]`	Slice elements at indices 1, 2, 3
`arr[::2]`	Get every other element
`arr[::-1]`	Reverse array
`arr[0, 1]`	Access element at row 0, column 1 (2D)
`arr[0:2, 1:3]`	Slice rows 0-1, columns 1-2 (2D)
`arr[arr > 5]`	Boolean indexing: get elements > 5
`arr[[0, 2, 4]]`	Fancy indexing: get elements at indices 0, 2, 4
`np.where(arr > 5)`	Get indices where condition is True

Basic Mathematical Operations¶

Command	Description
`arr + 5`	Add scalar to all elements
`arr * 2`	Multiply all elements by scalar
`arr1 + arr2`	Element-wise addition of arrays
`arr1 * arr2`	Element-wise multiplication
`arr1 / arr2`	Element-wise division
`arr ** 2`	Square all elements
`np.sqrt(arr)`	Square root of all elements
`np.abs(arr)`	Absolute value of all elements
`np.sum(arr)`	Sum of all elements
`np.mean(arr)`	Mean of all elements
`np.std(arr)`	Standard deviation
`np.min(arr)`	Minimum value
`np.max(arr)`	Maximum value
`np.argmin(arr)`	Index of minimum value
`np.argmax(arr)`	Index of maximum value

Array Reshaping¶

Command	Description
`arr.reshape(3, 4)`	Reshape to 3×4 (must have same total elements)
`arr.flatten()`	Convert to 1D array (copy)
`arr.ravel()`	Convert to 1D array (view, faster)
`arr.T`	Transpose array (swap dimensions)
`np.expand_dims(arr, axis=0)`	Add new dimension at specified axis
`np.squeeze(arr)`	Remove single-dimensional entries
`arr.resize((3, 4))`	Resize array in-place (can change size)

Array Concatenation & Splitting¶

Command	Description
`np.concatenate([arr1, arr2])`	Concatenate arrays along existing axis
`np.vstack([arr1, arr2])`	Stack arrays vertically (row-wise)
`np.hstack([arr1, arr2])`	Stack arrays horizontally (column-wise)
`np.column_stack([arr1, arr2])`	Stack 1D arrays as columns
`np.split(arr, 3)`	Split array into 3 equal parts
`np.vsplit(arr, 2)`	Split array vertically into 2 parts
`np.hsplit(arr, 3)`	Split array horizontally into 3 parts
`np.array_split(arr, 3)`	Split into 3 parts (allows unequal splits)

Advanced Usage¶

Linear Algebra¶

Command	Description
`np.dot(a, b)`	Dot product of two arrays
`a @ b`	Matrix multiplication (Python 3.5+)
`np.matmul(a, b)`	Matrix multiplication (explicit)
`np.linalg.inv(matrix)`	Inverse of matrix
`np.linalg.det(matrix)`	Determinant of matrix
`np.linalg.eig(matrix)`	Eigenvalues and eigenvectors
`np.linalg.svd(matrix)`	Singular Value Decomposition
`np.linalg.solve(A, b)`	Solve linear system Ax = b
`np.linalg.norm(arr)`	Compute vector/matrix norm
`np.linalg.matrix_rank(matrix)`	Rank of matrix
`np.linalg.qr(matrix)`	QR decomposition
`np.linalg.cholesky(matrix)`	Cholesky decomposition
`np.trace(matrix)`	Sum of diagonal elements
`np.diag(arr)`	Extract diagonal or create diagonal matrix

Statistical Functions¶

Command	Description
`np.median(arr)`	Median value
`np.var(arr)`	Variance
`np.percentile(arr, 75)`	75^th percentile
`np.quantile(arr, 0.75)`	0.75 quantile (same as 75^th percentile)
`np.corrcoef(arr1, arr2)`	Correlation coefficient matrix
`np.cov(arr1, arr2)`	Covariance matrix
`np.histogram(arr, bins=10)`	Compute histogram
`np.bincount(arr)`	Count occurrences of each integer
`np.average(arr, weights=w)`	Weighted average
`np.cumsum(arr)`	Cumulative sum
`np.cumprod(arr)`	Cumulative product
`np.diff(arr)`	Discrete difference between consecutive elements

Broadcasting & Advanced Indexing¶

Command	Description
`arr + np.array([1, 2, 3])`	Broadcasting: add array to each row
`np.broadcast_to(arr, (3, 3))`	Explicitly broadcast array to shape
`np.newaxis`	Add new axis for broadcasting (e.g., `arr[:, np.newaxis]`)
`np.take(arr, [0, 2, 4])`	Take elements at specified indices
`np.put(arr, [0, 2], [99, 88])`	Put values at specified indices
`np.select([cond1, cond2], [val1, val2])`	Choose values based on conditions
`np.choose([0, 1, 0], [arr1, arr2, arr3])`	Choose elements from multiple arrays
`np.compress(condition, arr)`	Select elements using boolean array
`np.extract(condition, arr)`	Extract elements satisfying condition

Universal Functions (ufuncs)¶

Command	Description
`np.sin(arr)`, `np.cos(arr)`, `np.tan(arr)`	Trigonometric functions
`np.arcsin(arr)`, `np.arccos(arr)`, `np.arctan(arr)`	Inverse trigonometric functions
`np.exp(arr)`	Exponential (e^x)
`np.log(arr)`	Natural logarithm
`np.log10(arr)`, `np.log2(arr)`	Base-10 and base-2 logarithms
`np.power(arr, 3)`	Raise to power (element-wise)
`np.ceil(arr)`, `np.floor(arr)`	Round up/down to nearest integer
`np.round(arr, decimals=2)`	Round to specified decimals
`np.clip(arr, min, max)`	Clip values to range [min, max]
`np.sign(arr)`	Sign of elements (-1, 0, or 1)
`np.maximum(arr1, arr2)`	Element-wise maximum of two arrays
`np.minimum(arr1, arr2)`	Element-wise minimum of two arrays

Random Number Generation¶

Command	Description
`np.random.seed(42)`	Set random seed for reproducibility
`np.random.rand(3, 4)`	Random floats in [0, 1) with shape (3, 4)
`np.random.randn(3, 4)`	Random standard normal distribution
`np.random.randint(0, 10, (3, 4))`	Random integers in [0, 10)
`np.random.random((3, 4))`	Random floats in [0, 1)
`np.random.choice(arr, size=5)`	Random sample from array
`np.random.shuffle(arr)`	Shuffle array in-place
`np.random.permutation(arr)`	Random permutation (returns copy)
`np.random.uniform(0, 10, size=100)`	Uniform distribution [0, 10)
`np.random.normal(0, 1, size=100)`	Normal distribution (mean=0, std=1)
`np.random.exponential(2.0, size=100)`	Exponential distribution
`np.random.binomial(10, 0.5, size=100)`	Binomial distribution
`rng = np.random.default_rng(42)`	Create new random generator (modern API)
`rng.random((3, 4))`	Generate random floats with new API

Memory & Performance¶

Command	Description
`arr.copy()`	Create deep copy of array
`arr.view()`	Create view (shares memory with original)
`arr.astype(np.float32)`	Convert array to different data type
`np.ascontiguousarray(arr)`	Return contiguous array in memory
`arr.flags`	Get memory layout information
`np.shares_memory(arr1, arr2)`	Check if arrays share memory
`np.may_share_memory(arr1, arr2)`	Check if arrays might share memory
`arr.nbytes`	Get total bytes consumed
`sys.getsizeof(arr)`	Get size including overhead (import sys)

Configuration¶

Data Types¶

NumPy supports various data types for memory optimization:

# Integer types
np.int8      # -128 to 127
np.int16     # -32,768 to 32,767
np.int32     # -2^31 to 2^31-1
np.int64     # -2^63 to 2^63-1
np.uint8     # 0 to 255
np.uint16    # 0 to 65,535

# Float types
np.float16   # Half precision
np.float32   # Single precision
np.float64   # Double precision (default)

# Complex types
np.complex64  # Two 32-bit floats
np.complex128 # Two 64-bit floats

# Boolean
np.bool_     # True or False

# String types
np.str_      # Unicode string
np.bytes_    # Byte string

# Create array with specific dtype
arr = np.array([1, 2, 3], dtype=np.float32)

Print Options¶

Configure how arrays are displayed:

# Set print options
np.set_printoptions(
    precision=3,        # Decimal places
    suppress=True,      # Suppress scientific notation
    threshold=1000,     # Max elements before summarizing
    edgeitems=3,        # Items at start/end when summarizing
    linewidth=120       # Characters per line
)

# Example: suppress scientific notation
np.set_printoptions(suppress=True)
print(np.array([1e-10, 1e10]))  # [0. 10000000000.]

# Reset to defaults
np.set_printoptions()

Error Handling¶

Configure how NumPy handles numerical errors:

# Set error handling
np.seterr(
    divide='warn',    # Division by zero: 'ignore', 'warn', 'raise', 'call'
    over='warn',      # Overflow: 'ignore', 'warn', 'raise', 'call'
    under='ignore',   # Underflow
    invalid='warn'    # Invalid operation (e.g., sqrt(-1))
)

# Context manager for temporary settings
with np.errstate(divide='ignore'):
    result = np.array([1, 2]) / 0  # No warning in this block

# Check for errors after operations
np.seterr(all='ignore')
result = np.array([1]) / 0
if np.isinf(result).any():
    print("Infinity detected")

Random Number Generator Configuration¶

# Legacy API (older code)
np.random.seed(42)

# Modern API (recommended)
rng = np.random.default_rng(seed=42)

# Use different algorithms
from numpy.random import PCG64, Philox, MT19937
rng = np.random.Generator(PCG64(seed=42))  # Default, fastest
rng = np.random.Generator(Philox(seed=42))  # Parallel streams
rng = np.random.Generator(MT19937(seed=42)) # Legacy compatibility

Common Use Cases¶

Use Case 1: Data Normalization¶

Normalize data to zero mean and unit variance (standardization):

import numpy as np

# Sample data
data = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]], dtype=float)

# Standardize: (x - mean) / std
mean = np.mean(data, axis=0)
std = np.std(data, axis=0)
normalized = (data - mean) / std

print(normalized)
# [[-1.22474487 -1.22474487 -1.22474487]
#  [ 0.          0.          0.        ]
#  [ 1.22474487  1.22474487  1.22474487]]

# Min-Max normalization: (x - min) / (max - min)
min_val = np.min(data, axis=0)
max_val = np.max(data, axis=0)
normalized_minmax = (data - min_val) / (max_val - min_val)

Use Case 2: Image Processing¶

Manipulate images as NumPy arrays:

import numpy as np
from PIL import Image

# Load image as array
img = np.array(Image.open('photo.jpg'))
print(f"Shape: {img.shape}")  # (height, width, channels)

# Convert to grayscale
gray = np.mean(img, axis=2).astype(np.uint8)

# Crop image (top-left 100x100 pixels)
cropped = img[:100, :100]

# Flip image vertically
flipped = np.flipud(img)

# Rotate 90 degrees
rotated = np.rot90(img)

# Adjust brightness (add value to all pixels)
brighter = np.clip(img + 50, 0, 255).astype(np.uint8)

# Apply threshold
threshold = 128
binary = np.where(gray > threshold, 255, 0).astype(np.uint8)

# Save result
Image.fromarray(binary).save('processed.jpg')

Use Case 3: Time Series Analysis¶

Calculate moving averages and statistics:

import numpy as np

# Sample time series data
prices = np.array([100, 102, 101, 105, 107, 106, 108, 110, 109, 111])

# Simple moving average (window size 3)
window = 3
moving_avg = np.convolve(prices, np.ones(window)/window, mode='valid')
print(f"Moving average: {moving_avg}")

# Calculate returns (percentage change)
returns = np.diff(prices) / prices[:-1] * 100
print(f"Returns: {returns}")

# Cumulative returns
cumulative_returns = np.cumprod(1 + returns/100) - 1
print(f"Cumulative returns: {cumulative_returns}")

# Volatility (rolling standard deviation)
def rolling_std(arr, window):
    return np.array([np.std(arr[i:i+window]) for i in range(len(arr)-window+1)])

volatility = rolling_std(returns, window=3)
print(f"Volatility: {volatility}")

# Detect outliers (values > 2 std from mean)
mean = np.mean(prices)
std = np.std(prices)
outliers = np.abs(prices - mean) > 2 * std
print(f"Outliers at indices: {np.where(outliers)[0]}")

Use Case 4: Matrix Operations for Machine Learning¶

Implement basic neural network operations:

import numpy as np

# Initialize weights and biases
np.random.seed(42)
input_size, hidden_size, output_size = 4, 5, 3
W1 = np.random.randn(input_size, hidden_size) * 0.01
b1 = np.zeros((1, hidden_size))
W2 = np.random.randn(hidden_size, output_size) * 0.01
b2 = np.zeros((1, output_size))

# Sample input (batch of 10 samples)
X = np.random.randn(10, input_size)

# Forward pass
def sigmoid(x):
    return 1 / (1 + np.exp(-x))

def softmax(x):
    exp_x = np.exp(x - np.max(x, axis=1, keepdims=True))
    return exp_x / np.sum(exp_x, axis=1, keepdims=True)

# Hidden layer
z1 = np.dot(X, W1) + b1
a1 = sigmoid(z1)

# Output layer
z2 = np.dot(a1, W2) + b2
a2 = softmax(z2)

print(f"Output shape: {a2.shape}")  # (10, 3)
print(f"Output probabilities sum to 1: {np.allclose(np.sum(a2, axis=1), 1)}")

# Calculate loss (cross-entropy)
y_true = np.array([0, 1, 2, 0, 1, 2, 0, 1, 2, 0])  # True labels
y_one_hot = np.eye(output_size)[y_true]
loss = -np.mean(np.sum(y_one_hot * np.log(a2 + 1e-8), axis=1))
print(f"Loss: {loss}")

Use Case 5: Statistical Analysis¶

Perform hypothesis testing and statistical calculations:

```python import numpy as np

Sample data: test scores from two groups¶

group_a = np.array([85, 88, 90, 92, 87, 89, 91, 86, 88, 90]) group_b = np.array([78, 82, 80, 85, 83, 81, 84, 79, 82, 80])

Descriptive statistics¶

print(f"Group A - Mean: {np.mean(group_a):.2f}, Std: {np.std(group_a):.2f}") print(f"Group B - Mean: {np.mean(group_b):.2f}, Std: {np.std(group_b):.2f}")

T-test (manual calculation)¶

n_a, n_b = len(group_a), len(group_b) mean_a, mean_b = np.mean(group_a), np.mean(group_b) var_a, var_b = np.var(group_a, ddof=1), np.var(group_b, ddof=1)

Pooled standard deviation¶

pooled_std = np.sqrt(((n_a - 1) * var_a + (n_b - 1) * var_b) / (n_a + n_b - 2)) t_statistic = (mean_a - mean_b) / (pooled_std * np.sqrt(1/n_a + 1/n_b)) print(f"T-statistic: {t_statistic:.3f}")

Correlation analysis¶

correlation = np.corrcoef(group_a, group_b)[0, 1] print(f"Correlation: {correlation:.3f}")

Bootstrap confidence interval¶

n_bootstrap = 10000 bootstrap_means = np.array([