NumPy Cheatsheet¶
Installation¶
| Platform | Command |
|---|---|
| Ubuntu/Debian | sudo apt update && sudo apt install python3-numpy |
| Ubuntu/Debian (pip) | pip install numpy |
| macOS (pip) | pip3 install numpy |
| macOS (Homebrew) | brew install python && pip3 install numpy |
| Windows (pip) | pip install numpy |
| Anaconda (all platforms) | conda install numpy |
| Specific version | pip install numpy==1.24.3 |
| With optimizations | pip install numpy[mkl] |
| Virtual environment | python -m venv myenv && source myenv/bin/activate && pip install numpy |
| Verify installation | python -c "import numpy as np; print(np.__version__)" |
Basic Commands¶
Array Creation¶
| Command | Description |
|---|---|
np.array([1, 2, 3]) |
Create 1D array from list |
np.array([[1, 2], [3, 4]]) |
Create 2D array from nested list |
np.zeros((3, 4)) |
Create 3×4 array filled with zeros |
np.ones((2, 3)) |
Create 2×3 array filled with ones |
np.full((3, 3), 7) |
Create 3×3 array filled with value 7 |
np.empty((2, 2)) |
Create 2×2 uninitialized array (faster) |
np.arange(0, 10, 2) |
Create array [0, 2, 4, 6, 8] with step |
np.linspace(0, 1, 5) |
Create 5 evenly spaced values from 0 to 1 |
np.eye(3) |
Create 3×3 identity matrix |
np.zeros_like(arr) |
Create zeros array with same shape as arr |
np.random.rand(3, 4) |
Create 3×4 array with random values [0,1) |
np.random.randint(0, 10, (3, 4)) |
Create 3×4 array with random integers [0,10) |
Array Properties¶
| Command | Description |
|---|---|
arr.shape |
Get dimensions of array (e.g., (3, 4)) |
arr.ndim |
Get number of dimensions |
arr.size |
Get total number of elements |
arr.dtype |
Get data type of elements |
arr.itemsize |
Get size of each element in bytes |
arr.nbytes |
Get total bytes consumed by array |
arr.T |
Get transposed array |
len(arr) |
Get length of first dimension |
Array Indexing & Slicing¶
| Command | Description |
|---|---|
arr[0] |
Access first element |
arr[-1] |
Access last element |
arr[1:4] |
Slice elements at indices 1, 2, 3 |
arr[::2] |
Get every other element |
arr[::-1] |
Reverse array |
arr[0, 1] |
Access element at row 0, column 1 (2D) |
arr[0:2, 1:3] |
Slice rows 0-1, columns 1-2 (2D) |
arr[arr > 5] |
Boolean indexing: get elements > 5 |
arr[[0, 2, 4]] |
Fancy indexing: get elements at indices 0, 2, 4 |
np.where(arr > 5) |
Get indices where condition is True |
Basic Mathematical Operations¶
| Command | Description |
|---|---|
arr + 5 |
Add scalar to all elements |
arr * 2 |
Multiply all elements by scalar |
arr1 + arr2 |
Element-wise addition of arrays |
arr1 * arr2 |
Element-wise multiplication |
arr1 / arr2 |
Element-wise division |
arr ** 2 |
Square all elements |
np.sqrt(arr) |
Square root of all elements |
np.abs(arr) |
Absolute value of all elements |
np.sum(arr) |
Sum of all elements |
np.mean(arr) |
Mean of all elements |
np.std(arr) |
Standard deviation |
np.min(arr) |
Minimum value |
np.max(arr) |
Maximum value |
np.argmin(arr) |
Index of minimum value |
np.argmax(arr) |
Index of maximum value |
Array Reshaping¶
| Command | Description |
|---|---|
arr.reshape(3, 4) |
Reshape to 3×4 (must have same total elements) |
arr.flatten() |
Convert to 1D array (copy) |
arr.ravel() |
Convert to 1D array (view, faster) |
arr.T |
Transpose array (swap dimensions) |
np.expand_dims(arr, axis=0) |
Add new dimension at specified axis |
np.squeeze(arr) |
Remove single-dimensional entries |
arr.resize((3, 4)) |
Resize array in-place (can change size) |
Array Concatenation & Splitting¶
| Command | Description |
|---|---|
np.concatenate([arr1, arr2]) |
Concatenate arrays along existing axis |
np.vstack([arr1, arr2]) |
Stack arrays vertically (row-wise) |
np.hstack([arr1, arr2]) |
Stack arrays horizontally (column-wise) |
np.column_stack([arr1, arr2]) |
Stack 1D arrays as columns |
np.split(arr, 3) |
Split array into 3 equal parts |
np.vsplit(arr, 2) |
Split array vertically into 2 parts |
np.hsplit(arr, 3) |
Split array horizontally into 3 parts |
np.array_split(arr, 3) |
Split into 3 parts (allows unequal splits) |
Advanced Usage¶
Linear Algebra¶
| Command | Description |
|---|---|
np.dot(a, b) |
Dot product of two arrays |
a @ b |
Matrix multiplication (Python 3.5+) |
np.matmul(a, b) |
Matrix multiplication (explicit) |
np.linalg.inv(matrix) |
Inverse of matrix |
np.linalg.det(matrix) |
Determinant of matrix |
np.linalg.eig(matrix) |
Eigenvalues and eigenvectors |
np.linalg.svd(matrix) |
Singular Value Decomposition |
np.linalg.solve(A, b) |
Solve linear system Ax = b |
np.linalg.norm(arr) |
Compute vector/matrix norm |
np.linalg.matrix_rank(matrix) |
Rank of matrix |
np.linalg.qr(matrix) |
QR decomposition |
np.linalg.cholesky(matrix) |
Cholesky decomposition |
np.trace(matrix) |
Sum of diagonal elements |
np.diag(arr) |
Extract diagonal or create diagonal matrix |
Statistical Functions¶
| Command | Description |
|---|---|
np.median(arr) |
Median value |
np.var(arr) |
Variance |
np.percentile(arr, 75) |
75th percentile |
np.quantile(arr, 0.75) |
0.75 quantile (same as 75th percentile) |
np.corrcoef(arr1, arr2) |
Correlation coefficient matrix |
np.cov(arr1, arr2) |
Covariance matrix |
np.histogram(arr, bins=10) |
Compute histogram |
np.bincount(arr) |
Count occurrences of each integer |
np.average(arr, weights=w) |
Weighted average |
np.cumsum(arr) |
Cumulative sum |
np.cumprod(arr) |
Cumulative product |
np.diff(arr) |
Discrete difference between consecutive elements |
Broadcasting & Advanced Indexing¶
| Command | Description |
|---|---|
arr + np.array([1, 2, 3]) |
Broadcasting: add array to each row |
np.broadcast_to(arr, (3, 3)) |
Explicitly broadcast array to shape |
np.newaxis |
Add new axis for broadcasting (e.g., arr[:, np.newaxis]) |
np.take(arr, [0, 2, 4]) |
Take elements at specified indices |
np.put(arr, [0, 2], [99, 88]) |
Put values at specified indices |
np.select([cond1, cond2], [val1, val2]) |
Choose values based on conditions |
np.choose([0, 1, 0], [arr1, arr2, arr3]) |
Choose elements from multiple arrays |
np.compress(condition, arr) |
Select elements using boolean array |
np.extract(condition, arr) |
Extract elements satisfying condition |
Universal Functions (ufuncs)¶
| Command | Description |
|---|---|
np.sin(arr), np.cos(arr), np.tan(arr) |
Trigonometric functions |
np.arcsin(arr), np.arccos(arr), np.arctan(arr) |
Inverse trigonometric functions |
np.exp(arr) |
Exponential (e^x) |
np.log(arr) |
Natural logarithm |
np.log10(arr), np.log2(arr) |
Base-10 and base-2 logarithms |
np.power(arr, 3) |
Raise to power (element-wise) |
np.ceil(arr), np.floor(arr) |
Round up/down to nearest integer |
np.round(arr, decimals=2) |
Round to specified decimals |
np.clip(arr, min, max) |
Clip values to range [min, max] |
np.sign(arr) |
Sign of elements (-1, 0, or 1) |
np.maximum(arr1, arr2) |
Element-wise maximum of two arrays |
np.minimum(arr1, arr2) |
Element-wise minimum of two arrays |
Random Number Generation¶
| Command | Description |
|---|---|
np.random.seed(42) |
Set random seed for reproducibility |
np.random.rand(3, 4) |
Random floats in [0, 1) with shape (3, 4) |
np.random.randn(3, 4) |
Random standard normal distribution |
np.random.randint(0, 10, (3, 4)) |
Random integers in [0, 10) |
np.random.random((3, 4)) |
Random floats in [0, 1) |
np.random.choice(arr, size=5) |
Random sample from array |
np.random.shuffle(arr) |
Shuffle array in-place |
np.random.permutation(arr) |
Random permutation (returns copy) |
np.random.uniform(0, 10, size=100) |
Uniform distribution [0, 10) |
np.random.normal(0, 1, size=100) |
Normal distribution (mean=0, std=1) |
np.random.exponential(2.0, size=100) |
Exponential distribution |
np.random.binomial(10, 0.5, size=100) |
Binomial distribution |
rng = np.random.default_rng(42) |
Create new random generator (modern API) |
rng.random((3, 4)) |
Generate random floats with new API |
Memory & Performance¶
| Command | Description |
|---|---|
arr.copy() |
Create deep copy of array |
arr.view() |
Create view (shares memory with original) |
arr.astype(np.float32) |
Convert array to different data type |
np.ascontiguousarray(arr) |
Return contiguous array in memory |
arr.flags |
Get memory layout information |
np.shares_memory(arr1, arr2) |
Check if arrays share memory |
np.may_share_memory(arr1, arr2) |
Check if arrays might share memory |
arr.nbytes |
Get total bytes consumed |
sys.getsizeof(arr) |
Get size including overhead (import sys) |
Configuration¶
Data Types¶
NumPy supports various data types for memory optimization:
# Integer types
np.int8 # -128 to 127
np.int16 # -32,768 to 32,767
np.int32 # -2^31 to 2^31-1
np.int64 # -2^63 to 2^63-1
np.uint8 # 0 to 255
np.uint16 # 0 to 65,535
# Float types
np.float16 # Half precision
np.float32 # Single precision
np.float64 # Double precision (default)
# Complex types
np.complex64 # Two 32-bit floats
np.complex128 # Two 64-bit floats
# Boolean
np.bool_ # True or False
# String types
np.str_ # Unicode string
np.bytes_ # Byte string
# Create array with specific dtype
arr = np.array([1, 2, 3], dtype=np.float32)
Print Options¶
Configure how arrays are displayed:
# Set print options
np.set_printoptions(
precision=3, # Decimal places
suppress=True, # Suppress scientific notation
threshold=1000, # Max elements before summarizing
edgeitems=3, # Items at start/end when summarizing
linewidth=120 # Characters per line
)
# Example: suppress scientific notation
np.set_printoptions(suppress=True)
print(np.array([1e-10, 1e10])) # [0. 10000000000.]
# Reset to defaults
np.set_printoptions()
Error Handling¶
Configure how NumPy handles numerical errors:
# Set error handling
np.seterr(
divide='warn', # Division by zero: 'ignore', 'warn', 'raise', 'call'
over='warn', # Overflow: 'ignore', 'warn', 'raise', 'call'
under='ignore', # Underflow
invalid='warn' # Invalid operation (e.g., sqrt(-1))
)
# Context manager for temporary settings
with np.errstate(divide='ignore'):
result = np.array([1, 2]) / 0 # No warning in this block
# Check for errors after operations
np.seterr(all='ignore')
result = np.array([1]) / 0
if np.isinf(result).any():
print("Infinity detected")
Random Number Generator Configuration¶
# Legacy API (older code)
np.random.seed(42)
# Modern API (recommended)
rng = np.random.default_rng(seed=42)
# Use different algorithms
from numpy.random import PCG64, Philox, MT19937
rng = np.random.Generator(PCG64(seed=42)) # Default, fastest
rng = np.random.Generator(Philox(seed=42)) # Parallel streams
rng = np.random.Generator(MT19937(seed=42)) # Legacy compatibility
Common Use Cases¶
Use Case 1: Data Normalization¶
Normalize data to zero mean and unit variance (standardization):
import numpy as np
# Sample data
data = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]], dtype=float)
# Standardize: (x - mean) / std
mean = np.mean(data, axis=0)
std = np.std(data, axis=0)
normalized = (data - mean) / std
print(normalized)
# [[-1.22474487 -1.22474487 -1.22474487]
# [ 0. 0. 0. ]
# [ 1.22474487 1.22474487 1.22474487]]
# Min-Max normalization: (x - min) / (max - min)
min_val = np.min(data, axis=0)
max_val = np.max(data, axis=0)
normalized_minmax = (data - min_val) / (max_val - min_val)
Use Case 2: Image Processing¶
Manipulate images as NumPy arrays:
import numpy as np
from PIL import Image
# Load image as array
img = np.array(Image.open('photo.jpg'))
print(f"Shape: {img.shape}") # (height, width, channels)
# Convert to grayscale
gray = np.mean(img, axis=2).astype(np.uint8)
# Crop image (top-left 100x100 pixels)
cropped = img[:100, :100]
# Flip image vertically
flipped = np.flipud(img)
# Rotate 90 degrees
rotated = np.rot90(img)
# Adjust brightness (add value to all pixels)
brighter = np.clip(img + 50, 0, 255).astype(np.uint8)
# Apply threshold
threshold = 128
binary = np.where(gray > threshold, 255, 0).astype(np.uint8)
# Save result
Image.fromarray(binary).save('processed.jpg')
Use Case 3: Time Series Analysis¶
Calculate moving averages and statistics:
import numpy as np
# Sample time series data
prices = np.array([100, 102, 101, 105, 107, 106, 108, 110, 109, 111])
# Simple moving average (window size 3)
window = 3
moving_avg = np.convolve(prices, np.ones(window)/window, mode='valid')
print(f"Moving average: {moving_avg}")
# Calculate returns (percentage change)
returns = np.diff(prices) / prices[:-1] * 100
print(f"Returns: {returns}")
# Cumulative returns
cumulative_returns = np.cumprod(1 + returns/100) - 1
print(f"Cumulative returns: {cumulative_returns}")
# Volatility (rolling standard deviation)
def rolling_std(arr, window):
return np.array([np.std(arr[i:i+window]) for i in range(len(arr)-window+1)])
volatility = rolling_std(returns, window=3)
print(f"Volatility: {volatility}")
# Detect outliers (values > 2 std from mean)
mean = np.mean(prices)
std = np.std(prices)
outliers = np.abs(prices - mean) > 2 * std
print(f"Outliers at indices: {np.where(outliers)[0]}")
Use Case 4: Matrix Operations for Machine Learning¶
Implement basic neural network operations:
import numpy as np
# Initialize weights and biases
np.random.seed(42)
input_size, hidden_size, output_size = 4, 5, 3
W1 = np.random.randn(input_size, hidden_size) * 0.01
b1 = np.zeros((1, hidden_size))
W2 = np.random.randn(hidden_size, output_size) * 0.01
b2 = np.zeros((1, output_size))
# Sample input (batch of 10 samples)
X = np.random.randn(10, input_size)
# Forward pass
def sigmoid(x):
return 1 / (1 + np.exp(-x))
def softmax(x):
exp_x = np.exp(x - np.max(x, axis=1, keepdims=True))
return exp_x / np.sum(exp_x, axis=1, keepdims=True)
# Hidden layer
z1 = np.dot(X, W1) + b1
a1 = sigmoid(z1)
# Output layer
z2 = np.dot(a1, W2) + b2
a2 = softmax(z2)
print(f"Output shape: {a2.shape}") # (10, 3)
print(f"Output probabilities sum to 1: {np.allclose(np.sum(a2, axis=1), 1)}")
# Calculate loss (cross-entropy)
y_true = np.array([0, 1, 2, 0, 1, 2, 0, 1, 2, 0]) # True labels
y_one_hot = np.eye(output_size)[y_true]
loss = -np.mean(np.sum(y_one_hot * np.log(a2 + 1e-8), axis=1))
print(f"Loss: {loss}")
Use Case 5: Statistical Analysis¶
Perform hypothesis testing and statistical calculations:
```python import numpy as np
Sample data: test scores from two groups¶
group_a = np.array([85, 88, 90, 92, 87, 89, 91, 86, 88, 90]) group_b = np.array([78, 82, 80, 85, 83, 81, 84, 79, 82, 80])
Descriptive statistics¶
print(f"Group A - Mean: {np.mean(group_a):.2f}, Std: {np.std(group_a):.2f}") print(f"Group B - Mean: {np.mean(group_b):.2f}, Std: {np.std(group_b):.2f}")
T-test (manual calculation)¶
n_a, n_b = len(group_a), len(group_b) mean_a, mean_b = np.mean(group_a), np.mean(group_b) var_a, var_b = np.var(group_a, ddof=1), np.var(group_b, ddof=1)
Pooled standard deviation¶
pooled_std = np.sqrt(((n_a - 1) * var_a + (n_b - 1) * var_b) / (n_a + n_b - 2)) t_statistic = (mean_a - mean_b) / (pooled_std * np.sqrt(1/n_a + 1/n_b)) print(f"T-statistic: {t_statistic:.3f}")
Correlation analysis¶
correlation = np.corrcoef(group_a, group_b)[0, 1] print(f"Correlation: {correlation:.3f}")
Bootstrap confidence interval¶
n_bootstrap = 10000 bootstrap_means = np.array([