SciPy Cheatsheet
Installation
| Platform | Command |
|---|
| Ubuntu/Debian | sudo apt-get install python3-scipy or pip install scipy |
| macOS | pip3 install scipy or brew install python3 && pip3 install scipy |
| Windows | pip install scipy or conda install scipy (recommended) |
| Conda (All platforms) | conda install scipy or conda install -c conda-forge scipy |
| Virtual Environment | python -m venv env && source env/bin/activate && pip install scipy |
| Specific Version | pip install scipy==1.11.4 |
| With Dependencies | pip install scipy[all] |
| Verify Installation | python -c "import scipy; print(scipy.__version__)" |
Basic Commands
Import and Setup
| Command | Description |
|---|
import scipy | Import the SciPy library |
import numpy as np | Import NumPy (required for SciPy) |
from scipy import optimize | Import optimization module |
from scipy import stats | Import statistics module |
from scipy import linalg | Import linear algebra module |
scipy.__version__ | Check SciPy version |
scipy.show_config() | Display SciPy configuration |
Optimization Basics
| Command | Description |
|---|
minimize(func, x0) | Find minimum of a scalar function |
minimize(func, x0, method='BFGS') | Minimize using BFGS algorithm |
root(func, x0) | Find root of a function |
curve_fit(func, xdata, ydata) | Fit curve to data points |
fsolve(func, x0) | Solve system of nonlinear equations |
minimize_scalar(func) | Minimize single-variable function |
linprog(c, A_ub, b_ub) | Linear programming optimization |
Integration
| Command | Description |
|---|
quad(func, a, b) | Integrate function from a to b |
dblquad(func, a, b, gfun, hfun) | Double integration |
odeint(func, y0, t) | Solve ordinary differential equations |
solve_ivp(func, t_span, y0) | Modern ODE solver with events |
trapz(y, x) | Trapezoidal rule integration |
simps(y, x) | Simpson’s rule integration |
Interpolation
| Command | Description |
|---|
interp1d(x, y) | 1D interpolation function |
interp1d(x, y, kind='cubic') | Cubic spline interpolation |
griddata(points, values, xi) | Interpolate scattered data |
UnivariateSpline(x, y) | Univariate spline approximation |
interp2d(x, y, z) | 2D interpolation function |
Statistics
| Command | Description |
|---|
stats.norm.rvs(loc, scale, size) | Generate normal random variables |
stats.ttest_1samp(data, popmean) | One-sample t-test |
stats.ttest_ind(data1, data2) | Independent two-sample t-test |
stats.pearsonr(x, y) | Pearson correlation coefficient |
stats.norm.pdf(x, loc, scale) | Normal probability density function |
stats.norm.cdf(x, loc, scale) | Cumulative distribution function |
stats.kstest(data, 'norm') | Kolmogorov-Smirnov test |
stats.chi2_contingency(table) | Chi-square test of independence |
Linear Algebra
| Command | Description |
|---|
linalg.solve(A, b) | Solve linear system Ax = b |
linalg.inv(A) | Calculate matrix inverse |
linalg.det(A) | Calculate determinant |
linalg.eig(A) | Compute eigenvalues and eigenvectors |
linalg.svd(A) | Singular value decomposition |
linalg.lstsq(A, b) | Least-squares solution |
linalg.norm(A) | Matrix or vector norm |
linalg.qr(A) | QR decomposition |
Signal Processing
| Command | Description |
|---|
signal.butter(N, Wn, btype) | Butterworth filter design |
signal.filtfilt(b, a, data) | Zero-phase digital filtering |
signal.find_peaks(data) | Find peaks in signal |
signal.correlate(sig1, sig2) | Cross-correlation of signals |
signal.convolve(sig1, sig2) | Convolution of signals |
signal.spectrogram(data, fs) | Compute spectrogram |
signal.resample(data, num) | Resample signal to num samples |
Sparse Matrices
| Command | Description |
|---|
sparse.csr_matrix(data) | Create compressed sparse row matrix |
sparse.csc_matrix(data) | Create compressed sparse column matrix |
sparse.eye(n) | Sparse identity matrix |
sparse.diags(diagonals, offsets) | Construct sparse diagonal matrix |
matrix.toarray() | Convert sparse to dense array |
matrix.todense() | Convert sparse to dense matrix |
Advanced Usage
Advanced Optimization
| Command | Description |
|---|
minimize(func, x0, method='SLSQP', constraints=cons, bounds=bnds) | Constrained optimization with bounds |
differential_evolution(func, bounds) | Global optimization using genetic algorithm |
basinhopping(func, x0, minimizer_kwargs) | Global optimization with basin-hopping |
least_squares(func, x0, bounds) | Nonlinear least squares with bounds |
minimize(func, x0, jac=gradient_func) | Optimization with analytical gradient |
shgo(func, bounds) | Simplicial homology global optimization |
dual_annealing(func, bounds) | Dual annealing global optimization |
Advanced Integration
| Command | Description |
|---|
solve_ivp(func, t_span, y0, method='RK45', events=event_func) | ODE solver with event detection |
solve_ivp(func, t_span, y0, dense_output=True) | ODE with continuous solution |
nquad(func, ranges) | N-dimensional integration |
odeint(func, y0, t, Dfun=jacobian) | ODE with Jacobian matrix |
solve_bvp(func, bc, x, y) | Boundary value problem solver |
quad(func, a, b, weight='cos') | Integration with weight functions |
Advanced Interpolation
| Command | Description |
|---|
RBFInterpolator(points, values) | Radial basis function interpolation |
BSpline(t, c, k) | B-spline basis functions |
make_interp_spline(x, y, k=3) | Make interpolating B-spline |
Akima1DInterpolator(x, y) | Akima interpolation (smooth curves) |
PchipInterpolator(x, y) | PCHIP 1-D monotonic cubic interpolation |
RegularGridInterpolator((x, y), z) | Interpolation on regular grid |
Advanced Statistics
| Command | Description |
|---|
stats.gaussian_kde(data) | Kernel density estimation |
stats.multivariate_normal.rvs(mean, cov, size) | Multivariate normal samples |
stats.bootstrap((data,), statistic, n_resamples) | Bootstrap confidence intervals |
stats.anderson(data, dist='norm') | Anderson-Darling test |
stats.mannwhitneyu(x, y) | Mann-Whitney U test |
stats.kruskal(*groups) | Kruskal-Wallis H-test |
stats.wilcoxon(x, y) | Wilcoxon signed-rank test |
stats.spearmanr(x, y) | Spearman rank correlation |
stats.linregress(x, y) | Linear regression analysis |
Advanced Linear Algebra
| Command | Description |
|---|
linalg.cholesky(A) | Cholesky decomposition |
linalg.lu(A) | LU decomposition |
linalg.schur(A) | Schur decomposition |
linalg.expm(A) | Matrix exponential |
linalg.logm(A) | Matrix logarithm |
linalg.sqrtm(A) | Matrix square root |
linalg.solve_triangular(A, b) | Solve triangular system |
linalg.pinv(A) | Moore-Penrose pseudoinverse |
Advanced Signal Processing
| Command | Description |
|---|
signal.stft(data, fs) | Short-time Fourier transform |
signal.istft(Zxx, fs) | Inverse STFT |
signal.welch(data, fs) | Power spectral density (Welch method) |
signal.coherence(x, y, fs) | Coherence between signals |
signal.hilbert(data) | Hilbert transform |
signal.savgol_filter(data, window, polyorder) | Savitzky-Golay filter |
signal.detrend(data) | Remove linear trend from data |
signal.wiener(data) | Wiener filter for noise reduction |
Advanced Sparse Operations
| Command | Description |
|---|
sparse.linalg.spsolve(A, b) | Solve sparse linear system |
sparse.linalg.eigsh(A, k) | Find k eigenvalues of sparse matrix |
sparse.linalg.svds(A, k) | Sparse SVD for k singular values |
sparse.linalg.cg(A, b) | Conjugate gradient solver |
sparse.linalg.gmres(A, b) | GMRES iterative solver |
sparse.kron(A, B) | Kronecker product of sparse matrices |
Spatial Operations
| Command | Description |
|---|
spatial.distance.pdist(X, metric='euclidean') | Pairwise distances between observations |
spatial.distance.cdist(XA, XB) | Distance between two collections |
spatial.KDTree(data) | k-d tree for fast nearest neighbor |
spatial.ConvexHull(points) | Convex hull of points |
spatial.Delaunay(points) | Delaunay triangulation |
spatial.Voronoi(points) | Voronoi diagram |
spatial.distance_matrix(x, y) | Distance matrix computation |
Image Processing
| Command | Description |
|---|
ndimage.gaussian_filter(image, sigma) | Gaussian smoothing filter |
ndimage.median_filter(image, size) | Median filter for noise reduction |
ndimage.rotate(image, angle) | Rotate image by angle |
ndimage.zoom(image, zoom) | Zoom image by factor |
ndimage.binary_erosion(image) | Binary morphological erosion |
ndimage.binary_dilation(image) | Binary morphological dilation |
ndimage.label(image) | Label connected components |
ndimage.sobel(image) | Sobel edge detection |
Configuration
NumPy/SciPy Configuration
# Display build and configuration information
import scipy
scipy.show_config()
# Set NumPy print options (affects SciPy output)
import numpy as np
np.set_printoptions(precision=4, suppress=True, linewidth=100)
# Configure warning filters
import warnings
warnings.filterwarnings('ignore', category=RuntimeWarning)
Optimization Configuration
# Configure optimization options
from scipy.optimize import minimize
options = {
'maxiter': 1000, # Maximum iterations
'disp': True, # Display convergence messages
'ftol': 1e-8, # Function tolerance
'gtol': 1e-8 # Gradient tolerance
}
result = minimize(func, x0, method='BFGS', options=options)
Integration Tolerances
from scipy.integrate import quad, solve_ivp
# Configure integration accuracy
result, error = quad(func, a, b,
epsabs=1e-10, # Absolute error tolerance
epsrel=1e-10, # Relative error tolerance
limit=100) # Subdivision limit
# Configure ODE solver
sol = solve_ivp(func, t_span, y0,
method='RK45',
rtol=1e-6, # Relative tolerance
atol=1e-9, # Absolute tolerance
max_step=0.1) # Maximum step size
from scipy import sparse
# Choose format based on use case
# CSR: efficient row slicing, matrix-vector products
A_csr = sparse.csr_matrix(data)
# CSC: efficient column slicing, matrix-vector products
A_csc = sparse.csc_matrix(data)
# COO: efficient construction, conversion
A_coo = sparse.coo_matrix(data)
# LIL: efficient incremental construction
A_lil = sparse.lil_matrix((1000, 1000))
Random Number Generation
from scipy import stats
import numpy as np
# Set random seed for reproducibility
np.random.seed(42)
# Use RandomState for thread-safe operations
rng = np.random.RandomState(42)
data = stats.norm.rvs(loc=0, scale=1, size=1000, random_state=rng)
Common Use Cases
Use Case 1: Curve Fitting and Model Selection
import numpy as np
from scipy.optimize import curve_fit
from scipy import stats
import matplotlib.pyplot as plt
# Generate noisy data
x = np.linspace(0, 10, 100)
y_true = 2.5 * np.exp(-0.5 * x) + 1.0
y_noisy = y_true + 0.2 * np.random.normal(size=len(x))
# Define model
def exponential_model(x, a, b, c):
return a * np.exp(-b * x) + c
# Fit curve
params, covariance = curve_fit(exponential_model, x, y_noisy)
y_fit = exponential_model(x, *params)
# Calculate R-squared
residuals = y_noisy - y_fit
ss_res = np.sum(residuals**2)
ss_tot = np.sum((y_noisy - np.mean(y_noisy))**2)
r_squared = 1 - (ss_res / ss_tot)
print(f"Parameters: a={params[0]:.3f}, b={params[1]:.3f}, c={params[2]:.3f}")
print(f"R-squared: {r_squared:.4f}")
Use Case 2: Signal Filtering and Analysis
import numpy as np
from scipy import signal
from scipy.fft import fft, fftfreq
# Create noisy signal
fs = 1000 # Sampling frequency
t = np.linspace(0, 1, fs)
clean_signal = np.sin(2 * np.pi * 50 * t) + np.sin(2 * np.pi * 120 * t)
noisy_signal = clean_signal + 0.5 * np.random.normal(size=len(t))
# Design and apply Butterworth filter
sos = signal.butter(10, 100, btype='low', fs=fs, output='sos')
filtered_signal = signal.sosfilt(sos, noisy_signal)
# Find peaks
peaks, properties = signal.find_peaks(filtered_signal,
height=0.5,
distance=20)
# Compute power spectral density
f, Pxx = signal.welch(filtered_signal, fs, nperseg=256)
print(f"Found {len(peaks)} peaks")
print(f"Dominant frequency: {f[np.argmax(Pxx)]:.2f} Hz")
Use Case 3: Statistical Hypothesis Testing
import numpy as np
from scipy import stats
# Generate two sample datasets
np.random.seed(42)
group1 = stats.norm.rvs(loc=100, scale=15, size=50)
group2 = stats.norm.rvs(loc=105, scale=15, size=50)
# Test for normality
_, p_norm1 = stats.shapiro(group1)
_, p_norm2 = stats.shapiro(group2)
# Test for equal variances
_, p_var = stats.levene(group1, group2)
# Perform appropriate t-test
if p_var > 0.05:
# Equal variances
t_stat, p_value = stats.ttest_ind(group1, group2)
test_type = "Independent t-test (equal variances)"
else:
# Unequal variances (Welch's t-test)
t_stat, p_value = stats.ttest_ind(group1, group2, equal_var=False)
test_type = "Welch's t-test (unequal variances)"
# Calculate effect size (Cohen's d)
pooled_std = np.sqrt((np.std(group1)**2 + np.std(group2)**2) / 2)
cohens_d = (np.mean(group1) - np.mean(group2)) / pooled_std
print(f"Test: {test_type}")
print(f"t-statistic: {t_stat:.4f}, p-value: {p_value:.4f}")
print(f"Cohen's d: {cohens_d:.4f}")
Use Case 4: Optimization with Constraints
import numpy as np
from scipy.optimize import minimize
# Portfolio optimization: maximize return, minimize risk
def portfolio_objective(weights, returns, cov_matrix, risk_aversion=1.0):
portfolio_return = np.sum(returns * weights)
portfolio_risk = np.sqrt(np.dot(weights.T, np.dot(cov_matrix, weights)))
return -(portfolio_return - risk_aversion * portfolio_risk)
# Sample data
n_assets = 4
returns = np.array([0.10, 0.12, 0.15, 0.08])
cov_matrix = np.array([[0.005, -0.001, 0.001, 0.000],
[-0.001, 0.008, 0.002, 0.001],
[0.001, 0.002, 0.012, 0.003],
[0.000, 0.001, 0.003, 0.004]])
# Constraints: weights sum to 1
constraints = {'type': 'eq', 'fun': lambda x: np.sum(x) - 1}
# Bounds: each weight between 0 and 1
bounds = tuple((0, 1) for _ in range(n_assets))
# Initial guess
x0 = np.array([1/n_assets] * n_assets)
# Optimize
result = minimize(portfolio_objective, x0,
args=(returns, cov_matrix),
method='SLSQP',
bounds=bounds,
constraints=constraints)
optimal_weights = result.x
print("Optimal portfolio weights:")
for i, weight in enumerate(optimal_weights):
print(f"Asset {i+1}: {weight*100:.2f}%")
Use Case 5: Image Processing Pipeline
import numpy as np
from scipy import ndimage
from scipy import signal
# Load or create image (grayscale)
image = np.random.rand(256, 256) * 255
# Step 1: Denoise with Gaussian filter
denoised = ndimage.gaussian_filter(image, sigma=2)
# Step 2: Edge detection
edges_x = ndimage.sobel(denoised, axis=0)
edges_y = ndimage.sobel(denoised, axis=1)
edges = np.hypot(edges_x, edges_y)
# Step 3: Threshold to create binary image
threshold = np.mean(edges) + np.std(edges)
binary = edges > threshold
# Step 4: Morphological operations
struct = ndimage.generate_binary_structure(2, 2)
cleaned = ndimage.binary_opening(binary, structure=struct)
cleaned = ndimage.binary_closing(cleaned, structure=struct)
# Step 5: Label connected components
labeled, num_features = ndimage.label(cleaned)
# Step 6: Calculate properties
sizes = ndimage.sum(cleaned, labeled, range(num_features + 1))
centers = ndimage.center_of_mass(cleaned, labeled, range(1, num_features + 1))
print(f"Found {num_features} objects")
print(f"Average object size: {np.mean(sizes[1:]):.2f} pixels")
Best Practices
-
Choose the Right Method: Select optimization algorithms based on problem characteristics (gradient-based for smooth functions, global optimizers for multimodal problems)
-
Vectorize Operations: Use NumPy array operations instead of loops for better performance; SciPy functions are optimized for array inputs
-
Handle Numerical Stability: Use appropriate tolerances (rtol, atol) for integration and optimization; be aware of condition numbers in linear algebra operations
-
Leverage Sparse Matrices: For large matrices with many zeros, use sparse matrix formats (csr_matrix, csc_matrix) to save memory and improve computation speed
-
Provide Good Initial Guesses: Optimization and root-finding algorithms converge faster with reasonable starting points; use domain knowledge when possible
-
Use Appropriate Statistical Tests: Verify assumptions (normality, equal variance) before applying parametric tests; use non-parametric alternatives when assumptions are violated
-
Set Random Seeds: Ensure reproducibility in stochastic algorithms by setting random seeds with np.random.seed() or using RandomState objects
-
Profile Performance: Use %timeit in Jupyter or cProfile to identify bottlenecks; consider Numba or Cython for critical sections if SciPy functions are insufficient
-
Check Convergence: Always examine optimization results (result.success, result.message) and integration errors before trusting outputs
-
Document Units and Scales: Clearly document physical units and data scales; normalize data when appropriate to improve numerical stability