Skip to content

SciPy Cheatsheet

Installation

Platform Command
Ubuntu/Debian sudo apt-get install python3-scipy or pip install scipy
macOS pip3 install scipy or brew install python3 && pip3 install scipy
Windows pip install scipy or conda install scipy (recommended)
Conda (All platforms) conda install scipy or conda install -c conda-forge scipy
Virtual Environment python -m venv env && source env/bin/activate && pip install scipy
Specific Version pip install scipy==1.11.4
With Dependencies pip install scipy[all]
Verify Installation python -c "import scipy; print(scipy.__version__)"

Basic Commands

Import and Setup

Command Description
import scipy Import the SciPy library
import numpy as np Import NumPy (required for SciPy)
from scipy import optimize Import optimization module
from scipy import stats Import statistics module
from scipy import linalg Import linear algebra module
scipy.__version__ Check SciPy version
scipy.show_config() Display SciPy configuration

Optimization Basics

Command Description
minimize(func, x0) Find minimum of a scalar function
minimize(func, x0, method='BFGS') Minimize using BFGS algorithm
root(func, x0) Find root of a function
curve_fit(func, xdata, ydata) Fit curve to data points
fsolve(func, x0) Solve system of nonlinear equations
minimize_scalar(func) Minimize single-variable function
linprog(c, A_ub, b_ub) Linear programming optimization

Integration

Command Description
quad(func, a, b) Integrate function from a to b
dblquad(func, a, b, gfun, hfun) Double integration
odeint(func, y0, t) Solve ordinary differential equations
solve_ivp(func, t_span, y0) Modern ODE solver with events
trapz(y, x) Trapezoidal rule integration
simps(y, x) Simpson's rule integration

Interpolation

Command Description
interp1d(x, y) 1D interpolation function
interp1d(x, y, kind='cubic') Cubic spline interpolation
griddata(points, values, xi) Interpolate scattered data
UnivariateSpline(x, y) Univariate spline approximation
interp2d(x, y, z) 2D interpolation function

Statistics

Command Description
stats.norm.rvs(loc, scale, size) Generate normal random variables
stats.ttest_1samp(data, popmean) One-sample t-test
stats.ttest_ind(data1, data2) Independent two-sample t-test
stats.pearsonr(x, y) Pearson correlation coefficient
stats.norm.pdf(x, loc, scale) Normal probability density function
stats.norm.cdf(x, loc, scale) Cumulative distribution function
stats.kstest(data, 'norm') Kolmogorov-Smirnov test
stats.chi2_contingency(table) Chi-square test of independence

Linear Algebra

Command Description
linalg.solve(A, b) Solve linear system Ax = b
linalg.inv(A) Calculate matrix inverse
linalg.det(A) Calculate determinant
linalg.eig(A) Compute eigenvalues and eigenvectors
linalg.svd(A) Singular value decomposition
linalg.lstsq(A, b) Least-squares solution
linalg.norm(A) Matrix or vector norm
linalg.qr(A) QR decomposition

Signal Processing

Command Description
signal.butter(N, Wn, btype) Butterworth filter design
signal.filtfilt(b, a, data) Zero-phase digital filtering
signal.find_peaks(data) Find peaks in signal
signal.correlate(sig1, sig2) Cross-correlation of signals
signal.convolve(sig1, sig2) Convolution of signals
signal.spectrogram(data, fs) Compute spectrogram
signal.resample(data, num) Resample signal to num samples

Sparse Matrices

Command Description
sparse.csr_matrix(data) Create compressed sparse row matrix
sparse.csc_matrix(data) Create compressed sparse column matrix
sparse.eye(n) Sparse identity matrix
sparse.diags(diagonals, offsets) Construct sparse diagonal matrix
matrix.toarray() Convert sparse to dense array
matrix.todense() Convert sparse to dense matrix

Advanced Usage

Advanced Optimization

Command Description
minimize(func, x0, method='SLSQP', constraints=cons, bounds=bnds) Constrained optimization with bounds
differential_evolution(func, bounds) Global optimization using genetic algorithm
basinhopping(func, x0, minimizer_kwargs) Global optimization with basin-hopping
least_squares(func, x0, bounds) Nonlinear least squares with bounds
minimize(func, x0, jac=gradient_func) Optimization with analytical gradient
shgo(func, bounds) Simplicial homology global optimization
dual_annealing(func, bounds) Dual annealing global optimization

Advanced Integration

Command Description
solve_ivp(func, t_span, y0, method='RK45', events=event_func) ODE solver with event detection
solve_ivp(func, t_span, y0, dense_output=True) ODE with continuous solution
nquad(func, ranges) N-dimensional integration
odeint(func, y0, t, Dfun=jacobian) ODE with Jacobian matrix
solve_bvp(func, bc, x, y) Boundary value problem solver
quad(func, a, b, weight='cos') Integration with weight functions

Advanced Interpolation

Command Description
RBFInterpolator(points, values) Radial basis function interpolation
BSpline(t, c, k) B-spline basis functions
make_interp_spline(x, y, k=3) Make interpolating B-spline
Akima1DInterpolator(x, y) Akima interpolation (smooth curves)
PchipInterpolator(x, y) PCHIP 1-D monotonic cubic interpolation
RegularGridInterpolator((x, y), z) Interpolation on regular grid

Advanced Statistics

Command Description
stats.gaussian_kde(data) Kernel density estimation
stats.multivariate_normal.rvs(mean, cov, size) Multivariate normal samples
stats.bootstrap((data,), statistic, n_resamples) Bootstrap confidence intervals
stats.anderson(data, dist='norm') Anderson-Darling test
stats.mannwhitneyu(x, y) Mann-Whitney U test
stats.kruskal(*groups) Kruskal-Wallis H-test
stats.wilcoxon(x, y) Wilcoxon signed-rank test
stats.spearmanr(x, y) Spearman rank correlation
stats.linregress(x, y) Linear regression analysis

Advanced Linear Algebra

Command Description
linalg.cholesky(A) Cholesky decomposition
linalg.lu(A) LU decomposition
linalg.schur(A) Schur decomposition
linalg.expm(A) Matrix exponential
linalg.logm(A) Matrix logarithm
linalg.sqrtm(A) Matrix square root
linalg.solve_triangular(A, b) Solve triangular system
linalg.pinv(A) Moore-Penrose pseudoinverse

Advanced Signal Processing

Command Description
signal.stft(data, fs) Short-time Fourier transform
signal.istft(Zxx, fs) Inverse STFT
signal.welch(data, fs) Power spectral density (Welch method)
signal.coherence(x, y, fs) Coherence between signals
signal.hilbert(data) Hilbert transform
signal.savgol_filter(data, window, polyorder) Savitzky-Golay filter
signal.detrend(data) Remove linear trend from data
signal.wiener(data) Wiener filter for noise reduction

Advanced Sparse Operations

Command Description
sparse.linalg.spsolve(A, b) Solve sparse linear system
sparse.linalg.eigsh(A, k) Find k eigenvalues of sparse matrix
sparse.linalg.svds(A, k) Sparse SVD for k singular values
sparse.linalg.cg(A, b) Conjugate gradient solver
sparse.linalg.gmres(A, b) GMRES iterative solver
sparse.kron(A, B) Kronecker product of sparse matrices

Spatial Operations

Command Description
spatial.distance.pdist(X, metric='euclidean') Pairwise distances between observations
spatial.distance.cdist(XA, XB) Distance between two collections
spatial.KDTree(data) k-d tree for fast nearest neighbor
spatial.ConvexHull(points) Convex hull of points
spatial.Delaunay(points) Delaunay triangulation
spatial.Voronoi(points) Voronoi diagram
spatial.distance_matrix(x, y) Distance matrix computation

Image Processing

Command Description
ndimage.gaussian_filter(image, sigma) Gaussian smoothing filter
ndimage.median_filter(image, size) Median filter for noise reduction
ndimage.rotate(image, angle) Rotate image by angle
ndimage.zoom(image, zoom) Zoom image by factor
ndimage.binary_erosion(image) Binary morphological erosion
ndimage.binary_dilation(image) Binary morphological dilation
ndimage.label(image) Label connected components
ndimage.sobel(image) Sobel edge detection

Configuration

NumPy/SciPy Configuration

# Display build and configuration information
import scipy
scipy.show_config()

# Set NumPy print options (affects SciPy output)
import numpy as np
np.set_printoptions(precision=4, suppress=True, linewidth=100)

# Configure warning filters
import warnings
warnings.filterwarnings('ignore', category=RuntimeWarning)

Optimization Configuration

# Configure optimization options
from scipy.optimize import minimize

options = {
    'maxiter': 1000,      # Maximum iterations
    'disp': True,         # Display convergence messages
    'ftol': 1e-8,        # Function tolerance
    'gtol': 1e-8         # Gradient tolerance
}

result = minimize(func, x0, method='BFGS', options=options)

Integration Tolerances

from scipy.integrate import quad, solve_ivp

# Configure integration accuracy
result, error = quad(func, a, b, 
                    epsabs=1e-10,  # Absolute error tolerance
                    epsrel=1e-10,  # Relative error tolerance
                    limit=100)     # Subdivision limit

# Configure ODE solver
sol = solve_ivp(func, t_span, y0,
               method='RK45',
               rtol=1e-6,      # Relative tolerance
               atol=1e-9,      # Absolute tolerance
               max_step=0.1)   # Maximum step size

Sparse Matrix Format Selection

from scipy import sparse

# Choose format based on use case
# CSR: efficient row slicing, matrix-vector products
A_csr = sparse.csr_matrix(data)

# CSC: efficient column slicing, matrix-vector products
A_csc = sparse.csc_matrix(data)

# COO: efficient construction, conversion
A_coo = sparse.coo_matrix(data)

# LIL: efficient incremental construction
A_lil = sparse.lil_matrix((1000, 1000))

Random Number Generation

from scipy import stats
import numpy as np

# Set random seed for reproducibility
np.random.seed(42)

# Use RandomState for thread-safe operations
rng = np.random.RandomState(42)
data = stats.norm.rvs(loc=0, scale=1, size=1000, random_state=rng)

Common Use Cases

Use Case 1: Curve Fitting and Model Selection

import numpy as np
from scipy.optimize import curve_fit
from scipy import stats
import matplotlib.pyplot as plt

# Generate noisy data
x = np.linspace(0, 10, 100)
y_true = 2.5 * np.exp(-0.5 * x) + 1.0
y_noisy = y_true + 0.2 * np.random.normal(size=len(x))

# Define model
def exponential_model(x, a, b, c):
    return a * np.exp(-b * x) + c

# Fit curve
params, covariance = curve_fit(exponential_model, x, y_noisy)
y_fit = exponential_model(x, *params)

# Calculate R-squared
residuals = y_noisy - y_fit
ss_res = np.sum(residuals**2)
ss_tot = np.sum((y_noisy - np.mean(y_noisy))**2)
r_squared = 1 - (ss_res / ss_tot)

print(f"Parameters: a={params[0]:.3f}, b={params[1]:.3f}, c={params[2]:.3f}")
print(f"R-squared: {r_squared:.4f}")

Use Case 2: Signal Filtering and Analysis

import numpy as np
from scipy import signal
from scipy.fft import fft, fftfreq

# Create noisy signal
fs = 1000  # Sampling frequency
t = np.linspace(0, 1, fs)
clean_signal = np.sin(2 * np.pi * 50 * t) + np.sin(2 * np.pi * 120 * t)
noisy_signal = clean_signal + 0.5 * np.random.normal(size=len(t))

# Design and apply Butterworth filter
sos = signal.butter(10, 100, btype='low', fs=fs, output='sos')
filtered_signal = signal.sosfilt(sos, noisy_signal)

# Find peaks
peaks, properties = signal.find_peaks(filtered_signal, 
                                     height=0.5, 
                                     distance=20)

# Compute power spectral density
f, Pxx = signal.welch(filtered_signal, fs, nperseg=256)

print(f"Found {len(peaks)} peaks")
print(f"Dominant frequency: {f[np.argmax(Pxx)]:.2f} Hz")

Use Case 3: Statistical Hypothesis Testing

import numpy as np
from scipy import stats

# Generate two sample datasets
np.random.seed(42)
group1 = stats.norm.rvs(loc=100, scale=15, size=50)
group2 = stats.norm.rvs(loc=105, scale=15, size=50)

# Test for normality
_, p_norm1 = stats.shapiro(group1)
_, p_norm2 = stats.shapiro(group2)

# Test for equal variances
_, p_var = stats.levene(group1, group2)

# Perform appropriate t-test
if p_var > 0.05:
    # Equal variances
    t_stat, p_value = stats.ttest_ind(group1, group2)
    test_type = "Independent t-test (equal variances)"
else:
    # Unequal variances (Welch's t-test)
    t_stat, p_value = stats.ttest_ind(group1, group2, equal_var=False)
    test_type = "Welch's t-test (unequal variances)"

# Calculate effect size (Cohen's d)
pooled_std = np.sqrt((np.std(group1)**2 + np.std(group2)**2) / 2)
cohens_d = (np.mean(group1) - np.mean(group2)) / pooled_std

print(f"Test: {test_type}")
print(f"t-statistic: {t_stat:.4f}, p-value: {p_value:.4f}")
print(f"Cohen's d: {cohens_d:.4f}")

Use Case 4: Optimization with Constraints

import numpy as np
from scipy.optimize import minimize

# Portfolio optimization: maximize return, minimize risk
def portfolio_objective(weights, returns, cov_matrix, risk_aversion=1.0):
    portfolio_return = np.sum(returns * weights)
    portfolio_risk = np.sqrt(np.dot(weights.T, np.dot(cov_matrix, weights)))
    return -(portfolio_return - risk_aversion * portfolio_risk)

# Sample data
n_assets = 4
returns = np.array([0.10, 0.12, 0.15, 0.08])
cov_matrix = np.array([[0.005, -0.001, 0.001, 0.000],
                       [-0.001, 0.008, 0.002, 0.001],
                       [0.001, 0.002, 0.012, 0.003],
                       [0.000, 0.001, 0.003, 0.004]])

# Constraints: weights sum to 1
constraints = {'type': 'eq', 'fun': lambda x: np.sum(x) - 1}

# Bounds: each weight between 0 and 1
bounds = tuple((0, 1) for _ in range(n_assets))

# Initial guess
x0 = np.array([1/n_assets] * n_assets)

# Optimize
result = minimize(portfolio_objective, x0, 
                 args=(returns, cov_matrix),
                 method='SLSQP',
                 bounds=bounds,
                 constraints=constraints)

optimal_weights = result.x
print("Optimal portfolio weights:")
for i, weight in enumerate(optimal_weights):
    print(f"Asset {i+1}: {weight*100:.2f}%")

Use Case 5: Image Processing Pipeline

import numpy as np
from scipy import ndimage
from scipy import signal

# Load or create image (grayscale)
image = np.random.rand(256, 256) * 255

# Step 1: Denoise with Gaussian filter
denoised = ndimage.gaussian_filter(image, sigma=2)

# Step 2: Edge detection
edges_x = ndimage.sobel(denoised, axis=0)
edges_y = ndimage.sobel(denoised, axis=1)
edges = np.hypot(edges_x, edges_y)

# Step 3: Threshold to create binary image
threshold = np.mean(edges) + np.std(edges)
binary = edges > threshold

# Step 4: Morphological operations
struct = ndimage.generate_binary_structure(2, 2)
cleaned = ndimage.binary_opening(binary, structure=struct)
cleaned = ndimage.binary_closing(cleaned, structure=struct)

# Step 5: Label connected components
labeled, num_features = ndimage.label(cleaned)

# Step 6: Calculate properties
sizes = ndimage.sum(cleaned, labeled, range(num_features + 1))
centers = ndimage.center_of_mass(cleaned, labeled, range(1, num_features + 1))

print(f"Found {num_features} objects")
print(f"Average object size: {np.mean(sizes[1:]):.2f} pixels")

Best Practices

  • Choose the Right Method: Select optimization algorithms based on problem characteristics (gradient-based for smooth functions, global optimizers for multimodal problems)

  • Vectorize Operations: Use NumPy array operations instead of loops for better performance; SciPy functions are optimized for array inputs

  • Handle Numerical Stability: Use appropriate tolerances (rtol, atol) for integration and optimization; be aware of condition numbers in linear algebra operations

  • Leverage Sparse Matrices: For large matrices with many zeros, use sparse matrix formats (csr_matrix, csc_matrix) to save memory and improve computation speed

  • Provide Good Initial Guesses: Optimization and root-finding algorithms converge faster with reasonable starting points; use domain knowledge when possible

  • Use Appropriate Statistical Tests: Verify assumptions (normality, equal variance) before applying parametric tests; use non-parametric alternatives when assumptions are violated

  • Set Random Seeds: Ensure reproducibility in stochastic algorithms by setting random seeds with np.random.seed() or using RandomState objects

  • Profile Performance: Use %timeit in Jupyter or cProfile to identify bottlenecks; consider Numba or Cython for critical sections if SciPy functions are insufficient

  • Check Convergence: Always examine optimization results (result.success, result.message) and integration errors before trusting outputs

  • Document Units and Scales: Clearly document physical units and data scales; normalize data when appropriate to improve numerical stability