5. Profiling#

If you want to make code faster, always profile first; you want to spend your time on the slow parts, and you only want to have the important bits of your code designed to go fast - everything else should be designed to be clean and maintainable.

5.1. Line profiler#

There is a build in cProfile (one of the very, very few modules to still have the “classic” Python 2 naming scheme), but instead, let’s use line_profiler and its IPython magic. Let’s start with this simple file that contains a MC sampler:

# This notebook does not work on WebAssembly (no line profiler)
%%writefile temp_prof.py

import numpy as np

def norm(loc: float, scale: float, x: np.ndarray) -> np.ndarray:
    y = (x - loc) / scale
    return np.exp(-y**2/2)/np.sqrt(2*np.pi) / scale

def sampler(
    data: np.ndarray,
    samples: int,
    *,
    mu_init: float = 0.5,
    proposal_width: float = 0.5,
    mu_prior_mu: float = 0,
    mu_prior_sd: float = 1.0,
) -> np.ndarray:
    
    mu_current = mu_init
    posterior = [mu_current]
    for i in range(samples):
        # Suggest new position
        mu_proposal = np.random.normal(mu_current, proposal_width)

        # Compute likelihood by multiplying probabilities of each data point
        likelihood_current = np.prod(norm(mu_current, 1, data))
        likelihood_proposal = np.prod(norm(mu_proposal, 1, data))

        # Compute prior probability of current and proposed mu
        prior_current = norm(mu_prior_mu, mu_prior_sd, mu_current)
        prior_proposal = norm(mu_prior_mu, mu_prior_sd, mu_proposal)

        p_current = likelihood_current * prior_current
        p_proposal = likelihood_proposal * prior_proposal

        # Accept proposal?
        p_accept = p_proposal / p_current

        # Usually would include prior probability, which we neglect here for simplicity
        accept = np.random.rand() < p_accept

        if accept:
            # Update position
            mu_current = mu_proposal

        posterior.append(mu_current)

    return np.asarray(posterior)
Writing temp_prof.py
np.random.seed(123)
data = np.random.randn(20)
from temp_prof import sampler
%load_ext line_profiler
%lprun -f sampler posterior = sampler(data, samples=10_000, mu_init=1.)

5.2. Memory profiling#

Please see Fil. It doesn’t support Python 3.10 yet, and doesn’t work with notebooks, so not included here.