Performance study
I've written an interesting look at 1D and 2D histogram performance here: https://iscinumpy.gitlab.io/post/histogram-speeds-in-python/
Bokeh
The plotting library Bokeh has just reached version 1.0. We'll use that today instead of matplotlib just to make things interesting. It's not as old and respected as matplotlib, but is prettier in a browser.
from bokeh.io import output_notebook, show
from bokeh.plotting import figure
output_notebook()
import numpy as np
Fourier series
Our goal will be to decompose (periodic) functions into Fourier components. A few definitions:
$T$ is the period. $\omega = \frac{2\pi}{T}$ is the true frequency. The Fourier series is then written:
$$ y(t) = \frac{a_0}{2} + \sum_{n=1}^{\infty} \left( a_n \cos n \omega t + b_n \sin n \omega t \right) \tag{1} $$We can compute it by integrating over the function of interest and computing the components:
$$ \left(\begin{matrix}a_n \\ b_n\end{matrix}\right) = \frac{2}{T} \int^{T}_{0} \left(\begin{matrix}\cos n \omega t \\ \sin n \omega t \end{matrix}\right) y(t) \, dt \tag{2} $$Useful tips:
- $a_0 = \left<y(t)\right>$
- For odd or even functions, we can drop cos or sin (respectively) and integrate over 1/2 the range.
Example: Step function
Let's define a step function valid from $-L$ to $L$ (it repeats outside that range). We have:
$$ f(t) = \begin{cases}1 & t \ge 0 \\ -1 & t < 0\end{cases}. $$This is odd, so we only need to integrate from $0$ to $L=T/2$ and multiply by 2. The average value is 0, so we can drop $a_0$. We have $\omega = 2\pi / T = \pi / L$ So equation 2 becomes:
$$ a_n =2 \frac{1}{L} \int^{L}_0 \sin \frac{n \pi t}{L} \, dt. $$Since $\int_0^A \sin a t \, dt = \frac{1 - \cos(a A)}{a}$, we have:
$$ a_n = \frac{2}{L} \frac{L}{n \pi} \left( 1 - \cos n \pi \right), $$which is non-zero only for odd n, giving the final formula:
$$ a_n = \frac{4}{n \pi}, \quad n\ \textrm{odd}. $$Combining with equation 1 gives:
$$ f(t) = \frac{4}{\pi} \sum^{\infty}_{n=1,3,5,\ldots} \frac{1}{n} \sin\left( \frac{n \pi t}{L} \right). $$t = np.linspace(-1.25 * np.pi, 1.25 * np.pi, 2000)
y = np.zeros_like(t)
for n in range(1, 100, 2):
y += 1 / n * np.sin(n * t)
y *= 4 / np.pi
p = figure(width=500, height=300)
p.line(t, y)
show(p)
The function is a best fit to the sawtooth, but it always "overshoots" the edges - this is the Gibbs phenomenon.
We can also look at the spectrum of frequencies:
x = np.arange(1, 100)
s = 4 / (np.pi * x)
s[1::2] = 0
p = figure(width=500, height=300)
p.vbar(x - 0.5, 1, s)
show(p)
x = np.linspace(-2.5 * np.pi, 2.5 * np.pi, 2000)
y = np.zeros_like(x)
L = 2 * np.pi
for n in range(1, 50):
y += 1 / n * (-1) ** (n + 1) * np.sin(2 * n * np.pi * x / L)
y *= 2 / np.pi
p = figure(width=500, height=300)
p.line(x, y)
show(p)
We can also look at the spectrum of frequencies:
x = np.arange(1, 50)
s = (-1) ** (x + 1) * 2 / (np.pi * x)
p = figure(width=500, height=300)
p.vbar(x - 0.5, 1, s)
show(p)
Fourier transform
Now briefly look at the Fourier transform, using the complex exponential function. Computationally, we will be converting the integrals to series anyway, so this becomes equivalent to the Fourier series.
$$ y(t) = \frac{1}{\sqrt{2 \pi}} \int_{-\infty}^{\infty} e^{i \omega t} Y(\omega) \, d\omega. \tag{3} $$And its inverse:
$$ Y(\omega) = \frac{1}{\sqrt{2 \pi}} \int_{-\infty}^{\infty} e^{- i \omega t} y(t) \, dt. \tag{4} $$You'll notice that different conventions exist for the sign and the location of the scaling factor - just be consistent and you'll be okay. We will follow the book's lead here.
We can apply the transform and the inverse together to obtain the Dirac delta function:
$$ \int_{-\infty}^{\infty} e^{i (\omega' - \omega) t}\, dt \equiv 2 \pi \delta (\omega' - \omega) $$This "function" is useful analytically because it can select on point out of an integration range, but ugly computationally. You should do it by hand before coding, rather than later.
We start with a signal $y(t)$ sampled in $N$ time intervals ($N+1$ measurements). We can assume constant time spacing $h \equiv \Delta t$. The total time is $T$ (carefully selected to look like a period as well, we'll see more later). We have a sampling rate $s \equiv N/T=1/h$ So:
$$ y_k \equiv y(t_k) \\ t_k \equiv k h $$At this point, we can view this Fourier transform as a Fourier Series, since we now have a periodicity imposed; the final measurement must be equal to the first $y_0 = y_N$.
Since we only have $N$ data points (left), we can only determine $N$ independent output Fourier components. We can write all the frequencies as multiples of the base frequency:
$$ \omega_n = n \omega_1 = n \frac{2 \pi}{N h} $$Forward transform:
$$ Y_n \equiv \frac{1}{h} Y(\omega_n) \approx \sum_{k=1}^{N} y_k \frac{e^{-2 \pi i k n / N}}{\sqrt{2 \pi}} \tag{5} $$And inverse:
$$ y(t) \approx \sum_{n=1}^{N} \frac{2 \pi}{N h} Y_n \frac{e^{i \omega_n t}}{\sqrt{2 \pi}} \tag{6} $$Let's add one more trick to make this easier to compute: we will introduce $Z=e^{-2\pi i / N}$, and rewrite the above expressions as
$$ y_k = \frac{\sqrt{2 \pi}}{N} \sum_{n=1}^{N} Z^{-n k} Y_n $$$$ Y_n = \frac{1}{\sqrt{2 \pi}} \sum_{k=1}^{N} Z^{n k} y_k $$N = 50
x = np.linspace(0, 2 * np.pi, N + 1)
signal = 30 * np.cos(3 * x) + 20 * np.sin(7 * x) # Change these values
h = 2 * np.pi / N
p = figure(width=500, height=300)
p.circle(x, signal)
show(p)
n = np.arange(N).reshape(-1, 1)
k = np.arange(N).reshape(1, -1)
zexpo = 1j * 2 * np.pi * k * n / N
zsum = np.sum(signal[:N] * np.exp(-zexpo), axis=1)
dftz = zsum / np.sqrt(2 * np.pi)
Note: Formula taken from the book. I think there may be a missing element on the end - but due to symmetric nature, the end element would be the same as the beginning element, so we don't lose anything.
p = figure(width=500, height=300)
p.line(k.ravel(), dftz.real)
p.line(k.ravel(), dftz.imag, color="firebrick")
show(p)
zfft = np.fft.fft(signal)
p = figure(width=500, height=300)
p.line(np.arange(N + 1), zfft.real)
p.line(np.arange(N + 1), zfft.imag, color="firebrick")
show(p)
N = 600
T = 1.0 / 800.0
x = np.linspace(0.0, N * T, N)
y = np.sin(50.0 * 2.0 * np.pi * x) + 0.5 * np.sin(80.0 * 2.0 * np.pi * x)
yf = np.fft.fft(y)
xf = np.linspace(0.0, 1.0 / (2.0 * T), N)
norm_yf = 2.0 / N * np.abs(yf)
p = figure(width=500, height=300)
p.line(x, y)
show(p)
p = figure(width=500, height=300)
p.line(xf[: N // 2 + 1], norm_yf[: N // 2 + 1])
show(p)
Let's look at that in more detail: (the description here is good).
t = np.linspace(0, 1, 1000)
f0 = 13 # Sampling frequency
f1 = 3 # First frequency
k = 1 # Any integer
f2 = f1 + k * f0 # Identical under the sampling frequency!
tp = np.linspace(0, 1, f0 + 1)
y1 = np.sin(2 * f1 * np.pi * t)
y2 = np.sin(2 * f2 * np.pi * t)
ytp = np.sin(2 * f2 * np.pi * tp)
p = figure(width=700, height=300)
p.line(t, y1)
p.line(t, y2, color="firebrick")
p.circle(tp, ytp, color="black")
show(p)