Linear Algebra

Week 6 Day 2: Linear Algebra

Objectives:

Perform basic linear algebra manipulations
Solve a realistic problem

import numpy as np
import matplotlib.pyplot as plt
import scipy.linalg

Matrix multiplication

All operations on an array are element-wise. Numpy used to have a "matrix" mode, where all operations were "matrix-wise"; that is, something like * would do matrix multiplication instead of element-wise multiplication. This was ugly and messy, and has been replaced in Python 3.5+ with a matrix multipy operator, @. (Older Python: Use .matmul() or .dot().)

Let's first look at the diminsion rules for matrix multiplicaiton:

[a, b] @ [b, c] = [a, c]

(np.ones([3, 4]) @ np.ones([4, 5])).shape

(3, 5)

The "inner" diminsions go away. This works for ND arrays, too:

[a] @ [a] = scalar

(np.ones([4]) @ np.ones([4])).shape

()

One of the two is allowed to have more than 2 dimensions, in which case it behaves like "stacks" of arrays:

[a,b,c] @ [c,d] = [a,b,d]

(np.ones([2, 3, 4]) @ np.ones([4, 5])).shape

(2, 3, 5)

Normal "prepend 1" broadcasting rules apply.

np.array([1, 2, 3]) @ np.array([[1, 2, 3]]).T

array([14])

Power user: Einstein summation notation

You can use Einstein summation notation for full control:

a = np.arange(25).reshape(5, 5)

np.trace(a)

np.einsum("ii", a)

a.T

array([[ 0,  5, 10, 15, 20],
       [ 1,  6, 11, 16, 21],
       [ 2,  7, 12, 17, 22],
       [ 3,  8, 13, 18, 23],
       [ 4,  9, 14, 19, 24]])

np.einsum("ji", a)

array([[ 0,  5, 10, 15, 20],
       [ 1,  6, 11, 16, 21],
       [ 2,  7, 12, 17, 22],
       [ 3,  8, 13, 18, 23],
       [ 4,  9, 14, 19, 24]])

a @ a

array([[ 150,  160,  170,  180,  190],
       [ 400,  435,  470,  505,  540],
       [ 650,  710,  770,  830,  890],
       [ 900,  985, 1070, 1155, 1240],
       [1150, 1260, 1370, 1480, 1590]])

np.einsum("ij,jk", a, a)

array([[ 150,  160,  170,  180,  190],
       [ 400,  435,  470,  505,  540],
       [ 650,  710,  770,  830,  890],
       [ 900,  985, 1070, 1155, 1240],
       [1150, 1260, 1370, 1480, 1590]])

np.sum(a * a)

np.einsum("ij,ij", a, a)

np.einsum("ij->", a ** 2)

Linear algebra

Let's look at a bit of Linear algebra now.

We'll solve the equation: $$ \mathbf{b} = A \mathbf{x} $$

Which has the solution:

$$ \mathbf{x} = A^{-1} \mathbf{b} $$

b = np.array([1, 2, 3])
print(b)

[1 2 3]

A = np.array([[1, 2, 3], [22, 32, 42], [55, 66, 100]])
print(A)

[[  1   2   3]
 [ 22  32  42]
 [ 55  66 100]]

np.linalg.inv(A) @ b

array([-1.4057971 , -0.1884058 ,  0.92753623])

Note that for these equations, 1D vectors really should be 2D column vectors! @ and solve handle 1D vectors pretty well so we are safe, but be careful.

Computing the inverse is slow - there are faster algorithms when you just want to solve one case, available as solve and internally using the LAPACK matrix library. We can even tell solve if we know something special about our matrix, like if we have a diagonal matrix, if we use scipy.linalg.solve instead!

x = np.linalg.solve(A, b)

A @ x - b

array([ 2.22044605e-16,  2.66453526e-15, -1.77635684e-15])

array([-1.4057971 , -0.1884058 ,  0.92753623])

A = np.array([[4, -2, 1], [3, 6, -4], [2, 1, 8]])

np.linalg.inv(A) - 1 / 263 * np.array([[52, 17, 2], [-32, 30, 19], [-9, -8, +30]])

array([[ 0.00000000e+00,  0.00000000e+00,  6.07153217e-18],
       [ 0.00000000e+00, -1.38777878e-17,  2.77555756e-17],
       [ 0.00000000e+00, -3.46944695e-18,  1.38777878e-17]])

x1 = np.array([12, -25, 32])
x2 = np.array([4, -10, 22])
x3 = np.array([20, -30, 40])

print(np.linalg.inv(A) @ x1)
print(np.linalg.inv(A) @ x2)
print(np.linalg.inv(A) @ x3)

[ 1. -2.  4.]
[ 0.31178707 -0.03802281  2.67680608]
[ 2.31939163 -2.96577947  4.79087452]

print(np.linalg.solve(A, x1))
print(np.linalg.solve(A, x2))
print(np.linalg.solve(A, x3))

[ 1. -2.  4.]
[ 0.31178707 -0.03802281  2.67680608]
[ 2.31939163 -2.96577947  4.79087452]

For such a tiny problem, inv beats solve by a hair. But if you invert only once, you can solve many problems with the same solution!

%%timeit
np.linalg.solve(A, x1)
np.linalg.solve(A, x2)
np.linalg.solve(A, x3)

23 µs ± 1.2 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)

%%timeit
np.linalg.inv(A) @ x1
np.linalg.inv(A) @ x2
np.linalg.inv(A) @ x3

36.1 µs ± 3.87 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)

%%timeit
Ai = np.linalg.inv(A)
Ai @ x1
Ai @ x2
Ai @ x3

13.8 µs ± 402 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

Problem: Hilbert matrix

Now let's look at problem 5 in 8.4.3 in our book. For now, let's do it on an 8x8 matrix; on your own try 100x100! We need the Hilbert matrix, which we can find in SciPy:

a = scipy.linalg.hilbert(8)

Or, given the formula in the book we could have produced it ourselves:

i, j = np.ogrid[1 : len(a) + 1, 1 : len(a) + 1]
a_ours = 1 / (i + j - 1)

np.all(a == a_ours)

True

We need b, which is just the first row of the matrix:

b = a[0]

Let's try solve. When you try 100x100, does this still work?

np.linalg.solve(a, b)

array([ 1.,  0.,  0.,  0., -0.,  0., -0.,  0.])

We can use the invhilbert function to make an inverse hilbert matrix. You can pass exact=True to return integers instead of double floats. Note that this matrix will overflow 64 bit integers at 14x14, and therefore will become an inefficient python integer array.

scipy.linalg.invhilbert(8) @ b

array([1., 0., 0., 0., 0., 0., 0., 0.])

We can also take the inverse ourselves. What happens when this becomes larger?

np.linalg.inv(a) @ b

array([ 9.99999977e-01,  8.19563866e-08, -2.23517418e-08,  5.96046448e-08,
       -2.38418579e-07,  0.00000000e+00, -1.19209290e-07,  0.00000000e+00])

Other tools are available: Eigen Vectors

I = 1 / 12 * np.array([[8, -3, -3], [-3, 8, -3], [-3, -3, 8]])

λs, ωs = np.linalg.eig(I)

λs

array([0.91666667, 0.16666667, 0.91666667])

ωs

array([[ 0.81649658, -0.57735027,  0.44507153],
       [-0.40824829, -0.57735027, -0.81535403],
       [-0.40824829, -0.57735027,  0.3702825 ]])

ω = ωs[:, 0]
print("ω:", ω)
print("Iω", I @ ω)
print("λω", λs[0] * ω)

ω: [ 0.81649658 -0.40824829 -0.40824829]
Iω [ 0.7484552 -0.3742276 -0.3742276]
λω [ 0.7484552 -0.3742276 -0.3742276]

print(I @ ωs - λs * ωs)

[[ 1.11022302e-16  4.16333634e-17 -5.55111512e-17]
 [-1.11022302e-16 -2.77555756e-17 -3.33066907e-16]
 [-1.66533454e-16 -1.80411242e-16 -1.11022302e-16]]

Mathematics tends to think of stacked vectors as column vectors - which is the opposite of what's natural to write in a text file, like [[row1], [row2]]. This is why matrix oriented languages tend to be column major. You can stack in rows, use .T when needed for matrix manipulations. You might want to start in "F" order if performance is important.