How to solve a linear system of equations?

While doing maths expertes (mainly Linear Algebra and Matrices) this year, I got to revisit some of the most fundamental questions in Linear Algebra. It’s the problem of solving a system of equations.

System of linear equations

A linear system of n equations with n unknowns is the following:

$$ \begin{array}{ccccccc} a_{0,0}x_{0} & + & \cdots & + & a_{0, n-1}x_{n-1} & = & b_0\\ & & \vdots & & \\ a_{i,0}x_{0} & + & \cdots & + & a_{i, n-1}x_{n-1} & = & b_i\\ & & \vdots & & \\ a_{n-1,0}x_{0} & + & \cdots & + & a_{n-1, n-1}x_{n-1} & = & b_{n-1}\\ \end{array} $$

It can be written in a matrix form $$Ax = b$$

where $A = (a_{i,j})$, $x = (x_i)$ and $b = (b_i)$.

This is a powerful technique for approaching the problem from a linear algebra perspective.

How to solve this problem?

One naive solution one can think of is to multiply this equation on the left by the inverse of which gives: $A^{-1}Ax = A^{-1}b$ and $x = A^{-1}b$.

For a small 2x2 matrix its easy to compute its exact inverse and solve the equation.

Inverse the matrix A

For example for a 2x2 matrix

import numpy as np
import scipy as sp

A = np.matrix([2., 3., 4., 9.]).reshape((2, 2))
A

matrix([[2., 3.],
        [4., 9.]])

b = np.matrix([6., 15.]).reshape((2, 1))
b

matrix([[ 6.],
        [15.]])

detA = np.linalg.det(A)
detA

6.0

invA = 1/detA * np.matrix([[9., -3.], [-4., 2.]])
invA

matrix([[ 1.5       , -0.5       ],
        [-0.66666667,  0.33333333]])

The solution is

x = invA @ b
x

matrix([[1.5],
        [1. ]])

In general, we can still compute the inverse of the matrix A

For example for a 3x3 matrix

A = np.array([2., 1., -1., -3., -1., 2., -2., 1., 2.]).reshape((3,3))
A

array([[ 2.,  1., -1.],
       [-3., -1.,  2.],
       [-2.,  1.,  2.]])

b = np.array([8, -11., -3.]).reshape((3, 1))
b

array([[  8.],
       [-11.],
       [ -3.]])

invA = np.linalg.inv(A)
invA

array([[ 4.,  3., -1.],
       [-2., -2.,  1.],
       [ 5.,  4., -1.]])

The solution is

x = invA @ b
x

array([[ 2.],
       [ 3.],
       [-1.]])

The downside of this approach is that it’s very costly and complicated to compute the inverse of a matrix even numerically. What about a large matrix say a 100x100 matrix of even a 1000x1000 matrix?

Matrix decomposition methods to the rescue

More advanced methods rely on matrix decomposition methods. We factorise the matrix A generally into a product of two or three matrices with good properties thus making it easy to solve the problem.

Forward and backward substitution

# Solve Ux=b where U is upper triangular
def backward_subt(U, b):
    m, n = U.shape
    assert(m == n)
    x = np.zeros(n)
    for i in reversed(range(n)):
        s = 0.
        for j in range(i, n):
            s += U[i,j] * x[j]
        x[i] = (b[i] - s) / U[i,i]
    return x

# Solve Lx=b where L is lower triangular
def forward_subt(L, b):
    m, n = L.shape
    assert(m == n)
    x = np.zeros(n)
    for i in range(n):
        s = 0.
        for j in range(0, i):
            s += L[i,j] * x[j]
        x[i] = (b[i] - s) / L[i,i]
    return x

QR decomposition

Decompose $A = QR$ where $Q$ is orthogonal($Q^{-1}=Q^{T}$) and $R$ is upper triangular

m, n = A.shape
Q, R = np.linalg.qr(A)

array([[-0.48507125,  0.41166465,  0.77151675],
       [ 0.72760688, -0.29939248,  0.6172134 ],
       [ 0.48507125,  0.86075337, -0.15430335]])

array([[-4.12310563, -0.72760688,  2.9104275 ],
       [ 0.        ,  1.5718105 ,  0.71105713],
       [ 0.        ,  0.        ,  0.15430335]])

From $Ax=b$, we have $QRx=b$, and $Rx=Q^Tb$

Solve $z = Q^Tb$ using backward subtitution

Q.T @ b

array([[-13.33945938],
       [  4.00437436],
       [ -0.15430335]])

z = (Q.T @ b).flatten()
x = backward_subt(R, z)
x

array([ 2.,  3., -1.])

LU decomposition

Decompose a matrix into a product $A = PLU$ and solve $PLU x= b$, which is $LUx = P^Tb$

Solve $Lz = t$ using forward substitution where $z = Ux$ and $t = P^T b$
Solve $Ux = z$ using backward subtitution

P, L, U = sp.linalg.lu(A)

t = (P.T @ b).flatten()
z = forward_subt(L, t)
z

array([-11.        ,   4.33333333,  -0.2       ])

x = backward_subt(U, z)
x

array([ 2.,  3., -1.])

SVD decomposition

Decompose $A = U \Sigma V^T$, the solution is $x = V \Sigma^{-1} U^T b$

U, S, Vt = np.linalg.svd(A)

array([[-0.45463329, -0.45500496, -0.76568862],
       [ 0.72562679,  0.30930747, -0.61465002],
       [ 0.51650202, -0.83504454,  0.18954231]])

array([5.10343603, 1.71519373, 0.11424157])

Vt

array([[-0.80713287, -0.13006101,  0.57586514],
       [-0.09785674, -0.93246433, -0.34775615],
       [-0.58220322,  0.3370377 , -0.73989526]])

array([5.10343603, 1.71519373, 0.11424157])

invS = (np.ones(3) / S)

def to_dense(x):
    n = x.shape[0]
    y = np.zeros((n, n))
    for i in range(n):
        for j in range(n):
            y[i,i] = x[i]
    return y

invS = np.ones(n) / S
invS

array([0.19594642, 0.58302452, 8.75338148])

x = Vt.T @ to_dense(invS) @ U.T @ b
x = x.flatten()
x

array([ 2.,  3., -1.])

Conclusion

It might be straightforward to try to inverse the matrix A when solving a linear system of n equations with n unknowns written in matrix form as $Ax = b$, but this approach is costly in practice. One must always use a matrix decomposition method and especially the QR decomposition which is very efficient in practice and numerically stable.