How to solve a linear system of equations?

Marc

2024/11/28

While doing maths expertes (mainly Linear Algebra and Matrices) this year, I got to revisit some of the most fundamental questions in Linear Algebra. It’s the problem of solving a system of equations.

System of linear equations

A linear system of n equations with n unknowns is the following:

$$ \begin{array}{ccccccc} a_{0,0}x_{0} & + & \cdots & + & a_{0, n-1}x_{n-1} & = & b_0\\ & & \vdots & & \\ a_{i,0}x_{0} & + & \cdots & + & a_{i, n-1}x_{n-1} & = & b_i\\ & & \vdots & & \\ a_{n-1,0}x_{0} & + & \cdots & + & a_{n-1, n-1}x_{n-1} & = & b_{n-1}\\ \end{array} $$

It can be written in a matrix form $$Ax = b$$

where \(A = (a_{i,j})\), \(x = (x_i)\) and \(b = (b_i)\).

This is a powerful technique for approaching the problem from a linear algebra perspective.

How to solve this problem?

One naive solution one can think of is to multiply this equation on the left by the inverse of which gives: \(A^{-1}Ax = A^{-1}b\) and \(x = A^{-1}b\).

For a small 2x2 matrix its easy to compute its exact inverse and solve the equation.

Inverse the matrix A

For example for a 2x2 matrix

import numpy as np
import scipy as sp
A = np.matrix([2., 3., 4., 9.]).reshape((2, 2))
A
matrix([[2., 3.],
        [4., 9.]])
b = np.matrix([6., 15.]).reshape((2, 1))
b
matrix([[ 6.],
        [15.]])
detA = np.linalg.det(A)
detA
6.0
invA = 1/detA * np.matrix([[9., -3.], [-4., 2.]])
invA
matrix([[ 1.5       , -0.5       ],
        [-0.66666667,  0.33333333]])

The solution is

x = invA @ b
x
matrix([[1.5],
        [1. ]])

In general, we can still compute the inverse of the matrix A

For example for a 3x3 matrix

A = np.array([2., 1., -1., -3., -1., 2., -2., 1., 2.]).reshape((3,3))
A
array([[ 2.,  1., -1.],
       [-3., -1.,  2.],
       [-2.,  1.,  2.]])
b = np.array([8, -11., -3.]).reshape((3, 1))
b
array([[  8.],
       [-11.],
       [ -3.]])
invA = np.linalg.inv(A)
invA
array([[ 4.,  3., -1.],
       [-2., -2.,  1.],
       [ 5.,  4., -1.]])

The solution is

x = invA @ b
x
array([[ 2.],
       [ 3.],
       [-1.]])

The downside of this approach is that it’s very costly and complicated to compute the inverse of a matrix even numerically. What about a large matrix say a 100x100 matrix of even a 1000x1000 matrix?

Matrix decomposition methods to the rescue

More advanced methods rely on matrix decomposition methods. We factorise the matrix A generally into a product of two or three matrices with good properties thus making it easy to solve the problem.

Forward and backward substitution

# Solve Ux=b where U is upper triangular
def backward_subt(U, b):
    m, n = U.shape
    assert(m == n)
    x = np.zeros(n)
    for i in reversed(range(n)):
        s = 0.
        for j in range(i, n):
            s += U[i,j] * x[j]
        x[i] = (b[i] - s) / U[i,i]
    return x
# Solve Lx=b where L is lower triangular
def forward_subt(L, b):
    m, n = L.shape
    assert(m == n)
    x = np.zeros(n)
    for i in range(n):
        s = 0.
        for j in range(0, i):
            s += L[i,j] * x[j]
        x[i] = (b[i] - s) / L[i,i]
    return x

QR decomposition

Decompose \(A = QR\) where \(Q\) is orthogonal(\(Q^{-1}=Q^{T}\)) and \(R\) is upper triangular

m, n = A.shape
Q, R = np.linalg.qr(A)
Q
array([[-0.48507125,  0.41166465,  0.77151675],
       [ 0.72760688, -0.29939248,  0.6172134 ],
       [ 0.48507125,  0.86075337, -0.15430335]])
R
array([[-4.12310563, -0.72760688,  2.9104275 ],
       [ 0.        ,  1.5718105 ,  0.71105713],
       [ 0.        ,  0.        ,  0.15430335]])

From \(Ax=b\), we have \(QRx=b\), and \(Rx=Q^Tb\)

Solve \(z = Q^Tb\) using backward subtitution

Q.T @ b
array([[-13.33945938],
       [  4.00437436],
       [ -0.15430335]])
z = (Q.T @ b).flatten()
x = backward_subt(R, z)
x
array([ 2.,  3., -1.])

LU decomposition

Decompose a matrix into a product \(A = PLU\) and solve \(PLU x= b\), which is \(LUx = P^Tb\)

P, L, U = sp.linalg.lu(A)
t = (P.T @ b).flatten()
z = forward_subt(L, t)
z
array([-11.        ,   4.33333333,  -0.2       ])
x = backward_subt(U, z)
x
array([ 2.,  3., -1.])

SVD decomposition

Decompose \(A = U \Sigma V^T\), the solution is \(x = V \Sigma^{-1} U^T b\)

U, S, Vt = np.linalg.svd(A)
U
array([[-0.45463329, -0.45500496, -0.76568862],
       [ 0.72562679,  0.30930747, -0.61465002],
       [ 0.51650202, -0.83504454,  0.18954231]])
S
array([5.10343603, 1.71519373, 0.11424157])
Vt
array([[-0.80713287, -0.13006101,  0.57586514],
       [-0.09785674, -0.93246433, -0.34775615],
       [-0.58220322,  0.3370377 , -0.73989526]])
S
array([5.10343603, 1.71519373, 0.11424157])
invS = (np.ones(3) / S) 
def to_dense(x):
    n = x.shape[0]
    y = np.zeros((n, n))
    for i in range(n):
        for j in range(n):
            y[i,i] = x[i]
    return y
invS = np.ones(n) / S
invS
array([0.19594642, 0.58302452, 8.75338148])
x = Vt.T @ to_dense(invS) @ U.T @ b
x = x.flatten()
x
array([ 2.,  3., -1.])

Conclusion

It might be straightforward to try to inverse the matrix A when solving a linear system of n equations with n unknowns written in matrix form as \(Ax = b\), but this approach is costly in practice. One must always use a matrix decomposition method and especially the QR decomposition which is very efficient in practice and numerically stable.