Matrices

One of the most fundamental objects in linear algebra (in fact, in all applied mathematics) is a matrix, and the most basic core problem is solving systems of linear equations. So, we start with a brief review of matrices and methods for solving systems of linear equations.

Definition

A matrix is a rectangular array of numbers:

A=[a11a1nam1amn]

Here, m is the number of rows in A and n is the number of columns. The matrix A is square if m=n.

Matrix Arithmetic

Matrix Addition:

C=A+B cij=aij+bij, A,B,CMm×n
It has usual properties:

Scalar Multiplication:

B=αA bij=αaij αR

Matrix Multiplication:

C=AB cij=k=1naikbkj AMm×n BMn×k CMm×k
All usual properties are there such as associativity and distributivity except ABBA.

Linear Equations

The above defined matrix multiplication immediately allows to rewrite a general linear system of m equations with n unknowns:

{a11x1+a12x2++a1nxn=b1am1x1+am2x2++amnxn=bm

in a compact matrix form: Ax=b where A=[aij], b[b1bm].

Curve Fitting:

Suppose we have a set of data points: (x1,y1),,(xn,yn) where (xi,yi) are measurements in a certain experiment. Suppose we want to fit a polynomial to the data, e.g. find a polynomial y=p(x) that passes through these points. Given n points, it is enough to consider polynomials of degree n1. Let p(x)=a0+a1x+a2x2++an1xn1, where aiR. So the problem is to find the coefficients a0,,an1 such that:

{p(x1)=a0+a1x1+a2x12++an1x1n1=y1p(xn)=a0+a1xn+a2xn2++an1xnn1=yn[1x1x1n11x2x2n11xnxnn1][a0a1an1]=[y1y2yn]

The more data points we have, the larger the system.

Solving Differential Equations:

Very few differential equations can be solved analytically (intuitive reason: very few functions are analytically integrable). In most applications, numerical solutions are required. Consider the following equation (Poisson eq.):

Related to Poisson's Ratio.
d2udx2=f(x)
With boundary conditions:
u(0)=0=u(1)

This describes many simple physical phenomena:

  • Temperature distribution (u(x)) in a bar with a heat source f(x)
  • Deformation of an elastic bar
  • Deformation of a string under tension

In applications, the "source term" may not be even known in a closed form. We may just be able to measure f at any point x. To solve this problem we need to discretize it.

Let us subdivide interval [0,1] into (n+1) equal subintervals:
xi=ih i=0,,n+1,h=1n+1<<1
Let ui=u(xi). From the boundary condition, we know that u0=0, un+1=0. If we find u1,,un this will give us an approximation of u(x). The first step is to approximate d2udx2. Assuming that h is small (n is large) which can be obtained as the average of two more direct approximations: u(x+h)u(x)h and u(x)u(xh)h:

dudxu(x+h)u(xh)2hd2udx2u(x+h2)u(xh2)h=u(x+h)u(x)hu(x)u(xh)hh=u(x+h)2u(x)+u(xh)h2

Therefore, d2udx2=f(x) leads to the difference equation: ui+1+2uiui1=h2fi where fi=f(xi), i=1,,n.
This system of difference equations can be written in the matrix form:

\begin{bmatrix} 2 & -1 & 0 & \cdots & 0 \\ -1 & 2 & -1 \\ 0 & -1 & 2 & \ddots & 0\\ \vdots & & \ddots & \ddots & -1 \\ 0 & \cdots & 0 & -1 & 2 \end{bmatrix}\begin{bmatrix}u_1 \\\vdots\\ u_n\end{bmatrix}=h^2\begin{bmatrix}f_1\\\vdots\\ f_n\end{bmatrix}$$ To obtain an accurate approximation, the discretization step $h$ should be small $\Rightarrow n$ should be large. Numerical schemes for PDEs arising in fluid and solid mechanics, weather prediction, image and video processing, molecular dynamics, chemical processes, etc., often require $n\sim10^6$ and more. The design of efficient numerical algorithms for solving large systems is an active area of research. ## Square Matrices ### Inverses: The inverse of a matrix is an analog of the reciprocal of a number. > [!Definition] > Let $A\in M_{n\times n}$. The inverse of $A$, denoted $A^{-1}$, is an $n\times n$ matrix that satisfies: $AA^{-1}=A^{-1}A=I_n$ where $I_n=\begin{bmatrix}1 & & 0 \\ & \ddots \\ 0 & & 1\end{bmatrix}$ is the $n \times n$ identity matrix. Let us recall a few important properties of matrix inverses. > [!info]- > If $A\in\mathbb{M}_{m\times n}$ we can define a right inverse of $A$ as a matrix $X\in\mathbb{M}_{n\times m}$ such that $AX=I_m$; or left inverse of $A$ as a matrix $Y\in\mathbb{M}_{n\times m}$ such that $YA=I_n$. 1) If $A^{-1}$ exists, then it is unique. Proof: Suppose $AX=XA=I$ and $AY=YA=I$. Then $X=XI=X(AY)=(XA)Y=Y$. 2) $(A^{-1})^{-1}=A$ Proof: All we need to check is that the definition is satisfied: $A^{-1}\cdot A=I_n$, similarly $A\cdot A^{-1}=I_n$. 3) If $A,B\in\mathbb{M}_{n\times n}$ are invertible $\Rightarrow (AB)^{-1}=B^{-1}A^{-1}$ Proof: Simple check. $(AB)\cdot(B^{-1}A^{-1})=A(BB^{-1})A^{-1}=AA^{-1}=I$ $(B^{-1}A^{-1})\cdot(AB)=B^{-1}(A^{-1}A)B=B^{-1}B=I$ From the theory of matrices, it follows that asking $AA^{-1}=I$ or $A^{-1}A=I$ in the definition is enough. 4) If $AX=I_n$, then $XA=I_n$ and thus $X=A^{-1}$. If $XA=I_n$, then $AX=I_n$ and thus $X=A^{-1}$. Proof: Not all square matrices have the inverse (but most!). See below. ### Singularity: > [!Definition] > A noninvertible matrix $A$ is called singular. If $A^{-1}$ exists $\Rightarrow A$ is nonsingular. The original motivation for introducing the matrix inverse is that it allows to write the solution of any linear system in a compact way: If $A$ is nonsingular, then the unique solution of $Ax=b$ is $x=A^{-1}b$. However, finding the inverse $A^{-1}$ (using e.g. the Gauss-Jordan method) is computationally inefficient as compared to direct Gaussian Elimination, which provides a systematic method for solving linear systems. Nevertheless, $A^{-1}$ is of great theoretical importance and provides insights into the design of practical algorithms.