System of Linear Equations
In this part, we will be studying a system of linear equations. In general, a system of linear equations is a set of \(\displaystyle m\) equations in \(\displaystyle n\) unknowns:
\[ \begin{aligned} a_{11} x_{1} +\cdots +a_{1n} x_{n} & =b_{1}\\ \vdots & \vdots \\ a_{m1} x_{1} +\cdots +a_{mn} x_{n} & =b_{n} \end{aligned} \]
Collecting the coefficients into a \(\displaystyle m\times n\) matrix \(\displaystyle \mathbf{A}\) and the unknowns into the vector \(\displaystyle \mathbf{x} \in \mathbb{R}^{n}\) and the entries on the RHS in the vector \(\displaystyle \mathbf{b} \in \mathbb{R}^{m}\), we can represent the system as:
\[ \mathbf{Ax} =\mathbf{b} \]
We can also interpret the system in a slightly different form. Let us treat each column of \(\displaystyle A\) as a column vector, say we call the \(\displaystyle i^{\text{th}}\) column \(\displaystyle \mathbf{a}_{i}\):
\[ \mathbf{A} =\begin{bmatrix} | & & |\\ \mathbf{a}_{1} & \cdots & \mathbf{a}_{n}\\ | & & | \end{bmatrix} \]
Then, the system \(\displaystyle \mathbf{Ax} =\mathbf{b}\) can be equivalently represented as:
\[ x_{1}\mathbf{a}_{1} +\cdots +x_{n}\mathbf{a}_{n} =\mathbf{b} \]
The LHS is a linear combination of the columns of \(\displaystyle \mathbf{A}\). From this, we see that we are looking for multipliers \(\displaystyle x_{1} ,\cdots ,x_{n}\) which can represent \(\displaystyle \mathbf{b}\) as a linear combination of the vectors \(\displaystyle \{\mathbf{a}_{1} ,\cdots ,\mathbf{a}_{n}\}\). This representation has the advantage that it allows us to use the tools of vector spaces and linear maps to bear on this problem.
One can immediately see that a system of equations has a solution if and only if \(\displaystyle \mathbf{b}\) is in the span of the columns of \(\displaystyle \mathbf{A}\). A system that has a solution is called consistent while a system that doesn’t have a solution is called inconsistent. In the next few sections we will first study consistent systems. Before taking up the general case of \(\displaystyle \mathbf{Ax} =\mathbf{b}\), it would be instructive to look at \(\displaystyle \mathbf{Ax} =\mathbf{0}\), which is called a homogenous system.
Having done this, we will turn to the situation when \(\displaystyle \mathbf{b}\) is not in the span of the columns of \(\displaystyle \mathbf{A}\). This means that \(\displaystyle \mathbf{Ax} =\mathbf{b}\) doesn’t have a solution. This situation is quite common in practice. When we don’t have the luxury of perfect solutions, we have to settle for compromises. So we will study the system \(\displaystyle \mathbf{Ax} \approx \mathbf{b}\). Can we find an approximate solution? We will define in a precise manner what it means for a solution to be approximate.
To summarize, we will study these three systems in this order:
- \(\displaystyle \mathbf{Ax} = \mathbf{0}\)
- \(\displaystyle \mathbf{Ax} =\mathbf{b}\)
- \(\displaystyle \mathbf{Ax} \approx \mathbf{b}\)
Along the way, we will revisit some of the ideas that we are already familiar with from a first course in linear algebra to help us cleanly formulate the theory.