Practical problems in many fields of study—such as biology, business, chemistry, computer science, economics, electronics, engineering, physics and the social sciences—can often be reduced to solving a system of linear equations. Linear algebra arose from attempts to find systematic methods for solving these systems, so it is natural to begin this book by studying linear equations.
If
is a straight line (if
is called a linear equation in the
is a linear equation; the coefficients of
Given a linear equation
that is, if the equation is satisfied when the substitutions
A system may have no solution at all, or it may have a unique solution, or it may have an infinite family of solutions. For instance, the system
Show that, for arbitrary values of
is a solution to the system
Simply substitute these values of
Because both equations are satisfied, it is a solution for all choices of
The quantities
When only two variables are involved, the solutions to systems of linear equations can be described geometrically because the graph of a linear equation
In particular, if the system consists of just one equation, there must be infinitely many solutions because there are infinitely many points on a line. If the system has two equations, there are three possibilities for the corresponding straight lines:
- The lines intersect at a single point. Then the system has a unique solution corresponding to that point.
- The lines are parallel (and distinct) and so do not intersect. Then the system has no solution.
- The lines are identical. Then the system has infinitely many solutions—one for each point on the (common) line.
With three variables, the graph of an equation
Before describing the method, we introduce a concept that simplifies the computations involved. Consider the following system
of three equations in four variables. The array of numbers
occurring in the system is called the augmented matrix of the system. Each row of the matrix consists of the coefficients of the variables (in order) from the corresponding equation, together with the constant term. For clarity, the constants are separated by a vertical line. The augmented matrix is just a different way of describing the system of equations. The array of coefficients of the variables
is called the coefficient matrix of the system and
Elementary Operations
The algebraic method for solving systems of linear equations is described as follows. Two such systems are said to be equivalent if they have the same set of solutions. A system is solved by writing a series of systems, one after the other, each equivalent to the previous system. Each of these systems has the same set of solutions as the original one; the aim is to end up with a system that is easy to solve. Each system in the series is obtained from the preceding system by a simple manipulation chosen so that it does not change the set of solutions.
As an illustration, we solve the system
First, subtract twice the first equation from the second. The resulting system is
which is equivalent to the original. At this stage we obtain
Finally, we subtract twice the second equation from the first to get another equivalent system.
Now this system is easy to solve! And because it is equivalent to the original system, it provides the solution to that system.
Observe that, at each stage, a certain operation is performed on the system (and thus on the augmented matrix) to produce an equivalent system.
The following operations, called elementary operations, can routinely be performed on systems of linear equations to produce equivalent systems.
- Interchange two equations.
- Multiply one equation by a nonzero number.
- Add a multiple of one equation to a different equation.
Suppose that a sequence of elementary operations is performed on a system of linear equations. Then the resulting system has the same set of solutions as the original, so the two systems are equivalent.
Elementary operations performed on a system of equations produce corresponding manipulations of the rows of the augmented matrix. Thus, multiplying a row of a matrix by a number
In hand calculations (and in computer programs) we manipulate the rows of the augmented matrix rather than the equations. For this reason we restate these elementary operations for matrices.
The following are called elementary row operations on a matrix.
- Interchange two rows.
- Multiply one row by a nonzero number.
- Add a multiple of one row to a different row.
In the illustration above, a series of such operations led to a matrix of the form
where the asterisks represent arbitrary numbers. In the case of three equations in three variables, the goal is to produce a matrix of the form
This does not always happen, as we will see in the next section. Here is an example in which it does happen.
Solution:
The augmented matrix of the original system is
To create a
The upper left
Next subtract
This completes the work on column 1. We now use the
Note that the last two manipulations did not affect the first column (the second row has a zero there), so our previous effort there has not been undermined. Finally we clean up the third column. Begin by multiplying row 3 by
Now subtract
The corresponding equations are
The algebraic method introduced in the preceding section can be summarized as follows: Given a system of linear equations, use a sequence of elementary row operations to carry the augmented matrix to a “nice” matrix (meaning that the corresponding equations are easy to solve). In Example 1.1.3, this nice matrix took the form
The following definitions identify the nice matrices that arise in this process.
A matrix is said to be in row-echelon form (and will be called a row-echelon matrix if it satisfies the following three conditions:
- All zero rows (consisting entirely of zeros) are at the bottom.
- The first nonzero entry from the left in each nonzero row is a
, called the leadingfor that row.
- Each leading
is to the right of all leadings in the rows above it.
A row-echelon matrix is said to be in reduced row-echelon form (and will be called a reduced row-echelon matrix if, in addition, it satisfies the following condition:
4. Each leading
The row-echelon matrices have a “staircase” form, as indicated by the following example (the asterisks indicate arbitrary numbers).
The leading
The importance of row-echelon matrices comes from the following theorem.
Every matrix can be brought to (reduced) row-echelon form by a sequence of elementary row operations.
In fact we can give a step-by-step procedure for actually finding a row-echelon matrix. Observe that while there are many sequences of row operations that will bring a matrix to row-echelon form, the one we use is systematic and is easy to program on a computer. Note that the algorithm deals with matrices in general, possibly with columns of zeros.
Step 1. If the matrix consists entirely of zeros, stop—it is already in row-echelon form.
Step 2. Otherwise, find the first column from the left containing a nonzero entry (call it
Step 3. Now multiply the new top row by
Step 4. By subtracting multiples of that row from rows below it, make each entry below the leading
Step 5. Repeat steps 1–4 on the matrix consisting of the remaining rows.
The process stops when either no rows remain at step 5 or the remaining rows consist entirely of zeros.
Observe that the gaussian algorithm is recursive: When the first leading
Solution:
The corresponding augmented matrix is
Create the first leading one by interchanging rows 1 and 2
Now subtract
Now subtract row 2 from row 3 to obtain
This means that the following reduced system of equations
is equivalent to the original system. In other words, the two have the same solutions. But this last system clearly has no solution (the last equation requires that
To solve a linear system, the augmented matrix is carried to reduced row-echelon form, and the variables corresponding to the leading ones are called leading variables. Because the matrix is in reduced form, each leading variable occurs in exactly one equation, so that equation can be solved to give a formula for the leading variable in terms of the nonleading variables. It is customary to call the nonleading variables “free” variables, and to label them by new variables
To solve a system of linear equations proceed as follows:
- Carry the augmented matrix\index{augmented matrix}\index{matrix!augmented matrix} to a reduced row-echelon matrix using elementary row operations.
- If a row occurs, the system is inconsistent.
- Otherwise, assign the nonleading variables (if any) as parameters, and use the equations corresponding to the reduced row-echelon matrix to solve for the leading variables in terms of the parameters.
There is a variant of this procedure, wherein the augmented matrix is carried only to row-echelon form. The nonleading variables are assigned as parameters as before. Then the last equation (corresponding to the row-echelon form) is used to solve for the last leading variable in terms of the parameters. This last leading variable is then substituted into all the preceding equations. Then, the second last equation yields the second last leading variable, which is also substituted back. The process continues to give the general solution. This procedure is called back-substitution. This procedure can be shown to be numerically more efficient and so is important when solving very large systems.
Rank
It can be proven that the reduced row-echelon form of a matrix
Compute the rank of
Solution:
The reduction of
Because this row-echelon matrix has two leading
Suppose that rank
Proof:
The fact that the rank of the augmented matrix is
Theorem 1.2.2 shows that, for any system of linear equations, exactly three possibilities exist:
- No solution. This occurs when a row occurs in the row-echelon form. This is the case where the system is inconsistent.
- Unique solution. This occurs when every variable is a leading variable.
- Infinitely many solutions. This occurs when the system is consistent and there is at least one nonleading variable, so at least one parameter is involved.
//www.geogebra.org/m/cwQ9uYCZ
Please answer these questions after you open the webpage:
1. For the given linear system, what does each one of them represent?
2. Based on the graph, what can we say about the solutions? Does the system have one solution, no solution or infinitely many solutions? Why
3. Change the constant term in every equation to 0, what changed in the graph?
4. For the following linear system:
Can you solve it using Gaussian elimination? When you look at the graph, what do you observe?
Many important problems involve linear inequalities rather than linear equations For example, a condition on the variables
A system of equations in the variables
Clearly
Our chief goal in this section is to give a useful condition for a homogeneous system to have nontrivial solutions. The following example is instructive.
Show that the following homogeneous system has nontrivial solutions.
Solution:
The reduction of the augmented matrix to reduced row-echelon form is outlined below.
The leading variables are
The existence of a nontrivial solution in Example 1.3.1 is ensured by the presence of a parameter in the solution. This is due to the fact that there is a nonleading variable (
If a homogeneous system of linear equations has more variables than equations, then it has a nontrivial solution (in fact, infinitely many).
Proof:
Suppose there are
Note that the converse of Theorem 1.3.1 is not true: if a homogeneous system has nontrivial solutions, it need not have more variables than equations (the system
Theorem 1.3.1 is very useful in applications. The next example provides an illustration from geometry.
Solution:
Let the coordinates of the five points be
This gives five equations, one for each
Linear Combinations and Basic Solutions
As for rows, two columns are regarded as equal if they have the same number of entries and corresponding entries are the same. Let
A sum of scalar multiples of several columns is called a linear combination of these columns. For example,
Solution:
For
Equating corresponding entries gives a system of linear equations
Turning to
leading to equations
Our interest in linear combinations comes from the fact that they provide one of the best
ways to describe the general solution of a homogeneous system of linear equations. When
solving such a system with
Example 1.3.1 is
saying that the general solution is
Now let
In fact, suppose that a typical equation in the system is
Hence
A similar argument shows that Statement 1.1 is true for linear combinations of more than two solutions.
The remarkable thing is that every solution to a homogeneous system is a linear combination of certain particular solutions and, in fact, these solutions are easily computed using the gaussian algorithm. Here is an example.
Solve the homogeneous system with coefficient matrix
Solution:
The reduction of the augmented matrix to reduced form is
so the solutions are
Here
The solutions
The gaussian algorithm systematically produces solutions to any homogeneous linear system, called basic solutions, one for every parameter.
Moreover, the algorithm gives a routine way to express every solution as a linear combination of basic solutions as in Example 1.3.5, where the general solution
Hence by introducing a new parameter
For this reason:
Any nonzero scalar multiple of a basic solution will still be called a basic solution.
In the same way, the gaussian algorithm produces basic solutions to every homogeneous system, one for each parameter (there are no basic solutions if the system has only the trivial solution). Moreover every solution is given by the algorithm as a linear combination of
these basic solutions (as in Example 1.3.5). If
Find basic solutions of the homogeneous system with coefficient matrix
Solution:
The reduction of the augmented matrix to reduced row-echelon form is
so the general solution is
Hence basic solutions are