The Inverse and Transpose of a Matrix

I will now address the question of whether, for a given m x n matrix A, there exists an n x m matrix B such that, with y = Ax, By = x. If so, the action of A is undone by B, that is, B moves y back to the original position x.

If m < n, there is no way to undo the mapping y = Ax. In other words, there does not exist an n x m matrix B such that By = x. To see this, consider the 1 x 2matrix A = (2, 1). Then, withxasin(I.12),Ax = 2x + x2 = y, butifwe knowy and A we only know that x is located on the line x2 = y – 2xi; however, there is no way to determine where on this line.

If m = n in (I.14), thus making the matrix A involved a square matrix, we can undo the mapping A if the columns3 of the matrix A are linear independent. Take for example the matrix A in (I.11) and the vector y in (I.13), and let  B      Then Here and in the sequel the columns of a matrix are interpreted as vectors.

and thus this matrix B moves the point y = Ax back to x. Such a matrix is called the inverse of A and is denoted by A -1. Note that, for an invertible n x n matrix

A, A-1 A = In, where In is the n x n unit matrix:

 n 0 0 … 0 0 1 0 … 0 In = 0 0 1 … 0 0 0 0 … 1

Note that a unit matrix is a special case of a diagonal matrix, that is, a square matrix with all off-diagonal elements equal to zero.

We have seen that the inverse of A isamatrix A-1 such that A-1 A = I 4 But what about AA-1? Does the order of multiplication matter? The answer is no:

Theorem I.1: If A is invertible, then AA-1 = I, that is, A is the inverse of A-1,

because it is trivial that

Theorem I.2: If A and B are invertible matrices, then (AB )-1 = B-1 A-1.

Now let us give a formal proof of our conjecture that

Theorem I.3: A square matrix is invertible if and only if its columns are linear independent.

Proof: Let A be n x n the matrix involved. I will show first that

(a) The columns a1,…,an of A are linear independent if and only iffor every b є К” the system of n linear equations Ax = b has a unique solution.

To see this, suppose that there exists another solution y : Ay = b. Then A (x – y) = 0 and x – y = 0, which imply that the columns a1,…,an of A are linear dependent. Similarly, if for every b є К” the system Ax = b has a unique solution, then the columns a1 ,…,an of A must be linear independent because, if not, then there exists a vector c = 0 in R” such that Ac = 0;hence, if x is a solution of Ax = b, then so is x + c.

Next, I will show that

(b) A is invertible if and only if for every b є К” the system of n linear equations Ax = b has a unique solution.

4 Here and in the sequel I denotes a generic unit matrix.

First, if A is invertible then the solution of Ax = b is x = A-ib, which for each b є К" is unique. Second, let b = ei be the ith column of the unit matrix In, and let xi be the unique solution of Axi = ei. Then the matrix Xwith columns x,…,xn satisfies

AX = A(xi, …,xn) = (Axi, Axn) = (ei,.в") = In;

hence, A is the inverse of X: A = X-i. It follows now from Theorem I. i that Xis the inverse of A : X = A – i. Q. E.D.

If the columns of a square matrix A are linear dependent, then Ax maps point x into a lower-dimensional space, namely, the subspace spanned by the columns of A. Such a mapping is called a singular mapping, and the corresponding matrix A is therefore called singular. Consequently, a square matrix with linear independent columns is described as nonsingular. It follows from Theorem I.3 that nonsingularity is equivalent to invertibility and singularity is equivalent to the absence of invertibility.   If m > n in (I.14), and thus the matrix A involved has more rows than columns, we can also undo the action of A if the columns of the matrix A are linear independent as follows. First, consider the transpose5 AT of the matrix A in (I.14):

that is, AT is formed by filling its columns with the elements of the correspond­ing rows of A. Note that

Theorem I.4: (AB)T = BTAT. Moreover, if A and B are square and in­vertible, then (AT)-i = (A-i)T, ((AB)-i)T = (B-iA-i)T = (A-i)T(B-i)T = (AT) i(BT) i, and similarly, ((AB)T) i = (BTAT) i = (AT) i(BT) i = (A-i)T( B-i)T

Proof: Exercise.

Because a vector can be interpreted as a matrix with only one column, the transpose operation also applies to vectors. In particular, the transpose of the vector x in (I.2) is

xT = (xi, x2, . . .,xn), which may be interpreted as a i x n matrix. The transpose of a matrix A is also denoted in the literature by A’.

Now ify = Ax, then ATy = ATAx, where ATA is an n x n matrix. If ATA is invertible, then (AT A)-1 ATy = x and thus the action of the matrix A is undone by the n x m matrix (ATA)-1 AT. Consequently, it remains to be shown that

Theorem I.5: AT A is invertible if and only if the columns of the matrix A are linear independent.

Proof: Let a1,…,an be the columns of A. Then ATa1,ATan are the columns of ATA. Thus, the columns of ATA are linear combinations of the columns ofA. Suppose that the columns of AT A are linear dependent. Then there

exist coefficients cj not all equal to zero such that c1 ATa1 +– + c„ATa„ = 0.

This equation can be rewritten as AT(c1a1 + ■ ■ ■ + cnan) = 0. Because a1,…,an are linear independent, we have c1a1 + ■■■ + cnan = 0; hence, the columns of AT are linear dependent. However, this is impossible because of the next theorem. Therefore, if the columns of A are linear independent, then so are the columns of AT A. Thus, the theorem under review follows from Theorem

I.3 and Theorem I.6 below.

Theorem I.6: The dimension of the subspace spanned by the columns of a matrix A is equal to the dimension of the subspace spanned by the columns of its transpose A T.

The proof of Theorem I.6 has to be postponed because we need for it the results in the next sections. In particular, Theorem I.6 follows from Theorems

I.11, I.12, and I.13.

Definition I.7: The dimension of the subspace spanned by the columns of a matrix A is called the rank of A.

Thus, a square matrix is invertible if and only if its rank equals its size, and if a matrix is invertible then so is its transpose.