Previous section

6. Polynomials and functions of matrices

Let f be a polynomial of degree k,

f(x) = bk xk + bk-1 xk-1 + ··· + b1 x + b0

and A = (aij )n x n . Define

f(A) = bk Ak + bk-1 Ak-1 + ··· + b1 A + b0 I.

If A is diagonalizable, A = TDT - 1 where D = diag (1 , ..., n ), then

(1)     f(A) = bk TDkT - 1 + bk - 1 TDk - 1T - 1 + ··· + b1 TDT - 1 + b0 TT - 1
= T [ bk Dk + bk - 1 Dk - 1 + ··· + b1 D + b0 I ] T - 1
= Tf(D)T - 1.

If p is the characteristic polynomial of A, then p(D) = 0, because

Dk = diag (1 k, ..., n k)    , k = 0, 1, ..., n

=>      p(D) = p(1 ) 0 = 0n x n
·.
0 p(n )

It now follows from (1) that p(A) = 0 (when A is diagonalizable). More generally,

the Cayley-Hamiltonin theorem: If p is the characteristic polynomail of a square matrix A ,then p (A) = 0, that is, A satisfies its own characteristic equation.

Example 1: Cayley-Hamilton theorem

Let p be the characteristic polynomial of A.

(2)

p() = ( - 1)nn + bn - 1 n - 1 + ··· + b1 + b0 = 0

(3)

(-1)nAn + bn - 1 An - 1 + ··· + b1 A + b0 I = 0

=>      An = ( - 1)n + 1 [ bn - 1 An - 1 + ··· + b1 A + b0 I ]
=>      An + 1 = ( - 1)n + 1 [ bn - 1 An + ··· + b1 A2 + b0 A ]

where we can substitute An from the second last equation. It follows that all the powers Ak, k n, can be written in terms of A0, A1, ..., An-1.

Example 2: Powers of a matrix

If A - 1 exists, that is, if = 0 is not an eigenvalue of A => det A = p(0) = b0 0, it follows from (3) that

I = - (b1 / b0 ) A - ··· - (bn-1 / b0 ) An-1 - [(-1)n / b0 ] An,

and thus

(4)

A -1 = - ( b1 / b0 ) I - ( b2 / b0 ) A - ··· - ( bn - 1 / b0 ) An-2 - [( - 1)n / b0 ] An-1.

Next consider functions f (A) of matrices, where f : C -> C can be expressed as a power series

(5)

f(z) = ck zk.

We have already mentioned the functions eA, sin A and cos A before. (Section 5.1 and exercises) Define

(6)

f(A) = ck Ak         (A0 = I)

if the series converges. The following result can be shown:

Proposition 1. If the radius of convergence of the power series (5) is r ( f can be expressed as (5) when z, | z | < r ), then the matrix series (6) converges if the spectral radius r (A) of A satisfies r (A) < r. (r(A) = max{| 1 | , ..., | n |}).

Since all the powers Ak, k n, can be written in terms of A0, A1, ..., An-1, it follows from (6) that

(7)

f(A) = d0 I + d1 A + d2 A2 + ··· + dn - 1 An - 1.

The coefficients d0 , ..., dn - 1 depend from A. They satisfy the equation

f(i ) = d0 + d1 i + d2 2i + ··· + dn - 1 i n - 1,    i = 1, ..., n

where i 's are the eigenvalues of A.

f(i ) = ck i k  , p (i ) = 0   =>    f(i ) = d0 + d1 i + ··· + dn - 1 i n - 1

Example 3: sin A

Note. If q() = 0, where q is a polynomial, it is not necessarily true that q(A) = 0. There exists a polynomial of minimal degree, called the minimal polynomial of A, such that m(A) = 0.

If the multiplicity of an eigenvalue i is bigger than one, we may also use the equations

f '(i ) = d1 + 2d2 i + ··· + (n - 1)dn - 1 i n - 2

If A is diagonalizable, then

f(A) = ck Ak = ck TDkT - 1 = T [ ck diag (i k, ..., n k) ] T - 1
= T [ diag (ck i k, ..., ck n k) ] T -1
= T diag (ck 1 k, ..., ck n k)T - 1
= T diag (f (1 ), ..., f (n ))T - 1

and thus

(8)

f(A) = T diag (f (1 ), ..., f (n ))T - 1.


Exercises: E60, E61, E62, E63, E64
Previous section
Contents