Category Archives: linear algebra

Rotations and reflections

A brief  note.


A vector space of uncountable dimension

Let V=\{ f: {\bf N}\rightarrow R\} be the real vector space of all functions from the natural numbers to the reals. Then V has uncountable dimension. To see this, for each a>0, let f_a be the function such that f_a(n)=a^n.

Claim: The f_a are linearly independent.

Proof: It suffices to show that any finite collection \{f_{a_i}\}_{i=1}^n with a strictly increasing sequence of a_i are linearly independent.

We prove this by induction on n, the fact being clear for n=1. Suppose \sum_{i=1}^nc_if_{a_i}=0 with n>1. The equation means


for all m. Thus, \sum_{i=1}^{n-1}c_i(a_i/a_n)^m+c_n=0 for all m. Now let m\rightarrow \infty. This shows that c_n=0. Thus, by induction, all c_i=0.

We have displayed an uncountable linearly independent collection of functions in V. Now, let \{b_i\}_{i\in I} be a basis for V. For each f_a there is then a unique finite set I(a)\subset I of indices such that f_a can be written  as a linear combination with non-zero coefficients of \{b_i\}_{i\in I(a)}. The linear span of any given finite set of the b_i is finite-dimensional. Hence, for any finite subset S\subset I, there are at most finitely many a such that I(a)=S. That is, the map a\mapsto I(a) is a finite-to-one map from the positive reals R_{>0} to the finite subsets of I. Hence, the set of finite subsets of I must be uncountable. But then I itself must be uncountable. (I leave it as an exercise to show that the set of finite subsets of a countable set is itself countable. You should really write out the proof if you’ve never done it before.)

I might point out that before the tutorials, I was a bit confused myself. That is, the first bit about the f_a‘s being an uncountable linearly independent set is rather easy. However, I started hesitating: Still, why can’t there be a countable set of elements in terms of which we can express all of them? After all, the set of coefficients we can use for the expressions is uncountable… So think through again clearly: how is this resolved above?

As a final remark, note that this proves that V is not isomorphic to R[x]. This is perhaps the first example you’ve seen where you can prove that two vector spaces of *infinite* dimensions are not isomorphic by simply counting the dimensions and comparing them.

Linear independence of polynomials

One of the exercises this week asked for a proof of linear independence for the set

\{x^i\}_{i\in {\bf N}}

inside the polynomials R[x] with real coefficients. However, note that the polynomials here are regarded as *functions* from R to R. Thus, it amounts to showing that if

c_0+c_1x+\cdots c_nx^n=0

as a function, then all c_i have to be zero. This does require proof. One quick way to do this is to note that all polynomial functions are differentiable. And if

f(x)=c_0+c_1x+\cdots c_nx^n

is the zero function, then so are all its derivatives. In particular,


for all i. But f^{(i)}(0)=i!c_i. Thus, c_i=0 for all i.

One possible reason for confusion is that there is another ‘formal’ definition of R[x] by simply identifying a polynomial with its sequence of coefficients. That is, you can think of an element of R[x] as a function f:N \rightarrow R that has *finite support* in that f(i)=0 for all but finitely many i. With this definition, the polynomial x^i becomes identified with the function e_i that sends i to 1 and everything else to zero. If you take this approach, the linear independence also becomes formal. But in this problem, you are defining R[x] as a function in its variable. This of course is the natural definition you’ve been familiar with at least since secondary school.

Here are two questions:

1. If you think of two polynomials f and g as functions from N to R with finite support, what is a nice way to write the product fg?

2. What is the advantage of this formal definition?

Quadratic forms

A quadratic form is essentially the same thing as a symmetric bilinear form, at least over a field of characteristic different from 2. A while ago, I wrote a brief note on quadratic forms dealing with diagonalization and geometry. I hope it’s useful for HT2013, sheet 4, problem 8, in particular.

Orthogonal bases

I was a bit confused during tutorials today about orthogonal bases, so I wrote up a short note on how to find them. This method is illustrated with problem 5 on sheet 4 for Linear Algebra II.


I have written some notes on symmetries following the Group Theory tutorials this week.

Linear algebra II, week 4

Here are some brief remarks on the questions.

For the question on the determinant of

A=\left( \begin{array}{cc} U & 0 \\ W & X\end{array} \right)

it is probably best to write at once

A=\left( \begin{array}{cc} U & 0 \\ W & I_{n_2}\end{array} \right)\left( \begin{array}{cc} I_{n_1} & 0 \\ 0 & X\end{array} \right)

so that

\det(A)=\det \left( \begin{array}{cc} U & 0 \\ W & I_{n_2}\end{array} \right) \det \left( \begin{array}{cc} I_{n_1} & 0 \\ 0 & X\end{array} \right)

by the multiplicativity of the determinant. Then

\det \left( \begin{array}{cc} U & 0 \\ W & I_{n_2}\end{array} \right)=\det(U)


\det \left( \begin{array}{cc} I_{n_1} & 0 \\ 0 & X\end{array} \right)=\det(X)

are immediate, by induction on n_1 and n_2, for example, using a column expansion. [A word on induction: I’m sure you can see these identities right away by now and you might not need to write down the inductive proof explicitly. But even so, since you’re essentially beginning mathematics, it’s good to be aware that a careful proof of something of this sort often requires induction.]

There are three important properties of the determinant that you need to internalize once and for all:

(1) \det(A)\neq 0 iff the rows of A are linearly independent iff the columns of A are linearly independent. A clear application of this property permeates many of the problems in sheets 2 and 3.

(2) Multiplicativity: \det(AB)=\det(A)\det(B).

(3) An n\times n matrix A can be view as n column vectors

A=(A_1, A_2, \ldots, A_n).

Viewed as a function of those vectors, the determinant is multilinear, that is, linear in each argument separately when the others are kept fixed. For example,

\det (cA_1+c'A_1', A_2, A_3, \ldots, A_n)  =   c\det (A_1, A_2, A_3, \ldots, A_n)+c'\det (A_1', A_2, A_3, \ldots, A_n).

One warning is that definitely

\det(A+B)\neq \det(A)+\det(B)

in general. That is, the determinant is definitely not linear as a function of the whole matrix. It is just linear as a function of any given column.

These three properties need to be proved once, more or less using the definitions. However, it’s not very useful to go back to the complicated definition too often. You will see such a phenomenon quite frequently in mathematics: A rather difficult definition that then needs to be used to establish basic foundational properties. It is then these properties that end up the most serviceable.

We discussed at the meeting the expression

\det (xI-A)=\det(xe_1-A_1, xe_2-A_2, \ldots, xe_n-A_n)

for the characteristic polynomial. I realized afterwards that actually both parts of problem 6 in sheet 3 can be done fairly easily using the multilinearity. The multilinearity of the determinant means that we can expand this expression essentially as if it were a product of the n arguments, giving rise to 2^n terms. Now try to collect those terms in powers of x, and you will find


terms that make up the coefficient of


I leave it to you to compute those terms. You will see the various (n-i)\times (n-i) minors emerging naturally. The only subtle point ends up being the sign in front of various determinants.

Tutorials, Hilary term, week 2

Here are some remarks on problem 7 (i) of Linear Algebra II, sheet 1. Let me know if you have a simpler proof.


Here are some further remarks. Recall that the key statement is this:

For \rho \in Sym(n), if

\rho=\tau_1\tau_2\cdots \tau_m,


m\geq n-c(\rho).

It is perhaps better to write it in the form

c(\rho)\geq n-m.

This is giving a lower bound for the number c(\rho) of cycles in \rho, saying that if m is small, then c(\rho) has to be big. I hope this makes intuitive sense: with just a few transpositions, you can’t create long cycles. Hence, there has to be many of them.

Note also that the strategy of proof (I learned from Matei’s answer) changes the perspective of the problem in a nice way. Instead of trying to prove a lower bound for the number of transpositions given a cycle decomposition, one tries to prove a lower bound for the number of cycles, given an expression in terms of transpositions. Possibly, this is easier because the cycle decomposition is essentially unique.

Collections, Hilary Term, 2012

I’ve marked the scripts for the algebra portion of the Merton collections for Mods Pure Maths, Part A AC1 and Part A AC2.
I am including here brief remarks on the questions. Later, if I have the energy, I will also upload comments on the Banach Spaces paper for Part B students.

More Jordan bases and canonical forms

Dear Professor,

I have a couple of questions about the Jordan canonical form section of the course if you get a spare moment.
1. I understand that if are considering linear maps from a vector space over the complex numbers to another vector space over the complex numbers, then eigenvalues always exist. When we find a Jordan basis, we use these eigenvalues. Does this mean we are assuming that the field in question is the complex numbers? Surely if not, then these eigenvalues aren’t guaranteed to exist so we can’t always find a Jordan basis?

2. After finding a pre-Jordan basis, we have to replace some basis vectors so as to satisfy the condition that if b_i belongs to V_i(\lambda), then (L -\lambda)b_i belongs to V_{i-1}(\lambda) for i greater than 1. I have noticed that sometimes there is a choice as to which basis vector in V_{i-1}(\lambda) to replace. Does it matter? Is there a convention? Similarly, when extending our basis

B_1\cup B_2\cup ...\cup B_{i-1 }



to be a basis for V_i(\lambda), there may be a choice as to which basis vector we choose to add. Does it matter? Is there a convention? My questions above lead me to believe that there are actually infinitely many Jordan bases for any one linear map / matrix, but that their Jordan canonical form is unique up to permutation of the Jordan blocks. Am I correct in thinking this?

3. Finally, say we have a linear map with two eigenvalues, \alpha and \beta. Say the characteristic polynomial is

(X - \alpha)^a (X - \beta)^b

and the minimal polynomial is

(X - \alpha)^c (X - \beta)^d

with, obviously, c less than or equal to a, d less than or equal to b. I don’t understand how we know without proof that dim V_c(\alpha) = a. I have chosen a two eigenvalue case for simplicity – obviously I’m interested in all several eigenvalue cases.

Thanks very much!


1. You are right that that eigenvalues need to be contained in the field of definition for a linear map to admit a Jordan canonical form. For example, the rotation matrix

\left( \begin{array}{cc} 0 & -1 \\ 1 & 0  \end{array} \right)

will not have a Jordan canonical forms over R. Over C, however, it is diagonalizable, and the diagonal form *is* the Jordan canonical form. (What is it?)

2. There is an error in the condition you write, which could be minor or serious. To be a Jordan basis for a linear map L with one eigenvalue \lambda, the requirement is that

(L-\lambda)B_i\subset B_{i-1}

for i>1. Look carefully to see and understand the difference from what you wrote. (Actually, from the overall understanding reflected in your message, I suspect your mistake was just a misprint. But I wrote the above for the general reader.)

As to your question, you are write there there are many choices involved. This is the point people often find confusing, not just in this topic, but in many basic mathematical problems: when there is not a unique solution. We just have to understand the material well enough to feel relaxed about the choices. There is no general convention I can think of regarding `good’ choices, other than obvious demands of economy like simple numbers and as many zero entries as possible.

All of your remaining observations are correct. I wouldn’t be too surprised if a study of the *space of all possible Jordan bases* would yield some insight on good choices, at least in some natural special situations.

3. The general discussion is an obvious generalization of the case with two eigenvalues, so let’s stick to your question as it stands.

For v \in V_c(\alpha), we have

(L-\alpha )v \in V_{c-1}(\alpha)\subset V_c(\alpha)

and hence,

Lv\in \langle v \rangle+V_c(\alpha)=V_c(\alpha).

Therefore, V_c(\alpha) is stabilized (that is, taken to itself) by L. Similarly, V_d(\beta) is stabilized by L. By considering the shape of the Jordan canonical form for


(the one eigenvalue case) we see that


for s=dim V_c(\alpha). Similarly,


for t=V_d(\beta). But we have

V=V_c(\alpha)\oplus V_d(\beta)

by the primary decomposition theorem. So





c=s=dim V_c(\alpha)


d=t=dim V_d(\beta) .