Tensors
Last time I got into labelling the coordinates of vectors and covectors with indices in a particular pattern:
- for a vector coordinate, the index is superscript, such as
, whereas - for a covector coordinate, the index is subscript, such as
.
These of course describe ordinary numbers (scalars). By saying
But this presupposes that we’ve chosen a set of basis vectors to be scaled by these coordinates. Coordinates are not fundamental. That is, we don’t necessarily want to say that a vector is its coordinates. So we also need a corresponding notation for basis (co)vectors, and we swap the placement of the indices like this:
- for a basis vector, the index is subscript, such as
, whereas - for a basis covector, the index is superscript, such as
.
Why swap them? Because we’re going to follow a simple rule of multiplying things with indices in opposite positions, and this configuration allows us to follow that rule even when scaling basis vectors by coordinates:
I also talked about a bit of machinery for linearly mapping a vector to a corresponding covector (or the reverse), and how this something else we have to configure for our vector space. The basis vectors and covectors are already paired up by how they are defined, but for anything else there is no natural pairing. We use a matrix to provide the missing information, and there is a significance to how we position the indices used to label its numeric elements:
- for a matrix that maps from vector to covector coordinates, the indices are subscript:
, whereas - for a matrix that maps from covector to vector coordinates, the indices are superscript:
.
The arithmetic of doing this kind of mapping, e.g. to get coordinates of the covector corresponding to some vector coordinates, is matrix multiplication:
The input vector has its index “up” and the result has its index “down”, from which
We can also invent linear mappings from vectors to vectors: things like rotation, reflection, skewing, but within the same vector space. They will also be described by a matrix with two indices, but we need to be careful how we place those indices, to be clear about the types of the input and output:
See how the two
This whole pattern is extremely general and powerful. We can define calculating machines that work on a particular number of vector (or covector) inputs. The matrix
The general name for these calculation machines is tensors. They include vectors and covectors, appropriately configured matrices representing operators, but also in some contexts, much larger collections of numbers that have to be labelled with three or four indices. For example, the Riemann curvature tensor
The ultimate constraint on tensors is that they can be contracted with other tensors (be they vectors, covectors or something more complex) until all we have left is a scalar. The key point is if we change the orientation of our basis vectors, the coordinates in all our tensors will change accordingly, but they will still represent the same thing, and we can be sure this is the case if we contract down to a scalar, because we’ll always get the same scalar from the same system of tensors regardless of the basis we choose. Indeed, the traditional way tensors are defined is as systems of coordinates that transform in a particular way under a change of basis (one often used but unhelpfully glib definition is “a tensor is something that transforms like a tensor.”)
But, just as we saw with vectors, it pays to think about these concepts in more ways than one. A tensor can be thought of as a geometrical object. Okay, it’s not as easy to visualise as an arrow. But just as a vector can be thought of as an arrow, which is a distinct kind of geometric object, not just a collection of numbers (even though it can be described with numbers if you choose a basis for doing so), any tensor is also a geometrical object of a distinct kind, not just a collection of numbers.
Given this, the index notation, which up to now we’ve been interpreting as a way to label collections of numbers, can instead be interpreted as a way to describe abstract machines that operator on geometrical objects - ultimately vectors and covectors - to produce scalars whose values are completely independent of any choice of basis. Some authors call it abstract index notation, others call it slot-naming index notation. (One convention I’ve seen in textbooks uses the Greek alphabet for indices when the notation is to be interpreted as numerical realisations, and the Latin alphabet for abstract notation.)
Another notational point: Einstein noticed that his hand got quite tired from endlessly writing
Four different tensors: the metric
In examples where we want the output to be something more than a scalar, i.e. a tensor with at least one index, the equation will still be perfectly unambiguous without any explicit summation symbols, because there will be some left-over indices that have no oppositely-positioned partner and therefore are not being summed over. They survive in the output. With this abbreviation, a square matrix multiplying with a column matrix (or, interpreted abstractly, an operator acting on a vector) is simply written:
Abstract index notation takes care of many notational duties that are sometimes performed in other ways. We say of
the result of which is a space of
Not yet regretting the time you've spent here?
Keep reading: