Empirical Orthogonal Functions

One of the most ubiquitous uses of eigenanalysis in data analysis is the construction of EOFs, the topic of this section. EOFs are a transform of the data; the original set of numbers is transformed into a different set with some desirable properties. In this sense the EOF transform is similar to other transforms such as the Fourier or Laplace transforms. In all these cases, we project the original data onto a set of orthogonal functions, thus replacing the original data with the set of projection coefficients on the basis vectors. However, the choice of the specific basis set varies from case to case.

In the Fourier case, for example, the choice is a set of sines and cosines of various frequencies. This is motivated by the desire to identify the principal modes of oscillation of the system. Thus if the signal projects strongly on sine waves of 2 frequencies, we will say that the signal is approximately the linear combination of these 2 frequencies. We will then attribute the remainder to other processes that are more weakly represented in the signal (the signal has low projection on them), and are thus assumed unimportant for the signal. Another important property for a basis is orthogonality (like sines or various frequencies); we would like to account for a certain component of the signal only once. An alternative to the sine/cosine set is a set of orthogonal polynomials, such as those named after Legendre. (Orthogonality often holds only over a specific interval, and sometime requires `weighting functions'. These are related to the choice of metric, which we will talk about a bit.)

The representation of the signal in terms of the projection coefficients on a basis set is often very useful at separating cleanly various scales. For example, if our data is the sea surface temperature of a given ocean basin, we can think of the projection on the lowest frequency wave (the one which has one crest and one trough within the spatial extent of the domain) as representing the ocean's `large-scale', while that on wavelengths of order 10-100 km as `eddies'.

In EOF analysis we also project the original data on a set of orthogonal basis vectors. However, the choice of the basis is different. Here, the first EOF is chosen to be the pattern, without the constraint of a particular analytic form, on which the data project most strongly. In other words, the leading EOF (sometime called the `gravest', or `leading', mode) is the pattern most frequently realized. The second mode is the one most commonly realized under the constraint of orthogonality to the first one, the third is the most frequently realized pattern that is orthogonal to both higher modes, and so on. Hence the term `empirical'; we still have an orthogonal basis, like the Fourier or Legendre bases, but whose members are not chosen based on analytic considerations, but based on maximization of the projection of the data on them.
 
 


Matrix and EOFs:
 

EOF Examples: