American Institute of Aeronautics and Astronautics
Analysis (PCA). PCA performs an orthogonal basis transformation on a multivariate dataset. The resulting
transformed variables are called principal components. This transformation is accomplished as follows:
where X is the original dataset
(oriented such that there is a row for each observation and a column for each
variable), C is the transformed data, and M is the new basis matrix. PCA chooses this new basis such that each
principal component captures as much of the variability of the data as possible while still being orthogonal to all
preceding components. It can be shown that this is accomplished by choosing the columns of M to be the
eigenvectors of the covariance matrix X
T
X, arranged in descending order of the corresponding eigenvectors.
,
PCA has several useful properties. First, the principal components are all linearly uncorrelated with one another.
Additionally, PCA is often used to reduce the dimensionality of a dataset while retaining as much information as
possible.
9
As presented above, C has the same dimensions as X, and no information has been lost. Suppose however,
that only the first N components are retained, and the rest are discarded. The formula for this truncated
transformation is given by:
where M
N
is the truncated basis matrix obtained by the first N columns of M, and C
N
contains the truncated
representation of the data using only the first N principal components. Geometrically, this corresponds to a
projection of the multivariate data onto a lower dimensional hyperplane. If the columns of X are linearly
independent this transformation necessarily losses some amount of the original information. The geometric
interpretation of these loses is the difference between the original data points and their projections. Recall that each
succeeding principal component captures less of the total variation of the dataset than all preceding components. As
a consequence, the truncated principal component representation has minimized the amount of information lost by
the reduction in dimensions. More precisely, the choice of the truncated basis has minimized the sum of squared
errors between the original data and its projection.
9
The amount of information lost in this dimensional reduction depends on the amount of correlation between
different variables of the multivariate dataset. In the worst case, if all dimensions of X are linearly uncorrelated with
one another, then PCA provides no additional benefit, and the truncated transformation is equivalent to retaining
only those dimensions with the most variance. In practice, however, there is often a significant amount of correlation
between different variables of a dataset, allowing for the vast majority of the variability to be obtained in a much
smaller number of principal components.
Before PCA can be applied to trajectory modeling, it is necessary to obtain a dataset that describes the empirical
aircraft trajectories in a consistent manner. For the purposes of our analysis, only three possible aircraft operations
are allowed: a departure, an arrival, or a missed approach. Because the trajectories of these three operations are very
distinct from one another, trajectories are grouped into separate datasets for each operation. Different aircraft types
may have very different performance characteristics that lead to distinct trajectories. Therefore, the data are
additionally segregated by aircraft type. In order to make consistent comparisons between observations, a common
reference time was necessary. This reference time was chosen to be the throttle up time
for departures; the touch
down time for arrivals; and the moment of missed approach initiation for missed approaches.
The departure trajectory models from Eckstein
1
consisted of observed groundspeeds at one second intervals
relative to this reference time to define a “speed profile” for each empirical trajectory. If all departures are assumed
In this discussion of PCA, the dataset X is assumed to have a mean of zero. Mean centering the data before
applying PCA ensures that this condition is met.
In practice, the principal components can typically be computed more quickly using the Singular Value
Decomposition (SVD) of the data. However, the eigenvalue decomposition method described here is traditionally
associated with PCA and provides a more intuitive explanation of the process.
9
“Throttle up time” is defined as the moment at which the aircraft begins its rapid acceleration for takeoff.