关于python:Principal-Component-Analysis

49次阅读

共计 1871 个字符,预计需要花费 5 分钟才能阅读完成。

Task #3 Principal Component Analysis
[Subject: Applied Econometrics]
Principal Component Analysis, or PCA in short, is a widely used method in applied
econometrics. Let a random vector have the multivariate normal distribution
where the covariance matrix is positive definite. The multivariate normal
distribution is a natural extension of univariate normal distribution. Its corresponding
probability density function (p.d.f.) is for any ,
Please note that when , the p.d.f. is
The spectral decomposition of is written as . Here, the columns,
of are the eigenvectors corresponding to the eigenvalues
which form the main diagonal of the matrix . Assume without loss of generality that the
eigenvalues are decreasing; i.e., .
Define a new random vector as . Given by an important theorem in
statistics
1
, we know that has a distribution. Hence the components
are independent random variables and, for , has a
distribution. The random vector is called the vector of principal components.
You are required to complete the following questions.

  1. The total variation () of a random vector is the sum of the variances of its
    components. For the random vector , prove that , where
    .
  2. The first component of , which is given by . This is a linear
    combination of the component of with the property
    , because is orthogonal. Consider any other linear
    combination of , say such that . Show that there are
    scalars such that
  3. The exact statement of the theorem refers to Theorem B.10 in a famous econometric textbook written by Greene, namely
    Econometric Analysis. ↩
    and .
  4. Try to write a MATLAB script to find its principal components of any given
    random vector . Compare your results with the official MATLAB function pca.
  5. One important feature of PCA is dimensionality reduction. For example, a CEO in
    a given firm may have multiple characteristics, such as education, confidence
    and social connections. Scholars prefer one indicator to measure the overall
    ability of CEO. PCA could finish this task by taking advantage of the first (and
    also the maximal) principal component.
    Let’s construct a Management Quality Factor measure with PCA using Chinese
    public companies. Use the CEO data from CSMAR and construct a measure for
    CEO quality.

WX:codehelp

正文完
 0