Principal Component Analysis (PCA, 주성분분석)
PCA이란?
고차원의 데이터를 저차원의 데이터로 환원시키는 기법 중 하나이다. (dimension reduction) PCA는데이터의 분산(variance)을 최대한 보존하면서 서로 직교하는 새 기저(축)를 찾는다.PCA recreates dimensions as a linear combination of features called components and then ranks components that contribute most to patterns in the data. PCA is useful for dramatically reducing data complexity and visualizing data in fewer dimensions. 이때 데이터를 투영시킬 수 있는 각 축의 단위 벡터들을 주성분(Principal Component)이라고 하며, 차원의 수만큼 존재하고 서로 직교하는 성질을 가지고 있다.
Figure A) shows the raw data, with each person represented as a dot. The Pearson correlation is r = .69.
Figure B) shows the same data, mean-centered, and with the two principal components (the eigenvectors of the covariance matrix) drawn on top. Notice that the eigenvector associated with the larger eigenvalue points along the direction of the linear relationship between the two variables.
Figure C) shows the same data but redrawn in PC space. Because PCs are orthogonal, the PC space is a pure rotation of the original data space. Therefore, the data projected through the PCs are decorrelated.
2022.10.14 - [인공지능/Machine Learning] - [ML] PCA / Correlation vs. Covariance
[ML] PCA / Correlation vs. Covariance
2022.10.13 - [인공지능/Machine Learning] - [ML] 머신러닝의 학습 방법 (Supervised Learning, Unsupervised Learning, Reinforcement Learning) [ML] 머신러닝의 학습 방법 (Supervised Learning, Unsupervised..
uely.tistory.com