### normal -> all eigenvectors are normal, always# A vector v is normal iff length of v is 1np.sqrt(np.sum(evecs[:, 0]**2))
1.0
### because A_sym is symmetrical the vectors are also orthogonal# vectors a and b are orthogonal iff the angle between a and b is 90 degree (dot product = 0)np.dot(evecs[:, 0], evecs[:, 1])
-2.7755575615628914e-17
## if A*B = I -> A, B are inverses## evecs.T = evecs.inv()np.round(evecs.dot(evecs.T), 2)
### eigenvecs and vals redescribe your space ### if your space is symmetrical, then the eigenvecs are a basis### so...why is this important for dimensionality reduction?
PCA
Calculate the covariance matrix of your data, C -> Symmetrical
Calculate the eigenvecs and eigenvalues of covariance matrix
eigenvectors are normal and orthogonal
Project our data onto the vectors that most describe the correlation
feature engineering (kinda) on the variance
Pros
Reducing dimensionality gives fast processing time
Get more accurate results, at times
Don't have to drop features
Visualize your data
Cons
lose ALL interpretability
To reduce could be computationally expensive
lose relationships
Decomposition for non square matrix
SVD (Singular Value Decomposition)
Linear Discriminant Analysis
Reduces any matrix
Assessment
use PCA for predicting with distance mls or if feature interpretability are unimportant
learned that the PCA features are combinations of other features