melike

2 minute(s) read

In biology we use PCA extensively. In almost all articles we see PCA plots. So here I will show a trick to hopefully improve the interpretation of a PCA plot.

We will use USArrests dataset that is available in R. To get more information try ?USArrests.

head(USArrests)
##            Murder Assault UrbanPop Rape
## Alabama      13.2     236       58 21.2
## Alaska       10.0     263       48 44.5
## Arizona       8.1     294       80 31.0
## Arkansas      8.8     190       50 19.5
## California    9.0     276       91 40.6
## Colorado      7.9     204       78 38.7

Here we do pca:

pcax = prcomp(USArrests, scale = T)
head(pcax$x)
##                   PC1        PC2         PC3          PC4
## Alabama    -0.9756604  1.1220012 -0.43980366  0.154696581
## Alaska     -1.9305379  1.0624269  2.01950027 -0.434175454
## Arizona    -1.7454429 -0.7384595  0.05423025 -0.826264240
## Arkansas    0.1399989  1.1085423  0.11342217 -0.180973554
## California -2.4986128 -1.5274267  0.59254100 -0.338559240
## Colorado   -1.4993407 -0.9776297  1.08400162  0.001450164

We can directly use R’s builtin function plot to see the results:

plot(pcax$x)

Or even biplot function which would also show what affects the data spread on a PCA plot. However, my main point of creating this post is to comment on the axis lengths. See the following two figures:

plot(pcax$x)

plot(pcax$x)

Are they the same? The same plot, yes, but would you react the same to both of them? No, right? The first one puts more weight to the x-axis. Here is my suggestion:

library(tidyverse)
varexplained = summary(pcax)$imp[2, 1:2]
varratio = unname(varexplained[1]/varexplained[2])
data.frame(pcax$x) %>% ggplot(aes(x = PC1, y = PC2)) + geom_point() + 
    theme_bw() + coord_fixed(ratio = 1/varratio)

Here, I believe, we can now interpret the dispersion better as the axis length is also proportional to the variance explained by each PC. PC1 explains 62% and PC2 explains around 25% of the variance and thus the ratio between x and y-axes is 2.5.

Say something

Comments

Nothing yet.