Q&A 11 How do you interpret GWAS model results with PCA covariates?

11.1 Explanation

Once a GWAS model is fitted using a phenotype (e.g., Plant height), a SNP, and population structure covariates (e.g., PC1–PC3), we interpret the results using the model summary. The key values to look for are:

  • Estimate: The effect size of each variable
  • Pr(>|t|): The p-value, used to determine significance
  • R-squared: The proportion of variation in the trait explained by the model
  • Residuals: The spread of errors not explained by the model

This example tests the association between SNP id1000007 and Plant height, adjusting for PC1 to PC3.

11.2 R Model Output Summary

Coefficient Estimate Std. Error t value Pr(> t
(Intercept) 115.83448 1.04707 110.63 < 2e-16 ***
id1000007 2.17767 1.54100 1.413 0.158
PC1 0.20502 0.02761 7.426 7.49e-13 ***
PC2 -0.19534 0.04436 -4.404 1.39e-05 ***
PC3 -0.29738 0.07399 -4.019 7.05e-05 ***

Model Fit:

  • Residual standard error: 18.63
  • Degrees of freedom: 378
  • R-squared: 0.2277
  • Adjusted R-squared: 0.2195
  • F-statistic: 27.86 on 4 and 378 DF
  • Overall p-value: < 2.2e-16

Takeaway: This SNP is not significant (p = 0.158), but PCs show strong association with plant height. Controlling for population structure is essential to avoid false signals in GWAS.