Applied GWAS Analysis
Preface
What this guide is about
Who this guide is for
0.1
Note on data used in this guide
How the guide is organized
What you will gain from this guide
How to use this guide effectively
Reproducibility and scientific responsibility
Support and updates
1
Introduction to GWAS
1.1
What is a genome-wide association study?
1.2
What GWAS can — and cannot — tell you
1.2.1
What GWAS
can
do
1.2.2
What GWAS
cannot
do
1.3
The core idea behind GWAS
1.4
GWAS as a workflow, not a single test
1.5
Why GWAS requires careful reasoning
1.6
Key takeaways
2
GWAS Study Design and Traits
2.1
Why study design matters
2.2
Types of traits in GWAS
2.2.1
Binary traits
2.2.2
Quantitative traits
2.3
Phenotype definition
2.4
Case control studies
2.5
Quantitative trait studies
2.6
Covariates and confounders
2.7
Sample size and power
2.8
Key takeaways
3
Genotype and Phenotype Data Structures
3.1
Why data structure matters in GWAS
3.2
Genotype data at a high level
3.3
Common genotype representations
3.4
Phenotype data
3.5
Linking genotypes and phenotypes
3.6
Covariates as structured data
3.7
Metadata and provenance
3.8
Key takeaways
4
Quality Control Decisions
4.1
Why quality control is central to GWAS
4.2
QC as a decision system
4.3
Sample level quality control
4.4
Variant level quality control
4.5
Order of QC steps matters
4.6
Thresholds are context dependent
4.7
Inspecting QC signals in the demo dataset
4.7.1
Basic alignment checks
4.7.2
Missingness per sample and per variant
4.7.3
Visualizing missingness
4.7.4
Example flags using illustrative thresholds
4.7.5
Recording QC decisions
4.8
QC and reproducibility
4.9
Key takeaways
5
Population Structure and Relatedness
5.1
Why population structure matters in GWAS
5.2
Population stratification
5.3
Relatedness and family structure
5.4
Inspecting population structure in the demo dataset
5.5
Ancestry components as summaries
5.6
Visualizing ancestry components
5.7
Structure as covariates
5.8
Structure versus signal
5.9
Key takeaways
6
Association Testing Models
6.1
What we test in GWAS
6.2
Load the demo dataset
6.3
Merge phenotypes and genotypes
6.4
Baseline association model for one SNP
6.5
Minimal association summary
6.6
Scaling up: test many SNPs (teaching subset)
6.7
QQ plot
6.8
Key takeaways
Foundations Complete
Congratulations on completing the GWAS foundations
What you have accomplished
From understanding to application
What changes in the applied track
A note on expectations
You are ready to continue
Appendix
A
Reproducibility and Reference Material
A.1
Purpose of this appendix
A.2
Reproducibility principles
A.3
Software environment
A.4
Data conventions used in this guide
A.5
On thresholds and defaults
A.6
Reporting GWAS results
A.7
Further reading
References
Explore More at Complex Data Insights
Applied GWAS Analysis
References