Author: Sahir Bhatnagar ()

Notes:

  • This vignette was built with R markdown and knitr. The source code for this vignette can be found here.

  • In the interactive manhattan, qq and volcano plots below, we only use a subset (random sample of SNPs on chromosomes 4 to 7) of the HapMap data included in this package. This is to reduce the size of the compiled vignette.


Introduction

Manhattan, Q-Q and volcano plots are popular graphical methods for visualizing results from high-dimensional data analysis such as a (epi)genome wide asssociation study (GWAS or EWAS), in which p-values, Z-scores, test statistics are plotted on a scatter plot against their genomic position. Manhattan plots are used for visualizing potential regions of interest in the genome that are associated with a phenotype. Q-Q plots tell us about the distributional assumptions of the observed test statistics. Volcano plots are the negative log10 p-values plotted against their effect size, odds ratio or log fold-change. They are used to identify clinically meaningful markers in genomic experiments, i.e., markers that are statistically significant and have an effect size greater than some threshold.

Interactive manhattan, Q-Q and volcano plots allow the inspection of specific value (e.g. rs number or gene name) by hovering the mouse over a point, as well as zooming into a region of the genome (e.g. a chromosome) by dragging a rectangle around the relevant area.

This pacakge creates interactive Q-Q, manhattan and volcano plots that are usable from the R console, the RStudio viewer pane, R Markdown documents, in Dash apps, Shiny apps, embeddable in websites and can be exported as .png files. By hovering the mouse over a point, you can see annotation information such as the SNP identifier and GENE name. You can also drag a rectangle to zoom in on a region of interest and then export the image as a .png file.

This work is based on the qqman R package and the plotly.js engine. It produces similar manhattan and Q-Q plots as the qqman::manhattan and qqman::qq functions; the main difference here is being able to interact with the plot, including extra annotation information and seamless integration with HTML.



Installation

You can install manhattanly from CRAN:

install.packages("manhattanly")

Alternatively, you can install the development version of manhattanly from GitHub with:

if (!require("pacman")) install.packages("pacman")
pacman::p_load_gh("sahirbhatnagar/manhattanly")



Quick Start

Manhattan plot

The manhattanly package ships with an example dataset called HapMap. See help(HapMap) for more details about how this dataset was created. Here is what the HapMap dataset looks like:

# load the manhattanly library
library(manhattanly)
## See example usage at http://sahirbhatnagar.com/manhattanly/
set.seed(12345)
HapMap.subset <- subset(HapMap, CHR %in% 4:7)
# for highlighting SNPs of interest
significantSNP <- sample(HapMap.subset$SNP, 20)
head(HapMap.subset)
##      CHR      BP         P        SNP ZSCORE EFFECTSIZE   GENE DISTANCE
## 3300   4  336758 0.7869011  rs6821220 0.2703    -0.1437 ZNF141        0
## 3301   4  992125 0.5116233  rs6855233 0.6563    -0.0128 FGFRL1     3634
## 3302   4 1155741 0.7977199   rs922697 0.2563    -0.1170  SPON2        0
## 3303   4 1302267 0.3590100 rs10025665 0.9173     0.1079   MAEA        0
## 3304   4 1388897 0.1824547 rs11731672 1.3332     0.1005 CRIPAK     9116
## 3305   4 1814818 0.4673265  rs6599401 0.7268     0.0625  LETM1        0
dim(HapMap.subset)
## [1] 3410    8

The required columns to create a manhattan plot are the chromosome, base-pair position and p-value. By default, the manhattanly function assumes these columns are named CHR, BP and P (but these can be specified by the user if they are different)

Create an interactive manhattan plot using one command:

manhattanly(HapMap.subset, snp = "SNP", gene = "GENE")

The arguments snp = "SNP" and gene = "GENE" specify that we want to add snp and gene information to each point. This information is found in the columns names "SNP" and "GENE" in the HapMap dataset. See help(manhattanly) for a full list of options.

Q-Q plot

Similarly, we can create an interactive Q-Q plot using one command (See help(qqly) for a full list of options):

qqly(HapMap.subset, snp = "SNP", gene = "GENE")

You can then save the plot as a .png file by clicking on the camera icon in the toolbar (which appears when you hover your mouse over it).

Volcano plot

You can also make a volcano plot which by default, highlights the points greater than the default genomewideline and effect_size_line arguments:

volcanoly(HapMap.subset, snp = "SNP", gene = "GENE", effect_size = "EFFECTSIZE")