Subset of HapMap data with simulated GWAS results

A dataset containing a subset of the draft release 2 for genome-wide SNP genotyping in DNA samples from 11 human populations (sometimes referred to as the "HapMap 3" samples). Only the PLINK .map file was used. Approximately 2.5% of the SNPs in each chromosome were retained. The p-values, zscores, and effectsizes were simulated using random distributions in R. Annotation information (nearest gene and distance to nearest gene) was obtained from the UCSC genome annotation database for the Mar. 2006 GenBank freeze assembled by NCBI (hg18, Build 36.1)

HapMap

Format

A data frame with 14412 rows and 8 variables:

CHR: chromosome number. Autosomes coded 1 through 22, and 23 is the X chromosome (integer)
BP: genomic base-pair position (integer)
P: p-value (numeric)
SNP: rs# or snp identifier (character)
ZSCORE: z-score (numeric)
EFFECTSIZE: effect size (numeric)
GENE: nearest gene to the SNP (character)
DISTANCE: distance between the SNP and GENE. if DISTANCE=0 then the SNP is located in the GENE (integer)

Source

ftp://ftp.ncbi.nlm.nih.gov/hapmap/genotypes/2009-01_phaseIII/plink_format/

http://hgdownload.cse.ucsc.edu/goldenPath/hg18/database/