GIANT consortium data files

From Giant Consortium
Revision as of 14:51, 2 May 2019 by HannahW (talk | contribs)
Jump to navigation Jump to search

We are releasing the summary data from our meta-analyses of Genome-Wide Association Studies (GWAS) in order to enable other researchers to examine particular variants or loci for their evidence of association with anthropometric traits. The files include p-values and direction of effect at over 2 million directly genotyped or imputed single nucleotide polymorphisms (SNPs). To prevent the possibility of identification of individuals from these summary results, we are not releasing allele frequency data from our samples.

2018 WHR Exome for Public Release

File Name Structure: PublicRelease.WHRadjBMI.*.*.*.txt.gz

Abbreviations in filenames:

C-Combined Sexes

M-Men

W-Women

All- All Ancestries

Eur- European descent only

Add- Additive genetic model

Rec- Recessive genetic model


File headers:

1. snpname - dbSNP rsID

2. chr - chromosome

3. pos - position

4. markername - chr:pos

5. ref - reference allele (hg19 + strand)

6. alt - alternate allele (hg19 + strand)

7. beta - beta

8. se - standard error

9. pvalue - P value

10. n - sample size

11. gmaf/eur_maf - alternate allele frequency in 1000 Genome Combined/European Ancestries

12. exac_maf/exac_nfe_maf -alternate allele frequency in ExAC Combined/Non-Finnish European Ancestries



2018 Exome Array Summary Statistics

WHR Exome Array Summary Statistics

WHR Data Files:


BMI Exome Array Summary Statistics

BMI Data Files:

If you use these data, please cite: Turcot V, Lu Y, Highland HM, Schurmann C, Justice AE, Fine RS, Bradfield JP, Esko T, Giri A, Graff M, Guo X, Hendricks AE, et al. (2018). Protein-altering variants associated with body mass index implicate pathways that control energy intake and expenditure in obesity. Nat Genet. 2018 Jan;50(1):26-41. doi: 10.1038/s41588-017-0011-x. Epub 2017 Dec 22. PMID: 29273807


Height Exome Array Summary Statistics

Height Data Files:


2018 GIANT and UK BioBank Meta Analysis

WHR GIANT and UK BioBank Meta-analysis Summary Statistics

WHR UK BioBank Meta Analysis for Public Release:

Access link for GWAS data for all ~27M sites: https://zenodo.org/record/1251813#.XCLJ7vZKhE4 DOI (10.5281/zenodo.1251813)


BMI and Height GIANT and UK BioBank Meta-analysis Summary Statistics

Please Note: We discovered that the BMI files for the meta-analysis of UK Biobank and GIANT originally uploaded did not reflect the full sample size and have now been corrected. If you downloaded these files prior to June 25, 2018, please download them again. Our apologies for any inconvenience.

If you use these data, please cite: Yengo L, Sidorenko J, Kemper KE, Zheng Z, Wood AR, Weedon MN, Frayling TM, Hirschhorn J, Yang J, Visscher PM, GIANT Consortium. (2018). Meta-analysis of genome-wide association studies for height and body mass index in ~700,000 individuals of European ancestry. Biorxiv.


2017 Gene x Environment Summary Statistics

Summary Statistics for Models Adjusting for Smoking Status

Column headers:

1. chromosome

2. rs_id: dbSNP ID

3. markername: chr:pos

4. allele_1: effect allele

5. allele_2: other allele

6. freq_Allele1_HapMapCEU: The allele frequency of Allele1 in the HapMap CEU population

7. effect: beta

8. stderr: standard error

9. p_value: p-value after correction for inflation of test statistics using genomic control both at the individual study level and again after meta-analysis

10. N: sample size


BMI Data Files


WCadjBMI:


WHRadjBMI:



Summary Statistics for Smoking Stratified Models

Column headers:

1. Chromosome

2. rs_id: dbSNP ID

3. markername: chr:pos

4. position_hg18: base pair position on build hg18

5. Effect_allele

6. Other_allele

7. EAF_HapMapCEU: The allele frequency of Allele1 in the HapMap CEU population

8. N_SMK: sample size for smokers

9. Effect_SMK: beta in smokers

10. StdErr_SMK: standard error in smokers

11. P_value_SMK: p- value for smokers after correction for inflation of test statistics using genomic control both at the individual study level and again after meta-analysis

12. N_NONSMK: sample size for nonsmokers

13. Effect_NonSMK: beta in nonsmokers

14. StdErr_NonSMK: standard error in nonsmokers

15. P_value_NonSMK: p- value for nonsmokers after correction for inflation of test statistics using genomic control both at the individual study level and again after meta-analysis


BMI:


WCadjBMI:


WHRadjBMI:


Summary Statistics for Gene x Physical Activity

  • Column1: rsid
  • Column2: Chromosome
  • Column3: Position_hg19
  • Column4: Effect_allele (positive strand)
  • Column5: Other_allele (positive strand)
  • Column6: EAF_HapMapCEU (frequency of effect allele based on the HapMap CEU reference population)
  • Column7: Sample_size
  • Column8: Effect effect estimate
  • Column9: Stderr standard error
  • Column10: Pvalue (p-value after meta-analysis in METAL based on regression coefficients: effect and standard error)
  • Column11: HetIsq


File naming scheme:

  • Trait: BMI, WaistadjBMI (Waist circumference adjusted for BMI), WHRadjBMI (Waist-to-hip ratio adjusted for BMI)
  • Physical activity: ACTIVE (individuals defined as active), INACTIVE (individuals defined as inactive), SNPadjPA (all individuals, added covariate of physical activity level)
  • Gender: MEN, WOMEN, ALL (Men and Women)
  • Ancestry: European (only European cohorts included in meta-anlaysis), All Ancestry (all cohorts included)


BMI Data Files


WAISTadjBMI Data Files


WHRadjBMI Data Files


GIANT Consortium 2016 Exome Array Data is Available Here for Download

2016 Data File Description:

Each file consists of the following information for each SNP and its association to the specified trait based on meta-analysis in the respective publication. Significant digits for the p values, betas and standard errors are limited to two digits to further limit the possibility of identifiability.

  • Column 1. CHR: chromosome
  • Column 2. POS: position of the variant (hg19)
  • Column 3. REF: reference allele (hg19 + strand)
  • Column 4. ALT: alternate allele (hg19 + strand)
  • Column 5. SNPNAME: dbSNP name of the genetic marker
  • Column 6. (one among these)
    • GMAF: Non-reference allele and frequency of existing variant in 1000 Genomes
    • AFR_MAF: Non-reference allele and frequency of existing variant in 1000 Genomes combined African population
    • AMR_MAF: Non-reference allele and frequency of existing variant in 1000 Genomes combined American population
    • EUR_MAF: Non-reference allele and frequency of existing variant in 1000 Genomes combined European population
    • EAS_MAF: Non-reference allele and frequency of existing variant in 1000 Genomes combined East Asian population
    • SAS_MAF: Non-reference allele and frequency of existing variant in 1000 Genomes combined South Asian population
  • Column 7. (one among these)
    • ExAC_MAF: Frequency of existing variant in ExAC combined population
    • ExAC_AFR_MAF: Frequency of existing variant in ExAC African/American population
    • ExAC_AMR_MAF: Frequency of existing variant in ExAC American population
    • ExAC_EAS_MAF: Frequency of existing variant in ExAC East Asian population
    • ExAC_NFE_MAF: Frequency of existing variant in ExAC Non-Finnish European population
    • ExAC_SAS_MAF: Frequency of existing variant in ExAC South Asian population
  • Column 8. beta: beta
  • Column 9. se: standard error
  • Column 10. Pvalue: p-value after meta-analysis using regression coefficients (beta and standard error)


GIANT Consortium 2012-2015 GWAS Summary Statistics

2012-2015 Data File Description:

Each file consists of the following information for each SNP and its association to the specified trait based on meta-analysis in the respective publication. Significant digits for the p values, betas and standard errors are limited to two digits to further limit the possibility of identifiability.

  • MarkerName: The dbSNP name of the genetic marker
  • Allele1: The first allele (hg19 + strand). Where the regression coefficients (betas) are provided, the first allele is the effect allele. Where betas are not provided (typically the 2010 data), the first allele is the trait-increasing allele.
  • Allele2: The second allele (hg19 + strand)
  • Freq.Allele1.HapMapCEU: The allele frequency of Allele1 in the HapMap CEU population
  • b: beta
  • SE: standard error
  • p: p-value after meta-analysis using regression coefficients (beta and standard error), and after correction for inflation of test statistics using genomic control both at the individual study level and again after meta-analysis
  • N: Number of observations


For the Height DEPICT Gene Set Enrichment Analysis file, the columns are as follows:

  • A: the ID of the predefined gene set (before reconstitution by DEPICT);
  • B: the name of the gene set;
  • C: the DEPICT P-value for enrichment;
  • D: the false discovery rate for enrichment;
  • E: the genes in the gene set that overlap height-associated loci


GWAS Age-/Sex-Stratified 2015 BMI and WHR Summary Statistics

If you use these data, please cite: Winkler TW*, Justice AE*, Graff M*, Barata L*, Feitosa MF, Chu S, Czajkowski J, Esko T, Fall T, Kilpeläinen TO, Lu Y, Mägi R et al. (2015). The influence of age and sex on genetic associations with adult body size and shape: a large-scale genome-wide interaction study. PLoS Genetics. In press.


GWAS Anthropometric 2015 BMI Summary Statistics

If you use these BMI data, please cite: Locke AE, Kahali B, Berndt SI, Justice AE, Pers TH, Day FR, Powell C, Vedantam S, Buchkovich ML, Yang J, Croteau-Chonka DC, Esko T et al. (2015). Genetic studies of body mass index yield new insights for obesity biology. Nature 518, 197-206.


GWAS Anthropometric 2015 Waist Summary Statistics

HIP:

HIPadjBMI:

WC:

WCadjBMI:

WHR:

WHRadjBMI:

If you use these Waist data, please cite: Shungin D, Winkler TW, Croteau-Chonka DC, Ferreira T, Locke AE, Magi R, Strawbridge R, Pers TH, Fischer K, Justice AE, Workalemahu T, Wu JM, et al. (2015). New genetic loci link adipose and insulin biology to body fat distribution. Nature 518: 187-196.


GWAS Anthropometric 2014 Height Summary Statistics

If you use these Height data, please cite: Wood AR, Esko T, Yang J, Vedantam S, Pers TH, Gustafsson S et al. (2014). Defining the role of common variation in the genomic and biological architecture of adult human height. Nature Genetics 11:1173-86.


Variability in BMI and Height Summary Statistics

If you use these Body Mass Index or Height data, please cite: Yang J, Loos RJ, Powell JE, Medland SE, Speliotes EK, Chasman DI, Rose LM, Thorleifsson G, Steinthorsdottir V, Mägi R, et al. (2012). FTO genotype is associated with phenotypic variability of body mass index. Nature 490:267-272.


Sex Stratified Anthropometrics Summary Statistics

If you use these data, please cite: Randall JC, Winkler TW, Kutalik Z, Berndt SI, Jackson AU, Monda KL, Kilpeläinen TO, Esko T, Mägi R, Li S, et al. (2013). Sex-stratified genome-wide association studies including 270,000 individuals show sexual dimorphism in genetic loci for anthropometric traits. PLoS Genet 9: e1003500.


Extremes of Anthropometric Traits Summary Statistics

If you use these data, please cite: Berndt SI, Gustafsson S, Mägi R, Ganna A, Wheeler E, Feitosa MF, Justice AE, Monda KL, Croteau-Chonka DC, Day FR, et al. (2013). Genome-wide meta-analysis identifies 11 new loci for anthropometric traits and provides insights into genetic architecture. Nature Genetics 45:501-512.


GIANT Consortium 2010 GWAS Summary Statistics

2010 Data File Description:

Each file consists of the following information for each SNP and its association to the specified trait based on meta-analysis in the respective publication. SNPs where N < 50% of the maximum have been excluded.

  • MarkerName: The dbSNP name of the genetic marker
  • Allele1: The first allele, by definition the trait-increasing allele (hg18 + strand)
  • Allele2: The second allele (hg18 + strand)
  • Freq.Allele1.HapMapCEU: The allele frequency of Allele1 in the HapMap CEU population
  • P: P value after meta-analysis using regression coefficients (beta and standard error), and after correction for inflation of test statistics using genomic control both at the individual study level and again after meta-analysis
  • N: Number of observations


GWAS 2010 BMI Summary Statistics

BMI (download GZIP)

MD5 (GIANT_BMI_Speliotes2010_publicrelease_HapMapCeuFreq.txt -- 79 MB; 2,471,517 lines) = 38c836542807a3830101bcf48bb34472

If you use these Body Mass Index data, please cite: Speliotes, E.K., Willer, C.J., Berndt, S.I., Monda, K.L., Thorleifsson, G., Jackson, A.U., Allen, H.L., Lindgren, C.M., Luan, J., Magi, R., et al. (2010). Association analyses of 249,796 individuals reveal 18 new loci associated with body mass index. Nat Genet 42, 937-948.


GWAS 2010 Height Summary Statistics

Height (download GZIP)

MD5 (GIANT_HEIGHT_LangoAllen2010_publicrelease_HapMapCeuFreq.txt -- 82 MB; 2,469,636 lines) = b51b4c4ff1f03bd33c4b2dfd6b10cb82

If you use these height data, please cite: Lango Allen, H., Estrada, K., Lettre, G., Berndt, S.I., Weedon, M.N., Rivadeneira, F., Willer, C.J., Jackson, A.U., Vedantam, S., Raychaudhuri, S., et al. (2010). Hundreds of variants clustered in genomic loci and biological pathways affect human height. Nature 467, 832-838.


GWAS 2010 WHRadjBMI Summary Statistics

WHRadjBMI (download GZIP)

MD5 (GIANT_WHRadjBMI_Heid2010_publicrelease_HapMapCeuFreq.txt -- 75 MB; 2,483,326 lines) = 8f7e2ca61c33a120db9e7bfe51e3c053

If you use these waist-hip ratio adjusted for BMI data, please cite: Heid, I.M., Jackson, A.U., Randall, J.C., Winkler, T.W., Qi, L., Steinthorsdottir, V., Thorleifsson, G., Zillikens, M.C., Speliotes, E.K., Magi, R., et al. (2010). Meta-analysis identifies 13 new loci associated with waist-hip ratio and reveals sexual dimorphism in the genetic basis of fat distribution. Nat Genet 42, 949-960.