Difference between revisions of "GIANT consortium data files"

From Giant Consortium
Jump to navigation Jump to search
(23 intermediate revisions by the same user not shown)
Line 1: Line 1:
 
We are releasing the summary data from our meta-analyses of Genome-Wide Association Studies (GWAS) in order to enable other researchers to examine particular variants or loci for their evidence of association with anthropometric traits. The files include p-values and direction of effect at over 2 million directly genotyped or imputed single nucleotide polymorphisms (SNPs). To prevent the possibility of identification of individuals from these summary results, we are not releasing allele frequency data from our samples.  
 
We are releasing the summary data from our meta-analyses of Genome-Wide Association Studies (GWAS) in order to enable other researchers to examine particular variants or loci for their evidence of association with anthropometric traits. The files include p-values and direction of effect at over 2 million directly genotyped or imputed single nucleotide polymorphisms (SNPs). To prevent the possibility of identification of individuals from these summary results, we are not releasing allele frequency data from our samples.  
  
=2018 WHR Exome for Public Release=
+
This page is being migrated, so during this time some files are being made available at the following site: [https://www.joelhirschhornlab.org/giant-consortium-results].
 +
This includes summary association results from Ried et al. ''Nature Communications'' 2016 (GWAS of body shape) and Yengo et al. 2022 (GWAS of height).
  
File Name Structure: PublicRelease.WHRadjBMI.*.*.*.txt.gz
+
='''2022 GWAS Summary Statistics and Polygenic Score (PGS) Weights'''=
 +
==Height GWAS Summary Statistics and PGS Weights==
  
'''Abbreviations in filenames:'''
+
GWAS summary statistics and PGS weights from Yengo et al. 2022 are temporarily available at the following sites: [https://www.joelhirschhornlab.org/giant-consortium-results] and [https://cnsgenomics.com/data/giant_2022/].
  
C-Combined Sexes
+
='''2018 Exome Array Summary Statistics'''=
 
+
==WHR Exome Array Summary Statistics==
M-Men
 
 
 
W-Women
 
 
 
All- All Ancestries
 
 
 
Eur- European descent only
 
 
 
Add- Additive genetic model
 
 
 
Rec- Recessive genetic model
 
 
 
 
 
'''File headers:'''
 
 
 
1. snpname - dbSNP rsID
 
 
 
2. chr - chromosome
 
   
 
3. pos - position
 
 
 
4. markername - chr:pos
 
   
 
5. ref - reference allele (hg19 + strand)
 
 
 
6. alt - alternate allele (hg19 + strand)
 
 
 
7. beta - beta   
 
 
 
8. se - standard error
 
 
 
9. pvalue  - P value
 
 
 
10. n - sample size
 
 
 
11. gmaf/eur_maf - alternate allele frequency in 1000 Genome Combined/European Ancestries
 
 
 
12. exac_maf/exac_nfe_maf -alternate allele frequency in ExAC Combined/Non-Finnish European Ancestries
 
 
 
 
 
 
 
=2018 Exome Array Summary Statistics=
 
== '''WHR Exome Array Summary Statistics''' ==
 
  
 
[[Media:Data_File_Description_2018_Exome_Array_Summary_Statistics_WHR_Exome_Array_Summary_Statistics.docx | Download WHR Exome Array Data File Description ]]
 
[[Media:Data_File_Description_2018_Exome_Array_Summary_Statistics_WHR_Exome_Array_Summary_Statistics.docx | Download WHR Exome Array Data File Description ]]
Line 71: Line 30:
  
  
== '''BMI Exome Array Summary Statistics''' ==
+
==BMI Exome Array Summary Statistics==
 
'''BMI Data Files:'''
 
'''BMI Data Files:'''
 
*[[Media:BMI_African_American.fmt.gzip|BMI_African_American gzip]]
 
*[[Media:BMI_African_American.fmt.gzip|BMI_African_American gzip]]
Line 83: Line 42:
  
  
== '''Height Exome Array Summary Statistics''' ==
+
==Height Exome Array Summary Statistics==
  
 
[[Media:Data_File_Description_2018_Exome_Array_Summary_Statistics_Height_Exome_Array_Summary_Statistics.docx‎ | Download Height Exome Array Data File Description]]
 
[[Media:Data_File_Description_2018_Exome_Array_Summary_Statistics_Height_Exome_Array_Summary_Statistics.docx‎ | Download Height Exome Array Data File Description]]
Line 97: Line 56:
  
  
=2018 GIANT and UK BioBank Meta-analysis=
+
='''2018 GIANT and UK BioBank Meta-analysis'''=
  
== '''WHR GIANT and UK BioBank Meta-analysis Summary Statistics''' ==
+
==WHR GIANT and UK BioBank Meta-analysis Summary Statistics==
  
 
'''WHR UK BioBank Meta Analysis for Public Release:'''
 
'''WHR UK BioBank Meta Analysis for Public Release:'''
Line 114: Line 73:
  
  
== '''BMI and Height GIANT and UK BioBank Meta-analysis Summary Statistics''' ==
+
==BMI and Height GIANT and UK BioBank Meta-analysis Summary Statistics==
  
 
Please Note: We discovered that the BMI files for the meta-analysis of UK Biobank and GIANT originally uploaded did not reflect the full sample size and have now been corrected.  If you downloaded these files prior to June 25, 2018, please download them again. Our apologies for any inconvenience.
 
Please Note: We discovered that the BMI files for the meta-analysis of UK Biobank and GIANT originally uploaded did not reflect the full sample size and have now been corrected.  If you downloaded these files prior to June 25, 2018, please download them again. Our apologies for any inconvenience.
Line 128: Line 87:
  
  
=2017 Gene x Environment Summary Statistics=
+
='''2017 Gene x Environment Summary Statistics'''=
  
== '''Summary Statistics for Models Adjusting for Smoking Status''' ==
+
==Summary Statistics for Models Adjusting for Smoking Status==
  
  
Line 149: Line 108:
 
*[[Media: WHRadjBMI.SNPadjSMK.zip | Download WHRadjBMI GZIP]]
 
*[[Media: WHRadjBMI.SNPadjSMK.zip | Download WHRadjBMI GZIP]]
  
== '''Summary Statistics for Smoking Stratified Models''' ==
+
 
 +
 
 +
==Summary Statistics for Smoking Stratified Models==
  
  
Line 170: Line 131:
  
  
== '''Summary Statistics for Gene x Physical Activity''' ==
+
==Summary Statistics for Gene x Physical Activity==
  
  
*Column1: rsid
 
*Column2: Chromosome
 
*Column3: Position_hg19
 
*Column4: Effect_allele (positive strand)
 
*Column5: Other_allele (positive strand)
 
*Column6: EAF_HapMapCEU (frequency of effect allele based on the HapMap CEU reference population)
 
*Column7: Sample_size
 
*Column8: Effect effect estimate
 
*Column9: Stderr standard error
 
*Column10: Pvalue (p-value after meta-analysis in METAL based on regression coefficients: effect and standard error)
 
*Column11: HetIsq
 
  
 +
[[Media:Data_File_Description_2017_Gene_x_Environment_Summary_Statistics_Summary_Statistics_for_Gene_x_Physical_Activity.docx | Download Gene x Physical Activity Data File Description]]
  
'''File naming scheme:'''
 
*Trait: BMI, WaistadjBMI (Waist circumference adjusted for BMI), WHRadjBMI (Waist-to-hip ratio adjusted for BMI)
 
 
*Physical activity: ACTIVE (individuals defined as active), INACTIVE (individuals defined as inactive), SNPadjPA (all individuals, added covariate of physical activity level)
 
 
*Gender: MEN, WOMEN, ALL (Men and Women)
 
 
*Ancestry: European (only European cohorts included in meta-anlaysis), All Ancestry (all cohorts included)
 
  
  
Line 268: Line 211:
  
  
= GIANT Consortium 2016 Exome Array Data is Available Here for Download =
+
='''2016 GIANT Body Shape Meta-analysis'''=
  
  
===2016 Data File Description:===
+
Summary association results from Ried et al. ''Nature Communications'' 2016 are temporarily aailable here: [https://www.joelhirschhornlab.org/giant-consortium-results].
Each file consists of the following information for each SNP and its association to the specified trait based on meta-analysis in the respective publication. Significant digits for the p values, betas and standard errors are limited to two digits to further limit the possibility of identifiability.  
 
  
*Column 1. CHR: chromosome
+
='''GIANT Consortium 2012-2015 GWAS Summary Statistics'''=
*Column 2. POS: position of the variant (hg19)
 
*Column 3. REF: reference allele (hg19 + strand)
 
*Column 4. ALT: alternate allele (hg19 + strand)
 
*Column 5. SNPNAME: dbSNP name of the genetic marker
 
*Column 6. (one among these)
 
**GMAF: Non-reference allele and frequency of existing variant in 1000 Genomes
 
**AFR_MAF: Non-reference allele and frequency of existing variant in 1000 Genomes combined African population
 
**AMR_MAF: Non-reference allele and frequency of existing variant in 1000 Genomes combined American population
 
**EUR_MAF: Non-reference allele and frequency of existing variant in 1000 Genomes combined European population
 
**EAS_MAF: Non-reference allele and frequency of existing variant in 1000 Genomes combined East Asian population
 
**SAS_MAF: Non-reference allele and frequency of existing variant in 1000 Genomes combined South Asian population
 
*Column 7. (one among these)
 
**ExAC_MAF: Frequency of existing variant in ExAC combined population
 
**ExAC_AFR_MAF: Frequency of existing variant in ExAC African/American population
 
**ExAC_AMR_MAF: Frequency of existing variant in ExAC American population
 
**ExAC_EAS_MAF: Frequency of existing variant in ExAC East Asian population
 
**ExAC_NFE_MAF: Frequency of existing variant in ExAC Non-Finnish European population
 
**ExAC_SAS_MAF: Frequency of existing variant in ExAC South Asian population
 
*Column 8. beta: beta
 
*Column 9. se: standard error
 
*Column 10. Pvalue: p-value after meta-analysis using regression coefficients (beta and standard error)
 
  
  
= GIANT Consortium 2012-2015 GWAS Summary Statistics =
 
  
===2012-2015 Data File Description:===
+
[[Media:Data_File_Description_2012-2015_GIANT_GWAS_Summary_Statistics.docx | Download 2012-2015 GIANT GWAS Summary Statistics Data File Description]]
Each file consists of the following information for each SNP and its association to the specified trait based on meta-analysis in the respective publication. Significant digits for the p values, betas and standard errors are limited to two digits to further limit the possibility of identifiability.
 
*'''MarkerName''': The [http://www.ncbi.nlm.nih.gov/SNP/ dbSNP] name of the genetic marker
 
*'''Allele1''': The first allele (hg19 + strand). Where the regression coefficients (betas) are provided, the first allele is the effect allele.  Where betas are not provided (typically the 2010 data), the first allele is the trait-increasing allele.
 
*'''Allele2''': The second allele (hg19 + strand)
 
*'''Freq.Allele1.HapMapCEU''': The allele frequency of Allele1 in the [http://www.hapmap.org HapMap] CEU population
 
*'''b''': beta
 
*'''SE''': standard error
 
*'''p''': p-value after meta-analysis using regression coefficients (beta and standard error), and after correction for inflation of test statistics using genomic control both at the individual study level and again after meta-analysis
 
*'''N''': Number of observations
 
 
 
 
 
For the Height DEPICT Gene Set Enrichment Analysis file, the columns are as follows:
 
*'''A''': the ID of the predefined gene set (before reconstitution by DEPICT);
 
*'''B''': the name of the gene set;
 
*'''C''': the DEPICT P-value for enrichment;
 
*'''D''': the false discovery rate for enrichment;
 
*'''E''': the genes in the gene set that overlap height-associated loci
 
  
  
Line 474: Line 377:
  
  
=GIANT Consortium 2010 GWAS Summary Statistics=
 
  
===2010 Data File Description:===
+
='''GIANT Consortium 2010 GWAS Summary Statistics'''=
 +
 
 +
 
  
Each file consists of the following information for each SNP and its association to the specified trait based on meta-analysis in the respective publication. SNPs where N < 50% of the maximum have been excluded.
+
[[Media:Data_File_Description_2010_GIANT_Consortium_2010_GWAS_Summary_Statistics.docx | Download GIANT Consortium 2010 GWAS Summary Statistics Data File Description]]
  
*'''MarkerName''': The [http://www.ncbi.nlm.nih.gov/SNP/ dbSNP] name of the genetic marker
 
*'''Allele1''': The first allele, by definition the trait-increasing allele (hg18 + strand)
 
*'''Allele2''': The second allele (hg18 + strand)
 
*'''Freq.Allele1.HapMapCEU''': The allele frequency of Allele1 in the [http://www.hapmap.org HapMap] CEU population
 
*'''P''': P value after meta-analysis using regression coefficients (beta and standard error), and after correction for inflation of test statistics using genomic control both at the individual study level and again after meta-analysis
 
*'''N''': Number of observations
 
  
  

Revision as of 16:16, 12 August 2022

We are releasing the summary data from our meta-analyses of Genome-Wide Association Studies (GWAS) in order to enable other researchers to examine particular variants or loci for their evidence of association with anthropometric traits. The files include p-values and direction of effect at over 2 million directly genotyped or imputed single nucleotide polymorphisms (SNPs). To prevent the possibility of identification of individuals from these summary results, we are not releasing allele frequency data from our samples.

This page is being migrated, so during this time some files are being made available at the following site: [1]. This includes summary association results from Ried et al. Nature Communications 2016 (GWAS of body shape) and Yengo et al. 2022 (GWAS of height).

2022 GWAS Summary Statistics and Polygenic Score (PGS) Weights

Height GWAS Summary Statistics and PGS Weights

GWAS summary statistics and PGS weights from Yengo et al. 2022 are temporarily available at the following sites: [2] and [3].

2018 Exome Array Summary Statistics

WHR Exome Array Summary Statistics

Download WHR Exome Array Data File Description


WHR Data Files:


BMI Exome Array Summary Statistics

BMI Data Files:

If you use these data, please cite: Turcot V, Lu Y, Highland HM, Schurmann C, Justice AE, Fine RS, Bradfield JP, Esko T, Giri A, Graff M, Guo X, Hendricks AE, et al. (2018). Protein-altering variants associated with body mass index implicate pathways that control energy intake and expenditure in obesity. Nat Genet. 2018 Jan;50(1):26-41. doi: 10.1038/s41588-017-0011-x. Epub 2017 Dec 22. PMID: 29273807


Height Exome Array Summary Statistics

Download Height Exome Array Data File Description


Height Data Files:


2018 GIANT and UK BioBank Meta-analysis

WHR GIANT and UK BioBank Meta-analysis Summary Statistics

WHR UK BioBank Meta Analysis for Public Release:

Access link for GWAS data for all ~27M sites: https://zenodo.org/record/1251813#.XCLJ7vZKhE4 DOI (10.5281/zenodo.1251813)


BMI and Height GIANT and UK BioBank Meta-analysis Summary Statistics

Please Note: We discovered that the BMI files for the meta-analysis of UK Biobank and GIANT originally uploaded did not reflect the full sample size and have now been corrected. If you downloaded these files prior to June 25, 2018, please download them again. Our apologies for any inconvenience.

If you use these data, please cite: Yengo L, Sidorenko J, Kemper KE, Zheng Z, Wood AR, Weedon MN, Frayling TM, Hirschhorn J, Yang J, Visscher PM, GIANT Consortium. (2018). Meta-analysis of genome-wide association studies for height and body mass index in ~700,000 individuals of European ancestry. Biorxiv.


2017 Gene x Environment Summary Statistics

Summary Statistics for Models Adjusting for Smoking Status

Download Models Adjusting for Smoking Status Data File Description


BMI:


WCadjBMI:


WHRadjBMI:


Summary Statistics for Smoking Stratified Models

Download Smoking Stratified Models Data File Description


BMI:


WCadjBMI:


WHRadjBMI:


Summary Statistics for Gene x Physical Activity

Download Gene x Physical Activity Data File Description


BMI Data Files:


WAISTadjBMI Data Files:


WHRadjBMI Data Files:


2016 GIANT Body Shape Meta-analysis

Summary association results from Ried et al. Nature Communications 2016 are temporarily aailable here: [4].

GIANT Consortium 2012-2015 GWAS Summary Statistics

Download 2012-2015 GIANT GWAS Summary Statistics Data File Description


GWAS Age-/Sex-Stratified 2015 BMI and WHR Summary Statistics

If you use these data, please cite: Winkler TW*, Justice AE*, Graff M*, Barata L*, Feitosa MF, Chu S, Czajkowski J, Esko T, Fall T, Kilpeläinen TO, Lu Y, Mägi R et al. (2015). The influence of age and sex on genetic associations with adult body size and shape: a large-scale genome-wide interaction study. PLoS Genetics. In press.


GWAS Anthropometric 2015 BMI Summary Statistics

If you use these BMI data, please cite: Locke AE, Kahali B, Berndt SI, Justice AE, Pers TH, Day FR, Powell C, Vedantam S, Buchkovich ML, Yang J, Croteau-Chonka DC, Esko T et al. (2015). Genetic studies of body mass index yield new insights for obesity biology. Nature 518, 197-206.


GWAS Anthropometric 2015 Waist Summary Statistics

HIP:

HIPadjBMI:

WC:

WCadjBMI:

WHR:

WHRadjBMI:

If you use these Waist data, please cite: Shungin D, Winkler TW, Croteau-Chonka DC, Ferreira T, Locke AE, Magi R, Strawbridge R, Pers TH, Fischer K, Justice AE, Workalemahu T, Wu JM, et al. (2015). New genetic loci link adipose and insulin biology to body fat distribution. Nature 518: 187-196.


GWAS Anthropometric 2014 Height Summary Statistics

If you use these Height data, please cite: Wood AR, Esko T, Yang J, Vedantam S, Pers TH, Gustafsson S et al. (2014). Defining the role of common variation in the genomic and biological architecture of adult human height. Nature Genetics 11:1173-86.


Variability in BMI and Height Summary Statistics

If you use these Body Mass Index or Height data, please cite: Yang J, Loos RJ, Powell JE, Medland SE, Speliotes EK, Chasman DI, Rose LM, Thorleifsson G, Steinthorsdottir V, Mägi R, et al. (2012). FTO genotype is associated with phenotypic variability of body mass index. Nature 490:267-272.


Sex Stratified Anthropometrics Summary Statistics

If you use these data, please cite: Randall JC, Winkler TW, Kutalik Z, Berndt SI, Jackson AU, Monda KL, Kilpeläinen TO, Esko T, Mägi R, Li S, et al. (2013). Sex-stratified genome-wide association studies including 270,000 individuals show sexual dimorphism in genetic loci for anthropometric traits. PLoS Genet 9: e1003500.


Extremes of Anthropometric Traits Summary Statistics

If you use these data, please cite: Berndt SI, Gustafsson S, Mägi R, Ganna A, Wheeler E, Feitosa MF, Justice AE, Monda KL, Croteau-Chonka DC, Day FR, et al. (2013). Genome-wide meta-analysis identifies 11 new loci for anthropometric traits and provides insights into genetic architecture. Nature Genetics 45:501-512.


GIANT Consortium 2010 GWAS Summary Statistics

Download GIANT Consortium 2010 GWAS Summary Statistics Data File Description


GWAS 2010 BMI Summary Statistics

If you use these Body Mass Index data, please cite: Speliotes, E.K., Willer, C.J., Berndt, S.I., Monda, K.L., Thorleifsson, G., Jackson, A.U., Allen, H.L., Lindgren, C.M., Luan, J., Magi, R., et al. (2010). Association analyses of 249,796 individuals reveal 18 new loci associated with body mass index. Nat Genet 42, 937-948.


GWAS 2010 Height Summary Statistics

If you use these Height data, please cite: Lango Allen, H., Estrada, K., Lettre, G., Berndt, S.I., Weedon, M.N., Rivadeneira, F., Willer, C.J., Jackson, A.U., Vedantam, S., Raychaudhuri, S., et al. (2010). Hundreds of variants clustered in genomic loci and biological pathways affect human height. Nature 467, 832-838.


GWAS 2010 WHRadjBMI Summary Statistics

If you use these waist-hip ratio adjusted for BMI data, please cite: Heid, I.M., Jackson, A.U., Randall, J.C., Winkler, T.W., Qi, L., Steinthorsdottir, V., Thorleifsson, G., Zillikens, M.C., Speliotes, E.K., Magi, R., et al. (2010). Meta-analysis identifies 13 new loci associated with waist-hip ratio and reveals sexual dimorphism in the genetic basis of fat distribution. Nat Genet 42, 949-960.