Difference between revisions of "GIANT consortium data files"

From Giant Consortium
Jump to navigation Jump to search
Line 12: Line 12:
 
===2016 Data File Description:===
 
===2016 Data File Description:===
 
Each file consists of the following information for each SNP and its association to the specified trait based on meta-analysis in the respective publication. Significant digits for the p values, betas and standard errors are limited to two digits to further limit the possibility of identifiability.  
 
Each file consists of the following information for each SNP and its association to the specified trait based on meta-analysis in the respective publication. Significant digits for the p values, betas and standard errors are limited to two digits to further limit the possibility of identifiability.  
*'''MarkerName''': The [http://www.ncbi.nlm.nih.gov/SNP/ dbSNP] name of the genetic marker
 
*'''Allele1''': The first allele (hg19 + strand). Where the regression coefficients (betas) are provided, the first allele is the effect allele.  Where betas are not provided (typically the 2010 data), the first allele is the trait-increasing allele.
 
*'''Allele2''': The second allele (hg19 + strand)
 
*'''Freq.Allele1.HapMapCEU''': The allele frequency of Allele1 in the [http://www.hapmap.org HapMap] CEU population
 
*'''b''': beta
 
*'''SE''': standard error
 
*'''p''': p-value after meta-analysis using regression coefficients (beta and standard error), and after correction for inflation of test statistics using genomic control both at the individual study level and again after meta-analysis
 
*'''N''': Number of observations
 
  
 
+
*Column 1. CHR: chromosome
Column 1. CHR: chromosome
+
*Column 2. POS: position of the variant (hg19)
Column 2. POS: position of the variant (hg19)
+
*Column 3. REF: reference allele (hg19 + strand)
Column 3. REF: reference allele (hg19 + strand)
+
*Column 4. ALT: alternate allele (hg19 + strand)
Column 4. ALT: alternate allele (hg19 + strand)
+
*Column 5. SNPNAME: dbSNP name of the genetic marker
Column 5. SNPNAME: dbSNP name of the genetic marker
+
*Column 6. (one among these)
Column 6. (one among these)
+
GMAF: Non-reference allele and frequency of existing variant in 1000 Genomes
  GMAF : Non-reference allele and frequency of existing variant in 1000 Genomes
+
AFR_MAF: Non-reference allele and frequency of existing variant in 1000 Genomes combined African population
  AFR_MAF : Non-reference allele and frequency of existing variant in 1000 Genomes combined African population
+
AMR_MAF: Non-reference allele and frequency of existing variant in 1000 Genomes combined American population
  AMR_MAF : Non-reference allele and frequency of existing variant in 1000 Genomes combined American population
+
EUR_MAF: Non-reference allele and frequency of existing variant in 1000 Genomes combined European population
  EUR_MAF : Non-reference allele and frequency of existing variant in 1000 Genomes combined European population
+
EAS_MAF: Non-reference allele and frequency of existing variant in 1000 Genomes combined East Asian population
  EAS_MAF : Non-reference allele and frequency of existing variant in 1000 Genomes combined East Asian population
+
SAS_MAF: Non-reference allele and frequency of existing variant in 1000 Genomes combined South Asian population
  SAS_MAF : Non-reference allele and frequency of existing variant in 1000 Genomes combined South Asian population
+
*Column 7. (one among these)
Column 7. (one among these)
+
ExAC_MAF: Frequency of existing variant in ExAC combined population
  ExAC_MAF : Frequency of existing variant in ExAC combined population
+
ExAC_AFR_MAF: Frequency of existing variant in ExAC African/American population
  ExAC_AFR_MAF : Frequency of existing variant in ExAC African/American population
+
ExAC_AMR_MAF: Frequency of existing variant in ExAC American population
  ExAC_AMR_MAF : Frequency of existing variant in ExAC American population
+
ExAC_EAS_MAF: Frequency of existing variant in ExAC East Asian population
  ExAC_EAS_MAF : Frequency of existing variant in ExAC East Asian population
+
ExAC_NFE_MAF: Frequency of existing variant in ExAC Non-Finnish European population
  ExAC_NFE_MAF : Frequency of existing variant in ExAC Non-Finnish European population
+
ExAC_SAS_MAF: Frequency of existing variant in ExAC South Asian population
  ExAC_SAS_MAF : Frequency of existing variant in ExAC South Asian population
+
*Column 8. beta: beta
Column 8. beta: beta
+
*Column 9. se: standard error
Column 9. se: standard error
+
*Column 10. Pvalue: p-value after meta-analysis using regression coefficients (beta and standard error)
Column 10. Pvalue: p-value after meta-analysis using regression coefficients (beta and standard error)
 
  
 
= GIANT consortium 2012-2015 GWAS Metadata is Available Here for Download =  
 
= GIANT consortium 2012-2015 GWAS Metadata is Available Here for Download =  

Revision as of 13:10, 1 February 2017

We are releasing the summary data from our 2010-2016 meta-analyses of Genome-wide Association (GWA) data, in order to enable other researchers to examine particular variants or loci for their evidence of association with anthropometric traits. The files include p-values and direction of effect at over 2 million directly genotyped or imputed single nucleotide polymorphisms (SNPs). To prevent the possibility of identification of individuals from these summary results, we are not releasing allele frequency data from our samples.

GIANT Consortium 2016 Exome Array Data is Available Here for Download

2016 Data File Description:

Each file consists of the following information for each SNP and its association to the specified trait based on meta-analysis in the respective publication. Significant digits for the p values, betas and standard errors are limited to two digits to further limit the possibility of identifiability.

  • Column 1. CHR: chromosome
  • Column 2. POS: position of the variant (hg19)
  • Column 3. REF: reference allele (hg19 + strand)
  • Column 4. ALT: alternate allele (hg19 + strand)
  • Column 5. SNPNAME: dbSNP name of the genetic marker
  • Column 6. (one among these)

GMAF: Non-reference allele and frequency of existing variant in 1000 Genomes AFR_MAF: Non-reference allele and frequency of existing variant in 1000 Genomes combined African population AMR_MAF: Non-reference allele and frequency of existing variant in 1000 Genomes combined American population EUR_MAF: Non-reference allele and frequency of existing variant in 1000 Genomes combined European population EAS_MAF: Non-reference allele and frequency of existing variant in 1000 Genomes combined East Asian population SAS_MAF: Non-reference allele and frequency of existing variant in 1000 Genomes combined South Asian population

  • Column 7. (one among these)

ExAC_MAF: Frequency of existing variant in ExAC combined population ExAC_AFR_MAF: Frequency of existing variant in ExAC African/American population ExAC_AMR_MAF: Frequency of existing variant in ExAC American population ExAC_EAS_MAF: Frequency of existing variant in ExAC East Asian population ExAC_NFE_MAF: Frequency of existing variant in ExAC Non-Finnish European population ExAC_SAS_MAF: Frequency of existing variant in ExAC South Asian population

  • Column 8. beta: beta
  • Column 9. se: standard error
  • Column 10. Pvalue: p-value after meta-analysis using regression coefficients (beta and standard error)

GIANT consortium 2012-2015 GWAS Metadata is Available Here for Download

2012-2015 Data File Description:

Each file consists of the following information for each SNP and its association to the specified trait based on meta-analysis in the respective publication. Significant digits for the p values, betas and standard errors are limited to two digits to further limit the possibility of identifiability.

  • MarkerName: The dbSNP name of the genetic marker
  • Allele1: The first allele (hg19 + strand). Where the regression coefficients (betas) are provided, the first allele is the effect allele. Where betas are not provided (typically the 2010 data), the first allele is the trait-increasing allele.
  • Allele2: The second allele (hg19 + strand)
  • Freq.Allele1.HapMapCEU: The allele frequency of Allele1 in the HapMap CEU population
  • b: beta
  • SE: standard error
  • p: p-value after meta-analysis using regression coefficients (beta and standard error), and after correction for inflation of test statistics using genomic control both at the individual study level and again after meta-analysis
  • N: Number of observations


For the Height DEPICT Gene Set Enrichment Analysis file, the columns are as follows:

  • A: the ID of the predefined gene set (before reconstitution by DEPICT);
  • B: the name of the gene set;
  • C: the DEPICT P-value for enrichment;
  • D: the false discovery rate for enrichment;
  • E: the genes in the gene set that overlap height-associated loci


GWAMA Age-/Sex-Stratified 2015 BMI and WHR

If you use these data, please cite: Winkler TW*, Justice AE*, Graff M*, Barata L*, Feitosa MF, Chu S, Czajkowski J, Esko T, Fall T, Kilpeläinen TO, Lu Y, Mägi R et al. (2015). The influence of age and sex on genetic associations with adult body size and shape: a large-scale genome-wide interaction study. PLoS Genetics. In press.

GWAS Anthropometric 2015 BMI

If you use these BMI data, please cite: Locke AE, Kahali B, Berndt SI, Justice AE, Pers TH, Day FR, Powell C, Vedantam S, Buchkovich ML, Yang J, Croteau-Chonka DC, Esko T et al. (2015). Genetic studies of body mass index yield new insights for obesity biology. Nature 518, 197-206.

GWAS Anthropometric 2015 Waist

HIP:

HIPadjBMI:

WC:

WCadjBMI:

WHR:

WHRadjBMI:

If you use these Waist data please cite: Shungin D, Winkler TW, Croteau-Chonka DC, Ferreira T, Locke AE, Magi R, Strawbridge R, Pers TH, Fischer K, Justice AE, Workalemahu T, Wu JM, et al. (2015). New genetic loci link adipose and insulin biology to body fat distribution. Nature 518: 187-196.

GWAS Anthropometric 2014 Height

If you use these Height data, please cite: Wood AR, Esko T, Yang J, Vedantam S, Pers TH, Gustafsson S et al. (2014). Defining the role of common variation in the genomic and biological architecture of adult human height. Nature Genetics 11:1173-86.

Variability in BMI and Height

If you use these Body Mass Index or Height data, please cite: Yang J, Loos RJ, Powell JE, Medland SE, Speliotes EK, Chasman DI, Rose LM, Thorleifsson G, Steinthorsdottir V, Mägi R, et al. (2012). FTO genotype is associated with phenotypic variability of body mass index. Nature 490:267-272.

Sex Stratified Anthropometrics

If you use these data, please cite: Randall JC, Winkler TW, Kutalik Z, Berndt SI, Jackson AU, Monda KL, Kilpeläinen TO, Esko T, Mägi R, Li S, et al. (2013). Sex-stratified genome-wide association studies including 270,000 individuals show sexual dimorphism in genetic loci for anthropometric traits. PLoS Genet 9: e1003500.

Extremes of Anthropometric Traits

If you use these data, please cite: Berndt SI, Gustafsson S, Mägi R, Ganna A, Wheeler E, Feitosa MF, Justice AE, Monda KL, Croteau-Chonka DC, Day FR, et al. (2013). Genome-wide meta-analysis identifies 11 new loci for anthropometric traits and provides insights into genetic architecture. Nature Genetics 45:501-512.

GIANT Consortium 2010 GWAS Metadata is Available Here for Download

2010 Data File Description:

Each file consists of the following information for each SNP and its association to the specified trait based on meta-analysis in the respective publication. SNPs where N < 50% of the maximum have been excluded.

  • MarkerName: The dbSNP name of the genetic marker
  • Allele1: The first allele, by definition the trait-increasing allele (hg18 + strand)
  • Allele2: The second allele (hg18 + strand)
  • Freq.Allele1.HapMapCEU: The allele frequency of Allele1 in the HapMap CEU population
  • P: P value after meta-analysis using regression coefficients (beta and standard error), and after correction for inflation of test statistics using genomic control both at the individual study level and again after meta-analysis
  • N: Number of observations

BMI (download GZIP)

MD5 (GIANT_BMI_Speliotes2010_publicrelease_HapMapCeuFreq.txt -- 79 MB; 2,471,517 lines) = 38c836542807a3830101bcf48bb34472

If you use these Body Mass Index data, please cite: Speliotes, E.K., Willer, C.J., Berndt, S.I., Monda, K.L., Thorleifsson, G., Jackson, A.U., Allen, H.L., Lindgren, C.M., Luan, J., Magi, R., et al. (2010). Association analyses of 249,796 individuals reveal 18 new loci associated with body mass index. Nat Genet 42, 937-948.

Height (download GZIP)

MD5 (GIANT_HEIGHT_LangoAllen2010_publicrelease_HapMapCeuFreq.txt -- 82 MB; 2,469,636 lines) = b51b4c4ff1f03bd33c4b2dfd6b10cb82

If you use these height data, please cite: Lango Allen, H., Estrada, K., Lettre, G., Berndt, S.I., Weedon, M.N., Rivadeneira, F., Willer, C.J., Jackson, A.U., Vedantam, S., Raychaudhuri, S., et al. (2010). Hundreds of variants clustered in genomic loci and biological pathways affect human height. Nature 467, 832-838.

WHRadjBMI (download GZIP)

MD5 (GIANT_WHRadjBMI_Heid2010_publicrelease_HapMapCeuFreq.txt -- 75 MB; 2,483,326 lines) = 8f7e2ca61c33a120db9e7bfe51e3c053

If you use these waist-hip ratio adjusted for BMI data, please cite: Heid, I.M., Jackson, A.U., Randall, J.C., Winkler, T.W., Qi, L., Steinthorsdottir, V., Thorleifsson, G., Zillikens, M.C., Speliotes, E.K., Magi, R., et al. (2010). Meta-analysis identifies 13 new loci associated with waist-hip ratio and reveals sexual dimorphism in the genetic basis of fat distribution. Nat Genet 42, 949-960.