Samples were sequenced at various centers, including CIDR, UW, Mayo, Mt. Sinai, and CHOP using the PGRN-Seq targeted exome platform using Illumina HiSeq technology. Each site aligned the reads to the GRCh37 build with decoy sequences. Aligned reads were then jointly called using GATK HaplotypeCaller, version 3.3-0, according to the GATK best practices. Although both INDELs and SNPs were called, SPHINX currently only displays the SNP calls.
Raw variant calls failing any of the following filters were dropped: QUAL < 50; ABHet > 0.75; QD < 5.0. Raw genotype calls failing any of the following filters were also dropped: GQ < 50; Heterozygous call with AB > 0.75. Allele frequencies are shown for the overall dataset, as well as stratified by European ancestry and African Ancestry (the two largest groups available in the dataset). Because there are a small number of individuals from other ancestry groups, the frequencies may not sum to the overall dataset frequency.