Welcome to Myers Laboratory


  

Data Repositories
BMI Chromosome 7
BMI Chromosome 13
Parkinson Disease Data
Huntington Disease Data
Pulmonary Disease Data
Metabolic Syndrome Data
Illumina Data

Publication Timeline
· Publication Timeline

Laboratory Studies
· GenePD (Parkinson Disease)
· HD Maps (Huntington's Disease)
· Human Obesity
· Family Heart Study
· Chronic Obstructive Pulmonary Disease

Databases
· ProtRepeater

Software
· HaploBuild

Site Statistics
We received
933761
page views since April 2005

  
HaploBuild
Introduction
  The analysis of large amounts of SNP data creates difficulties for the analysis of haplotypes and their association to traits of interest. Commonly fairly simple methods, such as two- or three-SNP sliding windows are used to create haplotypes across large regions, but these may be of limited value when adjacent SNPs are in strong LD and provide redundant information. We have created a novel program, “HaploBuild” for constructing and testing haplotypes for SNPs in close physical proximity to one another but which are not necessarily contiguous. Furthermore, the number of SNPs contained in the haplotype is not restricted, thereby permitting the evaluation of complex haplotype structures.
 
Algorithm
  The HaploBuild algorithm defines a heuristic for choosing markers that are combined as a haplotype and tested for association with a disease phenotype. Given a set of genotyped markers our algorithm works in three steps. The first step tests for association with all two-marker haplotypes where the markers in the haplotype are within some physical distance d to each other (typically 50kb). If the P-value for association of any of these two marker haplotypes is less than a specified alpha level the pair of markers is saved for step 2. The second step of HaploBuild builds a graph from each of the two-marker haplotype that reached significance in step 1. The goal of the graph construction is to iteratively add markers one at a time to the haplotype increasing the overall haplotype association significance in a depth-first manner. In this context, the source node represents the base haplotype and one of its children corresponds to a successful addition of a marker that increases the haplotype length from n to n + 1 markers. Consequently, the sink nodes of a completed tree represent full-length haplotypes that no marker, within a distance d of the haplotype markers, can be added that will strengthen the P-value for the association test.
 
Required software/modules that need to be installed to run HaploBuild
  Perl Modules
 

. Tree::Nary
. GraphViz
. Getopt::Std
. Chart::Graph::Gnuplot qw(gnuplot)
. Math::Random

  Software
 
. FBAT
. GraphViz (optional for graph drawing)
. R Project for Statistical Computing
. qvalue package in R
.
Haploview (optional for LD calculations)(For Linux and Mac users the Haploview.jar file must be in the directory were HaploBuild will be executed from)
 
Download
  Please click here to email me for a copy of the software HaploBuild
User Manual
  HaploBuild_user_manual.pdf
Example Files
  example_file.zip
Planned Updates
 
.Use the batch file option of fbat to greatly increase the speed of HaploBuild
.Include a chromosome number in the marker information file to accommodate multiple analysis with one set of files
.Allow for case-control study designs by using the R package haplo.stats
Comments? Please contact Jason M Laramie
Page Generation: 0.034 Seconds