Welcome Personnel Research  Software Publications   
We are committed to releasing free open-source software to the research community in a timely manner.  Our current open-source software projects are listed below.

I. BioSymphony (BioSym)

See www.BioSymphony.org for more information.

Solving complex biomedical problems relies more and more on collaborative efforts among two or more investigators each with different skills and expertise.  It is often the case that the most useful collaborations result from chance interactions among investigators from different departments at the same institution or among investigators at different institutions from the same region. The goal of the BioSymphony project is to develop and make freely available software for facilitating biomedical research collaborations. We are establishing a database of biomedical investigators at Dartmouth and throughout Northern New England that consists of annotated information about each investigator mined from PubMed that can be used to predict fruitful collaborations.  Our hope is that this resource will result in meaningful collaborations that otherwise might happen only by chance. The BioSymphony (BioSym) database and software is in early alpha testing at Dartmouth Medical School.  Check the BioSym web page, here or our blog (Epistasis Blog) for updates.

Development of BioSym is supported by generous funds from the Norris-Cotton Cancer Center.

II. Exploratory Visual Analysis (EVA)

See www.exploratoryvisualanalysis.org for more information.

EVA is a database and GUI for the exploratory visual analysis of statistical results (not raw data) from high-throughput genetic and genomic experiments.  How often have you been handed an Excel spreadsheet with >30,000 Affymetrix gene IDs and p-values from a statistical analysis and been left with the daunting challenge of extracting something biologically meaningful?  The EVA system allows you to database these results with knowledge about each gene from public databases such as Entrez Gene.  The GUI allows you to visually explore the p-values in the context of Gene Ontology, biochemical pathway, protein domain, chromosomal location, or phenotype thus facilitating biological interpretation. The first paper describing EVA was published in the 2005 proceedings of the Pacific Symposium on Biocomputing.  An example application of EVA can be found in a recent publication in Oncology Reports and a recent publication in Diabetes.  The prototype EVA database was programmed in Oracle while the prototype EVA GUI was programmed in Visual Basic.  An open-source version of EVA in Java  is under development and is available upon request.  Check this web page, www.exploratoryvisualanalysis.org, or our blog (Epistasis Blog) for updates.

Development of EVA is supported by generous funds from the Norris-Cotton Cancer Center.  The prototype for EVA was supported by NIH grant P20-LM007613.

III. Multifactor Dimensionality Reduction (MDR)

The open-source MDR software package can be freely downloaded from Sourceforge.Net.

See www.multifactordimensionalityreduction.org for more information.

MDR is a nonparametric and genetic model-free data mining alternative to logistic regression for detecting and characterizing nonlinear interactions among discrete genetic and environmental attributes.  The MDR method combines attribute selection, attribute construction, and classification with cross-validation and permutation testing to provide a comprehensive and powerful approach to detecting nonlinear interactions.  See our 2006 paper in the Journal of Theoretical Biology for a recent review.  See also the MDR entry in Wikipedia for a description of the basic method.  Click here to carry out a PubMed search for MDR publications.  Click here to Google MDR. Click here to Google Scholar MDR. See the publications page on this website for a comprehensive list of our MDR papers.  See our blog or www.multifactordimensionalityreduction.org for updates and news about latest developments with MDR.

Development of MDR is supported by NIH grants R01-AI59694, R01-HD047447, and R01-LM009012 as well as
by generous funds from the Norris-Cotton Cancer Center.

IV. Symbolic Modeler (SyMod)

See www.symbolicmodeler.org for more information.

The SyMod software package will provide open-source access to two different methods.  The first method, Symbolic Disciminant Analysis (SDA), was developed by our team as nonlinear alternative to Fisher's Linear Discriminant Analysis (LDA).  
The goal of SDA is to identify the optimal combination of attributes and mathematical functions for predicting a discrete endpoint.  Unlike LDA, SDA makes no assumptions about the functional form of the model.  Given a list of attributes (e.g. gene expression variables) and mathematical functions (e.g. +, -, *, /, log, sqrt, abs, AND, OR, <, >, etc.), SDA optimizes model discovery using any wrapper algorithm.  We have used genetic programming as a wrapper for SDA although other stochastic search methods such as simulated annealing could be used.  We have a new paper on SDA that will appear soon in a special issue of Human Heredity.  The second method that will be included in SyMod is symbolic regression.  Symbolic regression is similar to SDA but is used for continuous endpoints.  The alpha version of SyMod is ready for public testing. Check this web page, www.symbolicmodeler.org, or our blog (Epistasis Blog) for updates.

Development of SyMod is supported by NIH grant R01-AI59694 and by generous funds from the Norris-Cotton Cancer Center.

V. Weka-CG

The open-source Weka-CG software package can be downloaded from here.

Weka is an open-source data mining software package with a number of powerful machine learning methods such as decision trees, neural networks and support vector machines.  A recent book about data mining with Weka can be found here.  We are distributing our own version of Weka with integrated tools for computational genetics (CG).  The first new tool added to Weka-CG is our multifactor dimensionality reduction (MDR) method.  Here, MDR has been added to Weka-CG as a filter for constructive induction so that constructed attributes (i.e. SNP combinations) can be analyzed with any number of different methods included in Weka (e.g. logistic regression).

Development of Weka-CG is supported by NIH grant R01-AI59694 and R01-LM009012 as well as by generous funds from the Norris-Cotton Cancer Center.

Last updated by JHM on March 2, 2008