Data Analysis Toolkit

Tools used for Head and Neck Cancer research at Baylor College of Medicine, with a goal of condensing time-consuming manual pipelines of data analysis methods into one click scanning processes.

Download Latest Release

How to Use

Download the latest open source release for your platform here, and run the program installer, application will automatically alert you to updates in the future.
Open the program, navigate to the "Analysis Pipeline" tab and select the folder containing your genetic data files, or drag and drop the folder into the program window. The program is intelligent enough to recursively scan all subfolders for genetic data files.
Assign group letters or numbers to the generated table, ensure only a total of two groups, select the upper group for calculations, a threshold for normalization, and click "Run Analysis".

Allow a few minutes for the pipeline to run. Once finished, you will be asked to save a compressed file containing the results in CSV and GCT formats, along with a report file on problematic gene edge cases encountered with appropriate context.

Brief Pipeline Summary

Input genetic result files from assorted folder and assign groups to obtain combined spreadsheet of statistical tests, adjusted for error.

Pipeline consists of recursively scanning all input files to support various genetic data folder schemas, asking user to label testing groups.Data is filtered for existing genes based on lookup list, and transformed into linear and log space data.

Further normalization is carried out using Benjamini - Hochberg Procedure. Fold changes and p values for significant genes via a user defined threshold are outputted.