Title:

Quantifying DNA-Protein Matches

Poster

Preview Converted Images may contain errors

Abstract

In the field of molecular biology, determining the exact makeup of a sample is only part of the battle. What comes next is making sense of the thousands to millions of strings of A, T, C, and G, the nucleotides that make up an organism’s DNA. Various products use protein alignment to profile DNA by comparing that sequence against a library of known genetic structures. Currently, the most popular protein alignment software is BLAST. However, BLAST is slow and takes an estimated 22 years to do the same number of reads that this product, PALADIN, does in 31 hours. Unfortunately, BLAST uses the industry standard output format (BLAST tabular output) whereas PALADIN can only output in SAM format. In order for PALADIN to see more widespread use, it will need to conform to the industry standard. This project will add an option for PALADIN to output genetic data in BLAST tabular format. The project will not replace or eliminate the existing SAM format output PALADIN uses. This project will create scripts to compare PALADIN against its competitors in multiple scenarios. These scripts will show how PALADIN performs at classifying different types of DNA through statistical certainty. This project will analyze the data to represent it in a meaningful way. Protein alignment software profiles DNA by comparing a sequence against a library of known genetic structures. Currently, the most popular protein alignment software is BLAST. However, BLAST is slow and takes an estimated 22 years to do the same number of reads that our sponsor’s product, PALADIN, does in 31 hours. Unfortunately, BLAST uses the industry standard output format (BLAST tabular output) whereas PALADIN can only output in SAM format. We added an option for PALADIN to output data in BLAST tabular format, and we show how PALADIN performs at classifying different DNA samples through statistical analysis.

Authors

First Name Last Name
Mitchell Hersey
Sarah Hall
Mallorie Biron

File Count: 1


Leave a comment

Comments are viewable only by submitter



Submission Details

Conference URC
Event Interdisciplinary Science and Engineering (ISE)
Department Computer Science (ISE)
Group Systems
Added April 22, 2020, 8:45 a.m.
Updated April 22, 2020, 8:45 a.m.
See More Department Presentations Here