Abstract
Conventional pair-wise sequence alignment algorithms attempt to identify similar regions between 2 sequences. This project explores new global and local alignment techniques and applications for identifying regions of contrast between pairs of sequences.
The global contrast alignment algorithm will measure how different 2 sequences are. Furthermore, this measure can be used to verify a conventional global alignment score. For example, a low contrast score verifies a high conventional global alignment score because if the sequences really are similar, they won’t contrast. On the other hand, a high contrast score matched with a high conventional score indicates the sequence pair may not be as similar as the conventional algorithm suggests.
The local contrast alignment algorithm can be used to find strongly differing subsequences. One application that will be explored is to compare a disease resistant and non-resistant sequence pair to pinpoint the elements of resistance.
Plan of Action – What is being implemented
1. Conventional global alignment algorithm.
2. Conventional local alignment algorithm.
3. Global contrast alignment algorithm.
4. Local contrast alignment algorithm.
5. Conventional global alignment with contrast alignment adjustment score.
6. Java GUI for running the algorithms on different data sets and displaying the results.
Plan of Action – What methods are being compared and where are they from?
1. Conventional global alignment will be compared to global contrast alignment.
2. Conventional local alignment will be compared to local contrast alignment.
3. Conventional global alignment will be compared to global alignment with contrast adjustment.
4. The conventional global and local alignment algorithms are taken from the lectures and also by the book Algorithm Design, by Jon Kleinberg and Eva Tardos.
5. The contrast algorithms will be originally written by me. They will mainly be the inverse of the original algorithms with other adjustments to be determined later.
Plan of Action – Datasets
1. Data sets from homework 2.
2. Small home-brewed sequences to verify algorithm correctness.
3. Disease resistant and non-resistant sequences from the same species.
Plan of Action – Experiments
There will be 3 main experiments run in this project.
The first experiment will show that conventional global and local similarity alignments are not the contrast alignments. This will dispel the idea that it if one wants to know the contrasting data he/she can just look at the gaps and substitutions in the alignments produced by the conventional global and local algorithms.
The second experiment will show that conventional global alignment overstates sequence similarity for some sequences. I will attempt to prove using homework 2 data sets that the top 50 conventional global alignments do not have the lowest 50 global contrast alignment scores. This will prove that global alignment overstated similarity for those sequences not in the lowest 50 contrast scores. A new revised top 50 alignments will be outputted using a combination of the contrast scores and the similarity scores.
The third experiment will attempt to identify local contrast between disease resistant and non-resistant sequences of the same species. The contrast will be analyzed to determine whether or not it has biological value.
Reading List
· S. B. Needleman, C. D. Wunsch, "A General Method Applicable to the Search for Similarities in the Amino Acid Sequence of Two Proteins", Journal of q Molecular Biology, Vol. 48, 1970, pp. 443-53.
· T.F. Smith, M.S. Waterman, "Identification of Common Molecular Subsequences", Journal of Molecular Biology, March 1981.
· S. Altschul et al., "Basic Local Alignment Search Tool", Journal of Molecular Biology, 1990, Vol. 215, No. 3, pp. 403-410.
· N. Bray, I. Dubchak, L. Pachter, "AVID: A Global Alignment Program", Genome Research, Vol. 13, No. 1, 2003, pp. 97-102.
· Understanding the Functions of Plant Disease Resistance Proteins, by Gregory B. Martin1, Adam J. Bogdanove2, and Guido Sessa
· Towards Identification, Isolation and Characterization of Disease Resistant Genes from the Native North American Grape Species Vitis Shuttleworth II by J. Lu, H. Huang, Z. Ren, F. Bradeley, W. Hunter