Report ID
2004-12
Report Authors
S. Alireza Aghili, Divyakant Agrawal and Amr El Abbadi
Report Date
Abstract
A novel approach for similarity search on the protein structure databases is proposed. PADS (Protein Alignment by Directional shape Signatures) incorporates the three dimensional coordinates of the main atoms of each amino acid and extracts a geometrical shape signature along with the direction of the given amino acid. As a result, each protein is presented by a series of feature vectors representing local geometry, shape, direction, and secondary structure assignment of its amino acid constituents. Furthermore, a residue-to-residue distance matrix is calculated and is incorporated into a local alignment dynamic programming algorithm to find the similar portions of two given proteins and finally a sequence alignment step is used as the last filtration step. The optimal superimposition of the detected similar regions is used to assess the quality of the results. The proposed algorithm is fast and accurate and hence could be used for the analysis of large protein structure similarity. The method has been tested and compared with the results from CE, DALI, and CTSS using a representative sample of PDB structures. Several new structures not detected by other methods are detected.
Document
2004-12.pdf318.59 KB