UCSB Home

Pasteur

Computer Science Home
CS Home

Home 
Projects 
People 
Publications 
Courses 
Grants 

Strings in bioinformatics, generally involve very large databases. One of the examples of such a database is GenBank. It is been shown that the size of GenBank doubles every 15 months! On the other hand, the current existing algorithms on string homology search or approximate database range queries are computationally expensive and not scalable. Since the last decade, an extensive research has been conducted in the this emerging field and a large number of heuristics have been proposed. However, the very main obstacles still remain in the very nature of the problem. The other issue/motivation is the encoded structure of biological sequences in which makes them a more interesting area. In this project, we study and investigate the alternatives of the database arena to fit with the requirements of the problems in bioinformatics, particularly approximate string search and range queries.

* This project is based upon work supported by the National Science Foundation under Grant No. IIS 0209112 and IIS 0223022.

People Involved:

  • Alireza Aghili (Ph.D. Candidate)
Selected Publications:
  • Efficient Filtration of Sequence Homology Search Trough Singular Value Decomposition, S.A. Aghili, O.D.Sahin, D. Agrawal, A. El Abbadi. Submitted for publication...

  • BFT: Bit Filtration Technique for Approximate String Join in Biological Databases, S.A. Aghili, D. Agrawal, A. El Abbadi, The Proceedings of. 10-th International Conference on String Processing and Information Retrieval (SPIRE), October 2003, Manaus, Brazil.

  • Filtration of String Proximity Search via Transformation, Alireza Aghili, Divyakant Agrawal, Amr El Abbadi. The Proceedings of  IEEE Conference on Bioinformatics and Biomedical Engineering (BIBE), 149-157, March 2003,  Baltimore, Bethesda.   (Extended Version)

Related Links:

         


 

Copyright(c) 2002 DSL. All rights reserved.
dsl@cs.ucsb.edu