Projects/Research

Below are some project's I've worked on in class and through research. I've posted the source code for some of the projects and I just ask that you let me know of any improvements/bug fixes that are made.
  • Current Projects

    • My current research focuses on developing novel methods for querying and mining spatial and spatio-temporal data, specifically problems in which data uncertainty is prevalent.
  • Towards Community Discovery in Signed Collaborative Interaction Networks

    • Collaboration is central to the online world of today. The popularity of diverse collaboration-centered applications has led to the formation of unique and productive ecosystems, comprised of individuals who collectively annotate maps, organize photos, accumulate and author encyclopedic knowledge and even build software systems. Understanding the community structure of such systems is a necessary step for reducing administrative overhead and redirecting contributor efforts to new content generation, as well as for assessing the quality and objectiveness of the final product. Analysis of how collaborators interact can also lead to insights into the formation and architecture of ad-hoc online societies held together by the content generation as an ultimate activity goal and lacking any explicit hierarchical structure.

      We propose a framework for discovery of collaborative community structure in Wiki-based knowledge repositories based on raw-content generation analysis. We leverage topic modelling in order to capture agreement and opposition of contributors and analyze these multi-modal relations to map communities in the contributor base. The key steps of our approach include (i) modeling of pairwise variable-strength contributor interactions that can be both positive and negative, (ii) synthesis of a global network incorporating all pairwise interactions, and (iii) detection and analysis of community structure encoded in such networks.

      The global community discovery algorithm we propose outperforms existing alternatives in identifying coherent clusters according to objective optimality criteria. Analysis of the discovered community structure reveals coalitions of common-interest editors who back each other in promoting some topics and collectively oppose other coalitions or single authors. We couple contributor interactions with content evolution and reveal the global picture of opposing themes within the self-regulated community base for both controversial and featured articles in Wikipedia.

  • Summarizing Probabilistic Data

    • Many real world applications produce data with uncertainties. As such, it is important to provide scalable methods capable of properly managing this uncertain data. In this paper, we address the problem of building a space constrained synopsis over a probabilistic dataset where tuples are defined over a continuous domain. The primary goal of this work is to aid in exploratory tasks by providing quick approximate query results and statistical analysis with error bounds. Our approach differs from other summarization techniques in that we retain the shape of the probability distribution for each tuple. This provides us with a great deal of versatility in that we can approximately answer queries over uncertain datasets using our synopsis. In fact, given the proper query execution engine, our synopsis can be used to answer any query, the limiting factor being the error bounds we are capable of providing. We use minimax polynomials, polynomials which minimize the L1 error, to approximate the probability distribution of each tuple and introduce efficient methods to provide further space reduction while still bounding the error of our synopsis.
  • Opsin protein homology

    • This is a project I worked on to determine if type-I and type-II opsins were homologous (came from the same origin). Here is the software we wrote to compare the proteins.
  • 3-d 4x4x4 Tic-Tac-Toe

    • A minimax implementation for a 3D version of tic-tac-toe on a 4x4x4 board.
  • Spades AI

    • A (somewhat) intelligent Spades agent we made for cs265. It uses a fairly simple rule-base to decide which card to play. Feel free to give it a whirl, or to download the source.