;;;; UC Santa Barbara ArchLab ;;;; String Match Engine RuleCompiler - Release 0.8 ;;;; ;;;; Copyright (c) 2005 The Regents of the University of California. ;;;; All Rights Reserved. ;;;; ;;;; Permission to use, copy, modify, and distribute this software and its ;;;; documentation for educational, research and non-profit purposes, ;;;; without fee, and without a written agreement is hereby granted, ;;;; provided that the above copyright notice, this paragraph and the ;;;; following three paragraphs appear in all copies. ;;;; ;;;; Permission to incorporate this software into commercial products may ;;;; be obtained by contacting the University of California. For ;;;; information about obtaining such a license contact: ;;;; Tim Sherwood ;;;; ;;;; This software program and documentation are copyrighted by The Regents ;;;; of the University of California. The software program and ;;;; documentation are supplied "as is", without any accompanying services ;;;; from The Regents. The Regents does not warrant that the operation of ;;;; the program will be uninterrupted or error-free. The end-user ;;;; understands that the program was developed for research purposes and ;;;; is advised not to rely exclusively on the program for any reason. ;;;; ;;;; IN NO EVENT SHALL THE UNIVERSITY OF CALIFORNIA BE LIABLE TO ANY PARTY ;;;; FOR DIRECT, INDIRECT, SPECIAL, INCIDENTAL, OR CONSEQUENTIAL DAMAGES, ;;;; INCLUDING LOST PROFITS, ARISING OUT OF THE USE OF THIS SOFTWARE AND ;;;; ITS DOCUMENTATION, EVEN IF THE UNIVERSITY OF CALIFORNIA HAS BEEN ;;;; ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. THE UNIVERSITY OF ;;;; CALIFORNIA SPECIFICALLY DISCLAIMS ANY WARRANTIES, INCLUDING, BUT NOT ;;;; LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR ;;;; A PARTICULAR PURPOSE. THE SOFTWARE PROVIDED HEREUNDER IS ON AN "AS IS" ;;;; BASIS, AND THE UNIVERSITY OF CALIFORNIA HAS NO OBLIGATIONS TO PROVIDE ;;;; MAINTENANCE, SUPPORT, UPDATES, ENHANCEMENTS, OR MODIFICATIONS. The Bit-Split Rule Compiler -------------------------------------------------------------------------------- Overview -------------------------------------------------------------------------------- The Bit-Split Rule Compiler is based on the architecture and algorithms presented in "Bit-Split String Matching Engines for Intrusion Detection and Prevention" by Lin Tan, Brett Brotherton, and Timothy Sherwood. The compiler interprets a set of keywords and converts them into a number of small binary state machines. The binary state machines are then output in a tabular format that can be mapped directly to the memory modules described in the paper. The method of keyword decomposition involves many intermediate steps, utilizing the Aho-Corasick algorithm to construct a graph of the characters and the corresponding accepting states. The resulting DFA is decomposed further to compute the final binary state machine graphs. -------------------------------------------------------------------------------- Getting Started -------------------------------------------------------------------------------- While the source code provided should work with any ANSI Common Lisp compatible compiler/interpreter, for the sake of simplicity, it is assumed the user is using clisp. The compiler can accept rules in a number of representations. Examples located at the bottom of the source code demonstrate multiple methods for inputting keywords. To run the compiler over a set of keywords located in a file, the user should redirect input from a file as follows: % clisp sme_RuleCompiler.lisp < keywords.in Where "keywords.in" is the name of an ASCII text file. Each line of the file represents a keyword to be matched. Every character on the line must be preceded with a colon (":"). If it is necessary to represent an 8-bit byte sequence that is not specified by a printable character, a colon can be followed by two hexadecimal digits in place of a single character.