![]() |
![]() |
The STAT project provides a framework that supports the development and the control of scenario-based sensors. By using STAT, it is possible to create a set of sensors that will operate in different domains and environments, e.g., network-based sensors, host-based sensors, and application-based sensors. In addition, STAT provides an infrastructure to communicate securely with the deployed sensors so that it is possible to collect the results of scenario-based analysis and control sensors' configuration from a central location. Centralized management systems can be composed hierarchically to achieve scalability and extended control over large sets of sensors.
The STAT projects is centered around the following concepts:
The State Transition Analysis Technique is a method to describe computer penetrations as attack scenarios. Attack scenarios are represented as a sequence of transitions that characterize the evolution of the security state of a system. This characterization of attack scenarios allows for an intuitive graphic representation by means of state transition diagrams (see Figure 1).
In an attack scenario states represent snapshots of a system's security-relevant properties and resources. A description of an attack has an "initial" starting state and at least one "compromised" ending state. States are characterized by means of assertions, which are predicates on some aspects of the security state of the system. For example, in an attack scenario describing an attempt to violate the security of an operating system, assertions would state properties such as file ownership, user identification, or user authorization. Transitions between states are annotated with signature actions that represent the key actions that if omitted from the execution of an attack scenario would prevent the attack from completing successfully. For example, in an attack scenario describing a network port scanning attempt, a typical signature action would include the TCP segments used to test the TCP ports of a host.
The State Transition Analysis Technique has been initially used as the basis for developing a host-based intrusion detection system called USTAT. Later on, the technique has been extended to network traffic analysis and a network-based intrusion detection systems, called NetSTAT, has been developed. In 1998 and 1999 NetSTAT and USTAT were evaluated as part of both the MIT Lincoln Laboratory's off-line intrusion detection system evaluation and the Air Force Research Laboratory (AFRL) real-time evaluation. In the first case, USTAT and NetSTAT were used to analyze BSM logs and network traffic dumps of several weeks of traffic looking for attack signatures. In the second case, NetSTAT and USTAT were installed on a testbed network at AFRL. In both efforts the STAT-based tools performed very well and their combined results scored at the highest level in the evaluation.
Participating in this event gave strong positive feedback on the research that had been performed so far, and it also gave new insights into the STAT approach. In particular, running NetSTAT and USTAT at the same time evidenced a number of similarities in the way attack scenarios were represented and in the runtime architecture of the systems. A closer analysis of the mechanisms used by the STAT-based tools to match attack scenarios against a stream of events suggested that the STAT-based toolset could be redesigned as a family of systems.
The resulting design is based on a language, called STATL, and the corresponding runtime, called the "core", that embody the domain-independent characteristics of the STAT approach. The language and the runtime provide support for the representation of the domain-independent parts of attack scenarios and implement the domain independent mechanisms used at runtime to match attack scenarios against a stream of events. STATL and the core alone would be useless because intrusion detection is performed in particular domains (e.g., hosts or networks) and in specific environments (e.g., Windows NT or Solaris). Therefore, the STAT framework provides a well-defined way to extend both the language and the runtime and obtain a complete intrusion detection system tailored to the characteristics of a specific domain and environment.
STATL is an extendible language that is used to represent STAT attack scenarios. The language defines the domain-independent features of the STAT technique. The STATL language can be extended to express the characteristics of a particular domain and environment. The extension process includes the definition of the set of events that are specific to the particular domain or environment being addressed and the definition of new predicates on those events.
For example, to extend STATL to deal with events produced by the Apache Web Server one would define one or more events that represent entries in the application logs. In this case an event would have the fields host, ident, authuser, date, request, status, and bytes as defined by Apache's Common Log Format (CLF). After having defined new events it may be necessary to specify specific predicates on those events. For example, the predicate isCGIrequest() would return true if an event is a request for a CGI script. Event and predicate definitions are grouped in a language extension. Once the event set and associated predicates for a language extension are defined, it is possible to use them in a STATL scenario description by including them with the STATL use keyword. Extensions for TCP/IP networks, Sun BSM audit records, and Windows NT event logs have been developed.
STATL scenarios are matched against a stream of events by the STAT core. In order to have a scenario processed by the STAT core it is necessary to compile it into a Scenario Plugin, which is a shared library (e.g., a .so library in UNIX or a DLL library in Windows). In addition, each language extension used by the scenario must be compiled into a Language Extension Module, which is a shared library too. Both STATL scenarios and language extensions are translated into C++ code and compiled into libraries by the STAT development tools.
The STAT Core represents the runtime of the STATL language. The STAT Core implements the domain-independent characteristics of STATL, such as the concepts of state, transition, timer, and event matching. At run-time the STAT Core performs the actual intrusion detection analysis process by matching an incoming stream of events against a number of Scenario Plugins. A running instance of the STAT Core is dynamically extended to build a STAT-based application.
STAT-based applications perform STAT analysis of one or more event streams (operating system audit records, network traffic, application logs, system calls, etc.).
The architecture of a STAT-based application is centered around the STAT Core (see Figure 2). The STAT Core is extended with a number of modules that, together, determine the application's capabilities and behavior. The configuration of a STAT-based application can be changed at run-time through control directives sent to the STAT Core. A set of initial modules can be (and usually is) defined at startup time to determine the initial configuration of an application. In the following, an incremental configuration of a STAT-based application will be described to better illustrate the role of each sensor module, provide a hint of the high configurability of applications, and describe the dependencies between the different modules.
When an application is started with no modules, it contains only an instance of the STAT Core waiting for events or control messages to be processed. This initial "bare" configuration, which is presented in Figure 2 (a), does not provide any functionality.
The first step is to provide a source of events. To do this, an Event Provider module must be loaded into the STAT Core. An Event Provider collects events from the external environment (e.g., by parsing the Apache server logs, or by obtaining packets from the network driver), creates events as defined in one or more Language Extension Modules (e.g., the Apache Language Extension Module), encapsulates these events into generic STAT events, and inserts these events into the input queue of the STAT Core. Event Providers can be dynamically added to and removed from a STAT Core, and more than one Event Provider can be active at one time. For example, both an Event Provider for Apache events and a Solaris BSM audit record provider may feed their event streams to the same STAT Core. An Event Provider is implemented as a shared library. The activation of an event provider is done by sending specific control directives to the STAT Core to load and then activate the component. An Event Provider relies on the event definitions contained in one or more Language Extension Modules. These have to be available at the application's host. Once both the Event Provider and the Language Extension Modules are loaded in the STAT Core an Event Provider instance is activated. As a consequence, a dedicated thread of execution is started to execute the Event Provider. The Event Provider collects events from the external source, filters out those that are not of interest, transforms the remaining events into event objects (as defined by the Language Extension Module), encapsulates them into generic STAT events, and then inserts them into the STAT Core input queue. The STAT Core, in turn, consumes the events and checks if there are any STAT scenarios interested in the specific event types. At this point, the core is empty, and therefore no actual processing is carried out. This configuration is described in Figure 2 (b).
To start doing something useful, it is necessary to load one or more Scenario Plugins into the STAT Core. To do this, first a Scenario Plugin, in the form of a shared library, must be installed at the sensor's host. A Scenario Plugin may need the functions of one or more Language Extension Modules. These must be made available at the destination host. Then, the Scenario Plugin is loaded into the STAT Core, specifying a set of initial parameters. When a Scenario Plugin is loaded into the STAT Core an initial prototype for the scenario is created. The scenario prototype contains the data structures representing the scenario's definition in terms of states and transitions, a global environment, and a set of activation parameters. The prototype creates a first instance of the scenario. This instance is in the initial state of the corresponding attack scenario. The STAT Core analyzes the scenario definition and subscribes the instance for the events associated with the transitions that start from the scenario's initial state.
At this point the STAT Core is ready to perform event processing. The events obtained by the Event Provider are matched against the subscriptions of the initial instance. If an event matches a subscription, then the corresponding transition assertion is evaluated. If the assertion is satisfied then the destination state assertion is evaluated. If this assertion is also satisfied then the transition is fired. As a consequence of transition firing the instance may change state or a new instance may be created. Each scenario instance represents an attack in progress. This situation is presented in Figure 2 (c), where a Scenario Plugin has been loaded and there are currently four active instances of the scenario.
As a scenario evolves from state to state, it may produce some output. A typical case is the generation of an alert when a scenario completes. Another example is the creation of a synthetic event. A synthetic event is a STAT event that is generated by a Scenario Plugin and inserted in the STAT Core event queue. The event is processed like any other event and may be used to perform forward chaining of scenarios.
Apart from logging (the default action when a scenario completes) and the production of synthetic events (that are specified internal to the scenario definition), other types of responses can be associated with scenario states using Response Modules. Response Modules are collections of functions that can be used to perform any type of response (e.g., page the administrator, reconfigure a firewall, or shutdown a connection). Response Modules are implemented as shared libraries. To activate a Response Function it is necessary to make the shared library containing the desired Response Module functionality available at the application's host, load the library into the STAT Core, and then request the association of a Response Function with a specific state in a scenario definition. This allows one to specify responses for any intermediate state in a scenario. Each time the specified state is reached by any of the instances of the scenario, the corresponding Response Function is executed. Figure 2 (d) shows a Response Module and some Response Functions associated with particular states in the scenario definition.
At this point, the STAT-based application is completely configured. Event Providers, Scenario Plugins, Language Extension Modules, and Response Modules can be loaded and unloaded following the needs of the application. These reconfigurations are subject to a number of dependencies that must be satisfied in order to successfully load a component into the sensor and to have the necessary inputs and outputs available for processing. The management of these dependencies is delegated to the application or to the MetaSTAT infrastructure.
The STAT framework has been used to develop a number of STAT-based intrusion detection systems.
The MetaSTAT Infrastructure is a communication and control infrastructure to remotely control and coordinate the activities of a number of STAT-based applications.
The idea is that a protected network is instrumented with a "web of sensors" composed of distributed components integrated by means of a local communication and control infrastructure. The task of the web of sensors is to provide fine-grained surveillance inside the protected network. The web of sensors implements local surveillance against both outside attacks and local misuse by insiders in a way that is complementary to the mainstream approach where a single point of access (e.g., a gateway) is monitored for possible malicious activity. The outputs of the sensors, in the form of alerts, are collected by a number of "meta-sensor" components. Each meta-sensor is responsible for a subset of the deployed sensors, and may coordinate its activities with other meta-sensors. The meta-sensors are responsible for storing the alerts, for routing alerts to other sensors and meta-sensors (e.g., to perform correlation to identify composite attack scenarios), and for exerting control over the managed sensors. The high-level view of the architecture of the STAT-based web of sensors is given in Figure 3.
The MetaSTAT infrastructure is composed of several modules providing different functionality.
MetaSTAT components can be organized in a hierarchical structure to address large networks and cross-domain control issues.