cidf.txt

cidf.txt: Posted Aug 17, 1999; No information is available for this file.; tags | paper; SHA-256 | 1e501c4a91f74721c7a394653f3096293f33eef39fb7f2f16bba10fb903bc948; Download | Favorite | View
cidf.txt


CIDF Specification: Version 0.6          Page 1

                THE COMMON INTRUSION DETECTION FRAMEWORK

                           CIDF working group
               http://seclab.cs.ucdavis.edu/cidf/members.html

Contents

0: Preamble
        0.1 Introduction
        0.2 Organization of this document

1 Architecture
        1.1 Introduction
        1.2 Functional decomposition (E-boxes, A-boxes etc).
        1.3 Layering scheme
        1.4 Naming and locating components

2 Gidos and S-expressions
        2.1 Introduction to the gido format
        2.2 Gido Requirements and Rationale
        2.3 GIDO S-expression format
        2.4 Parts of a GIDO payload
        2.5 Detailed Examples
        2.6 Rules and Guidelines for Defining SIDs
        2.7 Example CIDF Module GIDO Sets
        2.8 Negotiation

3 Encoding Gidos in Bytes
        3.1 Introduction
        3.2 Gido header
        3.3 S-expression encoding

4 CIDF Communication
        4.1 CIDF message layer formats
        4.2 CIDF Message Processing
        4.3 CIDF Directory Services

5 APIs
        5.1 Introduction
        5.2 General APIs
        5.3 Crypto APIs
        5.4 Event generator API
        5.5 Event analyzer API
        5.6 GIDO database API
        5.7 Response unit API

A. Primitive Type Definitions

B. SIDS List

C. LDAP Background

D. Conformance Profiles


CIDF Specification: Version 0.6          Page 2

========================================================================
= 0.1: Introduction to CIDF
========================================================================

The goal of the Common Intrusion Detection Framework is a set of
specifications which allow
        * different intrusion detection systems to inter-operate and
          share information as richly as possible,
        * components of intrusion detection systems to be easily re-used
          in contexts different from those they were designed for.

The CIDF working group came together originally in January 1997 at the
behest of Teresa Lunt at DARPA in order to develop standards to
accomplish the goals outlined in the previous section.  She was
particularly concerned that the various intrusion detection efforts she
was funding be usable and reusable together and have lasting value to
customers of intrusion detection systems.

During the life of the effort, it became clear that this was of wider
value than just to DARPA contractors, and the group was broadened to
include representatives from a number of government, commercial, and
academic organizations.  After the first few months, membership in the
CIDF working group was open to any individuals or organizations that
wished to contribute.  No cost was involved (except to defray meeting
expenses).

Major decisions were made at regular (every few months) meetings of the
working group.  Those decisions were made by rough consensus of all
attendees.  That is, the meeting facilitator attempted to reach
consensus, but in situations where only one or two individuals were
protesting a decision, they were overruled in the interest of
efficiency.  No decisions were taken in the face of opposition from a
sizeable minority, rather the issue was tabled for further
consideration.  Meetings were fun and the working group had a good time
doing this (well, most of them, anyway).

In between meetings, most of the writing was done by small subgroups or
individuals.  Their text was brought back for approval/changes at
meetings.  Discussions were also carried on in the working group mailing
list, but few decisions were made that way.

The CIDF working group is now seeking to become an IETF working group.


CIDF Specification: Version 0.6          Page 3

========================================================================
= 0.2: Organization of the CIDF Spec
========================================================================

This section describes the organization of the CIDF specification as it
appears in the rest of this document.  CIDF basically consists of the
following things:

        1) A set of architectural conventions for how different parts
           of intrusion detection systems can be modeled as CIDF
           components.

        2) A way to represent gidos (generalized intrusion detection
           objects).  Gidos can
                 *  describe events that have happened in the systems mo
                    by an IDS,
                 *  instruct an ids to carry out some action
                 *  query an ids as to what has happened.
                 *  describe and IDS component.

        3) A way to encode gidos into streams of bytes suitable for
           transmission over a network or storage in a file.

        4) Protocols for CIDF components to find each other over a
           network and exchange gidos.

        5) Application Programming Interfaces to re-use CIDF
           components.

Each of these major areas thus forms one section (numbered as shown
above) of this document.  The organization of the individual sections is
described at the front of that section.

0.2.1: Format

This document complies with the requirements for RFC 1543, the format
for ASCII Internet RFCs.  In summary, this means that lines are at most
72 characters long and that they are terminated with a carriage-return,
line-feed pair.  Pages are at most 58 lines long and are terminated with
a form-feed character.  Paragraphs are single spaced and are separated
by blank lines.

Lines in the text beginning with "#" denote editorial comments which
should be removed before the final version.

The document is also divided into sections which are further divided
into subsections, subsubsections, and so on.  The numbering convention
is as "3.4.1", which describes the first subsubsection of the fourth
subsection of the third section.  Appendices are lettered, and so an
Appendix subsection might be B.4.2.


CIDF Specification: Version 0.6          Page 4

========================================================================
========================================================================
=
=                         1: CIDF Architecture
=
========================================================================
========================================================================
=
========================================================================
= 1.1: Introduction
========================================================================

This section introduces the architectural framework that CIDF assumes
will structure an intrusion detection system.  This scheme is basically
a framework around which interfaces and the communication protocols are
organized.  It is not mandated that CIDF-conformant intrusion detection
systems must be organized in exactly this way.  But they must support
interfaces that are so organized.

Section 1.2 introduces the various different kinds of components that
CIDF believes are needed in IDS systems.  Section 1.3 covers the
communication layering scheme, and section 1.4 discusses how components
are named and located.


CIDF Specification: Version 0.6          Page 5


========================================================================
= 1.2: CIDF Functional Decomposition
========================================================================

All CIDF components deal in *gidos* (generalized intrusion detection
objects) which are represented via a standard common format.  Gidos are
data that is moved around in the intrusion detection system.  Gidos can
represent events that occurred in the system, analysis of those events,
prescriptions to be carried out, or queries about events.

CIDF defines four interfaces that CIDF components may implement:

             ||       Push-style       |       Pull-style
    =========++========================+=======================
             || Produces gidos when it | Produces gidos
    Producer || wants to, typically in | when queried.
             || response to events.    |
    ---------++------------------------+-----------------------
             || Mates with push-style  | Mates with pull-
    Consumer || producer.              | style producer.
             ||                        |

Each of these interfaces takes two forms: a callable form, which permits
reuse of the component, and a protocol form, which permits the component
to interoperate with other CIDF components.

CIDF defines several types of preferred components:

        * Event generators
        * Analyzers
        * Databases
        * Response units

Figure 1.1 presents a schematic view of these components in a
hypothetical intrusion detection system.  The solid boxes labeled E1,
E2, A1, A2, D, etc represent the various components of some hypothetical
intrusion detection system.  It is convenient to think of these as
objects in the object-oriented programming sense (this does not dictate
an implementation in an object-oriented language or framework).


CIDF Specification: Version 0.6          Page 6

                    |        |         |
            ,-------|--------|---------|---------.
            |       |        |         |         |
            |       V        V         V         |
            |   ,------.  ,------.  ,------.     |
            |   |  E1  |  |  E2  |  |  E3  |     |
            |   `------'  `------'  `------'     |
            |     ^         ^          ^         |
            |     |         |   ,------'         |
            |     | ,-------'   |                |
            |     | | ,---------'     ,------.   |
            |     V V V       ,------>|  A1  |   |
<---------->|   ,------.      |       `------'   |
            |   |   C  |<-----'                  |
            |   |      |<-----.                  |
            |   `------'      |       ,------.   |
            |     ^  ^        `------>|  A2  |   |
            |     |  |                `------'   |
            |     |  `-----------.               |
            |     |              |               |
            |     V              V               |
            |    ,------.     ,------.           |
            |    |  D   |     |  R   |           |
            |    `------'     `------'           |
            |                    |               |
            `--------------------|---------------'
                                 |
                                 V

                 Figure 1.1: Types of CIDF components

Whether the individual components are separate processes or images, or
merely conceptually separate parts of the code in a single image is not
specified - both possibilities are covered by the CIDF specification.

CIDF allows for components to be aggregated together to masquerade as a
single component.  In other words, a large number of (possibly
distributed) components can be tied together and present themselves to
the outside world through a single CIDF interface.

#####################################################################
#
# Stuart comment:
# It is not clear at present how this last requirement is to
# be achieved.
#
#####################################################################

1.2.1 Matchmaking Service


CIDF Specification: Version 0.6          Page 7

The gray box (labelled C in Figure 1.1) represents the configuration and
directory services that tie components together via their standard CIDF
interfaces.  These are collectively termed the CIDF "matchmaker". A
component initiating communication may avoid using the matchmaker if the
component knows how to address its target directly, or if it uses
broadcast or other (non-CIDF) means to do so.  Otherwise, the matchmaker
allows a component either to look up its target by name or to derive its
communication "partners" by looking up "gido classes".

Gido classes specify types of data that may be exchanged between
components.  Components that wish to receive certain kinds of gidos
describe what they want; components producing event records describe
what it is they produce.  The matchmaker then takes care of associating
GIDO producers with appropriate GIDO consumers.  In this mode of use,
components are thus relieved of the burden of identifying or locating
their partners in the intrusion-detection system.

1.2.2 Event Generators

The boxes labelled Ei in Figure 1.1 are event generators.  Their role is
to obtain events from the larger computational environment outside the
intrusion detection system (symbolized by the fat arrows coming from
outside the dashed box), and provide them in the CIDF standard gido
format to the rest of the system.  For example, event generators might
be simple filters that take C2 audit trails and convert them into the
standard format.  Another event generator may passively monitor a
network and generate events based on the traffic thereon.  A third might
be application code in an SQL database program which generates events
describing database transactions.

It seems that event generators are likely to be reusable in that CIDF
has a standard data format, and so converting features of typical
computational environments into that format will be a task that many
groups will need to perform.  Hence, it is useful to specify a preferred
way to configure and use event generators.

Preferred event generators implement the push-style producer interface.
They create only gidos describing raw events, not gidos describing
analyses or prescriptions.

Preferred event generators provide events as soon as they occur (with
the possible exception of transport queuing). Storage of events is
handled in gido databases.

1.2.2 Event Analyzers

Analyzers are labeled by Ai in Figure 1.1.  They are the components we
typically think of in the intrusion detection context.  They obtain
gidos from other components, analyze them, and return new gidos (which
hopefully represent some kind of synthesis or summary of the input).


CIDF Specification: Version 0.6          Page 8

Thus for example, an analyzer might be a statistical profiling tool that
examines whether events being supplied to it now are statistically
unlikely to be from the same time series as events supplied to it in the
past.  Another example is a signature tool that examines sequences of
events looking for particular patterns that represent known misuse of
the system.  Another example would be a correlator that simply examines
events and attempts to determine whether they are causally related to
one another, and then puts them together into composite events which can
be further analyzed.  Simple analyzers might be just filters that throw
away events that match certain patterns, or caches that only forward
events dissimilar from recently seen events.

A preferred event analyzer implements the push-style consumer interface,
whereby it obtains input, and the push-style producer interface, whereby
it reports analyses.  The gidos it produces are analysis results, not
raw events nor prescriptions.

Again, preferred gido analyzers immediately pass through gidos (with the
exception of some processing delay). No provision is made for storage of
gidos by analyzers.

1.2.3 Event Databases

Databases are labeled by Di in Figure 1.1.  These components exist
simply to give persistence to CIDF gidos where that is necessary.  The
interfaces allow other components to pass gidos to the database, and to
query the database for gidos that it is holding.  Databases are not
expected to change or process the gidos in any way (or at least to
maintain the illusion that they don't).

A preferred gido database implements the push-style consumer interface,
whereby it receives any sort of gido,
and the pull-style producer interface, whereby it responds
to queries.

It is not assumed that the database is a complex application (such as a
relational database). It may simply be a file.

1.2.4 Response Units

Response units are the soldier ants of the CIDF ant-heap.  They carry
out prescriptions - gidos that instruct them to act on behalf of other
CIDF components.  This is where functionality such as killing processes,
resetting connections, etc.  would reside.  Response units are not
expected to produce output except as acknowledgements.

A preferred response unit implements the push-style consumer interface,
whereby it receives prescriptions.  It may also implement the push-style
producer interface, whereby it reports on its efforts to carry out the
prescriptions.

1.2.5 Other Components


CIDF Specification: Version 0.6          Page 9

Many other useful types of component are compatible with CIDF. For
example, a subsystem may record events in a non-CIDF format, but may
implement the pull-style producer interface so that CIDF components can
query its record of events.

A component may record gidos for archival purposes, thus needing only a
push-style consumer interface.

A component may observe the world and do some analysis or filtering
before creating gidos.  Such a component implements the push-style
producer interface.

An event analyzer may consult a gido database.  The analyzer would need
a pull-style consumer interface beside the usual push-style producer and
push-style consumer interfaces.

A component may carry out responses, like a response unit, but also
produce analyses, like an event analyzer.


CIDF Specification: Version 0.6          Page 10

========================================================================
= 1.3: Communication Layers
========================================================================

1.3.1: Background

CIDF supports both interoperability and reusability of components.  As
such, a component may be communicating with another across the network,
or as part of the same executable.  In addition, to the extent feasible,
CIDF avoids specifying a particular language or choice of network
protocols.  To support this flexibility, the design is structured in
layers.  Figure 1.2 shows the layers.

                        ------------------
                        |  APIs          |
                        |----------------|
                        | Gido layer     |
                        |----------------|
                        | message        |
                        | layer          |
                        |----------------|
                        | (negotiated)   |
                        | transport      |
                        | layer          |
                        ------------------

                             Figure 1.2

1.3.2: API Layer

At the top of figure 1.2 is an API layer indicating code-based
interfaces to the layers below.  Application programmers require a clean
and uniform way to call upon functions that are either local or remote
and do not wish to bother with the details of exactly how that function
is provided.  APIs hide information and simplify a programmer's task.
If the underlying structure of one of the lower layers is changed, the
programmer does not have to rewrite the application program.

The specification is in-principle neutral regarding the language used
for APIs.  Of course, the APIs must be instantiated for any specific
language, and the instantiations will be different for different
languages.  However, the semantics of what is being passed across the
interface will be common, and to the extent feasible, the APIs will be
conceptually similar.  The APIs are discussed in detail in Section 5 of
this document.

1.3.3: Gido layer

Independent of programming language, network protocols, etc, CIDF
defines common formats for intrusion detection data.  This data comes in
discrete packages called gidos (generalized intrusion detection
objects). The organization of the data, its semantics for an IDS
component, and a way to encode it in bytes are all defined at this
level.


CIDF Specification: Version 0.6          Page 11

The rationale for this is to separate the issue of how data is organized
and what it means (gido layer) from how it is gotten in and out of
components (API layer) and moved across networks (API layer). In the
case of components that are linked together into a single executable,
there may be no layer below the gido layer.  Gidos are discussed in
sections 2 and 3 of this document.

1.3.4: Message layer

Gidos must be moved across networks.  Certain features of this process
must be present for CIDF purposes and may not be provided by underlying
transport mechanisms (such as cryptography, CIDF addressing, etc). The
CIDF message layer is intended to provide this functionality.  This
layer is addressed in section 4.  Use of this layer is mandated for CIDF
components that are to be interoperable across a network.

1.3.5: Transport layer

The figure below illustrates the notion of two independently developed
CIDF modules that build to a common interface specification.  CIDF
supports

For the two modules to communicate, they are required to employ the same
transport protocols that will establish the communication channel and
handle message passing.  The introduction of the transport layer is
handled during the integration phase, as module developers negotiate and
agree upon a common transport channel.  For example, both developers may
agree that sockets will be used for this communication session.  Other
developers may decide they wish to employ secure RPC for a different
session.  CIDF provides the flexibility to use different transport
mechanisms, and a negotiation mechanism to choose amongst them.

The reason for having an independent transport layer below the message
layer is that our only requirement is that the components understand the
messages.  This is independent of the way in which messages are
transmitted.  Different applications will require different transport
mechanisms.  All components are required to support a default transport
mechanism, namely UDP. This is necessary in order to guarantee that two
components can talk at least enough to negotiate about which other
transport mechanism they might prefer.

------------------------------------------------------------------------
             Interoperation Among Independently-Developed
                     Intrusion-Detection Modules


CIDF Specification: Version 0.6          Page 12

+-------------+   +---+                       +---+    +-------------+
|  Intrusion  |   | T |                       | T |    |  Intrusion  |
|  Detection  |   | R |                       | R |    |  Detection  |
|   Module X  |   | A |     communication     | A |    |   Module Y  |
|             |   | N |       interface       | N |    |             |
| Developer 1 |   | S | <-------------------->| S |    | Developer 2 |
|             |   | P |       negotiated      | P |    |             |
| Language A  |   | O |        during         | O |    |  Language B |
|    OS X     |   | R |      integration      | R |    |     OS Y    |
+-------------+   | T |         phase         | T |    +-------------+
       ^          +---+                       +---+            ^
      / \                                                     / \
       |                                                       |
       |                                                       |
       | Build-to        +------------------+         Build-to |
       |                 |                  |                  |
       +-----------------| Common Interface |------------------+
                         |  Specification   |
                         +------------------+

------------------------------------------------------------------------
-

                             Figure 1.3


CIDF Specification: Version 0.6          Page 13

========================================================================
= 1.4: Naming and Locating Components
========================================================================

1.4.1: Background

In an intrusion-detection system of any scale, naming components has the
potential to become a boundless headache.  Components that "know" the
identity of other components will require modification to work with
other partners if redeployed in new contexts.  Such components might not
be informed at all of changes in the system (such as the addition or
removal of components of interest to them) that could affect their own
operation.

So each component has the option of specifying classes of gidos that it
is interested in, rather than naming other components.  A producer of
gidos can announce the classes of gidos it produces.  A gido consumer
can request the classes of gidos it wants to receive.  Communication
between components is then characterized, not by the address or other
identifying information of the endpoints involved, but by some feature
of the data the communication is to carry.

Components also have the option of naming other components explicitly,
either with or without the use of the CIDF infrastructure described in
the next sections.

Finally, components may also use network broadcast to indicate their
willingness to accept specific gidos.

1.4.2: Associations

Enabling the feature-based communication described above is the role of
the CIDF Matchmaking Service, or matchmaker, which will form and
maintain associations between components.  To discuss associations
further, it is useful to divide CIDF components into gido producers (E-
boxes, A-boxes, D-boxes) and gido consumers (A-boxes, D-boxes, R-boxes).
Also note that gidos may enter the ID & R system from other sources
(e.g., humans) and leave the system bound for other recipients.

A component contacts the matchmaker to announce its presence and ask for
associates.  This call returns a set of communications endpoints
identifying potential partners for the transfer of gidos of interest to
the caller.  Each producer-consumer pair subsequently established is the
basis for an association.  The caller has the option of being notified
(via a callback) when new potential partners enter the system or when
old ones leave.  Note, though, that individual components or CIDF
platforms may choose not to support dynamic addition of associations,
e.g.  due to resource constraints.  Components may also restrict the
number of concurrent associations they will enter into.


CIDF Specification: Version 0.6          Page 14

Associates are sought by a single API call, and individual associations
may be torn down with a second call.  However, each of these calls may
induce a larger number of lower-level interactions.  At the message
layer, setting up an association involves directory operations
(optionally including authentication). Maintaining a request for
associates may involve keepalive functions also implemented in the
message layer.  In the API layer occurs negotiation between producer and
consumer to determine what kinds of gidos the consumer will receive.
The specification returns to these lower-level operations in later
sections.

1.4.3: Gido Classes

For feature-based communication to work on a large scale, ways of
classifying gidos must be established in advance.  Every legal class of
gido will fall under a category, which is a set of values for a
particular attribute of the gido.  The attribute need not necessarily be
an explicit field in the gido; it could be an attribute of the gido
producer, or of the host the producer monitors.  When an attribute does
form part of a gido, it corresponds to a semantic ID (SID), as defined
elsewhere in this document.

So a category can denote something like:

  * an IP subnet
  * a DNS domain or subdomain
  * a physical subnet
  * a functional grouping of hosts, like:
      + a department
      + a project

Wildcarding will be allowed, but not arbitrary wildcarding.  For
instance, the last two elements of an IP address (only) might be allowed
to be wildcarded.

Each kind of category is hierarchical and will be used to organize the
directory used by the matchmaker.  Each category will be the
responsibility of a single server.  A gido class may specify more than
one category; it may also specify attributes that are not categories at
all.  These will be applied in negotiating what a given producer will
actually send to a consumer.

The matchmaker can be significantly simplified by building it atop LDAP-
compliant infrastructure.  Basically, the matchmaker then becomes a set
of LDAP-compliant servers (DSA's), plus an LDAP client (DUA) and
additional intelligence local to each component.  This "normal" client
environment will have to be replaceable where required with a simpler
equivalent.  Appendix C says more about our proposed use of LDAP.

1.4.4: Limitations


CIDF Specification: Version 0.6          Page 15

Though the matchmaker can do a great deal to make connections between
components intuitive and flexible, two key limitations of the approach
(or indeed of any approach built atop a hierarchical directory service)
should be noted.

Looking up a target in a hierarchical directory service is appropriate
if the target of the lookup is susceptible to hierarchical naming and if
its part of the directory hierarchy is believed to be trustworthy (or at
least not believed a priori to be untrustworthy). However, there are
interesting classes of gidos for which one or both of the above
assertions are not true.

First are gidos that concern things lacking hierarchical names.  Some
examples are:

  * public keys
  * programs or other bad bundles of bits
  * attack profiles, like Stephanie Forrest's tuples of system calls

Second are gidos that describe some characteristic of an attack or of an
attacker.  If one wants to know about attacks emanating from a given
subnet, or authored by a given principal, use of a hierarchical
directory service to locate related gidos would lead one to a server
operating out of the (hypothetical) attacker's domain, and hence likely
to be compromised.

In either case, to cope with gidos that describe an object lacking a
hierarchical name, or for which the name leads into an administrative
domain that cannot be trusted to provide accurate information, a
hierarchical directory service seems inappropriate.


CIDF Specification: Version 0.6          Page 16

========================================================================
= 2.1: Introduction to the gido format
========================================================================

2.1.1: Overview

This chapter specifies a standard gido format for use by CIDF
components.  These components shall use this standard for disseminating
event records, analysis results, and countermeasure directives, to IDS
modules.  The document both defines the syntactic structure of these
messages, and provides a method for defining the semantic content
necessary for interpreting the various data elements embedded within the
structure.

2.1.2: Organization

This section is organized as follows.  Section 2.2 discusses the
requirements for the gido format and the rationale for our choice.
Section 2.3 summarizes S-expressions as we define them and use them for
gidos.  Section 2.4 begins serious discussion of the semantic
identifiers we use, and how to put gido-sentences together.  Section 2.5
provides some detailed examples.  Section 2.6 contains some rules and
guidelines for defining new SIDS. Section 2.7 identifies the recommended
set of GIDOs (primarily internal status information) that all CIDF-
compliant modules should be able to produce.  Section 2.8 discusses
requirements for gido format negotiation protocols.

Readers will probably wish also to consult Appendices A and B which list
all the currently defined data types and SIDS. Appendix D on conformance
profiles is also related to this chapter.


CIDF Specification: Version 0.6          Page 17

========================================================================
= 2.2: Gido Requirements and Rationale.
========================================================================

Under the CIDF data sharing model, components receive an input stream,
use this input to drive their internal analytical processing, and pass
the results to other components within an overall intrusion detection
architecture.  The output of one component may be the input of another
component.  Therefore, this specification closely coordinates the
structures of event records, analysis reports, and countermeasure
prescriptions.  In many cases, current state information must also be
used in order to fully understand the meaning of events, hence this is
also encoded in gidos.  This adoption of a single standard for both E-,
A-, and R-boxes provides significant advantages in the reduction of
interface complexity.  In addition, this approach provides great
flexibility as intrusion-detection objectives move from component
analysis, to systems analysis, to system of systems analysis.

However, this relationship between event records and analysis results
does not necessarily extend beyond the specification of identical
gido structures.  Event records, analysis results, and
countermeasure prescriptions remain dissimilar in significant ways:

  o Event records represent the operational activity of the
    analysis target, and may be produced in large volumes. Minor
    losses of event records, while potentially damaging, will not
    necessarily imply a significant compromise to operational security.

  o Analysis results represent significant conclusions derived from
    an analytical review of an event stream, and should represent a
    significant reduction in volume from that of the event stream.
    Minor losses of analysis results are far more critical to the
    operation security of the target system than event records.

  o Countermeasure results likewise should be low volume and sensitive
    to loss.

Thus, while gidos encode events, analysis results, and countermeasure
prescriptions identically, other processing layers such as transport may
handle them differently.  For example, specifications for event
transport may derive requirements that emphasize performance (e.g.,
stateless UDP transmission), while analysis results dissemination
protocols may emphasize ensured delivery and accurate reassembly over
issues of performance (e.g., TCP transmission). Protocols for event
dissemination and analysis results reporting may also handle other
issues differently, such as security requirements.

The GIDO structure contains the actual data representing the event
record, analysis results, and countermeasure directives produced by
their respective CIDF components.  The encoding scheme requires the
ability to express complex, self-defining data structures, while
providing efficient high-volume transmissions of predefined structures.
This specification uses S-expressions as the basic payload format.


CIDF Specification: Version 0.6          Page 18

S-expressions are a self-defining formatting scheme for representing
arbitrarily complex data structures.  This message encoding
specification employs a very simplified form of S-expressions for event
record, analysis report, and countermeasure directive representation.
One of the motivations for this choice is that S-expressions in general
allow for an impressive degree of reasoning and formalism.

The design goals for the gido format are:

 -- generality: Gidos should be capable of representing arbitrarily
    complex data.
 -- self-defining: Extensions to payload formatting should be
    semantically defined within the payload itself.  Consumers should
    be able to learn or adjust to alterations in the expected format
    or comprehend entirely new payload format.
 -- simplicity: The encoding scheme should produce messages that
    do not force complex parsing logic upon IDS module developers if tha
    is not necessary in their application.  The encoding scheme should b
    easily understandable and gidos should have a human readable
    representation.
 -- efficiency: Payload expressions should represent data compactly.
    The overhead of semantic self-definitions should be removable
    when predefined messages are transported in bulk.
 -- flexible: Payload expressions must be open to modification and
    extensions to new data types, semantic information, and new
    data structures.
 -- independent of call semantics: Payload expression must be
    supportive of both embedded data (call by value) messages and
    data independent (call by reference) messages.


CIDF Specification: Version 0.6          Page 19

========================================================================
= 2.3 GIDO S-expression format
========================================================================

2.3.1 Preamble

In this section, we define how S-expressions are put together at a low
level in CIDF. This is the human readable format; the wire format is
defined in terms of this one in section 3.3.

In addition to questions of encoding format, this specification also
enumerates a set of CIDF-compliant default primitive data types and
semantic-identifiers (SIDs) used when expressing individual payload
fields.  How SIDS should be combined into S-expressions that form
meaningful gidos is discussed in section 2.4

The primitive data types, presented in Appendix A, define the available
encoding used for field representation.  Semantic-identifiers (SIDs), in
Appendix B, provide standard identifiers that gido consumers may use to
interpret the various data fields within a payload expression.

2.3.2 S-Expression Grammar

Following is the grammar for CIDF S-expressions in BNF. Terminal symbols
are represented in upper case.  Literal characters are enclosed in
quotes (").

<item-list>     ::= <item> | "(" item-list ")" |
     <item-list><item>
<item>     ::= "(" <sid-exp> <data-exp-list> ")" |
     "(" "def" SID <sid-exp> ")"
<sid-exp>     ::= <specifier> |
     "(" <specifier> <sid-exp-list> ")"
<sid-exp-list>  ::= <sid-exp> |
     <sid-exp-list> <sid-exp>
<specifier>     ::= SID | TYPE | NAME
<data-exp-list> ::= <data-exp> |
     <data-exp-list> <data-exp>
<data-exp>     ::= DATA |
     "(" <sid-exp> <data-exp-list> ")"

Using this grammar, data fields are coupled with semantic identifiers
parenthetically.  A SID indicates how its associated data element is
syntactically represented as well as the data element's semantic
content.  A collection of parenthetical SID/Data tuples can themselves
be grouped together in outer parentheses, indicating an explicit
*association* of the SID/Data tuples (i.e., they represent attributes of
a larger element in the expression). SID grouping is discussed further,
with illustrations, in Section 2.3.

A SID is a unique token for a semantic identifier.  TYPE is one of the
primitive types specified in Appendix A. NAME identifies a named element
of a structure.  DATA is a data literal.


CIDF Specification: Version 0.6          Page 20

2.3.3: GIDO S-expression Examples

The following sections illustrate low-level ways of using S-expressions
to encode gido data structures.  We give these examples for
concreteness, but see the next section for more information on how to
form gidos.

2.3.3.1: Embedded Semantics and Data Payload Example

This form is used for expressing field-oriented lists of data, where the
data is embedded within the message.  The format consists of a series of
tuples, one tuple per data field.  Each tuple consists of a semantic
identifier followed by its associated data item:

   Format: (SID-1 data-exp-1)(SID-2 data-exp-2) . . . (SID-N data-exp-N)

In this format, their is a SID with each data item, providing a self-
defining message format.  A consumer can parse the message for those
SIDs it understands and desires to analyze, and discard data fields
containing unknown or unwanted SIDs.  As discussed in Appendix B, each
SID has an associated data type, which completes the self-definition of
the message.  Thus, by parsing the SID tokens, the consumer knows both
how to interpret each data element semantically, and how the data
elements are syntactically represented.

2.3.3.2 Pre-defined Constant Payload Format

This form allows for semantics of predefined message structures to be
conveyed to consumers once.  From that point forward, consumers can
receive and interpret raw data structures without the overhead of
embedded SIDs.  This form is highly efficient for transporting high-
volumes of the same message type.  This form is also used for
enumerating a pre-defined set of CIDF E/A-box messages (see Section
2.5).

A gido producer begin the message exchange by sending the consumer a
message definition statement.  The "def" defines a new SID that can be
used subsequently.  SID indicates the semantic identifier being defined.
SIDs are special identifiers in the language.  Attempting to define a
SID that is already defined is an error.  arg-list is a list of dummy
arguments that will be matched with the actual arguments in use to
evaluate the S-expression.  sid-exp-1 defines the SID in terms of SIDs
and TYPEs that are already defined.  sid-exp-1 may only contain SIDS
that have been predefined either because they are included in an
appendix to this document or they have been defined in a prior
definition.

   Format1: (def SID arg-list sid-exp-1)

#####################################################################
# Editor's Comment: The event subgroup has not resolved the
# issue of scope for dynamically defined SIDS.
#####################################################################


CIDF Specification: Version 0.6          Page 21

========================================================================
= 2.4 Parts of a GIDO payload
========================================================================

2.4.1 Introduction

A GIDO consists of the GIDO header--which gives information pertaining
to the encapsulation of the GIDO, such as its version number, its
length, and so forth--and the GIDO payload.  In this section, we will
describe how SIDs are put together to compose the GIDO payload using S-
expressions described in the last section.  The Gido header is discussed
in section 3.2

A well-formed GIDO payload consists of one or more top-level
*sentences*.

Sentences are S-expressions that can be said to "assert" something.  A
typical sentence might describe the state of a machine at a given time,
or it might report that a given event had taken place, or it might also
recommend that an action be taken to counter an attack.

A sentence may be composed of other sentences, connected in some way;
such a sentence is called a *compound sentence*. A sentence which is not
compound is called a *simple sentence*. Broadly speaking, a simple
sentence contains a *verb*, which describes what happened, and other S-
expressions that describe who verbed what, where, when, and how, and so
forth.

In the following sections, we will examine how each of these may be
denoted and described, and finally, put together to form a complete
sentence.

2.4.2.  Verb SIDs

At the heart of a sentence is the *verb*. Normally, we think of verbs as
denoting some action (which may sound somewhat event-centric), but they
may also denote a recommendation, for instance, or description of state.
Each sentence has one main verb.  An example of a verb SID is "Execute".

Verb SIDs, unlike most other SIDs, do not take a concrete data type for
an argument.  Instead they take a sequence of one or more S-expressions.
These S-expressions describe the various "players" for the verb.  In the
case of "Execute", we would be interested in what (program) was
executed, who executed it, where and when it was executed, and so on.

2.4.3.  Role SIDs

A verb has little value until we describe who and what that verb applies
to.  This is accomplished using *role* SIDs.  A role denotes what part
an entity, or set of attributes, plays in a sentence.  Examples of roles
are "Initiator" and "Operand".


CIDF Specification: Version 0.6          Page 22

Role SIDs, like verb SIDs, take a sequence of one or more S-expressions
as argument.  These S-expressions describe the object, roughly speaking,
which is playing that role in the sentence.

Example:

    (Initiator (RealName "Joe Cool") (UserName "joe") (UserID "1618"))

denotes a user, with real name "Joe Cool", user name "joe", and user ID
"1618", acting as Initiator.  (Typically, an Initiator is someone who
causes an action to take place--such as executing a program.)

An S-expression headed by a role SID is called a *role clause*.

2.4.4.  Extension SIDs

It is not expected that any component will understand all SIDs.  A
component concerned with Unix notions will often not be worried about
X.500-related SIDs.  Nevertheless, many X.500-related SIDs have their
complements in the Unix world, and the Unix component will want to
capture this information, even if it isn't cognizant of the exact use of
this information in the X.500 world.  For instance, a user's real name
is a user's real name, although in Unix it might be the name in
/etc/passwd associated with the user's account, and in X.500 it may be a
Common Name.  If these two concepts were expressed with two completely
distinct SIDs, then we would lose much of the benefits of data sharing.

Extension SIDs are designed to address this.  Extension SIDs allow one
to specify information in a relatively generic fashion, and then give
more specialized receivers extra information about a SID that specifies
more precisely how it is to be used.  For instance, an X.500 Common Name
would be expressed as follows:

    (RealName (ExtendedBy X500CommonName) "Joe Cool")

Most components would be able to understand the RealName SID, and would
be able to capture the fact that the a user with the real name "Joe
Cool" is in question here.  Additionally, any component who understands
X.500 would implement the X500CommonName extension, so that it knows
that the real name is registered as a Common Name, along with any
implications of that fact.

In general, a SID is *extended* by following it with a sequence of one
or more SID-pairs, each of which is tagged with the ExtendedBy SID. An
extension SID MUST follow the SID or extension which it extends.  For
example, the following is well-formed:

    (ObjectName (ExtendedBy DeviceName) (Extendedby UnixFullDeviceName)
        ...
    )

where the ellipsis indicates the sequence of S-expressions qualifying
the ChangePrivilege verb.


CIDF Specification: Version 0.6          Page 23

An extended SID always takes the same *type* as the unextended (base)
SID. In fact, if one knows that a message will *only* be used by someone
who recognizes the extension, then it may omit the base class
altogether, and refer only to the extension.  Therefore, for instance,
one could write

    (X500CommonName "Joe Cool")

2.4.5.  Conjunction SIDs

Conjunction SIDs join sentences at the same "level" together.  Two
sentences that are simply juxtaposed together are presumed to mean that
both hold.  That is,

    <Sentence1> <Sentence2>

means that both Sentence1 and Sentence2 hold.  Other relationships are
indicated by the appropriate conjunction SID. For instance, to indicate
that Sentence1, Sentence2, and Sentence3 all had a common cause, one
writes

    (CommonCause <Sentence1> <Sentence2> <Sentence3>)

2.4.6.  Open S-Expressions

An open S-expression is one in which not all the data values are "filled
in", so to speak.  It is used to express concepts such as "<Someone>
removed <some file>." Its only currently defined usage is in the def
construct, as follows:

    (def RemoveFile ($username $filename)
        (Remove
            (Initiator (UserName $username))
            (Operand (ObjectType file) (ObjectName $filename))
        )
    )

In later usage, we can express "The user with user name joe removed the
file /etc/passwd" in this way:

    (RemoveFile "joe" "/etc/passwd")

Its general format is

    (def <NewSID> (<ListOfArguments>) <SIDExpansion>)

2.4.7.  Referent SIDs

There is a last special type of SIDs, called Referent SIDs.  They are
placed at the end of this chapter, because they are not restricted to
the construction of a single sentence, but instead allow one to link two
or more sentences together (though they are often used to refer to other
parts of the same sentence).


CIDF Specification: Version 0.6          Page 24

The two referent SIDs are ReferAs and ReferTo.  They take a string as
their data type.  A SID-pair headed by a referent SID is called a
*referent clause*. A referent clause may be placed into either a
sentence or a role clause.  Their interpretation varies depending on
where they appear:

    * If a ReferAs clause is placed into a sentence, it can be said
      to *refer* to that sentence, *except* for any ReferAs clauses.
      (It is considered bad form to use more than one ReferAs clause
      in the same sentence at the top level.)  Thereafter, a use of
      the corresponding ReferTo clause can be used in place of that
      sentence (although see warning below).

    * If a ReferAs clause is placed into a role clause, it is said
      to refer to the object described by the sequence of S-expressions
      following that role, *except* for any ReferAs clauses.  (It is
      considered bad form to use more than one ReferAs clause in the
      same role clause.)  Thereafter, a use of the corresponding
      ReferTo clause can be used in place of that object description
      (again, see warning below).

    * WARNING.  The referent SIDs MAY carry actual semantics, and are
      not simply macros.  If a ReferAs clause is placed into a sentence,
      and that sentence refers to an event (say), then the ReferTo
      clause refers specifically to that specific event, and not simply
      to an event with the same attributes (which after all may not be
      uniquely identifying).  Similarly, if a ReferAs clause is placed
      into a role clause, and that role clause describes an object (say)
      then the ReferTo clause refers specifically to the same object,
      and not simply to an object with the same attributes.

      Of course, if no specific item is denoted by the ReferAs clause,
      then this warning does not apply.  For example, if ReferAs occurs
      in an assertion of state, then it can be interpreted as simply a
      macro, since there is no unique item being denoted.

As an example, consider the following sequence:

    (Remove
        (Initiator (RealName "Joe Cool"))
        (Operand (FileName (ExtendedBy UnixPathName) "/etc/passwd"))
        (AtTime (Time "1998 Feb 25 12:40:32 PST"))
        (ReferAs "JoesDeletion")
    )

followed by

    (HelpedCause
        (ReferTo "JoesDeletion")
        (Login
            (Initiator (RealName "Mary Worth"))
            (To (HostName "host.work.com"))
            (Outcome (ExtendedBy UnixErrno) (ReturnCode 13))
        )
    )

CIDF Specification: Version 0.6          Page 25


This indicates that the act of Joe Cool deleting /etc/passwd later
helped to prevent Mary Worth from logging in to host.work.com.  Note
that this specific instance of Joe Cool deleting /etc/passwd is referred
to here.  Even if (by resetting the clock, say) Joe Cool were to delete
/etc/passwd a second time with the same attributes, this construction
would still show that it was the *first* deletion that helped prevent
Mary Worth from logging in.

Since referent SIDs act across GIDOs, and hence potentially across
multiple messages (although not necessarily so), the question of scope
arises.  The scope rule applying to Referent SIDs is as follows: The
value of a referent clause is the verb or role within which it is found
(roughly speaking), provided that that verb or role is in the same
thread.  A thread is defined as the conjunction of the originator ID and
thread ID fields in the GIDO header.  A producer MUST NOT re-use a
referent (such as "JoesDeletion") within the same thread, for
perpetuity.

2.4.8.  Guidelines for Putting SIDs Together to Form Sentences

In this section, we describe how to use verb SIDs, role SIDs,
conjunction SIDs, and other kinds of SIDs to construct sentences.

2.4.8.1.  Basic Organization

As noted above, a simple sentence is an S-expression headed by a verb
SID (which may be extended). This verb SID is followed by a sequence of
one or more S-expressions that describe the various entities that play
parts in the sentence, or qualify the verb.

The S-expressions denoting the roles of the sentence are headed by a
role SID, which may also be extended.  This role SID is again followed
by a sequence of one or more S-expressions that may describe attributes
of the entity playing that role.  It may also describe a sentence that
plays a role within the sentence.

A BNF-like grammar that specifies this structure is as follows.


CIDF Specification: Version 0.6          Page 26

<SentenceList>      ::= <Sentence> <SentenceList>
                      | <Sentence>
<Sentence>          ::= "(" <ConjunctionSID> <ExtensionList>
                          <SentenceList> ")"
                      | "(" <VerbSID> <ExtensionList> <QualifierList> ")
                      | "(" <VerbSID> <ExtensionList> <ReferToClause> ")
                      | "(" "def" <NewSID> <ArgList> <SIDExpansion> ")"
<QualifierList>     ::= <Qualifier> <QualifierList>
                      | <Qualifier>
<Qualifier>         ::= "(" <RoleSID> <ExtensionList> <QualifierList> ")
                      | "(" <RoleSID> <ExtensionList> <ReferToClause> ")
                      | "(" <AtomSID> <ExtensionList> <AtomSIDData> ")"
                      | "(" "ReferAs" <Referent> ")"
<ExtensionList>     ::= <Extension> <ExtensionList>
                      | <NULL>
<Extension>         ::= "(" "ExtendedBy" <ExtensionSID> ")"
<ArgList>           ::= <Arg> <ArgList>
                      | <NULL>
<ReferToClause>     ::= "(" "ReferTo" <Referent> ")"

In English: A GIDO payload is a SentenceList, which is a list of
Sentences.

A Sentence may be a ConjunctionSID, followed by a list of the Sentences
it conjoins, or it may be a VerbSID, followed by a list of Qualifiers of
that VerbSID.

A Qualifier may be a RoleSID, followed by a list of Qualifiers of that
RoleSID. A Qualifier may also be an AtomSID followed by its data.

Any list of Qualifiers may contain a ReferAs clause.  Thereafter, use of
the corresponding ReferTo clause may stand in for that list of
Qualifiers.

Any SID may be followed by a list of Extensions.

2.4.8.2.  Understanding Sentences and the Principle of Connectedness

The Principle of Connectedness simply states that when a component
reading a GIDO encounters a SID it does not understand, the component
must strictly ignore the S-expression that the SID heads.  The component
MUST NOT reject the GIDO on this ground.  For instance, in the example
below

    (InOrder
        (Delete
            (Initiator (FullName "Joe Hacker"))
            (Operand (ObjectType file) (ObjectName "/etc/passwd"))
        )
        (Execute
            (Initiator (UserName "sysadmin"))
            (Operand (ObjectType program) (ProgramName "SystemCheck"))
        )
    )


CIDF Specification: Version 0.6          Page 27

if a component does not understand the Delete verb SID, it may not make
use of the Initiator and Operand SIDs within that sentence, even if it
understands those, because it will not understand what they are the
Initiator and Operand *of*.

This is called the Principle of Connectedness because the portion of the
GIDO which is understood must form a connected tree.  If a parent is not
understood, its children should not be interpreted, as its relation to
the portion of the tree contain the parent is unknown.

2.4.8.3.  Rules and Guidelines for Using SIDs

Whenever a component puts a SID into a GIDO, the SID MUST be used with
the number of arguments (usually one) that the SID's definition calls
for (see the definitions in Appendix B). The SID's argument(s) MUST have
the syntax and meaning that the SID's definition calls for.  Otherwise
the component is OUT OF CONFORMANCE with the SID's definition.

A component that generates GIDOs MUST generate them in conformance with
all of the SID definitions in this specification.

Whenever the above rule permits, a component generating a GIDO SHOULD
use a SID from this specification and SHOULD avoid the SIDs defined in
the Uninterpreted SIDs section.  If the only suitable SID in this
specification is in the Uninterpreted SIDs section, then an
implementation MAY use it or define a new SID; defining a new SID is
usually better.

If a component generating GIDOs uses a SID from a particular
specification, and if that specification defines two applicable SIDs,
one of which is strictly more specific than another, then the component
SHOULD use the more specific one.

If CIDF component X creates a sentence and CIDF component Y later has a
copy of the sentence and passes it verbatim to CIDF component Z, then Y
MAY do so even if the sentence violates the above rules and guidelines.
The sentence MUST be passed verbatim and SHOULD be clearly ascribed to
its originator.  This provision frees D-boxes and such from having to
thoroughly understand and validate every GIDO they process.

However, if the CIDF component modifies any part of the sentence, then
it is responsible for the sentence's compliance with the above rules and
guidelines.


CIDF Specification: Version 0.6          Page 28

========================================================================
= 2.5.  Detailed Examples
========================================================================

Now that the basic components of an S-expression have been presented, we
illustrate how to utilize these components to express various records
structures and messages that intrusion detection systems may wish to
express.  In the following examples, we walk the reader through the
process of translating raw event structures, analysis results, and other
candidate message structures into S-expressions.

2.5.1.  Translating a Basic Security Module Audit Record

One very well-known form of security audit records are those introduced
in Sun Microsystems' SunOS 4.1.X Basic Security Module (BSM). There are
a variety of ways to translate BSM audit records into S-expressions,
depending on the data elements that a CIDF module may be directed to
filter or incorporate within its GIDOs.  In this example we demonstrate
the translation of a BSM audit record generated as a result of a
successful rlogin request.

2.5.1.1.  BSM Record Description

The raw BSM record describes an event in which an external user performs
a successful remote login to target.machine.com from source.machine.com.
A session is established in which the resulting real and effective user
IDs are set to thomas, the real and effective group IDs are set to
staff, terminal 6 is assigned to the session, and the process and
session IDs are set to 5345.

The event is captured by the audit daemon on target.machine.com, which
records the event as follows:

                          Raw BSM Audit Record

      [header,86,2,login - rlogin,,Sat Jul 29 20:43:01 1995,
      + 280009000 msec subject,thomas,thomas,staff,thomas,staff,
      5345,5345,0 6, source.machine.com text,successful login return,
      success,0]

2.5.1.2.  BSM to S-Expression Translation Process

Now we illustrate the underlying rationale used to translate a common
event structure such as a BSM audit record into a CIDF S-Expression.  As
discussed in Section 3.2, we begin our S-expression construction by
first defining the verb of our sentence in its most general form.  In
this case, the operation recorded in the BSM audit record is the
establishment of a communication session between two entities via
rlogin.  As we parse the potential Verb SIDs available in Appendix B.2,
we find that the SID most closely matching the rlogin operation is the
BeginSession SID. While BeginSession captures well the underyling action
represented in the audit record, we note that a Unix-specific extension
is available for further refinement (as discussed in Section 3.4). The
resulting S-expression is as follows:


CIDF Specification: Version 0.6          Page 29

Example 2.5.1.2a BSM Rlogin S-Expression:

- -->(BeginSession (ExtendedBy UnixRlogin)
                     :
                     :
                     :
- -->)

The next step is to qualify the verb with supporting S-expressions that
further enumerate the attributes of the event.  In this case, the verb
BeginSession has a series of supporting role clauses that can be derived
from the BSM record (Section 3.4). These role clauses include:

    o the observer from which the event was recorded
    o the initiator of the BeginSession operation
    o the entity to whom the BeginSession was directed
    o the resulting state changes or resource(s) produced or
      destroyed by the operation (in our case this involves the
      attributes of the session established by the rlogin
    o and the outcome of the event

>From the above categories of attributes we augment the S-expression
with the following relevant role-clauses:

Example 2.5.1.2b BSM Rlogin S-Expression:

 (BeginSession (ExtendedBy UnixRlogin)
- -->      (Observer   (S-expression ...) )
- -->      (Initiator  (S-expression ...) )
- -->      (To         (S-expression ...) )
- -->      (Operand    (S-expression ...) )
- -->      (Outcome    (S-expression ...) )
  )

Role clauses are selected for grouping associated datafields under a
common contextual usage in the S-expression sentence.  At this point, we
switch our attention to incorporating associated datafields within the
above role clauses.  Datafields that cannot correctly be associated
within the context of one of the available role clauses can still be
incorporated in the S-expression independent simple sentences within the
S-expression.


CIDF Specification: Version 0.6          Page 30

In our example, the Observer clause provides a contextual association
with all datafields that describe attributes of the oberserve, including
when, where, and through which means (i.e., BSM data) the observation
was recorded.  The initiator clause is used to associate datafields that
describe the entity responsible for the event.  In this case, the BSM
record provides very little information, other than hostname from which
the request was sent.  Similarly, the BSM record provides only the
hostname of the recipient, which we document in the To clause.  The
Operand clause is used to describe object that has been affected by the
event, which in this case was the creation of the session.  - From the
BSM audit record, we can include under the Operand clause the session's
associated user attributes, group attributes, process/session
attributes, and the device through which the session is supported.
Lastl we enumerate the attributes of the outcome.

Example 2.5.1.2.c Final BSM Rlogin S-Expression:

   Section Ref.
   ------------
 (BeginSession (ExtendedBy UnixRlogin)                   -- B.2.5
     (Observer                                           -- B.3.7
- -->     (AtTime (Time "Sat Jul 29 20:43:01 PDT 1995"))   -- B.3.2
- -->     (HostName "target.machine.com")                  -- B.5.4
- -->     (ObservationSourceType "BSM-SunOS")              -- B.5.1
  )
 (Initiator                                              -- B.3.1
- -->   (HostName "source.machine.com")                    -- B.5.4
  )
(To                                                      -- B.3.3
- --> (HostName "target.machine.com")                      -- B.5.4
 )
 (Operand                                                -- B.3.1
- -->   (UnixAUserName "thomas")                           -- B.5.9.6
- -->   (UnixUserName "thomas")                            --   "
- -->   (UnixEUserName "thomas")                           --   "
- -->   (UnixGroupName "staff")                            --   "
- -->   (UnixEGroupName "staff")                           --   "
- -->   (ProcessID 5345)                                   -- B.5.2
- -->   (SessionID 5345)                                   -- B.5.2
- -->   (Through                                           -- B.3.3
- -->        (ObjectName                                   -- B.5.1
- -->            (ExtendedBy UnixFullDeviceName)           -- B.5.1
- -->         "/dev/tty06")
      )
 )
 (Outcome                                                -- B.3.6
- -->   (Severity 3)                                       -- B.5.1
- -->   (ReturnCode                                        -- B.5.1
- -->        (ExtendedBy UnixErrno)                        -- B.5.1
- -->    0)                                                -- B.5.1
- -->   (Comment "successful login")                       -- B.5.1
 )
)


CIDF Specification: Version 0.6          Page 31

2.5.2.  Translating a TCP/IP Packet

In the next example, we'll see how to translate the contents of an FTP
connection request captured by a TCP/IP packet sniffer.  Here the TCP/IP
packet is observed being sent from an external client to the target
host's FTP control port.  The packet is translated by a CIDF module that
attempts to describe the transaction from the perspecti of analyzing
data sent to the application-layer (i.e, FTP) network servi

2.5.2.1.  TCP/IP Packet Description

The observer in this example is a CIDF E-box that parses sniffed pacekts
from a Sun Microsystem's Solaris machine.  The observer's host platform
i named snoopmachine.machine.com, and from this machine the observer
attem to capture and translate traffic to and from the FTP control port
of server.machine.com using the Solaris snoop(1) command:

 snoopmachine% snoop -v -d le0 -t a host server port 21

The following is an example snoop-formatted packet produced be the
observer:

                              Raw TCP/IP Packet


CIDF Specification: Version 0.6          Page 32

ETHER: ----- Ether Header -----
ETHER:
ETHER: Packet 7 arrived at 8:59:49.05
ETHER: Packet size = 70 bytes
ETHER: Destination = 0:01:02:03:04:05, Western Digital
ETHER: Source = 0:aa:bb:cc:dd:ee,
ETHER: Ethertype = 0800 (IP)
ETHER:
IP: ----- IP Header -----
IP:
IP: Version = 4
IP: Header length = 20 bytes
IP: Type of service = 0x00
IP:     xxx. .... = 0 (precedence)
IP:     ...0 .... = normal delay
IP:     .... 0... = normal throughput
IP:     .... .0.. = normal reliability
IP: Total length = 56 bytes
IP: Identification = 63187
IP: Flags = 0x4
IP:     .1.. .... = do not fragment
IP:     ..0. .... = last fragment
IP: Fragment offset = 0 bytes
IP: Time to live = 38 seconds/hops
IP: Protocol = 6 (TCP)
IP: Header checksum = 69a3
IP: Source address = 999.998.997.996, client.machine.com
IP: Destination address = 111.121.131.141, server.machine.com
IP: No options
IP:
TCP: ----- TCP Header -----
TCP:
TCP: Source port = 12406
TCP: Destination port = 21 (FTP)
TCP: Sequence number = 820300070
TCP: Acknowledgement number = 3095138926
TCP: Data offset = 20 bytes
TCP: Flags = 0x18
TCP: ..0. .... = No urgent pointer
TCP: ...1 .... = Acknowledgement
TCP: .... 1... = Push
TCP: .... .0.. = No reset
TCP: .... ..0. = No Syn
TCP: .... ...0 = No Fin
TCP: Window = 61320
TCP: Checksum = 0x4e8d
TCP: Urgent pointer = 0
TCP: No options
TCP:
FTP: ----- FTP: -----
FTP: "USER anonymous\r\n"


CIDF Specification: Version 0.6          Page 33

The packet consists for four layers of structure: the Ethernet header,
the IP header, the TCP header, and the FTP data portion.  Working from
the bottom up, we see that the packet represents an FTP "USER anonymous"
request, which for FTP is equivalent to a BeginSession request for an
anonymous FTP session.  Above the FTP header are the TCP fields,
containing, among other things, the source and destination ports (note
the destination port is port 21, the FTP control protocol port). Above
the TCP layer are the IP and Ethernet header, both containing datafields
that could be of use to further identify the initiator and recipient of
the FTP request.

2.5.2.2.  TCP/IP Packet to S-Expression Translation Process

As with the BSM exaample, we begin our S-expression by defining the verb
of our sentence.  In this example, the E-box is monitoring traffic to
the FTP control port when it encouters a TCP/IP packet that contains an
FTP USER command request for anonymous access.  As a result, we again
choose BeginSession as the verb.  The resulting S-expression is as
follows:

Example 2.5.2.2a FTP BeginSession S-Expression Example:

- --> (BeginSession
                     :
                     :
                     :
- --> )

Next, we qualify the verb with supporting S-expressions that further
enumerate the attributes of the event.  As with BeginSession in our BSM
example, we can support a series of role clauses from the information in
our FTP packet.  These role clauses include:

 o the observer from which the event was recorded
 o the initiator of the BeginSession operation
 o the entity to whom the BeginSession was directed
 o the resulting state changes or resource(s) produced or
   destroyed by the operation (in our case this involves the
   attributes of the session established by the rlogin
 o the command or tool used in the event
 o and the outcome of the event

- From the above categories of attributes, we augment the S-expression
with the following relevant role-clauses:

Example 2.5.2.2b FTP BeginSession S-Expression Example:

 (BeginSession
- -->             (Observer  (S-expression ...) )
- -->             (Initiator (S-expression ...) )
- -->             (To        (S-expression ...) )
- -->             (Operand   (S-expression ...) )
- -->             (Using     (S-expression ...) )
- -->             (Outcome   (S-expression ...) )
 )

CIDF Specification: Version 0.6          Page 34


The Observer clause can include a variety of datafield attributes,
including the timestamp and the host platform of the sniffer.  The
initiator of the BeginSession could also be viewed as attributes of the
location from which the request was sent.  Because both the "Initiator"
and "From" roles both provide accurate context to the set of attributes
that represent the entity responsible for the BeginSession Event, we
chose to recognize the two clause using referent SIDS (Section 3.7). The
entity responsible for the event can be described through a variety of
attributes within the packet, including the Ethernet address, IP
address, TCPPort, and hostname.  The recipient can be identified from a
similar set of corresponding datafields.  Unlike the BSM record, there
is very little information in the packet to describe the session, other
than the session will be associated with the anonymous user account.
The means used in this event is an FTP command, "USER".

Lastly, we identify the outcome of this event as pending, in that at
this point we cannot determine whether the BeginSession will succeed.
The outcome will be determined in subsequent GIDOs, which require an
association with this S-expression through a common thread ID define in
their GIDO headers.  We use the CIDFReturnCode extension of ReturnCode
to express this condition.  The GIDO recipient must consult the other
GIDOs in the thread until it encounters an Outcome with a ReturnCode
that is not pending.

Example 2.5.2.2.c Final FTP BeginSession S-Expression:


CIDF Specification: Version 0.6          Page 35

                                                          Section Ref.
                                                          ------------
 (BeginSession (ExtendedBy FtpCommand) "USER"             -- B.2.5
     (Observer                                            -- B.3.7
- -->     (AtTime (Time "08:59:49.1 PDT"))                  -- B.3.2
- -->     (HostName "snoopmachine.machine.com")             -- B.5.4
- -->     (ObservationSourceType "Packet")                  -- B.5.1
     )
     (Initiator                                           -- B.3.1
- -->     (ReferTo "the-client")                            -- B.5.1
     )
     (From
- -->     (ReferAs "the-client")                            -- B.5.1
- -->     (HostName client.machine.com)                     -- B.5.5
- -->     (EthernetAddress 0:aa:bb:cc:dd:ee)                -- B.5.5.1
- -->     (IPv4Address 999.998.997.999)                     -- B.5.5.2
- -->     (TCPPort 12406)                                   -- B.5.5.3
     )
     (To                                                  -- B.3.3
- -->     (EthernetAddress 0:01:02:03:04:05)                -- B.5.5.1
- -->     (IPv4Address 111.121.131.141)                     -- B.5.5.2
- -->     (Hostname "server.machine.com")                   -- B.5.5
- -->     (TCPPort 21)                                      -- B.5.5.3
     )
     (Operand                                             -- B.3.1
- -->     (UserName                                         -- B.5.4
- -->        (ExtendedBy UnixUserName)                      -- B.5.9.6
- -->     "anonymous")
     )
     (Using                                               -- B.3.1
- -->     (FTPCommand "USER")                               -- B.5.9.5
     )
     (Outcome                                             -- B.3.6
- -->     (ReturnCode                                       -- B.5.1
- -->        (ExtendedBy CIDFReturnCode)                    -- B.5.1
- -->     pending)
     )
 )
========================================================================
= 2.6 Rules and Guidelines for Defining SIDs
========================================================================

Other specifications MAY define SIDs for use with the CIDF framework.
If a CIDF component generates or uses those SIDs, those SIDs MUST be
defined in conformance to the rules here and SHOULD be defined in
conformance with the guidelines here.

     o Every SID MUST have a unique name.
     o Every SID's definition MUST include precise syntax.
     o Every SID's definition SHOULD include precise semantics.
     o The SID description must fully explain the intended use of
       SID (i.e., the intended data arguments must be described)


CIDF Specification: Version 0.6          Page 36

# Editor's note: The Event Subgroup is investigating naming
# conventions and rules for SID enumeration to eliminate the
# potential for SID reuse.

Specifiers SHOULD avoid defining a SID whose meaning overlaps another,
unless one SID is strictly more specific than another (unless the first
one provides all the information that the second one provides and more).

A SID MUST be so defined that when the SID heads an S-expression, the
truth of its S-expression is independent of the peer S-expressions, the
containing S-expression's peers, the peers of the container of the
containing S-expression, and so on.

Thus, an S-expression cannot *modify* the meaning of a peer S-
expression.  It can only augment the the peer S-expression.  (The
logical relationship between peer S-expressions is conjunction.) This is
critical because a consumer may ignore some peer S-expressions.

Specifiers should be wary when defining a set of closely related SIDs,
since a consumer may understand some of the SIDs and not others.  If two
data items can be properly understood together but cannot be properly
understood singly, then it is advisable to define a single SID that
takes both data items as arguments.


CIDF Specification: Version 0.6          Page 37

========================================================================
= 2.7.: Example CIDF Module GIDO Sets
========================================================================

This section enumerates example sets of internal status messages that
each CIDF-compliant E-, A-box, and R-box may choose to support.  These
message sets are not mandatory, but recommended as a consistent way of
conveying internal module information.

#####################################################################
# Editor's Comment: Recommendations for R-box message sets are
# forthcoming.
#####################################################################

2.7.1 Recommended E-Box Message Set

E-boxes can employ the following messages for basic internal information
transfer to consumers.  These messages are all formatted using pre-
defined constant payload expressions (see Section 2.4.3.3, Format1), and
contain E-box internal operation information.  (See Appendix A for the
SID to data type listing, and Section 3.2.3 for the list of Class ID
codes.)

Message ID: EB-Owner
Description: Returns the hostname of the machine where the E-box
  is running, the machine's IP address, the port number assigned to
  the E-box (-1 if NA), E-Box process ID, identification of E-box
  developer, and revision number of the E-box.
Priority: 5
Msg. Format: (def EB-Owner (struct HostName IP_Address Port PID
              DeveloperID RevisionNo))

Message ID: EB-Target
Description: Returns the hostname of the monitoring target, the IP
  address of the target, the port number assigned to the target if
  a network service, the process ID of the target, and an identifier
  indicating the type of event stream through which the target is
  being monitored.
Priority: 5
Msg. Format: (def EB-Target (struct HostName IP_Address Port PID
              EventStreamID))

Message ID: EB-Status
Description: Returns a timestamp indicating uptime for the E-box, the
  transfer messages to the consumer (synchronous polling, asynchronous
  forwarding, trap, other), events parsed per second, bytes parsed
  per second, records sent since uptime, bytes sent since uptime,
  internal E-Box errors produced since uptime.
Priority: 5
Msg. Format: (def EB-Status (struct UpTime ReportMethod RecsPerSec
              BytesPerSec SentRecsCnt SentByteCnt ErrorCount))


CIDF Specification: Version 0.6          Page 38

Message ID: EB-Transport
Description: Returns an identifier for the current transport mechanism
  being used, the revision number of the transport software,
  and the list of available transport mechanisms for this E-box.
Priority: 5
Msg. Format: (def EB-Transport (struct CurrentTrans RevisionNo
              AvailableTrans))

Message ID: EB-Error
Description: Returns an internal error code produced by the E-Box,
  a textual description of the error, and a severity code (e.g.,
  fatal, non-fatal, potential data loss).
Priority: 3
Msg. Format: (def EB-Error (struct EB-ErrorCode ErrorDesc Severity))

Message ID: EB-Warning
Description: Returns an internal warning code produced by the
  E-Box and a textual description of the warning.
Priority: 4
Msg. Format: (def EB-Warning (struct EB-WarnCode WarnDesc))

Message ID: EB-FilterStatus
Description: Returns the current filter array that identifies
  which of the available events the E-box is currently generating
  and returning to the consumer.
Priority: 5
Msg. Format: (def EB-FilterStatus CurrentFilterArray)

2.7.2 Recommended A-Box Message Set

A-boxes can employ the following message for basic internal
information transfer to their consumers.  These messages are all
formatted using pre-defined constant payload expressions (See Section
2.4.3.3, Format1), and contain A-box internal operation information.

Message ID: AB-Owner
Description: Returns the hostname of the machine where the A-box
  is running, the machine's IP address, the port number assigned to
  A-box (-1 if NA), A-Box process ID, identification of A-box
  developer, and revision number of the A-box.
Priority: 5
Msg. Format: (def AB-Owner (struct HostName IP_Address Port PID
              DeveloperID RevisionNo))

Message ID: AB-Target
Description: Returns the hostname of the analysis target, the IP
  address of the target, the port number assigned to the target if
  a network service, the process ID of the target, and the module
  identity of the E-box through which the target's operational activity
  is being monitored.
Priority: 5
Msg. Format: (def AB-Target (struct HostName IP_Address Port PID
              ModuleIdentity))


CIDF Specification: Version 0.6          Page 39

(Question: ModuleIdentity assumes a single E-to-A relationship.  Need
 to handle multi-E-box analyses?)

Message ID: AB-Status
Description: Returns a timestamp indicating uptime for the A-box, the
  transfer messages to the consumer (synchronous polling, asynchronous
  forwarding, trap, other), event records parsed per second, bytes
  parsed per second, reports sent since uptime, bytes sent since
  uptime, internal A-Box errors produced since uptime.
Priority: 5
Msg. Format: (def AB-Status (struct UpTime ReportMethod RecsPerSec
              BytesPerSec SentRecsCnt SentByteCnt ErrorCount))

Message ID: AB-Transport
Description: Returns an identifier for the current transport being
  used, the revision number of the transport software, and the list
  of available transport mechanisms for this A-box.
Priority: 5
Msg. Format: (def AB-Transport (struct CurrentTrans RevisionNo
              AvailableTrans))

Message ID: AB-Error
Description: Returns an internal error code produced by the A-Box,
  a textual description of the error, and a severity code.
Priority: 3
Msg. Format: (def AB-Error (struct AB-ErrorCode ErrorDesc Severity))

Message ID: AB-Warning
Description: Returns an internal warning code produced by the
  A-Box and a textual description of the warning.
Priority: 4
Msg.Format: (def AB-Warning (struct AB-WarnCode WarnDesc))

Message ID: AB-FilterStatus
Description: Returns the current filter array that identifies
  which of the available analysis reports the A-box is currently
  building and returning to the consumer.
Priority: 5
Msg. Format: (def AB-FilterStatus CurrentReportingArray)


CIDF Specification: Version 0.6          Page 40

========================================================================
= 2.8 Negotiation
========================================================================

We would like to enable CIDF to support adaptations in ID systems.  For
example:

 - adding or removing components (i.e., E-, A-, R-, or D-boxes) on
   the fly,
 - adding new capabilities to components via software
   modifications or adding new data such as signatures,
 - responding to specific situations such as identification of a
   possible threat.

We therefore want the components to be able to change, on the fly, the
information they are exchanging via CIDF without prearrangement. For
example, a new E-box brought into a system could broadcast its
capabilities and A-boxes could then request that the E-box start
sending them some subset of the newly available data.

The goal stated above is a research problem and is not amenable to a
near term solution.  However, there are some specific objectives that we
feel a dynamic negotiation protocol could accomplish in the near term
that would begin to address the more general problem.  These are:

 1. Identify the parties to participate in communication.
 2. Specify and distribute the packages to be used in
    communication.
 3. Specify and distribute filters to be used by producer(s).

These functions can be implemented as a preamble to normal CIDF
communication.

More details:

1.  Identify the parties to participate in communication.  We would like
to begin to address the question of how ID components locate other
components to communicate with.  This could be done by prearrangement
outside of CIDF, but we would like CIDF to address this problem.  A
component can broadcast a message providing its identity, how it can be
contacted, and a description of its capabilities and await a reply from
other components.  This can address both the situation of bringing new
components on board and components restarting after being off line.

2.  Specify and distribute the packages to be used in communication.
Packages are collections of SID definitions.  Typically, an ID component
deals in SIDs from a "small" number of packages.  The packages to be
used must be known to all producers and consumers of a collection of
gidos.  We want to specify a means by which the parties to communication
agree on the packages to be used and can obtain the necessary packages
if they do not already have them.


CIDF Specification: Version 0.6          Page 41

3.  Specify and distribute filters to be used by producer(s). Filters
are agents that change a gido into a different form and are agents that
run on behalf of a consuming component in a producing component.  They
allow communication to be more efficient by limiting the data sent to
that which can actually be used by the consumer.  We want to provide a
means by which a consumer can specify or actually send a filter to a
producer.

################################################################
# We would like to prioritize these three functions and then
# proceed to specify a protocol to address each of them in order
# of priority. Comments and suggestions for priority are requested.
################################################################

========================================================================
========================================================================
=
=                            3: Encoding Gidos
=
========================================================================
========================================================================
=
========================================================================
= 3.1: Introduction to Gido Encoding
========================================================================

In encoding a gido into actual bytes for storage, tranmission, etc, two
things are involved.  Firstly, every gido is accompanied (in perpetuity)
by a static format header which contains basic information about that
gido.  This header format is described in section 3.2.

Secondly, the S-expression which forms the payload of the gido must also
be encoded.  The method for doing this is covered in section 3.3


CIDF Specification: Version 0.6          Page 42

========================================================================
= 3.2: Gido Header
========================================================================

3.2.1: Introduction

The header definition, presented in this section, consists of a series
of constant fields that gido consumers can reliably parse to read basic
data common across all gidos.  The gido s-expression payload, presented
in a preceding section, contains the actual IDS component-specific data
structures, including semantic identifiers that allow gido consumers to
decode and interpret individual fields.

The gido header is used to convey information about the gido itself,
rather than details of the event, analysis report, or response
prescription (which are captured in the payload). Each CIDF-compliant
gido generated by any component MUST contain these fields in this order
(for this version). Consult Appendix A for details on type definition.

3.2.2: The Header Fields

1.  Version ID (type revision).  Indicates the format revision used
    to encode this gido.  Initially, the Version ID will indicate
    CIDF Version 1.0 (major = 1, minor = 0).  This Version ID will be
    incremented as future versions are introduced.  All current and
    future versions of this specification must reserve the first
    field of the gido header for the Version ID.  Gido consumers
    may reliably use this field to detect the format of the remainder
    of the gido.

#####################################################################
# Editor's Comment:  This field suggests that CIDF revision
# identifiers will follow a major.minor format.  The CIDF working
# group must decide if this is the proper revision format, and must
# then define the meaning of major and minor revision indicators.
#####################################################################

2.  Gido Length (4 octets, big-endian).  Indicates the byte length
    of the entire gido, including this header but excluding any
    optional digital signature.  This field may be used to cross-
    check gido completeness.

3.  Time Stamp (4 octets, big-endian).  Indicates the seconds since
    Unix epoch 1970.  This time refers to the moment that this report
    or request was generated.  Specifically, it does not refer to the
    time that any events were first detected, or when they occurred;
    these (if they are known) are to be placed in the message payload
    itself.

4.  Thread ID (4 octets).  Used to identify gidos with some
    common thread; all gidos about a given event (e.g., first
    report followed by successive updates) would share the same
    Thread ID.


CIDF Specification: Version 0.6          Page 43

5.  Class ID (2 octets).  Indicates the
    category that the event, analysis, or response generator believes
    the gido falls under.  Class IDs are defined in Section
    3.2.3.  This field is intended to allow receivers to process
    high-priority gidos in a given field of expertise before all
    others.  Note that some codes are reserved for user-defined
    Class IDs; the receiver must check to see if prior agreement
    exists between sender and receiver on these codes.

6.  Originator ID (unknown type).  A unique identifier associated with
    the component generating this gido.

#####################################################################
# Editor's Comment:  The format and semantics of the Originator ID
# is an open issue that requires resolution by the CIDF working group.
# Specifically, how will CIDF modules be uniquely identified from other
# CIDF modules?
#####################################################################

7.  Flags (1 octet).  The bits of this flags octet are to be
    interpreted according to the following table:

    Bit         Meaning
    ---         -------
    0 (LSB)     set = optional signature present (see below).
                clear = no optional signature
    1-7 (MSB)   reserved  (MSB = most significant bit)

The gido payload, plain or compressed, immediately follows the header.
If bit 0 in Flags is cleared, indicating no optional signature, the gido
ends with the payload (indicated by the Gido Length header field).
Otherwise, if bit 0 is set, indicating that a digital signature of the
content is present, this signature is contained in a structure following
the gido payload.  Recall that the Gido Length header field indicates
the end of the gido payload, not including the signature structure.

The signature structure has the following fields in it:

1.  Signature Length (2 octets).  Indicates the length, including this
    field (signature length), of the signature structure, in octets.

2.  Key ID (type unknown).  Uniquely identifies the key used to
    generate the signature.  This ID may be understood only by a
    given receiver if the gido is to be sent one-to-one.  This
    field also implies the signature algorithm.

#####################################################################
# Editor's Comment: This issues is tied up with that of originator-id
#
#####################################################################


CIDF Specification: Version 0.6          Page 44

3.  Signature data.  The entire gido represented by the Gido
    Length header field is passed through a gido digest, resulting
    in a short, fixed-length quantity.  This quantity is then signed
    using the applicable encryption/signature algorithm, and the
    result of this operation placed in this field.

3.2.3 Class ID Codes

The following default Class ID codes are defined for events and analysis
results.  Under this scheme, class ids 0 thru 15 are reserved for
CIDF event priorities, and 16 thru 31 are reserved for analysis report
priorities.  In addition, class ids 32 thru 127 are reserved for
future CIDF extensions. IDS developers may use the
remaining range (128 thru 255) for application-specific purposes.

                          (Default Event Class IDs)
        00 - Complete Event
        01 - Intermediate Event
        02 - Incomplete Event
        03 - E-box Internal Error Report
        04 - E-box Internal Warning Report
        05 - E-box Internal Status Message
        06 - Reserved for E
        07 - Reserved for E
           :
        15 - Reserved for
                        (Default Analysis Class IDs)
        16 - Critical Security Violation
        17 - Potential Security Violation
        18 - Suspicious Report
        19 - Warning Report
        20 - Intermediate Result
        21 - Informational Report
        22 - A-box Internal Error Report
        23 - A-box Internal Warning Report
        24 - Reserved for A
        25 - Reserved for A
           :
        31 - Reserved for A
                        (Reserved Priority Code Range)
#####################################################################
# Editor's Comment:  Class ID code range 32-48 is reserved for
# R-Box countermeasure directives.
#####################################################################
        32 - Reserved for future use
        33 - Reserved for future use
           :
        127 - Reserved for future use
                        (Undefined Priority Codes)
        128 - Undefined
           :
           : (Undefined values may be employed for
           :  application-specific purposes.)

255 - Undefined

CIDF Specification: Version 0.6          Page 45


========================================================================
= 3.3:  Encoding S-Expressions
========================================================================

GIDO payloads consist of S-expressions.  However, these S-expressions
are translated to an octet encoding format for efficient transmission or
storage.

The octet encoding of message payloads support highly efficient
transmissions of messages.  This section describes how to transform an
S-expression into the appropriate octet encoding.  This encoding is
designed to meet the following objectives:

    *   It must indicate the structure, so that a component ignorant
        of the elements within the S-expressions will still be able
        to parse the S-expressions.
    *   It must allow for pre-defined and distributed-out-of-band
        SIDs.
    *   It should be as compact as possible.

3.3.1: Octet Codes

The following codes will be used to represent various octet values in
the succeeding encoding specifications.  They are *not* S-expression
atoms.

    Code        Value       Interpretation
    ----        -----       --------------
    SEP         0xff        Used as separator.
    SOPEN       0xfe        S-expression open.
    PTR         0xfd        Pointer (referred to as @).
    SID         0xfc        Prelude to SID 2-octet code.
    TYPE        0xfb        Indicates concrete syntax type.

3.3.2: Encoding of S-Expression Grammar

What follows is the grammar for CIDF S-expressions.  After each line we
give the encoding applicable to that line.

    <item-list> ::= <item>
        E(<item-list>) = E(<item>)

    <item-list> ::= ( <item-list> )
        E(<item-list>) = E(<item-list>)

    <item-list> ::= <item-list> <item>
        E(<item-list>) = E(<item-list>) E(<item>)

    <item> ::= ( <sid-exp> <data-exp-list> )
        E(<item>) = SOPEN E(length{E(<sid-exp>) E(<data-exp-list>)})
                                   E(<sid-exp>) E(<data-exp-list>)
        E(length{X}) = var_encode(X)


CIDF Specification: Version 0.6          Page 46

    <item> ::= ( @ <sid-exp> <data-locator> )
        E(<item>) = SOPEN PTR E(<sid-exp>) E(<data-locator>)
        E(<data-locator>) = ascii_encode(<data-locator>)

    <item> ::= ( def <sid> <sid-exp> <semantics> )
        E(<item>) = SOPEN
                    E(length{E(def) E(<sid>) E(<sid-exp>) E(<semantics>)
                             E(def) E(<sid>) E(<sid-exp>) E(<semantics>)
        E(<sid>) = SID sid_encode(<sid>)
        E(<semantics>) = ascii_encode(<semantics>)

    <sid-exp> ::= <sid>
        E(<sid-exp>) = sid_encode(<sid>)

    <sid-exp> ::= '<type>:<sid>
        E(<sid-exp>) = TYPE type_encode(<type>) sid_encode(<sid>)

    <sid-exp> ::= ( <sid-exp-list> )
        E(<sid-exp>) = SOPEN E(length{E(<sid-exp-list>)})
                                      E(<sid-exp-list>)

    <sid-exp-list> ::= <sid-exp>
        E(<sid-exp-list>) = E(<sid-exp>)

    <sid-exp-list> ::= <sid-exp-list> <sid-exp>
        E(<sid-exp-list>) = E(<sid-exp-list>) E(<sid-exp>)

    <data-exp-list> ::= <data-exp>
        E(<data-exp-list>) = E(<data-exp>)

    <data-exp-list> ::= <data-exp-list> <data-exp>
        E(<data-exp-list>) = E(<data-exp-list>) E(<data-exp>)

    <data-exp> ::= <data>
        E(<data-exp>) = E(<data>)

    <data-exp> ::= ( <sid-exp> <data-exp-list> )
        E(<data-exp>) = SOPEN E(length{E(<sid-exp>) E(<data-exp-list>)})
                                       E(<sid-exp>) E(<data-exp-list>)

3.3.3: Auxiliary Functions

The following functions are used in the above syntax and encoding:

    ascii_encode(<string>) returns the ASCII-encoding of <string>.
    short_encode(<short>) returns the big-endian expression of
        <short>.  (E.g., short_encode(1234) = 0xd204.)
    sid_encode(<sid>) returns the 2-octet code for <sid>.
    type_encode(<type>) returns the SEP-terminated code for <type>.

var_encode(<int>) encodes an arbitrarily long integer.  It is encoded as
follows:

    L1 | <int>


CIDF Specification: Version 0.6          Page 47

where L1 is one byte containing the length of <int>, which is expressed
in big-endian order.

3.3.4: Encoding Data

Data may be encoded in one of two ways.  If the applicable SID had a
fixed-length data type, then the data is encoded exactly as specified by
the type; e.g., a ulong is encoded as four octets in big-endian order.

Otherwise, the data is encoded as follows:

    var_encode(length{Data}) | Data

Thus, if Data is a variable-length data structure that is 84,000 bytes
long, then it is encoded as follows:

    03 01 48 20 xx xx xx ...

3.3.5: SID Codes

SIDs are ordinarily encoded as 2-octet values.  A list of pre-defined
SIDs is given in Appendix B; if one exists for the purpose, it SHOULD be
used.  However, this encoding furnishes the ability to define new SIDs
should no applicable one exist, using the "def" operative.  For the
purposes of encoding, "def" is treated as a SID as well (i.e., it has
its own 2-octet code).

As noted in Section 3.3.2, this requires one to define a new SID code.
These SID codes may be unrestricted, but they should conform to the
following standard:

    * The code is a 2-octet value, as stated above.
    * The MSB (bit 7) of the first octet is the DYNAMIC bit.  If this
      bit is set, this is a dynamically-defined SID, and the code for
      the actual SID is given by bit 5 of the first octet through the
      LSB (bit 0) of the second octet.  If it is clear, this is a
      statically-defined SID, and the code for the SID is as given in
      the appendix.
    * If the DYNAMIC bit is set, the 2-octet value is followed by a
      4-octet value representing the UUID of the SID designer.  Also,
      the next bit (bit 6 of the first octet) is the EXPERIMENTAL bit.
      If *this* bit is set, then the SID is ephemeral and cannot be
      relied on in future encodings.  If it is clear, then this is a
      stable SID.


CIDF Specification: Version 0.6          Page 48

========================================================================
========================================================================
=
=                           4: CIDF Communication
=
========================================================================
========================================================================
=
========================================================================
= 4.1: Message Layer
========================================================================

4.1.1: Rationale for Message Layer

The CIDF message layer was developed to solve problems of
synchronization (i.e., blocking vs.  non-blocking processes) and
problems of different data formats for different operating systems.  It
also solves the problem that different groups will use different
programming languages.  In other words, the use of a messaging format
achieves the following goals:

    * Independent of blocking/non-blocking processes
    * Data format independent
    * Operating system independent
    * Programming language independent

4.1.2: Objectives of the CIDF Message Layer

The top-level objectives for the CIDF message layer are to

    * Provide an open architecture.
    * Avoid imposing architectural constraints or assumptions on the
      systems or modules.
    * Allow messaging independent of language, operating system, and
      network protocol.
    * Support easy addition of new components to the CIDF.
    * Support security requirements for authentication and privacy.
    * Support devices that don't want to fully support CIDF.

4.1.3: Message Format

This message structure resides on top of the negotiated transport layer
service.  Note that all reserved fields are set to 0 on transmission and
ignored on receipt.


CIDF Specification: Version 0.6          Page 49

    0                   1                   2                   3
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |    Version    | Control Byte  |          Checksum             |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |  Next Header  |                   Reserved                    |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                            Length                             |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                        Sequence Number                        |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                          Time Stamp                           |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                      Destination Address                      |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                    Options (variable)                         |
   ~                                                               ~
   |                                                               |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                    Payload Data (variable)                   |
   ~                                                               ~
   |                                                               |
   +               +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |               |     Privacy Trailer* (variable)               |
   +-+-+-+-+-+-+-+-+                                               ~
   |                                                               |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

    * if privacy option is used

Options all have a common type-length-value format described below.

    * Version - 1 octet.  CIDF message-layer version (1 for this
      initial version).

    * Control Byte - 1 octet.  Used by the message layer to support
      reliable transmission, flow control, and security association
      management.
      - Acknowledgement of a delivered message (1).
      - Message received, but not delivered because of lack of
        resources (2).
      - Message received, but the supplied security association was
        not available to all processing (4).

    * Checksum - 2 octets.  A checksum across the entire CIDF message,
      prior to application of cryptographic mechanisms (i.e., privacy
      and authentication transforms).  The checksum is computed as
      specified in the TCP standard (RFC 793).


CIDF Specification: Version 0.6          Page 50

    * Next Header - 1 octet.  Defines the type of either the next
      message layer option or application.  The following are the
      currently defined types.
      - Application Header (1)
      - Route List (4)
      - Privacy Header (50)
      - Authentication Header (51)

    * Length - 4 octets.  Length of the CIDF message, including
      message header.

    * Sequence Number - 4 octets.  Message layer sequence number used
      for message reliability (acknowledgement and duplicate removal)
      and to support protection against message replay.

    * Time Stamp - 4 octets.  Used to provide loose time
      synchronization between CIDF communicating parties and to
      support tardy delivery detection (from denial of service).

    * Destination Address - 4 octets.  IP address of the target of
      this message.  This field identifies the eventual recipient of
      the CIDF message and is used to route CIDF messages through
      intermediate CIDF nodes that cannot be traversed by normal
      network routing (e.g., firewalls).

4.1.4: Message Layer Protocol Options

Except for the CIDF privacy option, CIDF message options use the
following format.

    0                   1                   2                   3
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |  Next Header  |    Length     |                               |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+                               +
   |                     Option Data (variable)                    |
   ~                                                               ~
   |                                                               |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

    * Next Header - 1 octet.  Defines the type of either the next
      message layer option or application, with the same permitted
      values as defined above.

    * Length - 1 octet.  Specifies the number of 32-bit words for this
      option, including the next type and length fields.

    * Option Data - variable length.  The option data field is always
      padded to a 32-bit aligned size.

4.1.4.1: Route List Option


CIDF Specification: Version 0.6          Page 51

Route List is a variable length field that specifies the CIDF nodes
through which the message is to be routed for source routing, and
through which the message has been routed for recorded routing.  The
Subtype field indicates whether this is a source or record route.  The
Route List has the following format.  The route list option is used when
the message destination and source are separated by CIDF nodes that
cannot be traversed by normal network routing (e.g., firewalls).

    0                   1                   2                   3
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |  Next Header  |    Length     |    Subtype    |     Index     |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                      Route Data (variable)                    |
   ~                                                               ~
   |                                                               |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

    * Next Header and Length are defined above.

    * Subtype - 1 octet.  Specifies whether this is a recorded route
      or a source route.
      - Recorded Route (1)
      - Source Route (2)

    * Index - 1 octet.  Index into the array of addresses specifying
      the current address to be processed.  For source routing, this
      is the address of the next CIDF hop.  For recorded routes, this
      is the address of the last transmitting CIDF node.

    * Route Data - variable length.  This field is an array of
      Internet addresses.  Each internet address is 32 bits or 4
      octets.  For a source route, if the index is greater than the
      length, the source route is empty and the routing is to be based
      on the destination address field.  For a recorded route, if the
      index is greater than the length, the recorded route list is full.

4.1.4.2: Privacy Option

The CIDF privacy option supports both unicast or multicast privacy.  For
multicast privacy, one node of the multicast group is selected to
generate the keys.  The keys are then distributed to each multicast
group member.  For unicast privacy, each node generates its own privacy
keys which are distributed to the remote party.


CIDF Specification: Version 0.6          Page 52

    0                   1                   2                   3
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                   Key Generator Identity                      |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |               Security Parameters Index (SPI)                 |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                    Payload Data* (variable)                   |
   ~                                                               ~
   |                                                               |
   +               +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |               |     Padding (0-255 bytes)                     |
   +-+-+-+-+-+-+-+-+               +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                               |  Pad Length   | Next Header   |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

    * (foot note) if the cryptographic algorithm requires use of an
      initialization vector, then that vector is placed as clear text
      between the SPI and Payload Data.

    * Key Generator Identity - 4 octets.  This value identifies
      the CIDF entity that generated the key.  The initial use of
      this field is to specify either the key generator's IP address
      or for multicast applications the multicast address for the
      multicast group using this security association.

    * Security Parameters Index (SPI) - 4 octets.  The SPI is an
      arbitrary 32-bit value that uniquely identifies the Security
      Association for this message, relative to the key generator
      identity.

    * Padding - variable length.  The transmitter may add up to 255
      bytes of padding if required to support the block size of the
      cryptographic algorithm.  Padding is required to ensure that
      after the privacy option is applied, the message ends on a
      4-byte boundary.

    * Pad Length - 1 octet.  The number of padding bytes immediately
      preceding it.  The range of valid values is 0-255, where a
      value of zero indicates that no Padding bytes are present.

    * Next Header is defined above.

4.1.4.3: Authentication Header Option


CIDF Specification: Version 0.6          Page 53

    0                   1                   2                   3
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |  Next Header  |    Length     |           Reserved            |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                   Key Generator Identity                      |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |               Security Parameters Index (SPI)                 |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                Authentication Data (variable)                 |
   ~                                                               ~
   |                                                               |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

    * Next Header and Length are defined above.

    * Key Generator Identity - 4 octets.  This value identifies
      the CIDF entity that generated the key.  The initial use of
      this field is to specify either the key generator's IP address
      or for multicast applications the multicast address for the
      multicast group using this security association.

    * Security Parameters Index (SPI) - 4 octets.  The SPI is an
      arbitrary 32-bit value that uniquely identifies the Security
      Association for this message, relative to the key generator
      identity.

    * Authentication Data - variable number of 32-bit words.  The data
      (e.g., digital signature or keyed hash) used to provide
      cryptographic authentication.

4.1.5: Cryptographic Mechanisms

The CIDF message layer protocol provides data integrity and source
authentication services for the negotiation phase of CIDF communication.
This enables components to reliably establish communications with
minimal security overhead.  During the negotiation phase, the client and
server determine the specific cryptographic services to be provided for
further communication.

The message layer provides the cryptographic mechanisms as options,
enabling use of lower-level services (e.g., IPSEC), CIDF-specific
mechanisms, or no cryptographic services, depending on application
requirements.

The mechanisms used are determined by the client based on the mechanisms
supported by the server.  The message layer mechanisms provide the
fields necessary to (1) determine the cryptographic services applied (if
any), (2) determine the cryptographic context, and (3) provide
timeliness and replay protection.

4.1.6: Negotiation Mechanism

4.1.6.1: Introduction


CIDF Specification: Version 0.6          Page 54

Our approach is to use the simplest reliable transport mechanism
available (i.e., reliable CIDF messaging over UDP) as the default CIDF
transport protocol.  This simple protocol can then be used to negotiate
a more or less complex protocol for those components requiring
additional transport-layer services.  This allows simple devices to
participate easily, while allowing complex devices to take full
advantage of other transport-layer mechanisms.  The message layer
provides optional services to compensate for weaknesses in the transport
layer.  The combination of the CIDF message layer with transport-layer
options provides a range of communication capabilities that can be used
to support different application requirements.  The following types of
transport/messaging are initially envisioned:

    * No assured delivery over a connection-less transport.  That is,
      the CIDF message layer without acknowledgement and
      retransmission directly over UDP.

    * Assured delivery over a connection-less transport.  That is, the
      CIDF message layer with reliable delivery (acknowledgement,
      retransmission, and duplicate removal) over UDP.

    * Assured delivery over a connection-oriented transport.  That is,
      the CIDF message layer directly over TCP.

    * Object-oriented transport.  That is, the CIDF operations over
      CORBA.

To enable support for components that must use minimal communication
infrastructure, the default transport mechanism is based on UDP. The
following sections define the default transport layer protocol, CIDF
security services, and the transport negotiation mechanisms.

4.1.6.1.1: Rationale for negotiated transport layer

The simplest approach would be to mandate the use of a single transport
protocol.  But there is no one protocol that can adapt to the varying
requirements of all anticipated CIDF applications.  Depending on whether
an application is concerned with real-time traffic or simple accrual of
a database of events, different transport mechanisms are appropriate.

Specifically, some CIDF applications require a very light-weight
communication channel that does not have the resource usage required by
current TCP implementations, while other applications require a flexible
and robust communication channel such as TCP. Other requirements include
application-specific support for multicast, which is not supported by
TCP. Therefore, we have requirements for connectionless communication,
reliable connectionless communication, and reliable connection-oriented
communication.  Additionally, we have varying requirements for security
services.  In some applications and environments, the infrastructure
provides adequate security services.  In other applications, we require
CIDF-layer security services for authentication, privacy, or both.


CIDF Specification: Version 0.6          Page 55

Nevertheless, communications clearly cannot begin between two specific
components until a channel is agreed upon.  At the very least, this
implies that if we don't agree on a single channel for all transport, we
need to agree on a single channel for transport negotiation.

This channel needs to be widely supported and freely available.
Components are allowed to share data on whatever channel they wish, but
they must support channel negotiation on the common mechanism.

To support this range of requirements we provide a protocol based on the
reliable UDP variant of CIDF that enables applications to agree upon the
desired transport protocol, plus the desired CIDF message-layer security
services.  This exchange is only necessary if the participants have not
previously agreed upon a transport mechanism through external mechanisms
(e.g., local configuration settings or through the CIDF directory
service).

4.1.6.2: Default Transport Layer

The default transport layer protocol for CIDF messages is reliable CIDF
messaging over UDP. Other transport layer protocols may be used
following a negotiation using the default of protocols and services
required and supported by the CIDF client and server.  Until we acquire
a well-known CIDF port number, we will use 0x0CDF as the CIDF port.  The
CIDF message layer will listen on the CIDF well-known port for incoming
CIDF messages.

4.1.6.3: Conformant transport options

    * CIDF message layer without acknowledgement and retransmission
      directly over UDP.

    * CIDF message layer with acknowledgement and retransmission over
      UDP.

    * CIDF message layer directly over TCP.

4.1.6.4: Option Negotiation Message Formats

The negotiation for more advanced communication services occurs over a
UDP channel using only the CIDF message layer with authentication
mechanisms enabled.  This enables components that do not support TCP to
participate in CIDF. Negotiation occurs by the client querying the
server's capabilities.  In response, the server specifies the class of
CIDF operations supported, message services supported, and whether
extensions are supported.  The client then selects the services and
message mechanisms.  This information can also be provided by the
directory server.

The CIDF transport negotiation protocol resides directly over the CIDF
message layer.  The query-response data format is shown below.  We
assume that for cryptographic services, the negotiation of the specific
algorithms and modes is handled by the key distribution mechanism.


CIDF Specification: Version 0.6          Page 56

    0                   1                   2                   3
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |      Type     |    Length     |           Reserved            |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                   Option Request (variable)                   |
   ~                                                               ~
   |                                                               |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

    * Type - 1 octet.  Specifies the type of request.  For option
      negotiation messages, this value is 1.

    * Length - 1 octet.  Specifies the number of 32-bit words for this
      message, including the type and length fields.

Option Requests are formatted as follows.

    0                   1                   2                   3
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |    Request    |    Length     |    Option     |   Selection   |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                  Option Parameters (variable)                 |
   ~                                                               ~
   |                                                               |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

    * Request - 1 octet.  Specifies the type of request.  The
      following request types are currently supported.
      - Want (1) - Preferred service.
      - Can (2) - Sender is capable of using this service.

    * Length - 1 octet.  Specifies the number of 32-bit words for this
      option request, including the request and length fields.

    * Option - 1 octet.  The option being negotiated.  The following
      option types are currently supported.
      - Transport (1)
      - Privacy (2)
      - Authentication (3)

    * Selection - 1 octet.  The option value being negotiated.  The
      meaning of this fields depends on the option being negotiated.
      The following selection values are currently supported.

      For Transport negotiation.
      - None (0).  Used to reject communication with another CIDF node
 when no acceptable options are received.
      - UDP (1)
      - Reliable UDP (2)
      - TCP (3)


CIDF Specification: Version 0.6          Page 57

      For Privacy negotiation.
      - None (0)
      - IPSEC (1)
      - SSL (2)
      - CIDF (3)

      For Authentication negotiation.
      - None (0)
      - IPSEC (1)
      - SSL (2)
      - CIDF (3)

Currently, the only option parameter specified is the selection of
TCP/UDP port number for transport negotiation, which is formatted as
follows.

    0                   1                   2                   3
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |      Type     |    Length     |     Transport Port Number     |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

    * Type - 1 octet.  Specifies the type of option parameter.  For port
      numbers, this value is 1.

    * Length - 1 octet.  Specifies the number of 32-bit words for this
      message, including the type and length fields.

   * Transport Port Number - 2 octets.  This specifies on which port
     number the sender of the message will listen following completion
     of negotiation.  Both ends of the channel select their own
     respective ports.

4.1.6.5: Protocol Description

Identification of the remote CIDF component's IP address is handled
either through manual configuration or through the CIDF directory
service.  Note that this service may also indicate the CIDF component's
capabilities (can) and preferences (want) for transport and security
services.

When Sender S wishes to communicate with Receiver R, and the two
components have not yet agreed on a transport mechanism, then S must
initiate transport mechanism negotiation.

S sends a negotiation message to R on the CIDF well-known port
indicating the services preferred (if any) and permitted.  S includes a
separate option request for each supported option, indicating the
preferred option (if any).

When R receives an option negotiation, R selects the desired value using
local preferences if supported by S, S's preferences if supported
locally, or the intersection of local and S's capabilities if the
preferences are not specified or supported.


CIDF Specification: Version 0.6          Page 58

If the local and remote capabilities do not permit communication, the R
selects a transport option of None, indicating that communication is not
feasible.

R responds with only the selected options for transport, privacy, and
authentication identified as preferred options.

========================================================================
= 4.2: Message Layer Processing
========================================================================

4.2.1: Introduction

This section describes the processing of CIDF message layer messages.
The standard procedures are used for CIDF messages independent of the
transport layer.  The reliable transmission procedures describe
additional procedures to be used when the transport mechanisms is
reliable UDP. CIDF privacy and authentication procedures describe the
procedures used in providing CIDF layer privacy and authentication
mechanisms, respectively.

4.2.2: Standard Procedures

Each CIDF message uses the standard CIDF header.

4.2.2.1: Outbound Message Processing

On request by the application layer to transmit a CIDF message, the CIDF
message layer shall build the message header and append the message.

If the application indicates that this message requires source routing,
then the CIDF message layer shall use the supplied source route list.

If the application indicates that this message requires recorded
routing, then the CIDF message layer shall initialize the record route
list, placing the outgoing IP address as the first entry on the route
list.

The CIDF message layer shall insert the current CIDF version number, the
application-provided destination, and the current time as the CIDF
header time-stamp.

The CIDF message layer shall insert a new sequence number.  The sequence
number is initialize to 0, and incremented for each message sent by the
CIDF message layer.

The CIDF message layer shall compute the total message length and insert
that length into the Length field.

The CIDF message layer should compute and insert the checksum prior to
message transmission.  The checksum is inserted prior to applying CIDF
privacy or authentication mechanisms.


CIDF Specification: Version 0.6          Page 59

If CIDF privacy or authentication is being used, the CIDF message layer
shall encrypt and generate the authentication data for the message based
on the current security association in use with the recipient.  If CIDF
privacy or authentication is being used and no security association
exists, then the message transmission request should be rejected.

4.2.2.2: Inbound Message Processing

If the Version field is not a valid CIDF version number (currently 1),
the CIDF message layer shall discard the message.

If CIDF privacy or authentication is being used, the CIDF message layer
shall decrypt and authenticate the message, and discard the message on
failure.  On failure, due to lack of a valid security association, the
CIDF message layer should send a response to the source.  The response
is the CIDF message layer header, with the Control Byte set to 4.

If the Checksum field is not 0, the CIDF message layer shall compute the
message checksum (using the method described in RFC 793 and discard the
message if the Checksum check fails.

If the Time Stamp field indicates an unexpected delay, the CIDF message
layer should notify the application.

If the Destination Address is not the local CIDF node (i.e., the
destination does not match the local node's address or any multicast
address that the local node is using), the CIDF message layer shall
determine the next CIDF hop (using the source route, if provided) and
forward the message after adjusting the Sequence Number and Time Stamp.
If the message includes a record route option, then the CIDF message
layer shall enter its outgoing IP address if there is sufficient room in
the record route structure and increment the route index.  After
processing, the CIDF node should compute the checksum as specified in
RFC 793, and place the checksum in the Checksum field.  Finally, the
message layer shall apply the privacy and authentication transforms for
the next CIDF hop and transmit the message.

4.2.3: Reliable Transmission Procedures

4.2.3.1: Outbound Message Processing

For reliable message transmission, the CIDF message layer shall maintain
the round-trip latency and mean deviation values for each node with
which the local component communicates.  These values are used in
determining the timeout values for message transmission.  The CIDF
message layer shall use the standard TCP algorithm for computing message
layer timeouts.

On reliable message transmission, the CIDF message layer shall retain a
copy of the message for retransmission purposes and set a timer for the
message.  If the timer expires before the message is acknowledged, the
message layer shall retransmit the message up to an maximum of 5
retries.


CIDF Specification: Version 0.6          Page 60

On reception of an acknowledgement for the CIDF message, the CIDF
message layer shall remove the message from the retransmission queue.

4.2.3.2: Inbound Message Processing

On message reception, the CIDF message layer shall send a CIDF
acknowledgement to the source.  If the message layer can deliver the
message to the application layer, then the Control Byte shall be 1.
Otherwise, the Control Byte shall be 2.  The acknowledgement message is
identical to the original message header except for the Control Byte.

The CIDF message layer shall use the source IP address and the sequence
number to ensure that duplicate messages are not delivered to the
application layer.

4.2.4: CIDF Privacy Procedures

4.2.4.1: Outbound Message Processing

The CIDF message layer encapsulates the CIDF Application Data with an
CIDF privacy header and Trailer, encrypts the CIDF Application Data and
CIDF privacy trailer.  The original CIDF Header is retained, except the
CIDF Next Type, which is modified to indicate that this is an CIDF
encrypted message.

On message transmission, if the CIDF message layer is applying CIDF
privacy mechanisms for the message, the CIDF message layer shall
determine the security association (which determines the algorithm) for
the message target, add any required padding, compute and insert the
padding length in the trailer, insert the next header in the trailer,
perform the cryptographic transform over the resulting plain-text
message, and shall insert the security association identity (Key
Generator Identity and SPI) before the resulting ciphertext.  If an
initialization vector is required for the cryptographic transform, it
shall be inserted between the resulting ciphertext and the privacy
header.

The next header in the CIDF message layer is then set to 50.

4.2.4.1.1 Message Encryption

The CIDF message layer encapsulates the original CIDF application data
into the CIDF Application Data field, that includes any necessary
padding, and encrypts the result (Application Data, Padding, Pad Length,
and Next Header) using the Message Encryption Key, encryption algorithm,
and algorithm mode indicated by the security association.

4.2.4.1.2 Encryption Algorithms


CIDF Specification: Version 0.6          Page 61

The security association specifies the encryption algorithm to be used.
The CIDF privacy option is designed for use with symmetric encryption
algorithms.  Because the messages may arrive out of order, each message
must carry any data required to allow the receiver to establish
cryptographic synchronization for decryption.  This data may be carried
explicitly in the Application Data field (e.g., as an IV as described
above) or the data may be derived from the message header.  Since the
CIDF privacy option makes provision for padding of the plain-text,
encryption algorithms employed with the CIDF privacy option may exhibit
either block or stream mode characteristics.

4.2.4.2 Inbound Message Processing

Upon receipt of an CIDF message containing an CIDF privacy header, the
CIDF message layer looks up the security association, and regenerates
the CIDF application data.

4.2.4.2.1 Security Association Lookup

The Security Association information is included in the CIDF privacy
header.

The CIDF message layer looks up the appropriate algorithm and Message
Encryption Key for decryption, based on the SPI and Key Generator's
identity from the CIDF privacy header.

If no valid algorithm and key exists for this message, the receiver MUST
discard the message.

4.2.4.2.2 Message Decryption

The receiver decrypts the CIDF Application Data, Padding, Pad Length,
and Next Header using the neighborhood Message Encryption Key that has
been established for this neighborhood traffic.  If an explicit IV is
present in the payload field, it is input to the decryption algorithm
per the algorithm specification.  If an implicit IV is employed, a local
version of the IV is constructed and input to the decryption algorithm
per the algorithm specification.

After decryption, the original CIDF message is reconstructed and
processed per the normal CIDF protocol specification.  At a minimum, the
Next Header field in the CIDF privacy trailer should be moved to the
Next Header field in the CIDF header.

Note that there are two ways in which the decryption can "fail". The
selected security association may not be correct or the encrypted CIDF
message could be corrupted.  (The latter case would be detected if
authentication is selected for the security association, as would
tampering with the SPI.)

4.2.5: CIDF Authentication Procedures

4.2.5.1 Outbound Message Processing


CIDF Specification: Version 0.6          Page 62

On message transmission, if the CIDF message layer is applying CIDF
authentication mechanisms for the message, the CIDF message layer shall
determine the security association (which determines the algorithm) for
the message target, insert the length of the authentication header,
insert the next header in the authentication header, shall insert the
security association identity (Key Generator Identity and SPI) before
the resulting ciphertext, and perform the cryptographic transform over
the resulting message.

The next header in the CIDF message layer is then set to 51.

4.2.5.1.1 Integrity Check Value Calculation

The transmitter computes the Integrity Check Value (ICV) over the entire
message using the ICV key, hashing algorithm, and algorithm mode
indicated by the security association.  Since the Authentication Data is
not protected by encryption, a keyed authentication algorithm must be
employed to compute the ICV.

If privacy is selected in conjunction with CIDF authentication,
encryption is performed first, before the authentication.  The
encryption does not encompass the Authentication Data field.  This order
of processing facilitates rapid detection and rejection of replayed or
bogus messages by the receiver, prior to decrypting the message, hence
potentially reducing the impact of denial of service attacks.  It also
allows for the possibility of parallel processing of messages at the
receiver (i.e., decryption can take place in parallel) with
authentication.

4.2.5.1.2 Padding

No padding is required if the default 96-bit truncated Hashed Message
Authentication Codes (HMAC) algorithm is used.  However, if another
authentication algorithm is used, padding MAY be required.

If an authentication algorithm creates an ICV with length less than an
integral multiple of 32 bits, padding may be appended to the
Authentication Data field to ensure a 32-bit multiple AH. Alternatively,
the ICV may be truncated to a 32-bit multiple length.

In addition, if the authentication algorithm requires a multiple of a
block size and the CIDF message with CIDF authentication header does not
meet the block size requirements, zero-valued padding MUST be appended
to the end of the CIDF message prior to ICV computation.  This padding
is not transmitted with the CIDF message.

4.2.5.1.3 Authentication Algorithms

The security association specifies the authentication algorithm used for
the ICV computation.  At the time of writing, one mandatory-to-implement
algorithm and mode has been defined for CIDF authentication header.  It
is based on the Hashed Message Authentication Codes using a SHA-1 hash
value.  The output of the HMAC computation is truncated to the leftmost
96 bits.


CIDF Specification: Version 0.6          Page 63

4.2.5.2 Inbound Message Processing

Upon receipt of an CIDF message containing an CIDF authentication
header, the CIDF message layer looks up the Security Association and
verifies the Integrity Check Value.

4.2.5.2.1 Security Association Lookup

The Security Association information is included in the CIDF
authentication header.  The CIDF message layer looks up the appropriate
algorithm and key for ICV computation, based on the SPI and Key
Generator's identity from the CIDF authentication header.

If no valid algorithm and key exists for this message, the receiver MUST
discard the message.

4.2.5.2.2 Integrity Check Value Verification

The receiver computes the ICV of the entire CIDF message using the
specified authentication algorithm and the security association ICV key
that has been established for this security association.  If the
computed ICV matches the ICV included in the Authentication Data field
of the message, the CIDF message is valid and accepted.  If the values
do not match, the CIDF message layer MUST discard the CIDF message as
invalid.

To validate the CIDF message, the CIDF message layer saves the ICV value
in the CIDF authentication header and replaces it with zeros.  Then the
CIDF message layer performs the ICV computation over the entire message
and compares the saved ICV value with the computed ICV value.


CIDF Specification: Version 0.6          Page 64

========================================================================
= 4.3: CIDF Matchmaking Service
========================================================================

4.3.1: Rationale for matchmaking service

The CIDF Matchmaking Service, or matchmaker, provides a standard,
unified mechanism for CIDF components to make themselves known to other
components, and for components to locate communication "partners" with
which they can share information.  The use of a single infrastructure
for this purpose should greatly promote component re-use and ease
development of multi-component intrusion detection and response systems.

The matchmaker is meant to support different levels of directory
service, according to what a given component requests.  Though the
matchmaker will provide clear advantages to its clients in ease and
flexibility of configuration, the use of the matchmaker is optional, so
that components that do not want to use it (e.g., due to resource
constraints) are not obliged to do so.

4.3.2: Objectives for matchmaking service

The high-level objectives of the matchmaker are to:

  * allow CIDF components to contact an active CIDF installation and
    register themselves in it
  * allow components to establish associations for the exchange
    of data with other components
  * allow associates to authenticate themselves to one another as
    authorized gido producers or consumers in a given category
  * permit (but not force) associates to be designated by the type
    of data desired for exchange
  * permit (as a simplification) associates to be located directly
    by their identity
  * provide (as a further simplification) for associates to be
    located by network broadcast, without use of the matchmaker
    at all

4.3.3: Abstract Directory Model

This subsection describes the contents and use of an underlying
directory service for the CIDF Matchmaker, independent of the specifics
relative to any particular directory implementation.

4.3.3.1: Assumptions

Whatever the underlying directory service may be, we do make some
assumptions about its capabilities:

  + It provides a global and hierarchical namespace.
  + It provides for mutual authentication between the directory
    server and its clients.
  + Given the above authentication, it provides access control,
    if desired, to the level of individual principals and individual
    directory entries.

CIDF Specification: Version 0.6          Page 65


4.3.3.2: Goals of Directory Use

The underlying goal of the directory, as used by CIDF, is to ease the
process by which CIDF-compliant components locate one another, by
supporting feature-based lookup.  Rather than naming other components, a
component has the option of specifying classes of gidos that it is
interested in, and discovering what other components can produce or
consume them.

The matchmaker will use the directory to form categories, as a way of
classifying gidos for easy feature-based lookup.  A category is a set of
values for a particular attribute of a gido.  So for instance a
"fileserver" category would represent a set of values for the AtLocation
HostName attribute.

Categories are arranged in a hierarchy, but not necessarily as a tree.
In fact, the hierarchy defines only a partial ordering on the categories
involved in it.  So a given category may have multiple supercategories
in addition to having multiple subcategories.  Components may belong to
categories as either potential producers or potential consumers of the
gidos the category specifies.

Part of the point of having a hierarchy of categories is to permit the
analysis and consolidation at one level of the hierarchy of information
received from the next lower level of the hierarchy.  We assume
therefore that one very common kind of CIDF component will accept gidos
from producers in all of its subcategories, then produce gidos itself as
a result of analyzing those from its subcategories.

Finally, the directory will also be usable to locate components
accessible via a given DNS host or domain name.  Such names will be
mapped into the category hierarchy in a standardized fashion, so that
the mapping can be computed with no a priori knowledge of site-specific
directory configuration.

4.3.3.3: Directory Data

4.3.3.3.1: Naming Conventions

We noted just above that DNS names will have a well-defined mapping to
names in the directory tree.  This mapping will always be available as a
starting point for communication between DNS domains, in which the
initiator has no or limited knowledge of the contents of the foreign
domain.

The format of names in the directory tree is implementation-dependent
and hence outside the scope of this model.  However, the model does make
use of two "abstract" kinds of names characterized by the role they
fill:


CIDF Specification: Version 0.6          Page 66

  + A fully-qualified category name uniquely specifies a category,
    that is, specifies the "path" to the category from the root of
    the directory tree.  Though each category may have multiple
    supercategories, it has only one fully-qualified name.
  + A gido class identifies some group of gidos, and is based on a
    fully-qualified category name, but is augmented with further
    attributes.  These attributes may be further category names,
    or may be other attributes that do not play a structural role
    in the directory.

This implies that the structure of a given category hierarchy will not
be reflected in the structure of the directory tree where its categories
are stored.  An example may help illustrate.  (Here we imagine directory
names formatted like Unix pathnames.  Obviously this does not correspond
to any actual implementation.)

Say that the domain netlife.com defines a category
"phoenix_finance_hosts", with supercategories "finance_hosts" and
"phoenix_hosts". The fully-qualified name of the category is
"netlife.com/phoenix_finance_hosts". It is _not_
"netlife.com/phoenix_hosts/phoenix_finance_hosts". As this example
implies, however, it is good practice to name subcategories so as to
indicate their position in the local category hierarchy.  Any tools that
help automate category creation should certainly encourage this.

See Section 1.4 for further details.

4.3.3.3.2: Entry Contents

Each category will be represented by a directory entry containing the
following information:

  + A list of gido producers, that is, CIDF components that produce
    classes of gidos belonging to the category.
  + A list of gido consumers, that is, CIDF components that have
    expressed interest in consuming gidos belonging to the
    category.
  + A list of supercategories -- categories immediately above this
    one in the overall hierarchy.
  + A list of subcategories -- categories immediately below this
    one in the overall hierarchy.

For each CIDF component listed (either a producer or a consumer), a
further directory entry will be defined.  This entry will contain a set
of attributes, each of which has a (standardized) name and a value.
These attributes will include:

  + a DNS hostname
  + an optional IP address
  + a CIDF endpoint identifier
  + an optional CIDF filter expression specifying gidos
    of interest to the component

4.3.3.3.3: Access Control


CIDF Specification: Version 0.6          Page 67

We wish to enforce the following kinds of access control.

  + Producers in a given category can read the consumer list
    of that category and those of its supercategories.
  + Consumers in a given category can read the producer list
    of that category and those of its subcategories.
  + Each component in a given category can update its own
    filter expression and addressing/endpoint information.
  + The administrator of a given category can update its
    consumer and producer information, that is, add or remove
    entries for components.

Note that the access control described here applies only to the
information actually stored in the directory service.  Access control
for the gidos accessible via a given category is enforced by the
producers and consumers for that category.  That is, a producer will
check whether a consumer is authorized to consume from the category, and
a consumer will check whether a producer is authorized to produce into
the category.  The authorization mechanism used to achieve this is not
yet specified.  It will reside in the matchmaker, but is outside the
scope of the current subsection.

4.3.3.4: Directory Operations

The key operations involving the directory are:

  + modifying the category hierarchy
  + adding producers to a category
  + adding consumers to a category
  + category and component lookup
  + removing producers or consumers from a category

We distinguish adding producers from adding consumers because we view
the two operations as intrinsically different in frequency and
sensitivity.  Adding producers is likely to be relatively rare, and as
generators of the data on which an IDS operates, producers are clearly
the more critical of the two component types.

4.3.3.4.1: Modifying the Category Hierarchy

The category hierarchy may be changed only by some agent outside the
matchmaker with administrative responsibility for (and access to) the
relevant portions of the hierarchy.

4.3.3.4.2: Registering Producers

We view adding a producer to a category as an operation that is
explicitly initiated by some agent outside the matchmaker with
administrative responsibility for (and access to) the target category.
The most obvious example would be a human using a graphically-based
directory client tool, but the agent could also be part of an automated
process.


CIDF Specification: Version 0.6          Page 68

In any event, the agent can impose any restrictions desired on what
producers it will add.  This might include the inspection of public-key
certificates attesting to the component's identity or capability, for
example.  From the directory's standpoint, however, the only
authorization check performed will be that of the client requesting the
modify operation.

If this operation succeeds, the named component is added to the list of
those in the current category.

The producer list in the directory is considered to be an accurate
representation of the actual set of producers extant.  It is assumed
that the external agent adding the producer to the category will also
ensure that all copies of the relevant portion of the directory tree are
updated in a sufficiently timely fashion after the addition occurs.
However, the presence of a producer in the list does not guarantee that
the producer is actually available at any given time, of course.  A
consumer must attempt contact with the listed producer to determine
that.

4.3.3.4.3: Registering Consumers

We assume that adding a consumer to a category is both much more
frequent and somewhat less significant (from a functional as well as a
security standpoint) than adding a producer.  In particular, we do not
want to assume that adding consumers is under the control of an external
entity.  Rather, the ideal is to enable a consumer to be added to a
category automatically as a result of having expressed interest in
receiving the output of the category's producers.

The list of consumers in a given category is considered to be only a
hint of what consumers are actually extant.  No assumptions are made
about how or when changes to the list of consumers are propagated to any
copies of the directory tree other than the producer's own.

4.3.3.4.4: Category Lookup

Given a fully-qualified category name, the matchmaker will provide
interfaces to return, iteratively, the fully-qualified name of each of
its super- or subcategories.

4.3.3.4.5: Matchmaking

4.3.3.4.5.1: Data Access

The matchmaker will define interfaces to return, iteratively, each of
the attributes defined in a given directory entry, and to look up the
value of a specific attribute, given its name.

4.3.3.4.5.2: Producer Lookup


CIDF Specification: Version 0.6          Page 69

The major aim of lookup in the directory is to enable gido consumers to
contact gido producers.  To do so, a consumer presents a gido class
describing the gidos it wishes to receive.  This class will incorporate
a fully-qualified category name.  The matchmaker will apply the class
against the producer list of the category and return information on the
matching entries.

The information returned will be suitable for use as the target of the
data access interfaces mentioned above.

4.3.3.4.5.3: Consumer Lookup

A gido producer may also look up the set of consumers currently
registered as being interested in the output of a category to which the
producer belongs.  To do so, the producer presents a gido class
describing the gidos for which it wants to find consumers.  This class
will incorporate a fully-qualified category name.  The matchmaker will
apply the class against the consumer list of the category and return
information on the matching entries.

The information returned will be suitable for use as the target of the
data access interfaces mentioned above.

4.3.3.4.6: Deregistration

It is assumed that producers will be removed from their categories, when
desired, via some operation explicitly initiated by an agent external to
the matchmaker.

However, no such assumption is made for consumers, which are free to
stop using a category's output without notifying the matchmaker.  The
matchmaker should make it possible for a producer to query the current
nominal set of consumers to find out if they are active, removing them
from the set of recorded consumers if they are not.


CIDF Specification: Version 0.6          Page 70

========================================================================
========================================================================
=
=                         5: CIDF APIs
=
========================================================================
========================================================================
=
========================================================================
= 5.1: Introduction
========================================================================

Application programmers require a clean and uniform way to call upon
functions that are either local or remote and not bother with the
details of the call.  APIs hide information and simplify a programmer's
task.  Standard APIs allow sharing of functions and software components
between groups of people with common goals.

As a preliminary step to enumerating the individual interfaces of our
intrusion-detection framework, we categorize the various operations and
responsibilities that we commonly find within intrusion-detection
architectural boundaries.  The objective is to decompose the interface
specification into basic sets of interoperation among internally
cooperating modules.

We present a component-oriented architectural model that describes the
functionality of intrusion-detection systems in terms of modules with
well-defined roles and responsibilities.  Once our generic intrusion-
detection modules are defined, we enumerate a core set of interactions
among the modules.  Section 8 will then further refine these core
interactions among the generic intrusion-detection modules into a common
interface specification.

Under our architectural model, we recognize at least four basic
component types: the event generator unit, event analysis unit,
decision-support unit, and data management unit.  Each component
contributes a subtask to the overall intrusion-detection effort:

     An event generator (E-box) is responsible for retrieving individual
     representative of the analysis target's activity, filtering and for
     information, and passing these records onto client modules for furt
     An analysis unit (A-box) is the analytical engine of the intrusion-
     as an event generator client, receiving and analyzing events for po
     intrusion signatures. An A-box produces intrusion or anomaly report
     export to its client modules.
     The decision-support or countermeasure unit (C-box), represents the
     enforcement element within the intrusion-detection system. A C-box
     to the other components, receiving analytical results (and/or event
     input deploys countermeasures, ranging from administrative alerts t
     specified by the system's response policy.
     The data-management unit (D-box) provides the storage and retrieval
     intrusion-detection system. The primary role of the D-Box is to pro
     management between the IDS and persistent storage.


CIDF Specification: Version 0.6          Page 71

From these four core intrusion-detection modules, we decompose our
intrusion-detection interfaces such that they provide distinct
functional services.  The intent of this section is not to impose
architectural constraints or assumptions of modularity on IDS
developers.  Developers may use, ignore, or even extend the component-
oriented architectural model described here.  For example, an IDS
developer may decide to build a CIDF-compliant IDS module with its
analysis functions and response logic combined (i.e., essentially
merging the A- and C-box into a single component). The objective of this
section is not to impose requirements on components, but rather to
decompose of interfaces into core sets of services.

----------------------------------------------------------------------
                    Generic Interface Decomposition
                    of Intrusion Detection Systems

                      Event Transmission Interface

                       +------------+                 +-----------+
                       |            |    E-Server     |           |
     Target            |   Event    |    interface    |   Client  |
     Event  ---------->| Generator  |---------------->|   Module  |
     Stream            |  (E-Box)   |                 |           |
  (audit, network      |            |<----------------|   (e.g.,  |
  datagrams, SNMP,     |            |    E-Client     |   A-Box)  |
  application logs,    +------------+    interface    +-----------+
  A-box results, etc.)

                       Analytical Results Interface

                       +----------+               +----------+
            E-Server   |          |  A-Server     |          |
            interface  | Analysis |  interface    |  Client  |
         ------------->|   Unit   |-------------->|  Module  |
                       | (A-Box)  |               |          |
         <-------------|          |<--------------|  (e.g.,  |
           E-Client    |          |  A-Client     |  C-box)  |
           interface   +----------+  interface    +----------+

-----------------------------------------------------------------------


CIDF Specification: Version 0.6          Page 72

========================================================================
= 5.2 Database (D box) APIs
========================================================================

In the CIDF reference model, the database (D) box interfaces with every
other boxes.  For example, depending on the mode of operation, the E box
may potentially keep its audit records in the D box and the A box could
fetch these records, process them, and store the analysis results back
to the D box.  The above scenario is better suited in a non-real time
system where the analysis is done off-line.  However, even in a real-
time system, it may be required for the E box to send messages to both
the A box for analysis and to the D box as a backup for the future
references.  The R box would act according to the analysis results from
the A box (or results fetched from the D box) and log the action report
in the D box for the record.

Since many of the intrusion detection systems are not using a database
management system, a simple log file facility is assumed here as a least
common denominator.  Some major functions that a D box should support
includes: initializing/terminating a session, opening/closing the log
file, writing/reading audit records, searching records according to
certain given key, querying the number of records under a certain
category, etc.  Apart from these basic functions, further advanced
functions which supported by modern database management systems can be
included as options during a query and negotiation stage (discovery of
supported capabilities) between a client and a server.

After reviewing the XDAS specification, we considered this specification
is a good reference for developing basic database APIs to be used in the
CIDF context.  The XDAS specification defined a comprehensive list of
event types.  Some of these event types are very relevant to the
database operation.  The following discussion attempts to develop a set
of CIDF database APIs by using XDAS specification as a reference.

5.2.1 Concept of Operations

According to the CIDF reference model, each of the three components (E,
A, and R boxes) may interact with the database module along its
operation.  In the following discussion, a client-server model is
assumed where the clients (in one of the three boxes) make requests to
the server in a D-box.  Before a client can submit its request, it has
to initialize a session with the server.  The initialization mainly
involves with the authentication and authorization.  Presumably, a
similar initialization procedure should occur before any two CIDF
components have data transaction with each other.  The details of this
procedure will be a part of the security mechanisms that will be dealt
with separately for all of the CIDF components.  Therefore, we will not
further discuss the security related issues here.  As a response to this
initialization function call, the server will return a handle for the
client to proceed with its operation.


CIDF Specification: Version 0.6          Page 73

After a session is initialized, the client uses the handle received from
the initialization function to open a log-file for reading or writing
records.  The "open" API will return a handle to the log-file.  A client
may obtain more than one handle for its database operation, each of
which is independent of any other handles (e.g., multiple log-files, one
handle for each log-file).

Upon the receipt of a handle to the log-file, the client is ready to
either write, read, or search audit records from this log-file.  The
"read" API can potentially get more than one record at a time to
increase the efficiency which is similar to the get_bulk function in the
context of SNMP. The D-box operation should also support simple queries
to find out the number of records exist in the log-file that match
certain criterion.  At the end of the operation, the client should close
the log-file and terminate the session upon exit.

Note that, depending on the need of applications, a D-box implementer
may choose to support only part of the APIs on the list.  For instance,
based on the producer and consumer paradigm, some D-boxes served as an
archive center may only support get (read) but not write API. On the
other hand, some D-boxes acting as a producer may only implement write
API without supporting the get (read) API.

5.2.2 Database Object Format

The D-box is the only "passive" component in the CIDF reference model.
It is considered passive because it does not generate audit records
(event records, analysis results, or response actions) by itself, rather
it plays a supportive role in the reference model.  Therefore, the bulk
of the database object should retain the format(s) of other boxes (the
format issue is yet to be finalized). Without loss of generality, we
assume one single audit record format across E, A, and R boxes.  On top
of each audit record, a D-box should add on the following fields to
construct a database object:

--------------------------------------------------
Originator ID  (IP address, port #)
--------------------------------------------------
Object type  (Event, Analysis, or Response)
--------------------------------------------------
Time stamp  (the time the record was written into the log-file)
--------------------------------------------------
Object length
--------------------------------------------------
Audit record (from E, A, or R box)
....

5.2.3 CIDF Database APIs

In this section, we present a list of database APIs offering database
services to the clients in the other components of the CIDF reference
model.

5.2.3.1 db_initialize_session()


CIDF Specification: Version 0.6          Page 74

Name
 db_initialize_session -- initialize a session with the database
 server

Synopsis
 db_uint32 db_initialize_session (
  db_uint32   *minor_status,
  db_sec_con_t  *security_context,
  db_audit_ref_t  *db_ref,
 );

Description
 The db_initialize_session function initiates a session between the
 CIDF client and the database server.  It validates the security_context
 provided to ensure that the client has been authenticated and is
 authorized to use the database services.

 If successful, the function returns db_ref, a handle to the database
 server, and a status code [DB_S_COMPLETE].  The arguments for
 db_initialize_session() are:

 minor_status (out)
    An implementation specific return status that provides additional
    information when [DB_S_FAILURE] is returned by the function.

 security_context (in)
    A structure defining the security context of the client requesting
    use of the database services.  This is used to authenticate the
    client to the database server and establish the client's
    authorizations.

 db_ref (out)
    The handle to the database server is returned in db_ref.

Return Value
 One of the following status codes shall be returned:

 [DB_S_COMPLETE]: Successful completion.

 [DB_S_INVALID_SECURITY_CONTEXT]: The security context supplied is not
   valid.
 [DB_S_FAILURE]: An implementation specific error or failure has
  occurred.

5.2.3.2 db_open_audit_stream()

Name
 db_open_audit_stream -- open the audit stream (log-file)

Synopsis
 db_uint32 db_open_audit_stream (
  db_uint32  *minor_status,
  db_audit_ref_t  *db_ref,
  db_audit_stream_t *audit_stream_ref,
 );

CIDF Specification: Version 0.6          Page 75


Description
 The db_open_audit_stream function opens the audit stream for reading
 and returns a handle to the audit stream in audit_stream_ref handle.
 A caller may obtain more than one handle to the audit stream, each
 of which is independent of any other handles.

 If successful, the function returns [DB_S_COMPLETE].  The arguments
 for db_open_audit_stream () are:

 minor_status (out)
    An implementation specific return status that provides additional
    information when [DB_S_FAILURE] is returned by the function.

 db_ref (in)
    Handle to the database service obtained from a previous call to
    db_initialize_session.

 audit_stream_ref (out)
    Handle to the audit stream returned by the function.

Return Value
 One of the following status codes shall be returned:

 [DB_S_COMPLETE]: Successful completion.

 [DB_S_FAILURE]: An implementation specific error or failure has
  occurred.
 [DB_S_INVALID_DB_REF]: The handle to the database service is not valid.

 [DB_S_AUTHORIZATION_FAILURE]: The client does not possess the required
  authority.

5.2.3.3 db_commit_record()

Name
 db_commit_record -- write an audit record to the audit stream (log-file

Synopsis
 db_uint32 db_commit_record (
  db_uint32  *minor_status,
  db_audit_ref_t  *db_ref,
  db_audit_stream_t *audit_stream_ref,
  db_audit_rec_desc_t *audit_record_descriptor,
 );

Description
 The CIDF client writes the audit record identified by
 audit_record_descriptor to the current audit stream controlled by
 the database service and accessed by audit_stream_ref.

 If successful, the function returns [DB_S_COMPLETE].  The arguments
 for db_commit_record() are:


CIDF Specification: Version 0.6          Page 76

 minor_status (out)
           An implementation specific return status that provides additi
           information when [DB_S_FAILURE] is returned by the function.

 db_ref (in)
    The handle to the database server, obtained from a previous call to
    db_initialize_session().

 audit_stream_ref (in)
    The handle to the database audit stream obtained from a previous
    call to db_open_audit_stream().

 audit_record_descriptor (in)
    A descriptor referencing an audit record to be written to the audit
    stream.  On successful completion the audit_record_descriptor is no
    longer a valid reference to an audit record.

Return Value
        One of the following status codes shall be returned:

        [DB_S_COMPLETE]: Successful completion.

        [DB_S_INVALID_DB_REF]: The database server daemon handle supplie
  not point to the server daemon.
        [DB_S_INVALID_STREAM_REF]: The handle to the audit stream is not

        [DB_S_INVALID_RECORD_DESCRIPTOR]: The specified audit record des
    is not valid.
        [DB_S_STORAGE_FAILURE]: The audit record cannot be written to st
  storage.
        [DB_S_SERVICE_FAILURE]: There has been a database service failur

        [DB_S_FAILURE]: An implementation specific error or failure has
         occurred.
        [DB_S_AUTHORIZATION_FAILURE]: The client does not possess the re
                authority.


5.2.3.4 db_get_next()

Name
 db_get_next -- read next set of records from a previously opened audit
         stream


CIDF Specification: Version 0.6          Page 77

Synopsis
 db_uint32 db_get_next (
                db_uint32               *minor_status,
                db_audit_ref_t          *db_ref,
                db_audit_stream_t       *audit_stream_ref,
  db_uint32  max_records,
  db_buffer_t  *audit_record_buffer,
  db_uint32  *no_of_records,
        );

Description
        The db_get_next() function copies up to max_records records from
 audit stream accessed by audit_stream_ref into the buffer
 audit_record_buffer.  The actual number of records copied is returned
 in no_of_records.

 If the function successfully reads a record or records from the audit
 stream, the cursor associated with the audit stream referenced by
 audit_stream_ref will be advanced to the next record in the audit
 stream.

        If the call is unsuccessful, the position of the cursor is not c

 If successful, the function returns [DB_S_COMPLETE].  The arguments
        for db_get_next() are:

        minor_status (out)
           An implementation specific return status that provides additi
           information when [DB_S_FAILURE] is returned by the function.

        db_ref (in)
           The handle to the database server, obtained from a previous c
           db_initialize_session().

        audit_stream_ref (in)
           The handle to the database audit stream obtained from a previ
           call to db_open_audit_stream().

 max_records (in)
    The maximum number of records to be returned by the function in any
    one call.

 audit_record_buffer (in)
    Pointer to the buffer to which the audit records are to be copied.

 no_of_records (out)
    The number of records actually copied into audit_record_buffer.

Return Value
        One of the following status codes shall be returned:


CIDF Specification: Version 0.6          Page 78

        [DB_S_COMPLETE]: Successful completion.

        [DB_S_INVALID_DB_REF]: The database server daemon handle supplie
  not point to the server daemon.
        [DB_S_INVALID_STREAM_REF]: The handle to the audit stream is not

        [DB_S_END]: The end of the audit stream has been reached.

        DB_S_FAILURE]: An implementation specific error or failure has o

        [DB_S_AUTHORIZATION_FAILURE]: The client does not possess the re
                authority.

5.2.3.5 db_search_record()

Name
 db_search_record -- search audit records from the audit stream by
       matching key fields in the record

Synopsis
 db_uint32 db_commit_record (
  db_uint32  *minor_status,
  db_audit_ref_t  *db_ref,
  db_audit_stream_t *audit_stream_ref,
  db_audit_rec_desc_t *audit_record_descriptor,
  db_uint32  bit_fields,
  db_uint32  max_records,
  db_buffer_t  *matched_record_buffer,
  db_uint32  *no_of_records,
 );

Description
 The CIDF client writes the audit record identified by
 audit_record_descriptor to the current audit stream controlled by
 the database service and accessed by audit_stream_ref.

 If successful, the function returns [DB_S_COMPLETE].  The arguments
 for db_search_record() are:

 minor_status (out)
           An implementation specific return status that provides additi
           information when [DB_S_FAILURE] is returned by the function.

 db_ref (in)
    The handle to the database server, obtained from a previous call to
    db_initialize_session().

 audit_stream_ref (in)
    The handle to the database audit stream obtained from a previous
    call to db_open_audit_stream().

 audit_record_descriptor (in)
    A descriptor referencing an audit record to be used for searching
    the audit stream.


CIDF Specification: Version 0.6          Page 79

 bit_fields (in)
    Assuming there are 32 or less fields in an records.  The bits of the
    corresponding fields are set when these fields are used as keys for
    the search.

 max_records (in)
    The maximum number of records desired to be returned by the search
    function in any one call.

 matched_record_buffer (in)
    Pointer to the buffer to which the matched records are to be copied.

 no_of_records (out)
    The number of records actually copied into matched_record_buffer.

Return Value
        One of the following status codes shall be returned:

        [DB_S_COMPLETE]: Successful completion.

        [DB_S_INVALID_DB_REF]: The database server daemon handle supplie
  not point to the server daemon.
        [DB_S_INVALID_STREAM_REF]: The handle to the audit stream is not

        [DB_S_INVALID_RECORD_DESCRIPTOR]: The specified audit record des
    is not valid.
        [DB_S_FAILURE]: An implementation specific error or failure has
         occurred.
        [DB_S_AUTHORIZATION_FAILURE]: The client does not possess the re
                authority.

5.2.3.6 db_query()

Name
 db_query -- inquiry the status of the audit stream without fetching the
 records as in the case of db_search_record().

Synopsis
 db_uint32 db_query (
  db_uint32  *minor_status,
  db_audit_ref_t  *db_ref,
  db_audit_stream_t *audit_stream_ref,
  db_audit_rec_desc_t *audit_record_descriptor,
  db_uint32  bit_field,
  db_uint32  range_low,
  db_uint32  range_high,
  db_uint32  *no_of_records,
 );

Description
 The CIDF client inquiries the status of the audit stream accessed by
 audit_stream_ref.

 If successful, the function returns [DB_S_COMPLETE].  The arguments
 for db_query() are:

CIDF Specification: Version 0.6          Page 80


 minor_status (out)
           An implementation specific return status that provides additi
           information when [DB_S_FAILURE] is returned by the function.

 db_ref (in)
    The handle to the database server, obtained from a previous call to
    db_initialize_session().

 audit_stream_ref (in)
    The handle to the database audit stream obtained from a previous
    call to db_open_audit_stream().

 audit_record_descriptor (in)
    A descriptor referencing an audit record to be used for searching
    the audit stream.

 bit_field (in)
    Assuming there are 32 or less fields in an records.  The bit of the
    corresponding field is set when this field is used as a reference
    for the query.  When it is zero, it inquiries the total number of
       records in the audit stream.

 range_low (in)
    When the reference field is numerical, db_query() allows a range to
    be set.  The variable range_low sets the low end of the range.  If
    the field is non-numerical, range_low is set to be -1.

 range_high (in)
    When the reference field is numerical, db_query() allows a range to
    be set.  The variable range_high sets the high end of the range.  If
    the field is non-numerical, range_high is set to be -1.

 no_of_records (out)
    The number of records exist in the audit stream which match the
    query criterion.

Return Value
        One of the following status codes shall be returned:

        [DB_S_COMPLETE]: Successful completion.

        [DB_S_INVALID_DB_REF]: The database server daemon handle supplie
  not point to the server daemon.
        [DB_S_INVALID_STREAM_REF]: The handle to the audit stream is not

        [DB_S_INVALID_RECORD_DESCRIPTOR]: The specified audit record des
    is not valid.
        [DB_S_FAILURE]: An implementation specific error or failure has
         occurred.
        [DB_S_AUTHORIZATION_FAILURE]: The client does not possess the re
                authority.

5.2.3.7 db_close_audit_stream()


CIDF Specification: Version 0.6          Page 81

Name
 db_close_audit_stream -- close the audit stream (log-file)

Synopsis
 db_uint32 db_close_audit_stream (
  db_uint32  *minor_status,
  db_audit_ref_t  *db_ref,
  db_audit_stream_t *audit_stream_ref,
 );

Description
 The db_close_audit_stream function closes the audit stream, previously
 opened for reading, specified by the  audit_stream_ref handle.

 If successful, the function returns [DB_S_COMPLETE].  The arguments
 for db_open_audit_stream () are:

 minor_status (out)
    An implementation specific return status that provides additional
    information when [DB_S_FAILURE] is returned by the function.

 db_ref (in)
    Handle to the database service obtained from a previous call to
    db_initialize_session.

 audit_stream_ref (out)
    Handle to the audit stream which is to be closed.

Return Value
 One of the following status codes shall be returned:
Top Authors In Last 30 Days

Red Hat 285 files
Ubuntu 66 files
Debian 23 files
LiquidWorm 10 files
Valentin Lobstein 10 files
nu11secur1ty 7 files
Google Security Research 4 files
Milad Karimi 4 files
Jann Horn 3 files
Lennert Preuth 3 files
File Archives

Systems

AIX (429)
Apple (2,078)
BSD (376)
CentOS (58)
Cisco (1,927)
Debian (7,025)
Fedora (1,693)
FreeBSD (1,246)
Gentoo (4,467)
HPUX (880)
iOS (373)
iPhone (108)
IRIX (220)
Juniper (69)
Linux (49,485)
Mac OS X (691)
Mandriva (3,105)
NetBSD (256)
OpenBSD (488)
RedHat (15,706)
Slackware (941)
Solaris (1,611)
SUSE (1,444)
Ubuntu (9,480)
UNIX (9,394)
UnixWare (187)
Windows (6,653)
Other
cidf.txt

cidf.txt

File Archive:

May 2024

Top Authors In Last 30 Days

File Tags

File Archives

Systems

cidf.txt

Share This

cidf.txt

File Archive:

May 2024

Top Authors In Last 30 Days

File Tags

File Archives

Systems