STEP: Extensible Program Trace Encodingupdated: May 13, 2003 |
Program tracing is a common technique employed by software and hardware developers who are interested in characterizing the dynamic behavior of complex software systems. However, despite the popularity of trace-driven analyses, there are surprisingly few options for encoding trace data in a standard format. In the past, many developers have resorted to creating their own ad-hoc trace encoding solutions, tailored specifically to the data they are considering. Such efforts are usually redundant, and in many cases lead to an obscure and poorly documented trace format which ultimately limits the reuse and sharing of potentially valuable information.
The STEP system was created to address this problem by providing a standard method for encoding general program trace data in a flexible and compact format. The system consists of a trace data definition language along with a compiler for the language and an encoding architecture that implements a number of common trace reduction techniques. The system simplifies the development and interoperability of trace clients by encapsulating the encoding process and presenting the data as an abstract object stream.
This is the common pictoral overview of the STEP framework. Trace data from a variety of sources is collected and converted to a common object format. The interface is generated by defining the record types with STEP-DL and compiling the definitions to get Java classes. The definitions are also used to create encapsulated strategies for encoding the various records. The trace records are written as a .step file which is often already quite compact, but is also designed to work well with gzip and bzip2. The .step file can then be unpacked for a variety of uses.
You can browse the current javadoc API documentation for STEP on-line. The API for specific versions can be obtained from the downloads section.
A detailed description of STEP is available in the form of a M.Sc. thesis draft.
Abstract:
Program tracing is a common technique employed by software and hardware developers who are interested in characterizing the dynamic behavior of complex software systems. However, despite the popularity of trace-driven analyses, there are surprisingly few options for encoding trace data in a standard format.In the past, many developers have resorted to creating their own ad-hoc trace encoding solutions, tailored specifically to the data they are considering. Such efforts are usually redundant, and in many cases lead to an obscure and poorly documented trace format which ultimately limits the reuse and sharing of potentially valuable information.
The STEP system was created to address this problem by providing a standard method for encoding general program trace data in a flexible and compact format. The system consists of a trace data definition language along with a compiler for the language and an encoding architecture that implements a number of common trace compaction techniques. The system simplifies the development and interoperability of trace clients by encapsulating the encoding process and presenting the data as an abstract object stream.
This thesis presents a detailed description of the STEP system and evaluates its utility by applying it to a variety of trace data from Java programs. Initial results indicate that compressed STEP encodings are often substantially more compact than similarly compressed naive formats.
> full text: [ PDF ] [ PostScript ] > double-sided text: [ PDF ] [ PostScript ] > BIBTEX reference A more concise document that describes STEP is a paper presented as part of the 2002 ACM Workshop on Program Analysis for Software Tools and Engineering (PASTE).
> full text: [ PDF ] [ PostScript ] > presentation slides: [ PDF ] > BIBTEX reference The original PASTE submission is also available as Sable Technical Report 2002-7.
> full text: [ PDF ] [ PostScript ]
STEP was originally introduced as part of the STOOP framework in a poster presented at OOPSLA '01.
> poster: [ Encapsulated PostScript ] > abstract: [ PDF ] [ PostScript ] This was followed by a technical report on STOOP.
> full text: [ PDF ] [ PostScript ]
STOOP then fragmented into a number of separate projects that aim to use STEP as a common trace format. See the related work section for details.
The latest release of STEP is version 0.9.2.
STEP is indented as a reaserch tool to promote the sharing and reuse of trace data. You are welcome to download and use STEP free of charge. You may also modify the included source code in accordance with the GNU General Public License, under which STEP is released.
0.9.2 The release bundles the source plus pre-compiled binaries, along with several examples.
Please consult the README and license files included in the release.> step-0.9.2.tar.gz (484 KB)
> step-0.9.2.jar (340 KB)You can browse the API documentation on-line, or download it in .tar.gz or .jar format. > step-0.9.2_docs.tar.gz (427 KB)
> step-0.9.2_docs.jar (136 KB)
A number of other projects are related to STEP, either as trace producers or consumers (or both).
EVolve is a system for visualizing trace data in a variety of ways.
More information is available at: http://www.sable.mcgill.ca/evolve/
JIL is an XML based system for representing annotated versions of Java intermediate representations. Using the JIMPLEX browser, JIL documents can display a combination of static and dynamic program attributes.
More information is available at: http://www.sable.mcgill.ca/jil/
Soot is a Java bytecode transformation framework. It provides a basis for creating customized instrumentation systems and/or profile-driven compiler transformations.
More information is available at: http://www.sable.mcgill.ca/soot/
In order to perform meaningful experiments in optimizing compilation and run-time system design it is beneficial to quantify the behavior of programs with a concise and precisely defined set of dynamic software metrics.
More information is available at: http://www.sable.mcgill.ca/metrics/