[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

A proposal to change Soot [Long one]




Hi,


I use Jimple in Soot 2.0.1 as the intermediate representation in my project.  It has provided all features that were required
for my work up until now. Recently, I was working on a program transformation that alters method bodies.  I have seen a few
postings on this mailing list concerning such transformations.

Given the "destructive" nature of these transformations, "how can the relation between the untransformed and transformed code
of the system be maintained?" becomes an important question.  The relation can be captured during transformation, but
nevertheless this information is useful to the user only if the artifacts appearing in the relation indeed exists after the
transformation.  The most simple solution to this problem would be to construct a transformed system from scratch as the
transformation proceeds.  This is not "the" solution, but a favourable one in many situations.  Here is where the current
architecture of Soot fails to deliver.  In other words, things get messy when trying to use Soot in such situation.

The reason Soot fails in such situation is it uses a static wiring between various algorithmic components in the Soot.  In
terms of code, there is one instance of class G that is available via G.v() and this method is called in almost all places in
Soot to retrieve any algorithmic component on which a component is dependent.  In the light of the above problem, when a new
class is created via Scene.loadClassAndSupport(), a call chain containing SootResolver().resolveClassAndSupportClasses() ends
up calling Scene.v() to retrieve the scene into which the new class should be loaded.  This is where things go awry.  Although
the application could have created 2 different Scene object via "new Scene(null)", the loadClassAndSupport() calls on both
these instances ends up loading the classes into the Scene object available via Scene.v()->G.v().Scene() which is a singleton
contained in the singleton instance of G.  In short, one cannot maintain 2 different SootClasses of a class that differ in
their internal content.

One solution to the above problem is that the application will have add the feature that enables the change the singleton
instance of G returned via G.v().  However, G.instance, the reference to the singleton of G in G, is declared to be private
which means that the application will have to alter G implying each application will have to maintain a customized version of
Soot, hence, making way for a future maintainance nightmare.

I propose the following simple solution to avoid customized distributions of Soot.  The solution would be to add a new set
method (the code follows) that enables the application to switch the singleton instance.

	public static G set(G soot) {
	        G result = instance;
        	instance = soot;
	        return result;
    	}

Given the above method, one would do the following to manage 2 different sets of SootClasses.

	i = Scene.v();
	iC = i.loadClassAndSupport("java.lang.Object");
	iG = G.set(new G());
	j = Scene.v();
	jC = j.loadClassAndSupport("java.lang.Object");
	jG = G.v();

The Scenes referred to by i and j are different.  So, one would switch the G instances before doing any operation that will
change the underlying data structure in a significant way.  Such operations would be loading new classes, creating new Type
instances and such.

The above solution should suffice in cases where operations on a version is lumped together.  Say, loading/creation of all
classes pertaining to untransformed system and transformed system happen at different times with no interleaving or overlaps.
 However, things get messy in case the transformation use the untransformed system in bits and peices to create the
transformed system.  This will lead to lots of singleton switches depending on the operations on the components of Soot.

A more elegant solution to this would be to make the entities "Environment" aware.  So, instead of the infamous G.v() or
Scene.v() calls the entities would do a v() call on their environment. Hence, the environment would provide the necessary
system instance information such as the Scene instance in which a class was created.  This is a big change.  For example, the
to create a RefType one would need to provide the Environment in which the RefType should be created.  However, upon casual
source browsing of Soot, it seems that such a change should be easy to instrument (don't frown, it's just a speculation).
From the point of view of the individual applications that are currently using Soot, the change can be done as an extension.
The extension will be to enable support for an environment-based approach which can be used if needed.

This can have a big effect on the use of Soot in Eclipse because as users of Eclipse would be interested in program
transformations available in Eclipse, the user would also be interested in ways to visualize the transformation and it's result.

Well, I have put forth the proposal for review by the Soot community.  In case, there are a bunch of people wanting such
support and the Soot team welcomes these changes, I volunteer to take a first stab at injecting this support into Soot 2.0.1
(as available via the downloads page or probably the CVS version provided I get access to it) and others can contribute as
things shape up.  Even before getting any closer to code or how we can accomplish this change, it would be wise to discuss
about it in the Soot community.

waiting for reply,

--

Venkatesh Prasad Ranganath,
Dept. Computing and Information Science,
Kansas State University, US.
web: http://www.cis.ksu.edu/~rvprasad