[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Java line code of NewExpression



On Thu, Feb 12, 2004 at 02:02:30AM +0000, wassim masri wrote:
> Dear Ondrej,
> 
> I am hoping that you could tell me whether what I am trying to achieve is 
> doable using Soot/Spark.
> 
> Consider the code below:
> 
> <snip>
> 
> The additional things that I am looking for are:
> 1) The mapping between r3 and o1 (or index 2). I need to know what r3 really 
> mean.
> 2) The mapping between r4 and o2 (or index 3). I need to know what r4 really 
> mean.

The Jimple intermediate language was designed with the primary goal of
being easy to analyze and transform (for details, please see the CC
2000 paper, available at www.sable.mcgill.ca/publications). This design
has the following consequences. Jimple has no stack; everything is in
explicit variables. This means that not all Jimple variables correspond
to bytecode variables (some correspond to stack locations). It also
means that variables are split along def/use relationships, so that
values stored in stack locations (and variables) that are later reused 
for other values do not get confounded with these other values. Because
of this, a bytecode variable may correspond to multiple Jimple
variables. So, in general, there is no simple mapping between Jimple and
bytecode variables.

If you absolutely need to map Jimple to bytecode variables, the first
step would be to modify the Jimple representation so that there is more
of a correspondence between Jimple and bytecode. Fortunately, Soot first
produces naive Jimple, which is a direct translation of the bytecode,
and later produces the final Jimple in several passes. Therefore,
getting this modified bytecode-like Jimple should be just a matter of
disabling some of these passes and giving the appropriate options to the
remaining ones. The passes in question are part of the jb phase, and
they are documented in the phase options tutorial on the Soot tutorials
web page. Let us know if you find any of this documentation unclear.
I would recommend making the local splitting work on stack variables
only, and disabling local packing.

Of course, disabling these passes may reduce the precision of the
analyses that Soot does on the code, the declared type analysis and the
points-to analysis in particular. However, at least in theory, these
analyses should still work and produce sound, if imprecise results.

Once you have a situation where a simple mapping between Jimple and
bytecode variables exists, actually finding the mapping will become much
simpler. I don't know the part of the code that labels the Jimple
variables, but I imagine they would be numbered in the same order as
they are in the bytecode. You will have to look in the implementation,
however; the numbering scheme is not specified.

I feel compelled to repeat the earlier suggestion of implementing your
transformation on Jimple rather than bytecode, despite your statement
below that this is not an option. Not only will you not need to worry
about mapping variables between different representation, but you will
benefit from Jimple's key design goal of being easier to transform than
bytecode.

> 3) The line number in Java or byte code where 'new Object1' and 'new 
> Object2' occurred, in case there were multiple occurences of them in the 
> same method.

Soot uses the Tag interface to attach information to parts of the
intermediate representation (such parts implement the interface Host).
In particular, source line number and bytecode offset Tags are attached
to Jimple Stmts. In your Java code, you can query these Stmts for their
Tags using the Host interface. The tagging mechanism is documented in
the CC 2001 paper, several of the tutorials, and the last part of the
PLDI tutorial. I don't think any of these mention the line number and
bytecode offset tags explicitly, but they should give you an idea of
how things work.

You'll need to write some code to generate a map from the NewExprs to
the Stmts in which they appear (by iterating over all Stmts). You can 
then look up the Stmt for a NewExpr in this map, and query it for the
Tag.

> I really hate to reinvent the wheel and build my own tool using BCEL if the 
> above could be achieved
> using your tool. (If you are interested I can brief you on why I need those 
> features)
> 
> 
> Notes:
> 1) I am not using Eclipse

None of the above is specific to Eclipse. Eclipse reads the Tags
produced by Soot to display analysis results; however, it is only one
user of the tags and the tags are in no way specific to Eclipse.

> 2) I had the keep-line-number and keep-bytecode-offset options turned on but 
> I was not sure
> where I could see their effect (is it in Eclipse?)

There is no user-visible effect, except in code interacting with Soot,
which can query Hosts for the Tags.

> 3) Rewriting my own analysis/instrumentation work using Soot is not an 
> option since I invested over a year's work in BCEL (and I plan to finish my 
> disseration by this summer!)

Assuming your work is reasonably modular, perhaps it could be easily
ported to Soot. If the complexity is in the analysis/transformation
itself, it is presumably separated from the BCEL interaction code.
If instead the complexity is in dealing with BCEL, you may find dealing
with Soot instead removes much of the complexity.

Ondrej

> 
> 4) I am definitely willing to invest time to add features to Soot it doesn't 
> currently provide the functionality I am looking for.
> 
> Your help is really appreciated
> Thank you,
> Wes
> 
> 
> 
> 
> >From: Ondrej LHOTAK <olhotak@sable.mcgill.ca>
> >To: wassim masri <qds1@hotmail.com>
> >CC: soot-list@sable.mcgill.ca
> >Subject: Re: Java line code of NewExpression
> >Date: Tue, 10 Feb 2004 11:39:47 -0500
> >
> >The command-line switches in the Input Attribute Options section (see
> >http://www.sable.mcgill.ca/soot/tutorial/usage/index.html), you can
> >get Soot to attach tags to each statement with the information you are
> >looking for. As for mapping the NewExpr back to the statement in which
> >it occurs, Soot does not do this for you; you need to keep track of it
> >yourself.
> >
> >Ondrej
> >
> >On Tue, Feb 10, 2004 at 12:49:20PM +0000, wassim masri wrote:
> > > When printing the return Set of a reachingObjects() call
> > >
> > > I get output that looks as follows:
> > >
> > > "
> > > AllocNode 17 new Object1 in method <Test1: Object1 getObject(boolean)>,
> > > AllocNode 16 new Object2 in method <Test1: Object1 getObject(boolean)>,
> > > ....
> > > "
> > >
> > > This output gives the information about the NewExpression and in which
> > > method it occured.
> > >
> > > Is there a way to get more info like: which line of Java code (or 
> >bytecode)
> > > it occurred?
> > >
> > > Thanks,
> > > Wes
> > >
> > > _________________________________________________________________
> > > Find great local high-speed Internet access value at the MSN High-Speed
> > > Marketplace. http://click.atdmt.com/AVE/go/onm00200360ave/direct/01/
> > >
> 
> _________________________________________________________________
> Check out the great features of the new MSN 9 Dial-up, with the MSN Dial-up 
> Accelerator. http://click.atdmt.com/AVE/go/onm00200361ave/direct/01/
>