[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Cheap low-level info?
Hi Steve,
This is an interesting question. Soot takes 11 seconds (on an old
decrepit PII-400, I haven't tried it on newer hardware) to load and
instantiate all of the class information (SootClass, SootMethod,
SootField) through Coffi. This happens from SootResolver. I touched this
code once, but I don't remember it too well.
From looking at the code, it looks like if you parse Jimple, it does
create the AST in all cases. I can't remember how long it takes resolve
classes from Jimple. As for coffi, which is the default input method, we
don't have the bodies of the methods in that 11 seconds. It doesn't read
them in or instantiate any of that stuff.
We haven't done any inexpensive whole-program analyses with Soot, as far
as I know, so that's not a direction we've investigated so far. We've
just taken the hit and constructed Jimple before doing any analysis.
On Tue, 30 Oct 2001, Stephen Andrew Neuendorffer wrote:
> So, the question is: are there ways of doing meaningful whole program
> analysis without
> creating a syntax tree? Yes! The question is what information can I get
> cheaply
> directly from the bytecode. Context classes provide some information (like the
> methods and interfaces) without syntax trees for the bodies of any methods.
> I think there is other information that can be extracted as well:
> 1) Method call graphs.
> 2) Fields that are lvalues. (This is important when doing dataflow through
> fields in the context of method calls.)
> 3) Whether or not reflection is used. (Important if you want to ensure the
> validity of optimizations based on class hierarchy)
> 4) Imported classes. (Think: An abstraction of a method call graph that
> treats all the methods in a class as a single node)
>
> These are all useful abstractions of a method body that do not require the
> memory overhead of a syntax tree.
>
> Any ideas on how to build this into soot? From what I've seen, the
> bytecode is gone after load the class from coffi, correct?
In the resolving phase, the bytecode is not actually loaded. When you
create Jimple, the bytecode is loaded and discarded.
I think that if you want bytecode handling in Soot, you will have to
implement the class -> Baf pathway. It's unclear to me how long it might
take to get Baf from a classfile. Coffi is fast, though, so there is some
hope that creating Baf might not be too slow. It will definitely take
some engineering effort to convert classfiles to Baf, and you don't get to
know how fast that will be until you do it. There are hacks you can then
do to speed it up; I don't imagine that you'd actually have to do the
memory allocation for Baf objects if you don't actually need to have the
Baf in memory.
Ole Ostergarrd <olef@daimi.au.dk> was also indicating some interest in
having the bytecodes be directly accessible. Perhaps you can talk to him
to get a system implemented.
Finally, if you came here to Cambridge MA to plan out how Soot should be
extended to support direct input of Baf, I think we could get a lot of the
design done in a few days.
pat