[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Limiting whole-program analysis



On Wed, Jun 02, 2004 at 10:23:07PM -0500, Venkatesh Prasad Ranganath wrote:
> My understanding based on CC03 paper (on SPARK) is that SPARK is used to 
> calculate Points-to information and construct a precise call graph, if 
> configured.  So, when the below numbers indicate SPARK takes 140+ 
> seconds, I am wondering if it's just to calculate points-to information 
> (as in variable X points to M, N, and O abstract allocation objects) or 
> is something more being calculated?  If it's just points-to information 
> then is there way of viewing points-to information to compare it with 
> information generated from other points-to analysis implementation?  
> Also, is there a way to view call graph information for purpose of 
> comparison?

There are three main things being done while Spark runs:
1) Propagating points-to sets from allocation sites to all pointers.
2) Resolving virtual method calls based on points-to sets of their
receivers.
3) Loading methods that become reachable and converting them to the
Jimple IR.

Traditionally, part 3) has been taking the bulk of the time. Lately,
I've been playing with more interesting analyses (adding context
sensitivity), and not always worrying about raw speed. As a result, it
appears part 2) is now quite significant, and seems to be slowing things
down a lot. Part 1) still takes only a minor part of the time.

With some performance tuning, part 2) could probably be sped up so that
the overall time would be cut in half. From the sound of it, though,
that would not be enough of an improvement for you to be worth the
effort.

As for comparing the points-to information with other implementations,
this is difficult because different analyses have different abstractions
of pointers and objects being pointed to, because they use different
intermediate representations, and because the information is so big.
Spark has several methods of dumping out the information, but I've found
that for comparison, it's usually best to come up with your own method
depending on the representation that you want to compare it with.

Ondrej