[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

static analysis, worklist algorithm question



Hello,
I've implemented a customized version of flow-sensitive alias and
side-effects analyses using a FlowAnalysis framework in soot-1.2.2

(/soot-1.2.2/soot/src/soot/toolkits/scalar package)

Everything works really great for me, except for processing of while
loops. I haven't looked into the implementation of the worklist
algorithm you guys are using but by running it, it
seems to me that when a body of a while-loop is processed, it is
initially processed without any prior knowledge of data, for example,
aliases in alias analysis. It is safe but unfortunately produces imprecise
data. I'll give a small example,

    public void smallWhileTest(UnitFields p1)
    {
	UnitFields temp = p1;
	while(temp!=null)
	    temp = temp.next;

	temp.unitField = null;
    }

Using my side-effects analysis (that keeps track of certain locations, for
example, parameters), I want to produce a result saying that
parameter p1 is side-effected through the last statement.

Here is jimple:
    public void smallWhileTest(UnitFields)
    {
        Env r0;
        UnitFields r1, r2;

        r0 := @this: Env;
        r1 := @parameter0: UnitFields;
        r2 = r1;
        goto label1;

     label0:
        r2 = r2.<UnitFields: UnitFields next>;

     label1:
        if r2 != null goto label0;

        r2.<UnitFields: UnitFields unitField> = null;
        return;
    }

After processing of statement:  r2 = r1 the analysis has data

Unit:     r2 = r1
PointsTo flowThrough result set:
{r2=[@parameter0: UnitFields]
, r1=[@parameter0: UnitFields]
}

Then, the body of a while loop is processed:

Unit:     goto [?= if r2 != null[?= r2 = r2.<UnitFields: UnitFields next>]]
InSet:
{r2=[@parameter0: UnitFields]
, r1=[@parameter0: UnitFields]}

The InSet contains the aliases for r2, however, the first pass over
statement  r2 = r2.<UnitFields: UnitFields next> is done without such
knowledge:

PointsTo Analysis assignment unit: r2 = r2.<UnitFields: UnitFields next>
unknown base r2
genSet: {r2=[unknown(UnitFields)]}

Summary for this statement:

**********Unit:     r2 = r2.<UnitFields: UnitFields next>
**********InSet:    {}
**********KillSet:  {}
**********GenSet:   {r2=[unknown(UnitFields)]

What happened to InSet? It's seems empty.

The next several passes produce the desired data

{r2 = @parameter0,  next}
{r2 = @paramter0, next, next}
...
and so on up to the K-limiting value

Of course, if I simply don't record {r2=[unknown(UnitFields)]} into my
DataSet, I get the results I want. Unfortunately, I need to record unknown
locations, as I may have a program with a side-effects statement

       r.<Type: Type field> = val;

and on some paths r points to a parameter location (that I want to trace)
and on some other paths r points to something else (that I don't want to
track but can safely approximate by calling it "unknown"). In the example
I showed, r2 ALWAYS points to a parameter location and it would be nice if
the analysis could produce the precise data in this case.

Finally, questions: Making a pass over the loop statements without any
prior knowledge of data: is it a feature or a flaw of the analysis
framework? Or is it possible that it's a bug in my code that simply writes
transfer functions and ships them to the soot framework to produce a
solution?

Is it possible to have a framework where the initial pass over the loop
statements is done considering the previously obtained data (or the initial
"dummy" pass is overwritten by the following passes)?

I know you guys have Spark tool now, but I haven't looked
at it yet. How does Spark handle these cases?

Thank you very much,
sorry for the long mail,
Oksana.