[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Tightening Exception-based Control flow graph



John Plond JORGENSEN wrote:

<snip ..snip>

Following is a snippet from section 2.4 of your report.

<quote>
In the absence of exceptions, the meaning of nodes and edges in a method's control flow graph is clearly established:
 - there is a node for each instruction in the intermediate representation of the method;
 - a directed edge e from node m to node n means that after the instruction corresponding to m is executed, the instruction
   corresponding to n may be the next to execute (will be the next to execute if there is only one edge out of m);
 - if more than one edge leaves a node, the node must be some sort of conditional branch (an if or switch), so the value of
   the instruction's condition will determine which edge is taken.
 - if edge e is taken, all the work of m's instruction is completed before n's instruction begins execution.

The last two points cease to be true in the presence of exceptions.
</quote>
I have a comment regarding the last 2 bullets. In general, a control statement determines which of it's successors gets control after it completes. Given this notion an exception raising statement is a control statement. The definition was confined to control constructs such as if, while, switch, ... in the days earlier to languages supporting exceptions. So, bullet 3 is untrue in the presence of exception only if the way the successor is determined by a control statement is defined as dependent on a boolean expression. Not otherwise. As for the last bullet, asynchronous exceptions can ensure that this bullet is true. So in my opinion, exceptional edges are different from normal control flow edges but they are still control flow edges, and any statement that can raise an exception is a valid control statement.



vranganath> About the type of the exception, I do not see how vranganath> the catching of the most generic type of vranganath> exception will influence the proposed changes. vranganath> In a nested try block catching Throwable, any vranganath> exception thrown in it will be trapped by it's vranganath> handler before propogating it to the enclosing vranganath> handler (if any and if at all it does propogate). vranganath> Just based on Type equality and try-catch vranganath> semantics it should be safe to remove such vranganath> dubious nesting.

The type of exception matters because no matter what the first
statement is in the inner handler, it might throw an asynchronous
exception, which would be caught by the outer handler. That means
that the CFG has to include an edge from each of the throwing
statement's predecessors to both handlers, not just the inner
handler, since the next statement to complete execution after the
exception is thrown might be in the outer handler rather than the
inner one.

Well, independent of the type or sort of exception that may occur at the first statement of the inner handler, if the inner handler was covered by the outer catch block, this exception will be caught by the outer handler. The structured try-catch constructs in Java ensures this.


>> As for code generated for other try blocks, we want to ensure
>> that we do not lose the ability to recognize that in the
>> following code it is possible for "b" to get set to 1 even though
>> "a" remains 0:
>> >> static int a = 0;
>> static int b = 0;
>> >> void tryUpdate(int[] values, int index, >> LockObject inner, LockObject outer) {
>> try {
>> try {
>> inner.value = index; //A
>> int i = values[index]; //x
>> outer.value = i; //y
>> } catch (Throwable t) {
>> a = 1; //B
>> }
>> } catch (Throwable t) {
>> b = 1; //C
>> }


vranganath> I do not quite understand this too. Could you
vranganath> please elaborate how the proposed changes will
vranganath> prevent this situation? Cos, the following would
vranganath> be the jimple after the proposed changes and it
vranganath> does not change the order or the way exceptions
vranganath> will be handled except for the pruning the try
vranganath> blocks.


The example was supposed to provide an intuitive illustration of
why there needs to be a CFG edge from the statement I've now
labeled A to statement C as well as to B.  The edge from A to C
allows a dataflow analysis to see that it is possible for "index"
to be assigned to "inner.value" and 1 to be assigned to "b"
without any new values being assigned to "i", "outer.value", or
"a" (which is what happens if "values[index]" throws an
ArrayIndexOutOfBoundsException and then some asynchronous
exception occurs just before the assignment to a).  Of course the
CFG is actually built out of Jimple statements, where other
nuances exist, but the idea is most easily illustrated with Java
source.

Consider the following CFG: A -> B, x -> B, y -> B, B -> C

Let's consider your scenario, an exception at x will transfer control to B, but before the B begins an asynch exception occurs and transfers control to C and C completes. As the exception occurs after A, index was assigned to inner.value at A and 1 was assigned to B. So, the given graph gives enough information to capture your scenario. On the other hand, with the original graph,

	A -> B,	A -> C,
	x -> B, x -> C,
	y -> B, y -> C
	B -> C

The data flow analysis will indicate that the assignment to index.value at A is equally eligible to be visible visible at B and C which is incorrect by the exceution semantics of JVM.

vranganath> Can you please explain the above 2 scenarios,
vranganath> with a Jimple example, as to why the proposed
vranganath> changes is unsound?


For my purposes, since I am trying to build CFGs for arbirtrary
methods, there really isn't any difference between the 2
scenarios (i.e. exception handlers generated to implement
synchronized blocks and exception handlers generated from
explicit try/catch blocks in Java source). I focused on the
try/catch example because there it is plain why I want the edges
from inner block statements to both handlers. In the synchronized
block example, where the handlers just exit monitors, it is less
obvious (though the existence of the edges from the inner try block
to the outer handler for code generated by pre-1.3 javac does
illustrate what was wrong with that code: the inner monitor might
not be exited if an exception occurs at the wrong instant).

As a Jimple example, I will use the Jimple you present as your
translation for tryUpdate() (where the effect of your
transformation is to change the range of the second "catch" from
"from label0 to label3" to "from label1 to label3").  Again, I
have added labels as comments:

    vranganath>       label0:
    vranganath>          r2.<t: int value> = i0; //A
    vranganath>          i1 = r1[i0]; //B
    vranganath>          r3.<t: int value> = i1; //E

    vranganath>       label1:
    vranganath>          goto label3;

    vranganath>       label2:
    vranganath>          $r4 := @caughtexception; //C
    vranganath>          r5 = $r4;
    vranganath>          z2 = 1;

    vranganath>       label3:
    vranganath>          goto label5;

    vranganath>       label4:
    vranganath>          $r6 := @caughtexception; //D
    vranganath>          r7 = $r6;
    vranganath>          z3 = 1;

    vranganath>       label5:
    vranganath>          return;

    vranganath>          catch java.lang.Throwable from label0 to label1 with label2;
    vranganath>          catch java.lang.Throwable from label1 to label3 with label4;

There should be a CFG edge from statement A to statement D, to
reflect the fact that if statement B throws an exception that is
caught at label2, but an asynchronous exception immediately
arrives and is caught by label4, then it is possible for new
values to be assigned to r2.value and $r6 without any new values
being assigned to i1, or $r4.  Similarly there needs to be an
edge from B to D to reflect what could happen if exceptions occur
during the execution of statements E and C.

Please refer to the previous CFG explanation.


(I'm fudging something a bit here.  The "$r4 := @caughtException"
statement is really some Jimple bookkeeping which does not
correspond directly to anything in the input bytecode. So there
wouldn't be any harm in saying that our imagined asynchronous
exception cannot occur until after C completes, in which case you
wouldn't need the edge directly from A and B to D. But the CFG
building code must deal with any of Soot's intermediate
representations, not just Jimple, and in at least one of those
representations, Baf, the first statement in an exception handler
does correspond to real code.)

I think the same suggestion would be applicable to any intermediate representation as it based on language and JVM semantics and not based on the IR semantics.



It's true that the Soot you're working with will not produce the
edge from A to D. That's because when its CompleteUnitGraph
connects the predecessors of an excepting statement to the
catching handler, it fails to notice cases where the first
statement in the handler might itself throw an exception that
gets caught within the method. I fixed that by computing the
transitive closure of cascading exceptions (or at least I tried
to).


I fed your Jimple to my graph constructor, and, sure enough,
there are edges from A to D as well as C.  And this despite the
fact that for this example the "TrapTightener" pass which
precedes my CFG builder (a kludge to prevent verification errors)
actually makes exactly the same change to the trap ranges as you
do (see section 3.4.4 of my report for information about the
TrapTightener).  I had actually forgotten about the TrapTightener
until I ran the example, so I haven't thought through whether it
would necessarily make identical changes to Trap ranges as you
do.

So, to summarize this rambling reply, I believe it is true that
your proposal does not, in itself, change the behaviour of output
code. With the existing CFG implementations, though, it does
cause dataflow analyses to be less accurate than they would
otherwise be, because of the missing edges from the inner try
block to the outer try block. With my CFG implementation, the
edge would not be missing, so your modification would trim the
output exception tables, but not simplify the CFGs as you want it
to.

Don't get me defensive, I believe the above CFG example does illustrate the edges are trimmed yet the semantics of the program are preserved.


Hope this clarifies,

--

Venkatesh Prasad Ranganath,
Dept. Computing and Information Science,
Kansas State University, US.
web: http://www.cis.ksu.edu/~rvprasad