For input problem: <div class="snippet-clipboard-content notranslate position-rela

I think this is a serious issue, so <a class="user-mention notranslate" data-hovercard

With passing both -no-optimize and <code class="notra

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Erroneous CFA generation vol 2 about gazer HOT 13 CLOSED

ftsrg commented on July 18, 2024

Erroneous CFA generation vol 2

from gazer.

Comments (13)

as3810t commented on July 18, 2024

I think this is a serious issue, so @sallaigy @radl97 can one of you take a look?

from gazer.

sallaigy commented on July 18, 2024

What is the output without optimizations (-no-optimize) and without variable elimination (-elim-vars=off)? What is the output if the source file only contains the code snippet which is translated erroneously?

from gazer.

as3810t commented on July 18, 2024

With passing both -no-optimize and -elim-vars=off, the result is the same:

    loc22 -> loc23 {
        main_main_bb20_tmp24_inlined0 := (main_main_bb20__1_i_inlined0 bv_sign_extend bv[32])
        main_main_bb20_tmp25_inlined0 := (main_main_bb20__12_i_inlined0 bv_sign_extend bv[32])
        main_main_bb20_tmp26_inlined0 := (main_main_bb20_tmp24_inlined0 bvsrem main_main_bb20_tmp25_inlined0)
        main_main_bb20_tmp27_inlined0 := (main_main_bb20_tmp26_inlined0)[8:0]
    }

    loc23 -> loc24 {
    }

    loc24 -> loc18 {
        main_main_bb20__12_i_inlined0 := main_main_bb20_tmp27_inlined0
        main_main_bb20__1_i_inlined0 := main_main_bb20__12_i_inlined0
    }

from gazer.

as3810t commented on July 18, 2024

For a smaller example, the generated code is correct (also when using if instead of while). I tried it with -no-optimize and -elim-vars=off and without them as well.

signed char x = __VERIFIER_nondet_char();
signed char y = __VERIFIER_nondet_char();
signed char g;

while(x != y) {
      g = y;
      y = x % y;
      x = g;

      __VERIFIER_assert(x != y);
}

The generated excerpt:

loc18 -> loc10 {
        main_main_bb3__01_inlined0 := main_main_bb3__0_inlined0
        main_main_bb3__0_inlined0 := main_main_bb3_tmp11_inlined0
}

from gazer.

sallaigy commented on July 18, 2024

This is a PHI assignment. Are you confident that the assignment on the other incoming branch does not make this code correct? Does the verification produce an incorrect result (either with theta or the BMC algorithm)?

from gazer.

as3810t commented on July 18, 2024

The verification produces an incorrect result for the problem in the first comment with gazer-theta, but a correct result with gazer-bmc.

In the example the error location should not be reachable, however, gazer-theta produces a trace, which claims that the return value of the gcd function is 0, which can be traced back to that incorrectly handled swap:

Error trace:
------------
#0 in function main():
  call __VERIFIER_nondet_char() returned #5Dbv8          at 29:21
  x := 93       (0b01011101)     at 29:21
  call __VERIFIER_nondet_char() returned #3bv8           at 30:21
  y := 3        (0b00000011)     at 30:21
#1 in function gcd_test(#5Dbv8,#3bv8):
  a := 93       (0b01011101)     at 29:21
  b := 3        (0b00000011)     at 30:21
  a := 93       (0b01011101)     at 15:9
  b := 3        (0b00000011)     at 16:9
  a := 93       (0b01011101)
  b := 3        (0b00000011)
  t := 3        (0b00000011)
  b := 0        (0b00000000)     at 20:13
  a := 3        (0b00000011)
  a := 0        (0b00000000)
  b := 0        (0b00000000)
  a := 0        (0b00000000)
  a := 0        (0b00000000)
  return #0bv8
#2 in function main():
  g := 0        (0b00000000)
#3 in function __VERIFIER_assert(#0bv32):
  cond := 0     (0b00000000000000000000000000000000)     at 36:29

from gazer.

radl97 commented on July 18, 2024

Can you run theta-cfa -pipeline test.c?
This should create a digraph for all the different loops and functions. I think only gcd and it's loop is important here. (it uses the CFA representation by Gazer, not the Theta one)

from gazer.

as3810t commented on July 18, 2024

@radl97 The digraph can be found here

from gazer.

sallaigy commented on July 18, 2024

I looked at the generated automata and the original LLVM module. The generated code seems to be a faithful representation of the code coming from LLVM, which also looks correct (and BMC provides a correct result on it).

%.12.i contains the srem result from the previous iteration and the eventual return value %.1.lcssa is set to contain this value.

bb13:                                             ; preds = %bb15, %bb8
  %.12.i = phi i8 [ %.01.i, %bb8 ], [ %tmp19, %bb15 ], !dbg !28
  %.1.i = phi i8 [ %spec.select.i, %bb8 ], [ %.12.i, %bb15 ], !dbg !28
  call void @llvm.dbg.value(metadata i8 %.1.i, metadata !33, metadata !DIExpression()) #4, !dbg !28
  call void @llvm.dbg.value(metadata i8 %.12.i, metadata !34, metadata !DIExpression()) #4, !dbg !28
  %tmp14 = icmp eq i8 %.12.i, 0, !dbg !46                                                    ; b != 0
  br i1 %tmp14, label %gcd_test.exit, label %bb15, !dbg !45

bb15:                                             ; preds = %bb13
  call void @llvm.dbg.value(metadata i8 %.12.i, metadata !35, metadata !DIExpression()) #4, !dbg !28 ; t = b
  %tmp16 = sext i8 %.1.i to i32, !dbg !47
  %tmp17 = sext i8 %.12.i to i32, !dbg !49
  %tmp18 = srem i32 %tmp16, %tmp17, !dbg !50
  %tmp19 = trunc i32 %tmp18 to i8, !dbg !47
  call void @llvm.dbg.value(metadata i8 %tmp19, metadata !34, metadata !DIExpression()) #4, !dbg !28
  call void @llvm.dbg.value(metadata i8 %.12.i, metadata !33, metadata !DIExpression()) #4, !dbg !28
  br label %bb13, !dbg !45, !llvm.loop !51

gcd_test.exit:                                    ; preds = %bb13
  %.1.i.lcssa = phi i8 [ %.1.i, %bb13 ], !dbg !28

@as3810t Unless you are able to pinpoint the semantic difference yourself, I would recommend to look into the backend algorithm executed by theta.

from gazer.

radl97 commented on July 18, 2024

@as3810t I think you sent the wrong file.
Note that there are multiple generated files, one for each loop and each function.

The interesting files should be .gcd.dot and .gcd.loop.dot in this context.

Oh, I think gazer-cfa does not exactly work like that... :/ Sorry

from gazer.

as3810t commented on July 18, 2024

I found the following cases, where I think the outputted CFA is semantically different than the input C program. In each case, I generated the CFA with gazer-theta <file> --model-only -o <outfile>, without any additional parameters. The textual CFA is the exact output of gazer-cfa, while on the image I did some modification to make it more readable:

I text replaced (some of) the variable names to match their C source counterpart
I deleted nodes that had only one skip statement on their only outgoing edge.

Filename	Source	CFA (text)	CFA (img)	Notes
gcd_1.c	.c	.cfa	.png	The swap in lines 19-21 in `gcd_1.c` does not match the operations on transition between `loc24 -> loc21`. To verify let `x = 93`, `y = 3`. In this scenario, the assertion passes in the C file (`g = 3`), but fails in the CFA (`g = 0`).
num_conversion_2.c	.c	.cfa	.png	The cycles (lines 21-27 and `loc15`-`loc18`) do not match, as in the CFA, the order of the assignment on `y` and on `c` is swapped. To verify let `x = 9`. In this scenario, the assertion passes in the C file (`y = 9`), but fails in the CFA (`y = 8`).
sum02-2.c	.c	.cfa	.png	The cycles (lines 21-23 and `loc12`-`loc18`) do not match, as in the CFA, the order of the assignment on `i` and on `sn` is swapped. To verify let `n = 0`. In this scenario, the assertion passes in the C file (`sn = 0`), but fails in the CFA (`sn = 1`).

All cases have in common:

gazer-theta verifies incorrectly (with PRED_CART and NWT_IT_WP) and produces a counterexample that fits the input CFA (manually checked every one of them)
gazer-bmc verifies correctly (or bound reached)

from gazer.

radl97 commented on July 18, 2024

Thanks for the detailed report.

Gazer's CFA assumes some kind of SSA-form, but uses more assumptions right now, I think. (A strict SSA would need Phi nodes. Now the only assumption is (should be) that one loop or one function assigns only one value per path for a given variable)

This is handled by "PhiInputs". At the start of the loop, when a variable is changed in the loop. There could be two assignments where the value could come from: from before the loop or inside the loop in a previous iteration.
The loop is transformed to a function with multiple inputs or outputs.

So gcd_loop(a,b) returns gcd_loop(b,a%b). I think the problem lies in the phi input assignments affecting each other (when building up these in Theta).

Input assignments are part of the inlining process when instead of "calling", one assigns the input variables with the given value.

This could be in some inlining mechanism or in RecursiveToCyclicCfa.cpp lines 192- and 257-

from gazer.

radl97 commented on July 18, 2024

The solution could be a bit hard. Assigning in two stages (a' := b; b' := a%b; a:=a'; b:=b';) could be a solution.

from gazer.

Erroneous CFA generation vol 2 about gazer HOT 13 CLOSED

Comments (13)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent