Comments (13)
I think this is a serious issue, so @sallaigy @radl97 can one of you take a look?
from gazer.
What is the output without optimizations (-no-optimize
) and without variable elimination (-elim-vars=off
)? What is the output if the source file only contains the code snippet which is translated erroneously?
from gazer.
With passing both -no-optimize
and -elim-vars=off
, the result is the same:
loc22 -> loc23 {
main_main_bb20_tmp24_inlined0 := (main_main_bb20__1_i_inlined0 bv_sign_extend bv[32])
main_main_bb20_tmp25_inlined0 := (main_main_bb20__12_i_inlined0 bv_sign_extend bv[32])
main_main_bb20_tmp26_inlined0 := (main_main_bb20_tmp24_inlined0 bvsrem main_main_bb20_tmp25_inlined0)
main_main_bb20_tmp27_inlined0 := (main_main_bb20_tmp26_inlined0)[8:0]
}
loc23 -> loc24 {
}
loc24 -> loc18 {
main_main_bb20__12_i_inlined0 := main_main_bb20_tmp27_inlined0
main_main_bb20__1_i_inlined0 := main_main_bb20__12_i_inlined0
}
from gazer.
For a smaller example, the generated code is correct (also when using if instead of while). I tried it with -no-optimize
and -elim-vars=off
and without them as well.
signed char x = __VERIFIER_nondet_char();
signed char y = __VERIFIER_nondet_char();
signed char g;
while(x != y) {
g = y;
y = x % y;
x = g;
__VERIFIER_assert(x != y);
}
The generated excerpt:
loc18 -> loc10 {
main_main_bb3__01_inlined0 := main_main_bb3__0_inlined0
main_main_bb3__0_inlined0 := main_main_bb3_tmp11_inlined0
}
from gazer.
This is a PHI assignment. Are you confident that the assignment on the other incoming branch does not make this code correct? Does the verification produce an incorrect result (either with theta or the BMC algorithm)?
from gazer.
The verification produces an incorrect result for the problem in the first comment with gazer-theta
, but a correct result with gazer-bmc
.
In the example the error location should not be reachable, however, gazer-theta
produces a trace, which claims that the return value of the gcd function is 0
, which can be traced back to that incorrectly handled swap:
Error trace:
------------
#0 in function main():
call __VERIFIER_nondet_char() returned #5Dbv8 at 29:21
x := 93 (0b01011101) at 29:21
call __VERIFIER_nondet_char() returned #3bv8 at 30:21
y := 3 (0b00000011) at 30:21
#1 in function gcd_test(#5Dbv8,#3bv8):
a := 93 (0b01011101) at 29:21
b := 3 (0b00000011) at 30:21
a := 93 (0b01011101) at 15:9
b := 3 (0b00000011) at 16:9
a := 93 (0b01011101)
b := 3 (0b00000011)
t := 3 (0b00000011)
b := 0 (0b00000000) at 20:13
a := 3 (0b00000011)
a := 0 (0b00000000)
b := 0 (0b00000000)
a := 0 (0b00000000)
a := 0 (0b00000000)
return #0bv8
#2 in function main():
g := 0 (0b00000000)
#3 in function __VERIFIER_assert(#0bv32):
cond := 0 (0b00000000000000000000000000000000) at 36:29
from gazer.
Can you run theta-cfa -pipeline test.c
?
This should create a digraph for all the different loops and functions. I think only gcd and it's loop is important here. (it uses the CFA representation by Gazer, not the Theta one)
from gazer.
@radl97 The digraph can be found here
from gazer.
I looked at the generated automata and the original LLVM module. The generated code seems to be a faithful representation of the code coming from LLVM, which also looks correct (and BMC provides a correct result on it).
%.12.i
contains the srem result from the previous iteration and the eventual return value %.1.lcssa
is set to contain this value.
bb13: ; preds = %bb15, %bb8
%.12.i = phi i8 [ %.01.i, %bb8 ], [ %tmp19, %bb15 ], !dbg !28
%.1.i = phi i8 [ %spec.select.i, %bb8 ], [ %.12.i, %bb15 ], !dbg !28
call void @llvm.dbg.value(metadata i8 %.1.i, metadata !33, metadata !DIExpression()) #4, !dbg !28
call void @llvm.dbg.value(metadata i8 %.12.i, metadata !34, metadata !DIExpression()) #4, !dbg !28
%tmp14 = icmp eq i8 %.12.i, 0, !dbg !46 ; b != 0
br i1 %tmp14, label %gcd_test.exit, label %bb15, !dbg !45
bb15: ; preds = %bb13
call void @llvm.dbg.value(metadata i8 %.12.i, metadata !35, metadata !DIExpression()) #4, !dbg !28 ; t = b
%tmp16 = sext i8 %.1.i to i32, !dbg !47
%tmp17 = sext i8 %.12.i to i32, !dbg !49
%tmp18 = srem i32 %tmp16, %tmp17, !dbg !50
%tmp19 = trunc i32 %tmp18 to i8, !dbg !47
call void @llvm.dbg.value(metadata i8 %tmp19, metadata !34, metadata !DIExpression()) #4, !dbg !28
call void @llvm.dbg.value(metadata i8 %.12.i, metadata !33, metadata !DIExpression()) #4, !dbg !28
br label %bb13, !dbg !45, !llvm.loop !51
gcd_test.exit: ; preds = %bb13
%.1.i.lcssa = phi i8 [ %.1.i, %bb13 ], !dbg !28
@as3810t Unless you are able to pinpoint the semantic difference yourself, I would recommend to look into the backend algorithm executed by theta.
from gazer.
@as3810t I think you sent the wrong file.
Note that there are multiple generated files, one for each loop and each function.
The interesting files should be .gcd.dot and .gcd.loop.dot in this context.
Oh, I think gazer-cfa does not exactly work like that... :/ Sorry
from gazer.
I found the following cases, where I think the outputted CFA is semantically different than the input C program. In each case, I generated the CFA with gazer-theta <file> --model-only -o <outfile>
, without any additional parameters. The textual CFA is the exact output of gazer-cfa
, while on the image I did some modification to make it more readable:
- I text replaced (some of) the variable names to match their C source counterpart
- I deleted nodes that had only one skip statement on their only outgoing edge.
Filename | Source | CFA (text) | CFA (img) | Notes |
---|---|---|---|---|
gcd_1.c | .c | .cfa | .png | The swap in lines 19-21 in gcd_1.c does not match the operations on transition between loc24 -> loc21 . To verify let x = 93 , y = 3 . In this scenario, the assertion passes in the C file (g = 3 ), but fails in the CFA (g = 0 ). |
num_conversion_2.c | .c | .cfa | .png | The cycles (lines 21-27 and loc15 -loc18 ) do not match, as in the CFA, the order of the assignment on y and on c is swapped. To verify let x = 9 . In this scenario, the assertion passes in the C file (y = 9 ), but fails in the CFA (y = 8 ). |
sum02-2.c | .c | .cfa | .png | The cycles (lines 21-23 and loc12 -loc18 ) do not match, as in the CFA, the order of the assignment on i and on sn is swapped. To verify let n = 0 . In this scenario, the assertion passes in the C file (sn = 0 ), but fails in the CFA (sn = 1 ). |
All cases have in common:
gazer-theta
verifies incorrectly (with PRED_CART and NWT_IT_WP) and produces a counterexample that fits the input CFA (manually checked every one of them)gazer-bmc
verifies correctly (or bound reached)
from gazer.
Thanks for the detailed report.
Gazer's CFA assumes some kind of SSA-form, but uses more assumptions right now, I think. (A strict SSA would need Phi nodes. Now the only assumption is (should be) that one loop or one function assigns only one value per path for a given variable)
This is handled by "PhiInputs". At the start of the loop, when a variable is changed in the loop. There could be two assignments where the value could come from: from before the loop or inside the loop in a previous iteration.
The loop is transformed to a function with multiple inputs or outputs.
So gcd_loop(a,b) returns gcd_loop(b,a%b). I think the problem lies in the phi input assignments affecting each other (when building up these in Theta).
Input assignments are part of the inlining process when instead of "calling", one assigns the input variables with the given value.
This could be in some inlining mechanism or in RecursiveToCyclicCfa.cpp lines 192- and 257-
from gazer.
The solution could be a bit hard. Assigning in two stages (a' := b; b' := a%b; a:=a'; b:=b';
) could be a solution.
from gazer.
Related Issues (20)
- Jumping out of a loop from two different blocks to a same exit block breaks the translation of PHI values
- Basic format of the configuration description HOT 2
- Portfolio script as entry point of the docker file HOT 1
- Move CI to GH Actions
- Segmentation fault in a program with return in a nested loop
- Test harnesses should mock unused functions
- Support SV-COMP ReachSafety-Arrays with BMC HOT 1
- Support SV-COMP ReachSafety-Arrays with Theta
- Support SV-COMP ReachSafety-BitVectors with Theta
- Support SV-COMP ReachSafety-Floats with Theta
- Support SV-COMP ReachSafety-Heap with BMC HOT 2
- Support SV-COMP ReachSafety-Heap with Theta HOT 1
- Support SV-COMP ReachSafety-Sequentialized with BMC HOT 1
- Support SV-COMP ReachSafety-Sequentialized with Theta HOT 1
- Support SV-COMP ReachSafety-Combinations with Theta
- Support SV-COMP ReachSafety-Combined with BMC HOT 1
- Support 'malloc' in flat memory model
- Gitignore for portfolio HOT 3
- Test Docker image with GH Actions
- Discussion: Upgrading LLVM framework? HOT 5
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from gazer.