Coder Social home page Coder Social logo

Comments (9)

msoos avatar msoos commented on June 6, 2024

Wow, great catch, thank you so much! I can 100% reproduce it, it's definitely a bug :S I am expecting it to be a CNF reading bug, but it could also be an ANF transformation bug that makes SAT problems into UNSAT problems, which we currently don't check for (sorry). We need to:

  1. Add CNF generation to the fuzzer
  2. Create a better fuzz harness that can detect SAT->UNSAT

In the meanwhile, I'll try to get to the bottom of this. Thanks again, and I'm sorry you hit this issue :S

Mate

from bosphorus.

aljungberg avatar aljungberg commented on June 6, 2024

Happy that I'm able to contribute something!

My intuition was that maybe it was some simple CNF->ANF conversion bug so I ran this test as well:

docker run --rm -v `pwd`/:/dat/ 9dc02c0e39e1 --cnfread /dat/d17-55.cnf --anfwrite /dat/tmp.anf --simplify=0
docker run --rm -v `pwd`/:/dat/ 9dc02c0e39e1 --anfread /dat/tmp.anf --cnfwrite /dat/tmp.cnf --simplify 0

(9dc02c0e39e1 is the image id of msoos/bosphorus:latest.)

So basically I used Bosphorus to convert the problem to ANF and then back to CNF, see if that corrupts the problem set. However CryptoMiniSat can solve the CNF even after the Bosphorus roundtrip:

cryptominisat5 tmp.cnf
...
s SATISFIABLE
v 1 2 -3 4 -5 6 -7 -8 -9 10 11 12 -13 -14 -15 16 -17 -18 19 20 21 22 -23 24 -25

So whatever the bug is it's not quite that simple to trigger.

I also tried going back to the original problem and rewrote that software to output the problem as native ANF directly. Bosphorus couldn't solve that either. I'm not sure that proves anything though for two reasons:

  1. I didn't debug this ANF generation code much so I don't fully trust it. If I generated incorrect ANF, of course Bosphorus can't solve it. If I have the time after work tomorrow I'll double check this.
  2. Even if my native ANF is correct, Bosphorus internally converts to CNF and back again for the SAT solver phase so if the bug is in the CNF translator, it could still be triggered! The only way to avoid that would be to disable the SAT phase but Bosphorus needs that part to generate the solution.

from bosphorus.

msoos avatar msoos commented on June 6, 2024

Thanks for the help! I have checked, and

./bosphorus --cnfread d17-55.cnf --cnfwrite x
./cryptominisat5 x

Also triggers the bug. So given what you wrote above and this test, it's likely an ANF->ANF transformation that triggers the bug. To be honest, that was the most likely scenario, since it's the most technically complicated part of the system that lacks a fuzzer for SAT->UNSAT. So I really need an ANF solver that can find UNSAT, otherwise I will never be able to fuzz-generate this or similar issues, which means it'll be hard to generate a minimal examples and hard to keep the system safe.

I'll come back to this in a week if that's OK -- there is a paper deadline for AAAI-2021 and I wanna get that done before I debug this. Would that be OK? Thanks,

Mate

from bosphorus.

aljungberg avatar aljungberg commented on June 6, 2024

Of course, good luck with your paper! Didn't know you were doing AI research too. (Sometimes when I look at a SAT solver it does feel like some kind of intelligence, being able to find solutions to these tremendously complex non-linear problems.)

Regarding fuzzing ANF problems: as far as I know no-one wrote a direct ANF solver and indeed Bosphorus is the closest thing to such a program. So finding the base truth is tricky (which I ran into when trying to figure out this bug! It would have been convenient to quickly check the ANF with another tool to see if it agreed it was UNSAT.)

I think as an alternative, we could possibly take this problem and just gradually simplify it by relaxing constraints while preserving our bug invariant (Bosphorus fails and CMS succeeds). When we can no longer find a constraint that can be deleted or relaxed while preserving the invariant, we have a minimal test case.

Just doing some back of the napkin calculations, CryptoMiniSat takes about 4 seconds to solve this problem on my machine, and Bosphorus takes about 31 seconds. There are 233217 clauses and if we assume we have to delete most of them to find a tractable minimal example this process, that's over 2000 CPU hours to make one deletion attempt per clause. Hmm that's a little long. The time per test would go down as the problem successfully shrinks, but still.

We could do some kind of binary search/bisection to work on many clauses at once. Or perhaps rather than bisecting we can do each step in larger chunks. Maybe we start by trying to delete some subselection of 1<<16 clauses while preserving the invariant, then when we can't find a chunk like that anymore we go to 1<<15 etc.

Just thinking out loud, obviously you know much better than I how to troubleshoot this. I'm just a software engineer with a hobby.

from bosphorus.

msoos avatar msoos commented on June 6, 2024

Just a minor update that I finally managed to git my AAAI deadline, so I will now have time to fix this up. Sorry for taking so long.

from bosphorus.

aljungberg avatar aljungberg commented on June 6, 2024

This may now be fixed actually! I fuzzed Bosphorus until I found a simpler failure which is included in my PR #30. Can't guarantee the bugs are the same but when I run the previous Bosphorus for 1 iteration on this problem it reports UNSAT -- when I run the fixed version it finds a solution.

from bosphorus.

msoos avatar msoos commented on June 6, 2024

Oh wow, wait, what fuzzer did you use? I couldn't previously detect SAT->UNSAT because I didn't bother (sorry...) to find a tool that can read in an ANF and find a solution. What did you use? Can you please add that too? I'm sorry I failed here.

from bosphorus.

aljungberg avatar aljungberg commented on June 6, 2024

I did write a very crude exhaustive search ANF solver but in the end I think it was wasted effort! (Too crude really to publish I think.)

The existing fuzzer could have been modified very slightly to find this error I think:

  • after ./anf_gen.py $r > $dir/problem.anf
  • convert the ANF to CNF with no simplification
  • count the solutions using traditional methods for CNFs
  • run bosphorus on the original problem.anf this time with simplifications on
  • the new CNF should have the same number of solutions

In my case the minimal reproduction I found and used in #30 wasn't even SAT->UNSAT in the end, just SAT->(different) SAT.

from bosphorus.

msoos avatar msoos commented on June 6, 2024

Ah, of course! convert the ANF to CNF with no simplification --> Yep, that would have found it! Sorry, I actually found a bunch of issues with parsing, too, so I thought, I need the "big guns", a full-on ANF solver, but you are right. That would have done the trick. A lesson learnt! Let me do that now :)

from bosphorus.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.