Coder Social home page Coder Social logo

Comments (10)

pfalcon avatar pfalcon commented on July 29, 2024

Ok, I see ;-) :

  def remove_ssa_form(self):
    """ transform the flow out of ssa form. """

    if self.has_theta_expressions():
      return #raise RuntimeError('not yet implemented')

from decompiler.

EiNSTeiN- avatar EiNSTeiN- commented on July 29, 2024

Yup, the "out of ssa" code is not written yet :)

from decompiler.

EiNSTeiN- avatar EiNSTeiN- commented on July 29, 2024

I implemented a simple ssa back transform with copy assignments in bc990c0. It does not do variable coalescing yet but I will attempt to implement that at some point.

from decompiler.

pfalcon avatar pfalcon commented on July 29, 2024

Nice, looking forward to find bugs in it ;-).

Btw, the title of thsi ticket was "loop decompilation", I'm not sure if it's helpful reminded or not that it's still not implemented yet. Btw, I experimented with loop recognition and trivial case are, well, trivial. Slightly less trivial cases are very well possible. (And real real-world cases are expectedly hard.)

from decompiler.

EiNSTeiN- avatar EiNSTeiN- commented on July 29, 2024

Ah, right. Arranging loops into higher level while() and for() in the decompiled output is not implemented yet. Its one of the next things in my todo list. I guess I'll reopen this.

from decompiler.

EiNSTeiN- avatar EiNSTeiN- commented on July 29, 2024

I rewrote the entire control flow "reconstruction" code in the last week. The old way of combining blocks was just not working, this new way will evolve a lot better I think.

Btw, I experimented with loop recognition and trivial case are, well, trivial. Slightly less trivial cases are very well possible. (And real real-world cases are expectedly hard.)

I'd be interested in your more complex use cases. There are very few complex cases in the tests currently, I would love examples where my code just fails, or the decompilation is incorrect.

from decompiler.

pfalcon avatar pfalcon commented on July 29, 2024

Thanks for the updates!

I'd be interested in your more complex use cases.

So, that all bumps into #9. How to pass them to you, in what form they should be? Actually, I had to scratch my head how to run even the example in the description of this ticket - you codebase doesn't have a utility which can be fed with an IR file. In the end I dug out my older test with inline IR, then scratched my had what PYTHONPATH to run it with (made a runner shell script this type).

And in the end what I get is:

$ ./run.sh ir_test.py 
more than one group
Traceback (most recent call last):
  File "ir_test.py", line 21, in <module>
    dec.step_until(step_decompiled)
  File "/mnt/hdd/projects-3rdparty/RevEng/Decompilers/PYTHON/decompiler/src/decompiler.py", line 413, in step_until
    for step in self.step_generator:
  File "/mnt/hdd/projects-3rdparty/RevEng/Decompilers/PYTHON/decompiler/src/decompiler.py", line 509, in steps
    self.ssa_tagger.remove_ssa_form()
  File "/mnt/hdd/projects-3rdparty/RevEng/Decompilers/PYTHON/decompiler/src/ssa.py", line 372, in remove_ssa_form
    t.transform()
  File "/mnt/hdd/projects-3rdparty/RevEng/Decompilers/PYTHON/decompiler/src/ssa.py", line 565, in transform
    self.rename_groups(phi, groups)
  File "/mnt/hdd/projects-3rdparty/RevEng/Decompilers/PYTHON/decompiler/src/ssa.py", line 557, in rename_groups
    raise 'not implemented'
TypeError: exceptions must be old-style classes or derived from BaseException, not str

I.e. it doesn't even use modern Python raise ;-).

from decompiler.

EiNSTeiN- avatar EiNSTeiN- commented on July 29, 2024

How to pass them to you, in what form they should be?

What about assembly?

And in the end what I get is:

Ah, that example is hitting one of the cases for which I didn't have a test yet. Thanks for reminding me about it, I will work on it.

I was asking the question because I want to know if you actually have a set of use-cases of complex loops that you test other decompilers with?

from decompiler.

pfalcon avatar pfalcon commented on July 29, 2024

Anyway, answering your question, you can hopefully find a lot of inspiration in decomp/decomp#172 . As you can see, I was so excited by Van Emmerik's claims that "structuring problem is largely solved" and quoting Cifuentes, that I didn't get lazy to dig out her code, 16-bit compiler, and tested it. Then I tested Van Emmerik's own stuff. What I saw is terrible things ;-).

That ticket also quotes https://github.com/pfalcon/ScratchABlock , which is thing I started as a "response" to your comments on #9. The human-readable/writable, externalized IR is the core of the project (note that for structuring, only control transfer instructions are required, so the rest of syntax is just literal so far ;-) ), a utility to read that format, apply a transformation, and render result as a basic block list and a .dot file is the central user interface, and tests are written not with self.assertsWhichWerePutInPythonAsAJavaDiversion(), but by running aforementioned utility and then diffing results - i.e., everything is highly inviting for human being to come by and hack. I actually hacked that up in 2-3 days after our discussion in #9, now leisurely refactor/clean up and spool to github.

Finally, the ticket above also contains an example of non-trivial, but still human-graspable CFG. My code can structure it a bit, and I already have understanding what the other parts are, and how to structure them, though I'm not exactly sure the result would be close to the original code. The whole title of that ticket says that just shape of (sub)CFG is not enough to properly recover structure, at the very least, also analyzing and collating graphs based on branch condition codes is required, possible in non-trivial way. Good example of that is while loop. All folks represent it based on assumption that it starts with unconditional jump into loop header. But everyone who learned assembler first, high-level language then, knows that while loop is not that. While is if followed by do while, i.e.

if (c) {
  do {
    ...
  } while (c);
}

|
v

while (c) {
  ...
}

To match the above, one needs to match equivalence of conditions on different edges.

No paper I read so far (and I've read quite a lot by now) discussed the above trivial example. So, structuring problem is largely solved.

from decompiler.

pfalcon avatar pfalcon commented on July 29, 2024

What about assembly?

I have them in PseudoC assembler (which is what any decompiler should use IMHO). But original is in Xtensa assembler if you care.

from decompiler.

Related Issues (11)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.