Comments (10)
Ok, I see ;-) :
def remove_ssa_form(self):
""" transform the flow out of ssa form. """
if self.has_theta_expressions():
return #raise RuntimeError('not yet implemented')
from decompiler.
Yup, the "out of ssa" code is not written yet :)
from decompiler.
I implemented a simple ssa back transform with copy assignments in bc990c0. It does not do variable coalescing yet but I will attempt to implement that at some point.
from decompiler.
Nice, looking forward to find bugs in it ;-).
Btw, the title of thsi ticket was "loop decompilation", I'm not sure if it's helpful reminded or not that it's still not implemented yet. Btw, I experimented with loop recognition and trivial case are, well, trivial. Slightly less trivial cases are very well possible. (And real real-world cases are expectedly hard.)
from decompiler.
Ah, right. Arranging loops into higher level while() and for() in the decompiled output is not implemented yet. Its one of the next things in my todo list. I guess I'll reopen this.
from decompiler.
I rewrote the entire control flow "reconstruction" code in the last week. The old way of combining blocks was just not working, this new way will evolve a lot better I think.
Btw, I experimented with loop recognition and trivial case are, well, trivial. Slightly less trivial cases are very well possible. (And real real-world cases are expectedly hard.)
I'd be interested in your more complex use cases. There are very few complex cases in the tests currently, I would love examples where my code just fails, or the decompilation is incorrect.
from decompiler.
Thanks for the updates!
I'd be interested in your more complex use cases.
So, that all bumps into #9. How to pass them to you, in what form they should be? Actually, I had to scratch my head how to run even the example in the description of this ticket - you codebase doesn't have a utility which can be fed with an IR file. In the end I dug out my older test with inline IR, then scratched my had what PYTHONPATH to run it with (made a runner shell script this type).
And in the end what I get is:
$ ./run.sh ir_test.py
more than one group
Traceback (most recent call last):
File "ir_test.py", line 21, in <module>
dec.step_until(step_decompiled)
File "/mnt/hdd/projects-3rdparty/RevEng/Decompilers/PYTHON/decompiler/src/decompiler.py", line 413, in step_until
for step in self.step_generator:
File "/mnt/hdd/projects-3rdparty/RevEng/Decompilers/PYTHON/decompiler/src/decompiler.py", line 509, in steps
self.ssa_tagger.remove_ssa_form()
File "/mnt/hdd/projects-3rdparty/RevEng/Decompilers/PYTHON/decompiler/src/ssa.py", line 372, in remove_ssa_form
t.transform()
File "/mnt/hdd/projects-3rdparty/RevEng/Decompilers/PYTHON/decompiler/src/ssa.py", line 565, in transform
self.rename_groups(phi, groups)
File "/mnt/hdd/projects-3rdparty/RevEng/Decompilers/PYTHON/decompiler/src/ssa.py", line 557, in rename_groups
raise 'not implemented'
TypeError: exceptions must be old-style classes or derived from BaseException, not str
I.e. it doesn't even use modern Python raise ;-).
from decompiler.
How to pass them to you, in what form they should be?
What about assembly?
And in the end what I get is:
Ah, that example is hitting one of the cases for which I didn't have a test yet. Thanks for reminding me about it, I will work on it.
I was asking the question because I want to know if you actually have a set of use-cases of complex loops that you test other decompilers with?
from decompiler.
Anyway, answering your question, you can hopefully find a lot of inspiration in decomp/decomp#172 . As you can see, I was so excited by Van Emmerik's claims that "structuring problem is largely solved" and quoting Cifuentes, that I didn't get lazy to dig out her code, 16-bit compiler, and tested it. Then I tested Van Emmerik's own stuff. What I saw is terrible things ;-).
That ticket also quotes https://github.com/pfalcon/ScratchABlock , which is thing I started as a "response" to your comments on #9. The human-readable/writable, externalized IR is the core of the project (note that for structuring, only control transfer instructions are required, so the rest of syntax is just literal so far ;-) ), a utility to read that format, apply a transformation, and render result as a basic block list and a .dot file is the central user interface, and tests are written not with self.assertsWhichWerePutInPythonAsAJavaDiversion(), but by running aforementioned utility and then diffing results - i.e., everything is highly inviting for human being to come by and hack. I actually hacked that up in 2-3 days after our discussion in #9, now leisurely refactor/clean up and spool to github.
Finally, the ticket above also contains an example of non-trivial, but still human-graspable CFG. My code can structure it a bit, and I already have understanding what the other parts are, and how to structure them, though I'm not exactly sure the result would be close to the original code. The whole title of that ticket says that just shape of (sub)CFG is not enough to properly recover structure, at the very least, also analyzing and collating graphs based on branch condition codes is required, possible in non-trivial way. Good example of that is while loop. All folks represent it based on assumption that it starts with unconditional jump into loop header. But everyone who learned assembler first, high-level language then, knows that while loop is not that. While is if followed by do while, i.e.
if (c) {
do {
...
} while (c);
}
|
v
while (c) {
...
}
To match the above, one needs to match equivalence of conditions on different edges.
No paper I read so far (and I've read quite a lot by now) discussed the above trivial example. So, structuring problem is largely solved.
from decompiler.
What about assembly?
I have them in PseudoC assembler (which is what any decompiler should use IMHO). But original is in Xtensa assembler if you care.
from decompiler.
Related Issues (11)
- What's THETA function? HOT 5
- Unit tests not working HOT 2
- decompiler import not found
- Proposing a PR to fix a few small typos
- How to use assembly as input?
- Call arguments not detected
- Overlapping registers not handled correctly
- IDA UI is broken HOT 1
- "ir-parser" disasm error HOT 3
- Please make "ir-parser" "disassembler" a first-class component HOT 7
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from decompiler.