Coder Social home page Coder Social logo

bdcht / amoco Goto Github PK

View Code? Open in Web Editor NEW
450.0 41.0 69.0 2.91 MB

yet another tool for analysing binaries

License: GNU General Public License v2.0

Python 99.85% Assembly 0.06% C 0.09% QML 0.01%
python reverse-engineering assembly-language symbolic-execution graphs

amoco's People

Contributors

5df avatar agustingianni avatar bdcht avatar deepio avatar fabaff avatar lgtm-migrator avatar lrgh avatar qsantos avatar yrp604 avatar zachriggle avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

amoco's Issues

x64: bug in register sizes when decoding some SSE instructions

Example:

>>> from amoco.arch.x64 import cpu_x64
>>> str(cpu_x64.disassemble('\xf3\x0f\x2a\xc0'))
'cvtsi2ss    xmm0, rax'

should return cvtsi2ss xmm0, eax

Same for cvtsi2sd.

For these examples, it seems to be sufficient to replace set_opdsz_64 by set_opdsz_32 in the definition in spec_sse.py (and to create set_opdsz_32 in utils.py).

x86 and x64: bug in push argument size

>>> cpu_x86.disassemble('\x66\x6a\x08').operands[0].size
8
>>> cpu_x64.disassemble('\x66\x6a\x08').operands[0].size
8

Should be 16.

The Intel manual recommends that 0x66 0x68 N N is used for 16-bit push, but GNU as (at least version 2.25) generates 0x66 0x6A 0x08 when asked to assemble pushw $8.

64-bit imul instruction is broken

It looks like amoco lacks support for 64-bit imul.

movabs rax,0xbdad7412bb99a0c1
movabs r10,0x9a899ae230f27801
imul   r10
import amoco
import amoco.arch.x64.cpu_x64 as cpu

shellcode = 'H\xb8\xc1\xa0\x99\xbb\x12t\xad\xbdI\xba\x01x\xf20\xe2\x9a\x89\x9aI\xf7\xea'

loc = cpu.cst(0x8048380,64)
instr = [ ]
while len(shellcode)>0:
    i = cpu.disassemble(shellcode,address=loc)
    l = i.length
    instr.append(i)
    shellcode = shellcode[l:]
    loc += l
b = amoco.code.block(instr)
print b
print b.map
/home/user/.pyenv/versions/2.7.9/lib/python2.7/site-packages/amoco/arch/x64/asm.pyc in i_IMUL(i, fmap)
    915   if len(i.operands)==1:
    916     src = i.operands[0]
--> 917     m,d = {8:(al,ah), 16:(ax,dx), 32:(eax,edx)}[src.size]
    918     r = fmap(m**src)
    919   elif len(i.operands)==2:

KeyError: 64

x64: IMUL cannot have imm64

Proposed patch

--- a/amoco/arch/x64/spec_ia32e.py
+++ b/amoco/arch/x64/spec_ia32e.py
@@ -732,9 +732,11 @@ def ia32_reg_rm_8(obj,Mod,RM,REG,data):
 def ia32_reg_rm_wd(obj,Mod,RM,REG,data):
     op2,data = getModRM(obj,Mod,RM,data)
     op1 = getregR(obj,REG,op2.size)
-    if data.size<op2.size: raise InstructionError(obj)
-    imm = data[0:op2.size]
-    x = env.cst(imm.int(),op2.size)
+    sz = op2.size
+    if sz == 64: sz = 32
+    if data.size<sz: raise InstructionError(obj)
+    imm = data[0:sz]
+    x = env.cst(imm.int(),sz)
     obj.operands = [op1, op2, x]
     obj.bytes += pack(imm)
     obj.type = type_data_processing

x86: fails to disassemble shrd or shld

>>> print(cpu_x86.disassemble('\x0f\xa5\xef'))
None
>>> print(cpu_x86.disassemble('\x0f\xac\xef\x01'))
None
>>> print(cpu_x86.disassemble('\x0f\xa4\x6d\xf8\x01'))
None

I would like to use something like

@ispec_ia32("*>[ {0f}{a4} /r ib(8) ]", mnemonic = "SHLD")

but it is not possible.
As mentioned in the documentation, the "variable length directive" should be "at the end of the FORMAT".
How are we supposed to parse these instructions without using ugly hacks?

"pop esp" incorrect

The instruction pop esp does not add 4 to esp after popping. Currently, Amoco treats it as effectively: mov esp [esp]; add esp 0x4, which is incorrect.

Thanks to @lieanu for pointing it out!

Using one more cpu archs at the same time

Hi,

I got confused by the following codes, maybe this is an Amoco bug.

The code as follows:

https://gist.github.com/lieanu/f65788ff947a04d50aa0

import amoco
import amoco.system.raw
import amoco.system.core

def sym_exec_gadget_and_get_mapper(code, cpu):
    '''Taken from https://github.com/0vercl0k/stuffz/blob/master/look_for_gadgets_with_equations.py'''

    p = amoco.system.raw.RawExec(
        amoco.system.core.DataIO(code), cpu
    )

    try:
        blocks = list(amoco.lsweep(p).iterblocks())
    except:
        return None

    if len(blocks) == 0:
        return None

    mp = amoco.cas.mapper.mapper()
    for block in blocks:
        if block.instr[-1].mnemonic.lower() == 'call':
            p.cpu.i_RET(None, block.map)

        try:
            mp >>= block.map
        except Exception as e:
            pass

    return mp

if __name__ == "__main__":

    # pop rdi; ret --> "\x5f\xc3"
    print "-"*20, "AMD64", "-"*20
    print "Instr: ", "pop rdi; ret"
    import amoco.arch.x64.cpu_x64 as amd64_cpu
    cpu = amd64_cpu
    print sym_exec_gadget_and_get_mapper("\x5f\xc3", cpu)

    # pop eax; ret --> "\x58\xc3"
    print "-"*20, "I386", "-"*20
    print "Instr: ", "pop eax; ret"
    import amoco.arch.x86.cpu_x86 as i386_cpu
    cpu = i386_cpu
    print sym_exec_gadget_and_get_mapper("\x58\xc3", cpu)

    # pop rdi; ret --> "\x5f\xc3"
    print "-"*20, "AMD64 again", "-"*20
    print "Instr: ", "pop rdi; ret"
    import amoco.arch.x64.cpu_x64 as amd64_cpu
    cpu = amd64_cpu
    print sym_exec_gadget_and_get_mapper("\x5f\xc3", cpu)

The outputs:

-------------------- AMD64 --------------------
Instr:  pop rdi; ret
rdi <- { | [0:64]->M64(rsp) | } instructions [352]                              
rsp <- { | [0:64]->(rsp+0x10) | }
rip <- { | [0:64]->M64(rsp+8) | }
-------------------- I386 --------------------
Instr:  pop eax; ret
eax <- { | [0:32]->M32(esp) | } instructions [352]                              
esp <- { | [0:32]->(esp+0x8) | }
eip <- { | [0:32]->M32(esp+4) | }
-------------------- AMD64 again --------------------
Instr:  pop rdi; ret
rdi <- { | [0:64]->M64(esp) | }
esp <- { | [0:32]->(esp+0xc) | }
eip <- { | [0:32]->M32(esp+8) | }

You can see that the mapper of pop rdi; ret after switching to amd64 cpu, is incorrect.

Regards,
lieanu

x86: relative jmp should be a signed value

>>> print(str(cpu_x86.disassemble('\x74\xc0')))
jz          .-64     # OK
>>> print(str(cpu_x86.disassemble('\xeb\xc0')))
jmp         .+192    # should be jmp .-64

Proposed patch:

--- a/amoco/arch/x86/spec_ia32.py
+++ b/amoco/arch/x86/spec_ia32.py
@@ -146,7 +146,7 @@ def ia32_imm8(obj,ib):
 @ispec_ia32("16>[ {e1} ib(8) ]", mnemonic = "LOOPE",  type=type_control_flow)
 @ispec_ia32("16>[ {e0} ib(8) ]", mnemonic = "LOOPNE", type=type_control_flow)
 def ia32_imm_rel(obj,ib):
-    obj.operands = [env.cst(ib,8)]
+    obj.operands = [env.cst(ib,8).signextend(32)]

 @ispec_ia32("16>[ {e3} cb(8) ]", mnemonic = "JECXZ", type=type_control_flow)
 def ia32_cb8(obj,cb):

cas: simplification of ptr(cst, disp)

>>> ad=env.mem(env.cst(12,32),disp=4)
>>> str(ad.simplify())
'M32(0xc+4)'

I would have expected 'M32(0x10)'

Proposed patch:

--- a/amoco/cas/expressions.py
+++ b/amoco/cas/expressions.py
@@ -960,6 +960,9 @@ class ptr(exp):
         if isinstance(self.seg,exp):
             self.seg = self.seg.simplify()
         if not self.base._is_def: self.disp=0
+        if self.disp and self.base._is_cst:
+            self.base += self.disp
+            self.disp = 0
         return self

     # default segment handler does not care about seg value:

x86: emulation failure for setne

>>> m=mapper()
>>> i=cpu_x86.disassemble('\x0F\x95\xC2')
>>> m[env.edx]=env.mem(env.ebp)
>>> print(str(m))
edx <- { | [0:32]->M32(ebp) | }
>>> print(str(i))
setnz       dl
>>> i(m)
>>> print(str(m))
edx <- { | [0:32]->M32(ebp) | }
eip <- { | [0:32]->(eip+0x3) | }
(ebp) <- ((zf==0x0) ? 0x1 : 0x0)

edx should be modified, not (ebp)

Proposed patch

--- a/amoco/arch/x86/asm.py
+++ b/amoco/arch/x86/asm.py
@@ -569,7 +572,7 @@ def i_NOT(i,fmap):
   fmap[op1] = ~fmap(op1)

 def i_SETcc(i,fmap):
-  op1 = fmap(i.operands[0])
+  op1 = i.operands[0]
   fmap[eip] = fmap[eip]+i.length
   fmap[op1] = tst(fmap(i.cond[1]),cst(1,op1.size),cst(0,op1.size))

NameError: global name 'c' is not defined

code:

import sys
import amoco
import amoco.system.raw
import amoco.system.core

def sym_exec_gadget_and_get_mapper(code, cpu):
    '''Taken from https://github.com/0vercl0k/stuffz/blob/master/look_for_gadgets_with_equations.py'''

    p = amoco.system.raw.RawExec(
        amoco.system.core.DataIO(code), cpu
    )

    try:
        blocks = list(amoco.lsweep(p).iterblocks())
    except:
        return None

    if len(blocks) == 0:
        return None

    mp = amoco.cas.mapper.mapper()
    for block in blocks:
        if block.instr[-1].mnemonic.lower() == 'call':
            p.cpu.i_RET(None, block.map)

        try:
            mp >>= block.map
        except Exception as e:
            pass

    return mp

if __name__ == "__main__":

    # adc bh, bh ; call qword ptr [rsi]
    import amoco.arch.x64.cpu_x64 as cpu
    bytes = "10ffff16".decode("hex")
    print sym_exec_gadget_and_get_mapper(bytes, cpu)

output:

Traceback (most recent call last):structions [352]                              
  File "amoco_test2.py", line 38, in <module>
    print sym_exec_gadget_and_get_mapper(bytes, cpu)
  File "amoco_test2.py", line 24, in sym_exec_gadget_and_get_mapper
    p.cpu.i_RET(None, block.map)
  File "/usr/lib/python2.7/site-packages/amoco/code.py", line 32, in map
    self._map = mapper(self.instr)
  File "/usr/lib/python2.7/site-packages/amoco/cas/mapper.py", line 44, in __init__
    if not instr.misc['delayed']: instr(self)
  File "/usr/lib/python2.7/site-packages/amoco/arch/core.py", line 65, in __call__
    i_xxx(self,map)
  File "/usr/lib/python2.7/site-packages/amoco/arch/x64/asm.py", line 560, in i_ADC
    fmap[af]  = halfcarry(a,op2,c)
NameError: global name 'c' is not defined

Error in parsing IA32 pextrw instruction

Here is the current output:

>>> from amoco.arch.x86 import cpu_x86
>>> i = cpu_x86.disassemble('\x66\x0f\xc5\xc1\x00')
>>> str(i)
'pextrw      xmm0, ecx, 0x0'

While it should be pextrw eax, xmm1, 0

x86: AT&T-syntax output fails

Solved by the following patch.
Note that this patch is not sufficient to generate AT&T syntax that can be used as an input to GNU as.

--- a/amoco/arch/x86/formats.py
+++ b/amoco/arch/x86/formats.py
@@ -104,7 +161,7 @@ def opers_att(i):
     s = []
     for op in reversed(i.operands):
         if op._is_mem:
-            s.extend(deref(op))
+            s.extend(deref_att(op))
         elif op._is_cst:
             if i.misc['imm_ref'] is not None:
                 s.append((Token.Address,str(i.misc['imm_ref'])))

No examples

This seems like a really great project with a lot of potential uses. For one, I'd like to use it in my pwndbg debugger scripts for GDB in order to simulate execution of a single basic block (to determine the expected path of a conditional branch).

Unfortunately, there do not appear to be any examples which I can base this work off of. Since I'm running in a debugger, I can provide complete state information, and even memory access.

How would I go about simulating a short sequence of instructions, given a known starting register context?

x86: emulation mixes the two cmpsd instructions

Same problem as in #34

>>> i = cpu_x86.disassemble('\xF2\x0F\xC2\xCA\x00')
>>> str(i)
'cmpsd       xmm1, xmm2, 0x0'
>>> m=mapper()
>>> i(m)
>>> print(str(m))
eflags <- { | [0:1]->(((~M32(edi)[31:32])&M32(esi)[31:32])|((M32(edi)-M32(esi))[31:32]&((~M32(edi)[31:32])|M32(esi)[31:32]))) | [1:2]->eflags[1:2] | [2:3]->T1 | [3:4]->eflags[3:4] | [4:5]->T1 | [5:6]->eflags[5:6] | [6:7]->((M32(edi)-M32(esi))==0x0) | [7:8]->((M32(edi)-M32(esi))<0x0) | [8:11]->eflags[8:11] | [11:12]->((M32(edi)[31:32]^(M32(edi)-M32(esi))[31:32])&(M32(edi)[31:32]^M32(esi)[31:32])) | [12:32]->eflags[12:32] | }
eip <- { | [0:32]->(eip+0x5) | }
edi <- { | [0:32]->(df ? (edi-0x4) : (edi+0x4)) | }
esi <- { | [0:32]->(df ? (esi-0x4) : (esi+0x4)) | }

which is the semantics of cpu_x86.disassemble('\xA7')

Proposed patch, until there is an implementation of this SSE2 instruction:

--- a/amoco/arch/x86/asm.py
+++ b/amoco/arch/x86/asm.py
@@ -296,7 +296,10 @@ def i_CMPSB(i,fmap):
 def i_CMPSW(i,fmap):
   _cmps_(i,fmap,2)
 def i_CMPSD(i,fmap):
-  _cmps_(i,fmap,4)
+  if i.misc['opdsz']==128:
+    return
+  else:
+   _cmps_(i,fmap,4)

x64: error in decoding 16-bit MOV

>>> str(cpu_x64.disassemble('\x66\x41\xc7\x84\x55\x11\x11\x11\x11\x22\x22'))
'None'
>>> str(cpu_x64.disassemble('\x66\x41\xc7\x84\x55\x11\x11\x11\x11\x22\x22\x22\x22'))
'mov         word ptr [(r13+(rdx*0x2))+0x11111111], 0x22222222'

while the first one is valid and should decode as mov word ptr [(r13+(rdx*0x2))+0x11111111], 0x2222

Proposed patch

--- a/amoco/arch/x64/spec_ia32e.py
+++ b/amoco/arch/x64/spec_ia32e.py
@@ -531,7 +532,7 @@ def ia32_ptr_iwd(obj,Mod,RM,data):
     op1,data = getModRM(obj,Mod,RM,data)
     REX = obj.misc['REX']
     size = op1.size
-    if REX: size=32
+    if REX and size==64: size = 32
     if data.size<size: raise InstructionError(obj)
     imm = data[0:size]
     x = env.cst(imm.int(),size).signextend(op1.size)

x86: typo in PSHUFD semantics

--- a/amoco/arch/x86/asm.py
+++ b/amoco/arch/x86/asm.py
@@ -1411,7 +1413,7 @@ def i_PSLLQ(i,fmap):
   fmap[op1] = composer(res)

 def i_PSHUFD(i,fmap):
-  fmap[rip] = fmap[rip]+i.length
+  fmap[eip] = fmap[eip]+i.length
   op1 = i.operands[0]
   op2 = i.operands[1]
   op3 = i.operands[2]

x64: bug in getModRM()

>>> from amoco.arch.x64 import cpu_x64
>>> i=cpu_x64.disassemble('\x48\x8b\x04\xc5\0\0\0\0')
>>> str(i.operands[1])
'M64cs(((rax*0x8)+rip))'

but it should be M64cs((rax*0x8))

There is also

>>> from amoco.arch.x64 import cpu_x64
>>> i=cpu_x64.disassemble('\x64\x48\x8b\x04\x25\x28\0\0\0')
>>> str(i.operands[1])
'M64fs(rip+40)'

which should be M64fs(40).

The proposed patch below addresses these problems.

--- a/amoco/arch/x64/utils.py
+++ b/amoco/arch/x64/utils.py
@@ -96,12 +96,16 @@ def getModRM(obj,Mod,RM,data):

     # check [disp16/32] case:
     if (b is env.rbp or b is env.r13) and Mod==0:
-        b=env.rip
-        if seg is '': seg = env.cs
-        Mod = 0b10
-    if (b is env.bp) and Mod==0:
-        b=env.cst(0,adrsz)
+        if s is 0 and seg is '':
+            seg = env.cs
+            bs = env.rip
+        else:
+            bs = s + env.cst(0,adrsz)
         Mod = 0b10
+    elif s is 0:
+        bs = b
+    else:
+        bs = b + s
     # now read displacement bytes:
     if Mod==0b00:
         d = 0
@@ -119,7 +123,6 @@ def getModRM(obj,Mod,RM,data):
         obj.bytes += pack(d)
         data = data[immsz:data.size]
         d = d.int(-1)
-    bs = b+s
     if bs._is_cst and bs.v==0x0:
         bs.v = d
         bs.size = adrsz

Note that I wrote this patch such that bs = b + s appears only when both are non-zero.
It allows me to replace this addition by the following piece of code, that I need because I want amoco to remember whether the binary conained 8b 04 18 (aka. mov (%rax,%rbx), %eax) or 8b 04 03 (aka. mov (%rbx,%rax), %eax).

        # Instead of doing bs = b+s, which will reorder arguments, we do
        # the addition manually, and change 'prop' so the many future calls
        # to 'simplify' does not reorder the arguments
        from amoco.cas import expressions
        bs = expressions.op('+', b, s)
        bs.prop |= 16

x64: PUSH and POP should have 64-bit arguments

>>> str(cpu_x64.disassemble('\xff\x35\0\0\0\0'))
'push        dword ptr cs:[rip]'
>>> str(cpu_x64.disassemble('\x8f\x05\0\0\0\0'))
'pop         dword ptr cs:[rip]'

Should not be dword ptr.
The prefix 0x48 is not mandatory and is not set by compilers.

Proposed patch:

--- a/amoco/arch/x64/spec_ia32e.py
+++ b/amoco/arch/x64/spec_ia32e.py
@@ -254,6 +254,7 @@ def ia32_rm8(obj,Mod,RM,data):
 @ispec_ia32("*>[ {f7} /7 ]", mnemonic = "IDIV" )
 def ia32_rm32(obj,Mod,RM,data):
     op1,data = getModRM(obj,Mod,RM,data)
+    if obj.mnemonic in ["PUSH", "POP"] and op1.size == 32: op1.size = 64
     obj.operands = [op1]
     obj.type = type_data_processing

x64: bug in decoding instructions on 8-bit registers

>>> from amoco.arch.x64 import cpu_x64
>>> str(cpu_x64.disassemble('\x80\xcc\x0c'))
'or          spl, 0xc'
>>> str(cpu_x64.disassemble('\x40\x80\xcc\x0c'))
'or          spl, 0xc'

The first should be or ah, 0xc.

I am looking for how to correct this bug.
The function getreg in env.py never returns ah, therefore one solution would be to change this function to add a third argument (the value of REX) ; another possibility is to change getModRM in utils.py. It depends on if we need also to change getregR, getregRW and getregB.

My proposal for a patch is

--- a/amoco/arch/x64/utils.py
+++ b/amoco/arch/x64/utils.py
@@ -67,6 +67,16 @@ def getModRM(obj,Mod,RM,data):
     # r/16/32 case:
     if Mod==0b11:
         op1 = env.getreg((B<<3)+RM,opdsz)
+        if REX is None and opdsz is 8: op1 = {
+            env.al:  env.al,
+            env.cl:  env.cl,
+            env.dl:  env.dl,
+            env.bl:  env.bl,
+            env.spl: env.ah,
+            env.bpl: env.ch,
+            env.sil: env.dh,
+            env.dil: env.bh,
+            }[op1]
         return op1,data
     # m/16/32 case:
     if adrsz!=16 and RM==0b100:

Strange mapper behaviour

>>> print(str(mapper().M(env.mem(env.ptr(env.cst(0,size=32))))))
M32(0x0)
>>> print(str(mapper().M(env.mem(env.ptr(env.ext('toto',size=32))))))
@toto

where I would expect M32(@toto).

I propose the following patch

--- a/amoco/cas/mapper.py
+++ b/amoco/cas/mapper.py
@@ -130,7 +130,7 @@ class mapper(object):
     # get a memory location value (fetch) :
     # k must be mem expressions
     def M(self,k):
-        if k.a.base._is_ext: return k.a.base
+        if k.a.base._is_ext: return k
         n = self.aliasing(k)
         if n>0:
             f = lambda e:e[0]._is_ptr

x86: issues with movq

>>> i = cpu_x86.disassemble('\x66\x0F\xD6\x4D\xF0')
>>> str(i)
'movq        qword ptr [ebp-16], xmm1'
>>> i(mapper())
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "amoco/arch/core.py", line 67, in __call__
    i_xxx(self,map)
  File "amoco/arch/x86/asm.py", line 1246, in i_MOVQ
    fmap[op1] = op2.zeroextend(op1.size)
  File "amoco/cas/mapper.py", line 204, in __setitem__
    raise ValueError('size mismatch')
ValueError: size mismatch

and also

>>> str(cpu_x86.disassemble('\xF3\x0F\x7E\x55\xF0'))
'movq        mm2, qword ptr [ebp-16]'

where it should be 'movq xmm2, qword ptr [ebp-16]'

and finally

>>> i = cpu_x86.disassemble('\xF3\x0F\x7E\xC1')
>>> str(i)
'movq        xmm0, xmm1'
>>> m=mapper()
>>> i(m)
>>> print(str(m))
eip <- { | [0:32]->(eip+0x4) | }
xmm0 <- { | [0:128]->xmm1 | }

which should only copy the low quadword of %xmm1 and clear the high quadword of %xmm0

x64: bug in decoding MOVSX

>>> str(cpu_x64.disassemble('\x48\x0f\xbe\xc0'))
'movsx       rax, rax'

should return movsx rax, al

Proposed patch:

--- a/amoco/arch/x64/spec_ia32e.py
+++ b/amoco/arch/x64/spec_ia32e.py
@@ -860,6 +860,7 @@ def ia32_movx(obj,Mod,RM,REG,data,_flg8):
         if R==1: REG = (R<<3)+REG
     op1 = env.getreg(REG,size)
     obj.misc['opdsz']=8 if _flg8 else 16
+    if REX is not None: obj.misc['REX']=(0,R,X,B) # op2 not 64-bit
     op2,data = getModRM(obj,Mod,RM,data)
     obj.operands = [op1, op2]
     obj.type = type_data_processing

Distribution request

I am a developer on the ArchAssault project, we would like to distribute this tool. Would be possible to add a tag and a license to it, so that we can release it?

x86: error in parsing 6A FF (push -1)

Proposed patch:

--- a/amoco/arch/x86/spec_ia32.py
+++ b/amoco/arch/x86/spec_ia32.py
@@ -136,11 +136,14 @@ def ia32_strings(obj):
 # imm8:
 @ispec_ia32("16>[ {d5} ib(8) ]", mnemonic = "AAD",    type=type_data_processing)
 @ispec_ia32("16>[ {d4} ib(8) ]", mnemonic = "AAM",    type=type_data_processing)
-@ispec_ia32("16>[ {6a} ib(8) ]", mnemonic = "PUSH",   type=type_data_processing)
 @ispec_ia32("16>[ {cd} ib(8) ]", mnemonic = "INT",    type=type_control_flow)
 def ia32_imm8(obj,ib):
     obj.operands = [env.cst(ib,8)]

+@ispec_ia32("16>[ {6a} ib(8) ]", mnemonic = "PUSH",   type=type_data_processing)
+def ia32_imm8_signed(obj,ib):
+    obj.operands = [env.cst(ib,8).signextend(8)]
+
 @ispec_ia32("16>[ {eb} ib(8) ]", mnemonic = "JMP",    type=type_control_flow)
 @ispec_ia32("16>[ {e2} ib(8) ]", mnemonic = "LOOP",   type=type_control_flow)
 @ispec_ia32("16>[ {e1} ib(8) ]", mnemonic = "LOOPE",  type=type_control_flow)

patch request for symbolic MemoryZone management

When parsing relocatable files, some offsets for memory zones are determined by the relocation table, and therefore are not integer values, but symbols.
Amoco handles this relatively well (we can put an amoco expression in .disp) but there are a few minor issues.
One is for displaying; proposed patch is

--- a/amoco/cas/expressions.py
+++ b/amoco/cas/expressions.py
@@ -953,6 +953,7 @@ class ptr(exp):
         return '%s(%s%s)'%(self.seg,self.base,d)

     def disp_tostring(self,base10=True):
+        if hasattr(self.disp, '_is_cst'): return '+%s'%self.disp
         if self.disp==0: return ''
         if base10: return '%+d'%self.disp
         c = cst(self.disp,self.size)

Another one, more tricky, is in memory zone management; proposed patch is

--- a/amoco/system/core.py
+++ b/amoco/system/core.py
@@ -197,7 +197,9 @@ class MemoryZone(object):
         if i is None:
             if len(self._map)==0: return [void(l*8)]
             v0 = self._map[0].vaddr
-            if (vaddr+l)<=v0: return [void(l*8)]
+            # Don't test if (vaddr+l)<=v0 because we need the test to be
+            # true if vaddr or v0 contain label/symbols
+            if not (v0<(vaddr+l)): return [void(l*8)]
             res.append(void((v0-vaddr)*8))
             l = (vaddr+l)-v0
             vaddr = v0

Expressions error

Hi,

i followed the example and got

>>> from amoco.arch.x86.env import *
>>> from amoco.cas import smt
>>> z = (eax^cst(0xcafebabe,32))+(ebx+(eax>>2))
>>> print z
T32
>>> print z.to_smtlib()
_top
>>> type(z)
<class 'amoco.cas.expressions.top'>
>>> z.to_smtlib().sexpr()
'_top'

It should be op. I removed "+(eax":

>>> z = (eax^cst(0xcafebabe,32))+(ebx>>2)
>>> type(z)
<class 'amoco.cas.expressions.op'>
>>> z.to_smtlib()
(eax ^ 3405691582) + LShR(ebx, 2)
>>> z.to_smtlib().sexpr()
'(bvadd (bvxor eax #xcafebabe) (bvlshr ebx #x00000002))'

x64: MOVSQ (and others) are missing

>>> str(cpu_x64.disassemble('\xa5'))
'movsd       '
>>> str(cpu_x64.disassemble('\x48\xa5'))
'movsd       '

Proposed patch:

--- a/amoco/arch/x64/spec_ia32e.py
+++ b/amoco/arch/x64/spec_ia32e.py
@@ -134,6 +134,8 @@ def ia32_nooperand(obj):
 def ia32_strings(obj):
     if obj.mnemonic[-1]=='D' and obj.misc['opdsz']:
         obj.mnemonic = obj.mnemonic[:-1]+'W'
+    if obj.mnemonic[-1]=='D' and obj.misc['REX']:
+        obj.mnemonic = obj.mnemonic[:-1]+'Q'

 #1 operand
 #----------

x86: Intel-syntax output is not the one that GNU as can use

Proposed patch.

--- a/amoco/arch/x86/formats.py
+++ b/amoco/arch/x86/formats.py
@@ -11,15 +11,60 @@ def pfx(i):
 def mnemo(i):
     mnemo = i.mnemonic.replace('cc','')
     if hasattr(i,'cond'): mnemo += i.cond[0].split('/')[0]
-    return [(Token.Mnemonic,'{: <12}'.format(mnemo.lower()))]
+    if mnemo == 'RETN': mnemo = 'RET'
+    s = [(Token.Mnemonic,'{: <12}'.format(mnemo.lower()))]
+    # clang assembler does not understand 'sal'
+    if mnemo == 'SAL': s = [(Token.Mnemonic,'{: <12}'.format('shl'))]
+    # Special case: when gcc produces 'rep ret'
+    # http://mikedimmick.blogspot.fr/2008/03/what-heck-does-ret-mean.html
+    # it usually puts it on two separate lines, and old versions of
+    # GNU as don't like a true 'rep ret'
+    if mnemo == 'RET' and i.misc.get('pfx') is not None:
+        if i.misc['pfx'][0] == 'rep':
+            s = [(Token.Prefix,'rep; ')] + s
+    return s
+
+def str_intel(op):
+    if op is None:
+        return "<None>" # NEVER
+    elif op._is_ext:
+        return str(op.ref)
+    elif op._is_eqn:
+        # Special cases
+        if op.op.symbol == '+' and op.l._is_eqn and op.l.op.symbol == '+' \
+            and op.l.l._is_eqn and op.l.l.l is None and op.l.l.op.symbol == '-':
+            # clang ptr diff
+            return '(%s+%s-%s)' % (
+                str_intel(op.r),
+                str_intel(op.l.r),
+                str_intel(op.l.l.r))
+        if op.op.symbol == '+' and op.r._is_reg and str(op.r) == 'esp':
+            # ending the formula with 'esp' is not allowed by assemblers
+            return '(%s+%s)' % (str_intel(op.r), str_intel(op.l))
+        # Generic case
+        if op.l is None: return '(%s%s)' % (op.op.symbol, str_intel(op.r))
+        return '(%s%s%s)' % (str_intel(op.l), op.op.symbol, str_intel(op.r))
+    else: # _is_cst or _is_reg
+        return str(op)

 def deref(op):
     assert op._is_mem
-    d = '%+d'%op.a.disp if op.a.disp else ''
-    s = {8:'byte ptr ',16:'word ptr ', 64:'qword ptr ', 128:'xmmword ptr '}.get(op.size,'')
-    s += '%s:'%op.a.seg  if (op.a.seg is not '')  else ''
-    s += '[%s%s]'%(op.a.base,d)
-    return s
+    address = str_intel(op.a.base)
+    if op.a.disp:
+        address += '%+d' % op.a.disp
+    prefix = {
+        8:  'byte ptr ',
+        16: 'word ptr ',
+        32: 'dword ptr ',
+        64: 'qword ptr ',
+        80: 'tbyte ptr ',
+        128:'xmmword ptr ',
+        }.get(op.size,'')
+    if op.a.seg is not '':
+        prefix += '%s:' % op.a.seg
+    if getattr(op.a.disp, '_is_ext', False):
+        prefix += '%s' % op.a.disp.ref
+    return '%s[%s]' % (prefix, address)

 def opers(i):
     s = []
@@ -33,8 +78,19 @@ def opers(i):
                 s.append((Token.Constant,'%+d'%op.value))
             else:
                 s.append((Token.Constant,str(op)))
+        elif op._is_ext:
+            s.append((Token.Address,'OFFSET FLAT:%s'%op.ref))
         elif op._is_reg:
-            s.append((Token.Register,str(op)))
+            op = str(op)
+            if op.startswith('st'):
+                op = 'st(%s)'%op[2]
+            s.append((Token.Register,op))
+        elif op._is_eqn and op.l._is_ext:
+            s.append((Token.Address,'OFFSET FLAT:%s%s%s'%
+                (op.l.ref,op.op.symbol,int(op.r))))
+        else:
+            import sys
+            sys.stderr.write("TODO %s %s\n"%(op.__class__,op))
         s.append((Token.Literal,', '))
     if len(s)>0: s.pop()
     return s
@@ -42,7 +98,9 @@ def opers(i):
 def oprel(i):
     to = i.misc['to']
     if to is not None:
-        return [(Token.Address,'*'+str(to))]
+        return [(Token.Address,str(to))]
+    if i.operands[0]._is_ext:
+        return [(Token.Address,str(i.operands[0].ref))]
     if (i.address is not None) and i.operands[0]._is_cst:
         v = i.address + i.operands[0].signextend(32) + i.length
         i.misc['to'] = v

mapper can become very slow

m=mapper()
for i in [
  '\x8B\xBB\0\0\0\0',
  '\x83\xEF\x0C',
  '\x89\xBB\0\0\0\0',
  '\x8B\x93\0\0\0\0',
  '\xFF\x74\xBA\x2C',
  '\xFF\x74\xBA\x28',
  '\xFF\x74\xBA\x24',
  '\xFF\x74\xBA\x20',
  '\xFF\x74\xBA\x1C',
  '\xFF\x74\xBA\x18',
  '\xFF\x74\xBA\x14',
  '\xFF\x74\xBA\x10',
  '\xFF\x74\xBA\x0C',
  '\xFF\x74\xBA\x08',
  '\xFF\x74\xBA\x04',
  '\xFF\x34\xBA',
  '\x8B\x84\x24\xF0\0\0\0',
  '\xFF\x50\x08',
  ]:
  i=cpu_x86.disassemble(i)
  print(str(i))
  i(m)

outputs

mov         edi, [ebx]
sub         edi, 0xc
mov         [ebx], edi
mov         edx, [ebx]
push        [((edi*0x4)+edx)+0x2c]
push        [((edi*0x4)+edx)+0x28]
push        [((edi*0x4)+edx)+0x24]
push        [((edi*0x4)+edx)+0x20]
push        [((edi*0x4)+edx)+0x1c]
push        [((edi*0x4)+edx)+0x18]
push        [((edi*0x4)+edx)+0x14]
push        [((edi*0x4)+edx)+0x10]
push        [((edi*0x4)+edx)+0xc]
push        [((edi*0x4)+edx)+0x8]
push        [((edi*0x4)+edx)+0x4]
push        [((edi*0x4)+edx)]
mov         eax, [esp+240]
call        [eax+0x8]

(which is OK) but takes 16 seconds on my computer (which is not OK).

x86: A few errors in SSE2 parsing

Some are missing, e.g. F3 0F 2C 45 F0 aka. cvttss2si -16(%ebp), %eax.
Some are incorrectly parsed, e.g. F3 0F 5A 45 F0 aka. cvtss2sd -16(%ebp), %xmm0 but amoco says cvtss2sd eax, [ebp-16]

Inconsistencies when using simultaneously multiple architectures

>>> from amoco.arch.x64 import cpu_x64
>>> from amoco.cas.mapper import mapper
>>> cpu_x64.disassemble('\x77\x08')(mapper())
>>> from amoco.arch.x86 import cpu_x86
>>> cpu_x64.disassemble('\x77\x08')(mapper())
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "amoco/arch/core.py", line 67, in __call__
    i_xxx(self,map)
  File "amoco/arch/x86/asm.py", line 503, in i_Jcc
    fmap[eip] = tst(fmap(cond),fmap[eip]+op1,fmap[eip])
  File "amoco/cas/expressions.py", line 39, in checkarg_numeric
    return f(self,n)
  File "amoco/cas/expressions.py", line 186, in __add__
    def __add__(self,n): return oper('+',self,n)
  File "amoco/cas/expressions.py", line 1201, in oper
    return op(opsym,l,r).simplify()
  File "amoco/cas/expressions.py", line 1216, in __init__
    raise ValueError("Size mismatch %d <> %d"%(l.size,r.size))
ValueError: Size mismatch 32 <> 64

We can see that after importing x86, decoding 64-bit instructions gets the semantics of 32-bit instructions.

Some errors in cfg recovery

Hello,

First, amoco seems really cool. Thanks!

I have an issue, and a question

Issue:
cfg recovery seems to be a bit broken currently. I have a simple 'puts("hello world")' elf which I am using for testing. the lsweep method recovers most of the basic blocks, as expect however others don't seem to get past the first basic block:

>>> p = amoco.system.loader.load_program('hi32')
>>> z = amoco.lforward(p)
>>> G=z.getcfg()
>>> print G.C
[<grandalf.graphs.graph_core object at 0x7fa70251bf10>, <grandalf.graphs.graph_core object at 0x7fa7024660d0>]
>>> print G.C[0].sV
0.| <node [0x8048380] at 0x7fa70251bb90>
>>> print G.C[1].sV
0.| <node [#PLT@__libc_start_main] at 0x7fa7079073d0>
1.| <node [@__libc_start_main] at 0x7fa7024def50>

Furthermore, some methods error:

>>> z = amoco.fbackward(p)
>>> z.getcfg()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python2.7/dist-packages/amoco/main.py", line 267, in getcfg
    for x in self.itercfg(loc): pass
  File "/usr/local/lib/python2.7/dist-packages/amoco/main.py", line 289, in itercfg
    if self.check_ext_target(t):
  File "/usr/local/lib/python2.7/dist-packages/amoco/main.py", line 261, in check_ext_target
    self.update_spool(e.v[1],t.parent)
  File "/usr/local/lib/python2.7/dist-packages/amoco/main.py", line 206, in update_spool
    T = self.get_targets(vtx,parent)
  File "/usr/local/lib/python2.7/dist-packages/amoco/main.py", line 373, in get_targets
    func.map[pc] = mpc
  File "/usr/local/lib/python2.7/dist-packages/amoco/code.py", line 33, in map
    self.helper(self._map)
  File "/usr/local/lib/python2.7/dist-packages/amoco/code.py", line 42, in helper
    if self._helper: self._helper(self,m)
AttributeError: _helper

Although interestingly, running it again does not error, it still doesn't recovery the cfg:

>>> z.getcfg()
<amoco.cfg.graph object at 0x7fa702511a90>

(I've also noticed running other cfg recoveries twice gives different results the second time)

I think this these errors stem from certain function calls not getting resolved properly, but I'm not sure:

>>> print n.data
# --- block 0x8048380 ---
0x8048380  '31ed'           xor         ebp, ebp
0x8048382  '5e'             pop         esi
0x8048383  '89e1'           mov         ecx, esp
0x8048385  '83e4f0'         and         esp, 0xfffffff0
0x8048388  '50'             push        eax
0x8048389  '54'             push        esp
0x804838a  '52'             push        edx
0x804838b  '6820850408'     push        #__libc_csu_fini
0x8048390  '68b0840408'     push        #__libc_csu_init
0x8048395  '51'             push        ecx
0x8048396  '56'             push        esi
0x8048397  '687b840408'     push        #main
0x804839c  'e8afffffff'     call        *0x8048350
>>> i = n.data.instr[-1]
>>> print i.misc['to']
0x8048350
>>> p.mmap.read(0x8048350, 4)
['\xff%(\x97']

Im happy to provide the binaries, and/or any other info that would be helpful. I spent some time looking around trying to solve it, but am still wrapping my head around how everything is set up.

Question:
Im interested in tagging the memory sections with their flags (i.e read, write execute). I originally hacked it onto 'mo', before I noticed that MemoryMap has an unused 'perm' attribute. Is this the correct spot to store that info?

Alternatively, the info is stored in p.bin.Phdr. Would it be cleaner to just query that?

Thanks!

x64: another bug in decoding MOVSX

>>> from amoco.arch.x64 import cpu_x64
>>> str(cpu_x64.disassemble('\x48\x0f\xbf\xf2'))
'movsx       rsi, rdx'

should be movsx rsi, dx

Proposed patch:

--- a/amoco/arch/x64/spec_ia32e.py
+++ b/amoco/arch/x64/spec_ia32e.py
@@ -849,7 +852,7 @@ def ia32_movx(obj,Mod,RM,REG,data,_flg8):
     if R==1: REG = (R<<3)+REG
     op1 = env.getreg(REG,size)
     obj.misc['opdsz']=8 if _flg8 else 16
-    op2,data = getModRM(obj,Mod,RM,data)
+    op2,data = getModRM(obj,Mod,RM,data,REX=(0,R,X,B))
     obj.operands = [op1, op2]
     obj.type = type_data_processing

Error in PE loader

The PE loader was stopping reading after 179 bytes in some samples I was analysing. Tracked it down to the following line in system\pe.py:567

f = open(filename,'r')

The open mode should be 'rb', otherwise the stream terminates at invalid string characters.

Sorry for not doing the whole "fork and merge" thing, I'm lazy :)

x64: errors in decoding MOVD

For example

>>> str(cpu_x64.disassemble('\x66\x0f\x7e\xc7'))
'movd        rdi, xmm0'

should be movd edi, xmm0

Proposed patch:

--- a/amoco/arch/x64/spec_sse.py
+++ b/amoco/arch/x64/spec_sse.py
@@ -122,7 +122,7 @@ def sse_ps(obj,Mod,REG,RM,data):
 @ispec_ia32("*>[ {0f}{6e} /r ]", mnemonic="MOVD", _inv=False)
 @ispec_ia32("*>[ {0f}{7e} /r ]", mnemonic="MOVD", _inv=True)
 def sse_pd(obj,Mod,REG,RM,data, _inv):
-    if not check_nopfx(obj,set_opdsz_64): raise InstructionError(obj)
+    if not check_nopfx(obj,set_opdsz_32): raise InstructionError(obj)
     REX = obj.misc['REX']
     if REX is not None:
         W,R,X,B = REX
@@ -889,7 +889,7 @@ def sse_pd(obj,Mod,RM,data):
 @ispec_ia32("*>[ {0f}{6e} /r ]", mnemonic="MOVD", _inv=False)
 @ispec_ia32("*>[ {0f}{7e} /r ]", mnemonic="MOVD", _inv=True)
 def sse_pd(obj,Mod,REG,RM,data, _inv):
-    if not check_66(obj,set_opdsz_64): raise InstructionError(obj)
+    if not check_66(obj,set_opdsz_32): raise InstructionError(obj)
     op2,data = getModRM(obj,Mod,RM,data)
     op1 = getregR(obj,REG,128)
     obj.operands = [op1,op2] if not _inv else [op2,op1]

memory leaks when using mapper

The following script

from amoco.cas.mapper import mapper
import gc
gc.enable()
gc.set_debug(gc.DEBUG_LEAK)
m=mapper()
del m
gc.collect()
print(gc.garbage)

outputs:

gc: collectable <list 0x10d8357e8>
[[[...], [...], None]]

This might not seem an issue, but it makes my software eat hundreds of megabytes of memory...

The culprit might be OrderedDict.
Cf. http://www.gossamer-threads.com/lists/python/bugs/860875
This bug is known since 2010, and not corrected on standard python 2.7 installations.

x86: emulation mixes the two movsd instructions

>>> i = cpu_x86.disassemble('\xf2\x0f\x10\x40\x30')
>>> str(i)
'movsd       xmm0, qword ptr [eax+48]'
>>> m=mapper()
>>> i(m)
>>> print(str(m))
(edi) <- M32(esi)
eip <- { | [0:32]->(eip+0x5) | }
esi <- { | [0:32]->(df ? (esi-0x4) : (esi+0x4)) | }
edi <- { | [0:32]->(df ? (edi-0x4) : (edi+0x4)) | }

which is the semantics of the movsd instruction obtained from i = cpu_x86.disassemble('\xa5')

Proposed patch:

--- a/amoco/arch/x86/asm.py
+++ b/amoco/arch/x86/asm.py
@@ -374,6 +374,10 @@ def i_STOSD(i,fmap):

 #------------------------------------------------------------------------------
 def _movs_(i,fmap,l):
+  if len(i.operands):
+      # SSE2 movsd instruction
+      # no semantics available
+      return
   counter = cx if i.misc['adrsz'] else ecx
   loc = mem(edi,l*8)
   src = fmap(mem(esi,l*8))

Feature Request: MIPS Support

One of the things that I'd really like to see out of Amoco is to support the basic 32-bit MIPS ISA. As it's a RISC architecture and not entirely dissimilar to ARM/SPARC, it shouldn't be too difficult.

Eventually I'll have time to do this, but I don't think it'll be until after DEFCON in August.

x86: emulation of PSLLQ and PSRLQ

The second argument can be an imm8.

--- a/amoco/arch/x86/asm.py
+++ b/amoco/arch/x86/asm.py
@@ -1363,7 +1365,6 @@ def i_PSRLQ(i,fmap):
   fmap[eip] = fmap[eip]+i.length
   op1 = i.operands[0]
   op2 = i.operands[1]
-  assert op1.size==op2.size
   src1 = fmap(op1)
   src2 = fmap(op2)
   val1 = (src1[i:i+64] for i in range(0,op1.size,64))
@@ -1374,7 +1376,6 @@ def i_PSLLQ(i,fmap):
   fmap[eip] = fmap[eip]+i.length
   op1 = i.operands[0]
   op2 = i.operands[1]
-  assert op1.size==op2.size
   src1 = fmap(op1)
   src2 = fmap(op2)
   val1 = (src1[i:i+64] for i in range(0,op1.size,64))

x86 and x64: both arguments of PSLLQ (and others) can be registers

>>> from amoco.arch.x64 import cpu_x64
>>> from amoco.cas.mapper import mapper
>>> i=cpu_x64.disassemble('\x66\x0f\xf3\xd3')
>>> str(i)
'psllq       xmm2, xmm3'
>>> i(mapper())
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "amoco/arch/core.py", line 67, in __call__
    i_xxx(self,map)
  File "amoco/arch/x64/asm.py", line 1461, in i_PSLLQ
    res  = [v1<<src2.value for v1 in val1]
AttributeError: 'reg' object has no attribute 'value'

x64: typo (line missing) in MOVSXD decoding

>>> str(cpu_x64.disassemble('\x48\x63\xd2'))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "amoco/arch/core.py", line 197, in __call__
    return self(bytestring[s.mask.size/8:],**kargs)
  File "amoco/arch/core.py", line 190, in __call__
    i = s.decode(bytestring,e,i=self.__i,ival=b.ival)
  File "amoco/arch/core.py", line 437, in decode
    self.hook(obj=i,**kargs)
  File "amoco/arch/x64/spec_ia32e.py", line 629, in ia32_movsxd
    op2,data = getModRM(obj,Mod,RM,data,REX=(0,R,X,B))
NameError: global name 'R' is not defined

Proposed patch

--- a/amoco/arch/x64/spec_ia32e.py
+++ b/amoco/arch/x64/spec_ia32e.py
@@ -626,6 +626,7 @@ def ia32_arpl(obj,Mod,REG,RM,data,_inv):
 def ia32_movsxd(obj,Mod,REG,RM,data):
     op1 = getregR(obj,REG,64)
     # force REX.W=0 for op2 decoding:
+    W,R,X,B = getREX(obj)
     op2,data = getModRM(obj,Mod,RM,data,REX=(0,R,X,B))
     obj.operands = [op1, op2]
     obj.type = type_data_processing

RuntimeError: maximum recursion depth exceeded in __instancecheck__

Hello Axel,

Here is a repro for a bug I recently encountered:

import amoco
import amoco.system.raw
import amoco.arch.x86.cpu_x86 as cpu

def sym_exec_gadget_and_get_mapper(code):
    p = amoco.system.raw.RawExec(
        amoco.system.core.DataIO(code), cpu
    )
    blocks = list(amoco.lsweep(p).iterblocks())
    assert(len(blocks) > 0)
    mp = amoco.cas.mapper.mapper()
    for block in blocks:
        if block.instr[-1].mnemonic.lower() == 'call':
            p.cpu.i_RET(None, block.map)
        mp >>= block.map
    return mp

code = '\x01\x00\xf4\xfd\xf4\xff\xd6'
x = sym_exec_gadget_and_get_mapper(code)

# D:\Codes\rp2s\amoco\cas\expressions.pyc in checkarg_numeric(self, n)
#      31 def _checkarg_numeric(f):
#      32     def checkarg_numeric(self,n):
# ---> 33         if isinstance(n,(int,long)):
#      34                 n = cst(n,self.size)
#      35         elif isinstance(n,(float)):

# RuntimeError: maximum recursion depth exceeded in __instancecheck__

Cheers,
0vercl0k

x64: error in decoding CVTTSD2SI or CVTSD2SI

>>> str(cpu_x64.disassemble('\xf2\x48\x0f\x2c\xf3'))
'cvttsd2si   rsi, rbx'

should be cvttsd2si rsi, xmm3

Proposed patch:

--- a/amoco/arch/x64/spec_sse.py
+++ b/amoco/arch/x64/spec_sse.py
@@ -389,7 +389,9 @@ def sse_sd(obj,Mod,REG,RM,data):
 @ispec_ia32("*>[ {0f}{2d} /r ]", mnemonic="CVTSD2SI")
 def sse_sd(obj,Mod,REG,RM,data):
     if not check_f2(obj,set_opdsz_128): raise InstructionError(obj)
-    op2,data = getModRM(obj,Mod,RM,data)
+    # force REX.W=0 for op2 decoding:
+    W,R,X,B = getREX(obj)
+    op2,data = getModRM(obj,Mod,RM,data,REX=(0,R,X,B))
     if op2._is_mem: op2.size = 64
     op1 = getregRW(obj,REG,32)
     obj.operands = [op1,op2]

x86: PUNPCKHQDQ is missing

Proposed patch:

--- a/amoco/arch/x86/spec_sse.py
+++ b/amoco/arch/x86/spec_sse.py
@@ -577,6 +577,7 @@ def sse_sd(obj,Mod,REG,RM,data):
 @ispec_ia32("*>[ {0f}{6a} /r ]", mnemonic="PUNPCKHDQ")
 @ispec_ia32("*>[ {0f}{6b} /r ]", mnemonic="PACKSSDW")
 @ispec_ia32("*>[ {0f}{6c} /r ]", mnemonic="PUNPCKLQDQ")
+@ispec_ia32("*>[ {0f}{6d} /r ]", mnemonic="PUNPCKHQDQ")
 @ispec_ia32("*>[ {0f}{6f} /r ]", mnemonic="MOVDQA")
 @ispec_ia32("*>[ {0f}{74} /r ]", mnemonic="PCMPEQB")
 @ispec_ia32("*>[ {0f}{75} /r ]", mnemonic="PCMPEQW")

Semantics of IA32 SAR

In amoco/arch/x86/asm.py I can read

def i_SAR(i,fmap):
  (...)
  if count._is_cst:
    if count.value==0: return
    if count.value==1:
        fmap[of] = bit0
    else:
        fmap[of] = top(1)

while the Intel instruction set reference says

The OF flag is affected only on 1-bit shifts.

Why do you set it to top?

TypeError: to_smtlib() takes exactly 1 argument (2 given)

Hello Axel,

Here is a repro for a bug I recently encountered:

import amoco
import amoco.system.raw
import amoco.arch.x86.cpu_x86 as cpu

def sym_exec_gadget_and_get_mapper(code):
    p = amoco.system.raw.RawExec(
        amoco.system.core.DataIO(code), cpu
    )
    blocks = list(amoco.lsweep(p).iterblocks())
    assert(len(blocks) > 0)
    mp = amoco.cas.mapper.mapper()
    for block in blocks:
        if block.instr[-1].mnemonic.lower() == 'call':
            p.cpu.i_RET(None, block.map)
        mp >>= block.map
    return mp

code = '\xec\x27\x57\x33\xec\x67\x11\x04\xc3'
x = sym_exec_gadget_and_get_mapper(code)

# D:\Codes\rp2s\amoco\cas\smt.pyc in cast_z3_bool(x, solver)
#     102
#     103 def cast_z3_bool(x,solver=None):
# --> 104     b = x.to_smtlib(solver)
#     105     if not z3.is_bool(b):
#     106         assert b.size()==1

# TypeError: to_smtlib() takes exactly 1 argument (2 given)

Cheers,
0vercl0k

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.