Coder Social home page Coder Social logo

kordesii's People

Contributors

dc3-tsd avatar ddash-ct avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

kordesii's Issues

Extracting variable names may sometimes crash kordesii

Extracting variable names may sometimes crash kordesii. Observed while analyzing eae876886f19ba384f55778634a35a1d975414e83f22f6111e3e792f706301fe.

  • Example code: var = context.variables[data_ptr]

  • Example Output:

[*] operands        : Calculating operand: [ebp+var_9B+7] -> 0x117F7FC + 0x0*0x1 - 0x94 = 0x117F768
[!] core            : 18347880
Traceback (most recent call last):
  File "c:\github\extmir-kordesii\kordesii\core.py", line 88, in decoder_entry
    main_func()
  File "c:/github/extmir-kordesii/kordesii/decoders/conti.py", line 488, in main
    decrypt_strings_2()
  File "c:/github/extmir-kordesii/kordesii/decoders/conti.py", line 246, in decrypt_strings_2
    var = context.variables[data_ptr]
  File "c:\github\extmir-kordesii\kordesii\utils\function_tracing\variables.py", line 53, in __getitem__
    return self._variables[addr_or_name]
KeyError: 18347880
[+] core            : IDA return code = 0
----Decoded Strings----
  • Workaround was not to rely on variable names but directly use variable addresses instead

  • Example workaround: var_addr = operands[1].addr

Implement an API layer to divide CORE from Disassembler

Currently, kordesii is built above IDA-Python and dependent on IDA to achieve its stuff. That, when IDA is basically supplying for Kordesii the information to work on, where Kordesii itself performs the logic.

My suggestion is to use a more flexible and modern design for Kordesii in such a way that an API layer is implemented above CORE functionality. The logic will be implemented without being dependent on a specific disassembler to supply the information.

In this way, the community would be able to implement plugins for other disassemblers such as Cutter, Radare2, Binary Ninja, and GHIDRA.

This method will expand the usability of Kordesii and your great implementation for IDA can be a template or a go-to reference for other plugins.

To sum up, the solution I would like to see will be

Cutter        IDA      Radare2      Binja      GHIDRA
    |__________|_________|___________|_________|
                         |
                      API Layer
                         |
                    Kordesii Logic

Force usage of IDA64

The current kordesii version does a check on the input file to determine if it is appropriate to run 32-bit or 64-bit IDA for file analysis, which assumes the input file is a PE/ELF/MACH-O. But, for situations in which the input file is shellcode, there is no option to force analysis as 64-bit, and the default will be 32-bit.

Proposing enabling an option whereby analysis as 64-bit can be forced. Ideally, this option would be propagated to the DC3-MWCP project for invocation of a kordesii script from a MWCP module. If this should additionally be added as an issue to DC3-MWCP , please let me know.

Bug: cpu_context.ProcessorContext.get_value WIDE_STRING

Identified an bug in cpu_context.ProcessorContext.get_value() specifically when specifying the data_type as function_tracing.WIDE_STRING.

This was specifically observed in malware sample hash c78beff838f4c57be9044996c25eca7d, when the FunctionTracer is analyzing the call to lstrcpyW at ea 0x4034d2.

The following is the sample code used to extract the data:

from kordesii.utils import function_tracing

ft = function_tracing.FunctionTracer(0x4034d2)
ctx, args = ft.get_function_args(0x4034d2)
src = args[1]
value = ctx.get_value(src, data_type=function_tracing.WIDE_STRING)

Where value is the following: '/\x00f\x00a\x00v\x00.\x00i\x00c\x00o' whereas it should have trailing nulls to make it a valid UTF-16 encoded string.

IDA 7.x support

Is supporting IDA 7.x for kordesii in the plan? Would it be possible to get a branch for 7.x?

Bug: cpu_context.mem_read()

There is a bug in cpu_context.mem_read() where a NoneType object is being returned when data of length 4 was being requested.

This was specifically observed in malware sample hash 86c314bc2dc37ba84f7364acd5108c2b, when the FunctionTracer is analyzing the code at ea 0x1400013f3.

The stack trace is as follows:

  File "<string>", line 4, in <module>
  File "C:\Python27\lib\site-packages\kordesii\utils\function_tracing\function_tracer.py", line 111, in context_at
    for ctx in self.iter_context_at(ea):
  File "C:\Python27\lib\site-packages\kordesii\utils\function_tracing\function_tracer.py", line 96, in iter_context_at
    yield pb.cpu_context(code_ref)
  File "C:\Python27\lib\site-packages\kordesii\utils\function_tracing\flowchart.py", line 79, in cpu_context
    processor.execute(self._context, ip)
  File "C:\Python27\lib\site-packages\kordesii\utils\function_tracing\cpu_emulator.py", line 264, in execute
    instruction(cpu_context, ip, mnem, operands)
  File "C:\Python27\lib\site-packages\kordesii\utils\function_tracing\opcodes.py", line 879, in _mov_lea
    cpu_context.set_operand_value(ip, opvalue2, idc.print_operand(ip, 0), idc.get_operand_type(ip, 0), width=width)
  File "C:\Python27\lib\site-packages\kordesii\utils\function_tracing\cpu_context.py", line 899, in set_operand_value
    self.reg_write(opnd.upper(), value)
  File "C:\Python27\lib\site-packages\kordesii\utils\function_tracing\cpu_context.py", line 619, in reg_write
    self.registers[reg.upper()] = val
  File "C:\Python27\lib\site-packages\kordesii\utils\function_tracing\cpu_context.py", line 436, in __setitem__
    self.__setattr__(reg_name, value)
  File "C:\Python27\lib\site-packages\kordesii\utils\function_tracing\cpu_context.py", line 433, in __setattr__
    register[reg_name] = value
  File "C:\Python27\lib\site-packages\kordesii\utils\function_tracing\cpu_context.py", line 383, in __setitem__
    self.__setattr__(reg_name, value)
  File "C:\Python27\lib\site-packages\kordesii\utils\function_tracing\cpu_context.py", line 379, in __setattr__
    raise ValueError('Register value must be int or long, got {}'.format(type(value)))
ValueError: Register value must be int or long, got <type 'NoneType'>

Debugging through this, starting in the cpu_emulator._get_value() function for analysis at the EA:

  • The type at opnd 1 of 0x1400013f3 is identified as idc.o_mem, and cpu_context.mem_read() is called
  • The address is identified as not being mapped, and mem_read() calls self.map_segment - the function eventually runs the code at L692 - self._memctrlr.read(address, size)
  • The code in the read function identifies that the provided offset is not in the _memmap, so it enters the loop and returns a string of length 0.

core.py update breaks kordesii in Windows 10

The following update to core.py appears to break the kordesii parse command in a Windows 10 environment:
process = subprocess.Popen(command, stdout=subprocess.PIPE, stderr=subprocess.PIPE, shell=sys.platform != 'nt')

Windows failed to interpret the command C:\\Program Files\\IDA 7.2\\ida.exe... as a valid command with the shell=sys.platform != 'nt' argument and threw the following error message:
'C:\Program' is not recognized as an internal or external command, operable program or batch file.

Signed values cause issues with IDIV instruction.

While analyzing eae876886f19ba384f55778634a35a1d975414e83f22f6111e3e792f706301fe, observed unexpected values in registers after emulation specifically if an arithmetic operation is performed with negative signed values. 

  • Example operands: IMUL, IDIV, LEA (there might be more)

  • This causes string decryption to produce different results after emulation

  • Using emulate.hook_instruction() is not viable since the instruction is still being emulated either before or after hook

  • Workaround was to trap faulty operands and perform manual arithmetic operations supporting negative signed values (like doing manual emulation)

MOVUPS/MOVUPD support is missing from function_tracing/opcodes

In function_tracing/opcodes.py, there is support for the MOVAPD and MOVAPS instructions, but no support for the MOVUPD and MOVUPS instructions.

After evaluating, the MOVAPD and MOVAPS instructions can be combined to a single function using appropriate function decorators, and will also support the MOVUPD and MOVUPS instructions.

Python3 support

Are there any plans to upgrade kordesii to Python3 with the release of IDA 7.4?

Enable test case addition / updates when errors are encountered

Enable a command line option for kordesii test such that test cases can be updated/added even if an error is encountered and aggregated in self.reporter.errors, which is not always due to something related to a decoder.

Specifically applies to the update_tests and add_test functions within tester.Tester

IDB file not saved after decoder run

If you use the CLI tool and provide it with a filename without a file extension, kordesii does not save the patched .idb file after it runs the decoder (it only provides the CLI output to console) i.e. (renamed strings.exe to strings for testing purposes)

kordesii parse Sample ./kordesii/decoders/tests/strings

If you provide kordesii with a filename with an extension, kordesii will save and provide the patched .idb file after it runs the decoder:

kordesii parse Sample ./kordesii/decoders/tests/strings.exe

Is this the intended behavior?

ROL opcode implementation for x86_64 function tracing behaves incorrectly when get_msb() returns >1 bit

In cases where there are greater than size bits in a value, get_msb() will return multiple bits. This causes ROL's opvalue1 on line 1480 of utils/function_tracing/x86_64/opcodes.py to contain an incorrect value in some cases. In other places where get_msb() is used (e.g. in SAL/SHL) this is addressed by applying the bitmask retrieved by utils.get_mask(width) to the result of get_msb(), but this is not done for the ROL implementation.

Adding opvalue1 &= utils.get_mask(width) after opvalue1 = (opvalue1 * 2) + tempcf on line 1480 of x86_64/opcodes.py should address this issue for ROL, but unless get_msb() returning multiple bits is assumed in other locations it may make sense to update that function to ensure it always returns a single bit.

Current ROL code (lines 1477-1481), which multiplies opvalue1 by 2 without applying the width bitmask:

if tempcount > 0:
        while tempcount:
            tempcf = get_msb(opvalue1, width)
            opvalue1 = (opvalue1 * 2) + tempcf
            tempcount -= 1

An example of updated ROL code which applies the bitmask:

if tempcount > 0:
        while tempcount:
            tempcf = get_msb(opvalue1, width)
            opvalue1 = (opvalue1 * 2) + tempcf
            opvalue1 &= utils.get_mask(width)
            tempcount -= 1

Support movsq instruction

The movsq instruction is currently unsupported by the built-in movs x86/64 opcode.

Request the addition of support for this instruction.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.