Coder Social home page Coder Social logo

aflplusplus / qemu-libafl-bridge Goto Github PK

View Code? Open in Web Editor NEW
52.0 9.0 27.0 462.64 MB

A patched QEMU that exposes an interface for LibAFL-based fuzzers

License: Other

Emacs Lisp 0.01% GDB 0.01% Python 3.98% Dockerfile 0.01% Makefile 0.11% C 80.63% Meson 0.49% C++ 11.68% Haxe 0.38% Objective-C 0.12% Shell 1.54% Assembly 0.58% Pawn 0.03% NSIS 0.01% Perl 0.24% SmPL 0.03% GLSL 0.01% SourcePawn 0.09% NASL 0.01% POV-Ray SDL 0.06%

qemu-libafl-bridge's Introduction

QEMU LibAFL Bridge

This is a patched version of QEMU that exposes an interface for LibAFL-based fuzzers.

This raw interface is used in libafl_qemu that expose a more Rusty API.

To use libafl_qemu, refer to the LibAFL repository, especially the qemu example fuzzers such as qemu_launcher.

License

This project extends the QEMU emulator, and our contributions to previously existing files adopt those files' respective licenses; the files that we have added are made available under the terms of the GNU General Public License as published by the Free Software Foundation, either version 2 of the License, or (at your option) any later version.

qemu-libafl-bridge's People

Contributors

afaerber avatar agraf avatar aliguori avatar aurel32 avatar berrange avatar blueswirl avatar bonzini avatar davidhildenbrand avatar dgibson avatar ebblake avatar ehabkost avatar elmarco avatar gkurz avatar huth avatar jan-kiszka avatar jnsnow avatar juanquintela avatar kevmw avatar kraxel avatar legoater avatar mcayland avatar mstsirkin avatar philmd avatar pm215 avatar rth7680 avatar stefanharh avatar stsquad avatar stweil avatar vivier avatar xanclic avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

qemu-libafl-bridge's Issues

Callback for block compilation is not provided with the size of the block.

If we can provide this information to the callback, then we don't need to determine the block size for ourselves in LibAFL here.
Note the following links are from the qemu repository itself rather than the fork from the bridge.

We can see the code for generating the callback in the bridge here.

/*
 * Isolate the portion of code gen which can setjmp/longjmp.
 * Return the size of the generated code, or negative on error.
 */
static int setjmp_gen_code(CPUArchState *env, TranslationBlock *tb,
                           target_ulong pc, void *host_pc,
                           int *max_insns, int64_t *ti)
{
    ...

    //// --- Begin LibAFL code ---

    struct libafl_block_hook* hook = libafl_block_hooks;
    while (hook) {
        uint64_t cur_id = 0;
        if (hook->gen)
            cur_id = hook->gen(pc, hook->data);
        if (cur_id != (uint64_t)-1 && hook->exec) {
            TCGv_i64 tmp0 = tcg_const_i64(cur_id);
            TCGv_i64 tmp1 = tcg_const_i64(hook->data);
            TCGTemp *tmp2[2] = { tcgv_i64_temp(tmp0), tcgv_i64_temp(tmp1) };
            tcg_gen_callN(hook->exec, NULL, 2, tmp2);
            tcg_temp_free_i64(tmp0);
            tcg_temp_free_i64(tmp1);
        }
        hook = hook->next;
    }
    
    //// --- End LibAFL code ---

    gen_intermediate_code(env_cpu(env), tb, *max_insns, pc, host_pc);
    assert(tb->size != 0);

    ...

We can see that hook->gen is only passed the pc. However, if we move the hook function to below the call to gen_intermediate_code, then we will be able to include the size of the input block. This can be observed by following the call chain to where QEMU logs the input blocks when provided the -d in_asm argument where this value is used to generate the debug output. If we consider the ARM code base (although other architectures should be the same).

We can see gen_intermediate_code here.

/* generate intermediate code for basic block 'tb'.  */
void gen_intermediate_code(CPUState *cpu, TranslationBlock *tb, int *max_insns,
                           target_ulong pc, void *host_pc)
{

   ...

    translator_loop(cpu, tb, max_insns, pc, host_pc, ops, &dc.base);
}

We can then see the translator_loop here. Note that the translator is setting the tb->size value which is available to us in the function setjmp_gen_code where we call our hook.

void translator_loop(CPUState *cpu, TranslationBlock *tb, int *max_insns,
                     target_ulong pc, void *host_pc,
                     const TranslatorOps *ops, DisasContextBase *db)
{
   
    ...

    /* The disas_log hook may use these values rather than recompute.  */
    tb->size = db->pc_next - db->pc_first;
    tb->icount = db->num_insns;

#ifdef DEBUG_DISAS
    if (qemu_loglevel_mask(CPU_LOG_TB_IN_ASM)
        && qemu_log_in_addr_range(db->pc_first)) {
        FILE *logfile = qemu_log_trylock();
        if (logfile) {
            fprintf(logfile, "----------------\n");
            ops->disas_log(db, cpu, logfile);
            fprintf(logfile, "\n");
            qemu_log_unlock(logfile);
        }
    }
#endif
}

Lastly, we can see the code for generating the trace output for -d in_asm here. Note the IN: prefix which tells us this is the input block (e.g. the target code being emulated) rather than the intermediate or host representation of the block. Note here the use of the tb->size by the expression dc->base.tb->size.

static void arm_tr_disas_log(const DisasContextBase *dcbase,
                             CPUState *cpu, FILE *logfile)
{
    DisasContext *dc = container_of(dcbase, DisasContext, base);

    fprintf(logfile, "IN: %s\n", lookup_symbol(dc->base.pc_first));
    target_disas(logfile, cpu, dc->base.pc_first, dc->base.tb->size);
}

static const TranslatorOps arm_translator_ops = {
    .init_disas_context = arm_tr_init_disas_context,
    .tb_start           = arm_tr_tb_start,
    .insn_start         = arm_tr_insn_start,
    .translate_insn     = arm_tr_translate_insn,
    .tb_stop            = arm_tr_tb_stop,
    .disas_log          = arm_tr_disas_log,
};

static const TranslatorOps thumb_translator_ops = {
    .init_disas_context = arm_tr_init_disas_context,
    .tb_start           = arm_tr_tb_start,
    .insn_start         = arm_tr_insn_start,
    .translate_insn     = thumb_tr_translate_insn,
    .tb_stop            = arm_tr_tb_stop,
    .disas_log          = arm_tr_disas_log,
};

And the disassembler here. It is using the size parameter as a bound for the loop.

/* Disassemble this for me please... (debugging).  */
void target_disas(FILE *out, CPUState *cpu, target_ulong code,
                  target_ulong size)
{
    ...

    for (pc = code; size > 0; pc += count, size -= count) {
	fprintf(out, "0x" TARGET_FMT_lx ":  ", pc);
	count = s.info.print_insn(pc, &s.info);
	fprintf(out, "\n");
	
        ...
    }
}

extern on static and unused - inside device-save.c

I think there is something weird in libafl_extras/syx-snapshot/device-save.c. This line asks for an extern on a now static function:

line 16:

extern void save_section_header(QEMUFile *f, SaveStateEntry *se, uint8_t section_type);

The only use of save_section_header() was previously inside device_save_all() and now moved mostly to device_save_kind(). In the new code save_section_header() is no longer used.

Breakpoint in multi-thread environment causes hang once reached

I try to use LibAFL and qemu-x86_64-static with snapshot support to fuzz a complex application. At startup, I can set two breakpoints, call emu.run() and regain control once the breakpoint(s) trigger.

After the application has been initialized, multiple threads spawn and the actual fuzzing can start by sending a network packet. To catch this event, I set another breakpoint in a different library.

Once the packet is sent, the emulated program just "hangs", but does not signal the breakpoint to the fuzzer, causing emu.run() to not return. I'm pretty sure that this is an internal bug, because when I set the breakpoint after the parsing code, the console output indicating the parsing occurs and then the application starts to hang.

The application can run and parse the network input successfully when no breakpoint is set. Furthermore, the breakpoint triggers in gdb when i use qemu-x86_64-static with -g with the application. The CPU usage stays at 100% after the emulated application "hangs", so maybe some other threads lock and prevent the signal back to LibAFL?

I just tried to generate a CPU trace with -d cpu, but cancelled it after the size grew to over 30GB. If you have any other hints how to debug the issue I'm happy to investigate further.

recv( readme.md );

Awesome! But is there are any example how to interact with that api?
Please

Library doesn't build in default configuration

I know this library is primarily designed to be built by this build.rs file but unfortunately we have to patch this library as part of our fuzzer and I'm currently trying to rebase our fork against the latest master.
To establish a baseline I tried running mkdir build && cd build && ../configure && make but this runs into the following errror:

/usr/bin/ld: libblock.fa.p/block_block-backend.c.o: in function `blk_aio_read_entry':
/home/stefan/uni/master/asp-qemu/build/manual/../../block/block-backend.c:1677:(.text+0x2fea): undefined reference to `syx_snapshot_cow_cache_read_entry'
/usr/bin/ld: libblock.fa.p/block_block-backend.c.o: in function `blk_aio_write_entry':
/home/stefan/uni/master/asp-qemu/build/manual/../../block/block-backend.c:1698:(.text+0x3303): undefined reference to `syx_snapshot_cow_cache_write_entry'
collect2: error: ld returned 1 exit status

I think this is because syx_snapshot_cow_cache_write_entry is only defined if CONFIG_SOFTMMU is set, based on the following meson file

specific_ss.add(when: 'CONFIG_SOFTMMU', if_true: [files(
'syx-snapshot/device-save.c',
'syx-snapshot/syx-snapshot.c',
'syx-snapshot/syx-cow-cache.c',
'syx-snapshot/channel-buffer-writeback.c',
)])

but afaict block-backend.c is built unconditionally (relevant meson.build line).

Is there any interest in modifying the configure script to set all required variables or is the canonical way to build through the build.rs script and additional modifications to this repo should be avoided?

Adding #ifdef AS_LIB instead of using LibAFL comment

When adding new qapi to qemu-libafl-bridge, it is nice to have qemu as standalone and gdb/monitor to debug qemu code. However, gdb breakpoint won't work since it will exit right away once it hits breakpoint. it would be nice to gate all the libafl code between #ifdef so that it works for both.

Systemmode instruction hook causes a crash

Describe the bug
The instruction hook in QEMU systemmode causes a failed assertion in QEMU.

To Reproduce
Steps to reproduce the behavior (in LibAFL):

  1. Build the qemu_systemmode example.elf.
  2. Determine a valid instruction address inside using e.g.
    arm-linux-gnueabi-nm ./example/example.elf | grep main
  3. Add a QemuHelper with an instruction hook on a valid address within the control flow.
    Minimal Example:
use libafl_qemu::QemuHelper;
use libafl::prelude::UsesInput;
use libafl::prelude::HasMetadata;
use libafl_qemu::QemuHelperTuple;
use libafl_qemu::QemuHooks;

#[derive(Debug,Default)]
pub struct QemuInsHelper {}

impl<S> QemuHelper<S> for QemuInsHelper
where
    S: UsesInput + HasMetadata,
{
    fn first_exec<QT>(&self, hooks: &QemuHooks<'_, QT, S>)
    where
        QT: QemuHelperTuple<S>,
    {
        hooks.instruction(0x0000012a, exec_ins_hook::<QT, S>, false)  // some valid address
    }
}

pub fn exec_ins_hook<QT, S>(
    _hooks: &mut QemuHooks<'_, QT, S>,
    _state: Option<&mut S>,
    _pc: u32,
)
where
    S: UsesInput,
    QT: QemuHelperTuple<S>,
{}

Use it inside fuzzer.rs:

        let mut hooks = QemuHooks::new(
            &emu,
            tuple_list!(
                QemuInsHelper::default(),
                ...
  1. Start the fuzzer and observe an error message.

Expected behavior
Hooks should be executed without crashing.

Screen output/Screenshots

$ KERNEL=./example/example.elf target/debug/qemu_systemmode -icount shift=4,align=off,sleep=off -machine mps2-an385 -monitor null -kernel ./example/example.elf -serial null -nographic -snapshot -drive if=none,format=qcow2,file=dummy.qcow2 -S
FUZZ_INPUT @ 0x290
main address = 0x12a
Breakpoint address = 0x78

**
ERROR:../tcg/tcg.c:2207:tcg_gen_callN: code should not be reached
Bail out! ERROR:../tcg/tcg.c:2207:tcg_gen_callN: code should not be reached
[Objective   #1]  (GLOBAL) run time: 0h-0m-0s, clients: 2, corpus: 0, objectives: 1, executions: 0, exec/sec: 0.000
                  (CLIENT) corpus: 0, objectives: 1, executions: 0, exec/sec: 0.000

Additional context
I bisected the issue in the LibAFl repo and found 59bf11 (between 0.9.0 and 0.10.0) to be the last fully working revision, after which a different error from the current one prevented me from further analysis.

Incorrect node type casting in libafl_maps_next leads to out-of-bound dereference

The libafl_maps_next function casts an IntervalTreeNode to a MapInfo using the container_of macro (13685). However, the root of the interval tree is initialized as IntervalTreeRoot in the read_self_maps function (23). This mismatch in types leads to an out-of-bound dereference when accessing e->itree.start in libafl_maps_next (13687).

This is definitely a minor bug since h2g_valid is likely to return always false, but still is blocking me to debug QEMU using ASAN.

I assign this to me, but suggestions on how to fix it are very welcome. Here my proposals:

  1. Somewhat, make root a MapInfo instead of IntervalTreeRoot, but maybe this would require changes in the interval tree implementation in QEMU, so I don't think it's a good idea.
  2. Use a boolean flag in libafl_maps_next when called the first time (maybe too ugly)

Unilateral unlinking of edge's TB causes infinite loop

Context: my target allocates RWX pages, which are populated with jit code and executed.

I'm experiencing infinite loops when using the QemuCoverageEdgeHelper. The infinite loop occurs because the edges generated by libafl_gen_edge have the same pc as the last TB executed and because edges' TB is unlinked from its successor in the chain when the QEMU's signal handler is triggered.

I report below, step-by-step, a possible scenario that would trigger the infinite loop:

  1. We are in cpu_exec_loop and about to execute a TB, named X from now on
  2. During its execution, X encounters a jne to another TB, named Y
  3. Code for Y is generated, and TBs X and Y are linked through an edge:
    X -> edge -> Y
    the important detail here is that the edge's TB has the same pc as X
  4. Execution proceeds through several TBs until when another TB, named Z from now on, is executed
  5. Z triggers the signal handler and, causality, one of the TB in the page that is going to be invalidated is Y:
    tb_invalidate_phys_page_unwind

    PAGE_FOR_EACH_TB(addr, last, unused, tb, n) {
        if (current_tb == tb &&
            (tb_cflags(current_tb) & CF_COUNT_MASK) != 1) {
            /*
             * If we are modifying the current TB, we must stop its
             * execution. We could be more precise by checking that
             * the modification is after the current PC, but it would
             * require a specialized function to partially restore
             * the CPU state.
             */
            current_tb_modified = true;
            cpu_restore_state_from_tb(current_cpu, current_tb, pc);
        }
        tb_phys_invalidate__locked(tb); // TB IS Y
    }
  1. By looking at point 3, we know that Y is referenced by edge so, when tb_jmp_unlink is executed, tb_reset_jmp is called on the edge's TB. Now edge's TB and Y are not linked anymore:
    X -> edge -x-> Y
  2. execution proceeds...
  3. X is executed again, its basic blocks are executed until the edge's TB (remember, X -> edge), but edge is unlinked now, so after the edge is executed, it exits instead of jumping to Y
  4. We are now back in cpu_exec_loop, edge is now last_tb, the pc returned by cpu_get_tb_cpu_state is the same as X, so tb_lookup will return X, X will be executed again, which will jump again to 9), resulting in an infinite loop between 9) and 10)

I see two possible problems here:

  1. We are in a RWX context, where a page is accessed to be written, so potentially this means the JIT code in a page is going to be replaced from other code. In our example above, the page that is being re-written contains the TB named Y and, as we have to invalidate the edge X -> Y, we should also unlink X from edge to have the chain completely invalidated.
  2. an edge shouldn't be initialized with the pc of the previous TB (so, with the X's pc)

I think a possible solution would be:
when the signal handler is triggered, and we are unlinking an edge -> TB2 relationship, we should also unlink the other side of the chain: TB1 -> edge.

I'm open to other suggestions.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.