Comments (5)
There are a number of different forms of hooks in the memory and block groups, which do different things. I'll try to give you my explanation of what they do and how you might use them - I may be wrong, so please someone correct me if I've made a mistake.
In general, all the hooks can be bound to a region of the emulated memory, so that they will only fire when that region is reached (or accessed). The hooks are also stored in a list, so the more hooks that you add, the more processing that is needed by the system to dispatch them. This can make your emulation slow down if you have a lot of hooks present.
Some hooks can return a value, which if non-0 will abort execution.
UC_HOOK_BLOCK
What is it?
Let's first deal with the UC_HOOK_BLOCK
case. These hooks are called whenever the code execution starts within a 'basic block' in the emulated code. A 'basic block' is a sequence of instructions without any conditional branch or special processing instructions (or other events like the end of mapped memory) - a sequence that can be entirely emulated in (effectively) a linear path. The UC_HOOK_BLOCK is called with the address of the start of the block that's being executed and the size of that block (in bytes).
Because the block hooks are only called on entry to a section of code which must be executed, you can guarantee the execution passes through all the instructions in the block. If some instructions are conditional, the effect of the instruction might be null - eg ADDEQ r0, r0, r1
in ARM is a conditional add that only happens if the Z flag is set. The basic block might contain any number of these conditional instructions as the execution still passes through the instructions.
Why might you use it?
If you want a gross understanding of the code path, knowing where the system executed, you might use a block hook. Your hook might write diagnostics about where the code was at that time and the state of registers. This would give you a very clear picture of how the execution was passing through the system. Loops, for example, might result in the same block hook being fired repeatedly, as the code passes through the same code, ending in the conditional jump back to the start of the loop.
If you were disassembling the code, you could perform the disassembly on each block for its entire range, rather than using a code hook.
UC_HOOK_CODE
What is it?
The UC_HOOK_CODE
is more fine grained than the UC_HOOK_BLOCK
. This occurs on every instruction that is executed, before it is executed. So whilst UC_HOOK_BLOCK
is "I'm about to run this section of code", UC_HOOK_CODE
is "I'm about to run this instruction". There being a lot of code hooks means that your hook will be called a lot. The hook is called (like the block hook) with the address of the code being executed, and its size. The size will only ever cover one instruction, however.
Why might you use it?
If you want to breakpoint the code at a particular place, this hook is a perfect way to do that. Calling uc_emu_stop
will cause the emulation to stop at this point.
You might also use it to trace the execution with a disassembly, in a similar way to the block hooks, above. Because you're executing at the instruction level, this means that you can read the registers as they are before the code is executed, which may be useful for your disassembly.
If you want to inject behaviour you might use this hook to modify the registers - including modifying the program counter, to jump to a different place.
UC_HOOK_MEM_READ, UC_HOOK_MEM_WRITE, UC_HOOK_READ_MEM_AFTER
What is it?
The UC_HOOK_MEM_READ
and UC_HOOK_MEM_WRITE
hooks are called whilst the emulator is executing instructions. When the code being emulated tries to read or write memory within the range, the hooks will be called.
For UC_HOOK_MEM_READ
, the hook is called with the address that is being read, and the size of the access. A 'value' is passed, but this operation occurs before the value has been read, so its content is indeterminate.
For UC_HOOK_MEM_WRITE
, the hook is called with the address that is being written, the size of the access and the value that was written to it.
For UC_HOOK_MEM_READ_AFTER
, the hook is called after the read has occurred. It is the same as UC_HOOK_MEM_READ
except that the value has been populated.
These hooks are not used if you directly access the memory using the Unicorn mem_*
functions.
Why might you use it?
If you were providing watchpoints that track accesses to memory, you might use any of these 3 hooks. You could report all the registers and the program counter at the time of access - even reporting a stack backtrace if you knew the calling standard.
You might use these UC_HOOK_MEM_READ
and UC_HOOK_MEM_WRITE
operations to fake memory mapped IO. If you had a memory mapped device that you wanted to expose to the system, you could use a UC_HOOK_MEM_READ
hook to write a suitable value into memory for the memory mapped register being accessed. The execution of the instruction would then pick up the new value that you had written.
Similarly, the UC_HOOK_MEM_WRITE
could update your internal state with the register that had been written to the address.
You might implement memory protection in a different manner than the standard Unicorn form. For example, you might check that processor mode and decide whether the memory is actually accessible or not to the code that is performing that access. This isn't usually an operation of the CPU (although some CPUs and MMUs do have this ability), but for diagnosing whether a given section of code should be able to access other memory this could be useful.
UC_HOOK_MEM_FETCH
What is it?
The UC_HOOK_MEM_FETCH
hook is not used.
Why might you use it?
You wouldn't. It's deprecated and will never be called.
UC_HOOK_MEM_READ_UNMAPPED, UC_HOOK_MEM_WRITE_UNMAPPED, UC_HOOK_MEM_FETCH_UNMAPPED
What is it?
All 3 of these hooks are called when there is an access to a region for which there is no memory mapping.
The UC_HOOK_MEM_READ_UNMAPPED
hooks is called when the code tries to read an unmapped memory region.
The UC_HOOK_MEM_WRITE_UNMAPPED
hooks is called when the code tries to write to an unmapped memory region.
The UC_HOOK_MEM_FETCH_UNMAPPED
hooks is called when the emulator needs to read an unmapped memory region to fetch code to execute.
In all cases you can either map the page in with the uc_mem_map*
function or return non-0 to abort execution.
Why might you use it?
You might use these for dynamic memory mapping, only mapping in the memory when it is needed - which could be useful for a virtual-memory type system.
You might use it for trapping bad accesses at the time that they happen (although the usual abort that you would get will also give you this information).
UC_HOOK_MEM_READ_PROT, UC_HOOK_MEM_WRITE_PROT, UC_HOOK_MEM_FETCH_PROT
What is it?
All 3 of these hooks are called when there is an access to a region for which there is a memory mapping but the memory was mapped with one of the UC_PROT_*
restrictions.
The UC_HOOK_MEM_READ_PROT
hooks is called when the code tries to read a region that isn't allowed to be read.
The UC_HOOK_MEM_WRITE_PROT
hooks is called when the code tries to write to a region that isn't allowed to be written.
The UC_HOOK_MEM_FETCH_PROT
hooks is called when the emulator needs to read an instruction from a region that isn't allowed to execute.
In all cases you can either map the page in with the uc_mem_protect*
function or return non-0 to abort execution.
Why might you use it?
You might change the protection level of the region to allow the memory to be accessed, or you might return non-0 to abort execution.
from unicorn.
I've created a wiki page with the content I wrote here - that way it might be findable again, and if I've got something wrong, people can correct it.
https://github.com/unicorn-engine/unicorn/wiki/Unicorn-hooks
from unicorn.
I'm sorry for the late reply, but wow, what a detailed and clear answer! Thank you, this has helped me so much and I'm sure many people will appreciate the wiki article! Kudos
from unicorn.
Thanks for the excellent explanation from @gerph. I will also add this to docs/
.
from unicorn.
Link to #1924
from unicorn.
Related Issues (20)
- RISC-V64 incorrectly returns error when calling `emu_start` with `count` = 1 at end of page
- Memory hooks cause incorrect emulation of the carry flag for the SAR instrution on x86_64 HOT 2
- ctl_set_cpu_model issues HOT 2
- mips 3 issues HOT 2
- When running x86 simulation in unrestricted mode, there's an EFLAGS error upon exiting HOT 3
- distutils deprecation HOT 1
- UC_HOOK_INTR not observed HOT 1
- Changing x86 32 bit execution to x86 64 bit and vice versa HOT 3
- glib_compat breaks Qt Widgets with glib backend HOT 11
- page_collection_lock causing a crash HOT 3
- Does memory need to be mapped aligned to page boundaries? HOT 1
- Ignoring a Branchs/Handling Invalid Memory Access Handling Gracefully HOT 4
- Setting RIP inside callback doesn't change execution flow HOT 2
- Paging doesn't work on x86, is that by design? HOT 4
- ARM32 Cortex A9 MRRC instruction UC_ERR_INSN_INVALID HOT 5
- Execution of xgetbv instruction and setting up of the XCR register. HOT 5
- dec r11w causes memory exception HOT 1
- Confusing `CMAKE_MSVC_RUNTIME_LIBRARY` checks HOT 7
- syscall.LoadDLL("unicorn.dll") HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from unicorn.