justfoxing / ghidra_bridge Goto Github PK

Python 3 bridge to Ghidra's Python scripting

License: MIT License

Python 100.00%

ghidra_bridge's Introduction

Ghidra Bridge

Ghidra is great, and I like scripting as much of my RE as possible. But Ghidra's Python scripting is based on Jython, which isn't in a great state these days. Installing new packages is a hassle, if they can even run in a Jython environment, and it's only going to get worse as Python 2 slowly gets turned off.

So Ghidra Bridge is an effort to sidestep that problem - instead of being stuck in Jython, set up an RPC proxy for Python objects, so we can call into Ghidra/Jython-land to get the data we need, then bring it back to a more up-to-date Python with all the packages you need to do your work.

The aim is to be as transparent as possible, so once you're set up, you shouldn't need to know if an object is local or from the remote Ghidra - the bridge should seamlessly handle getting/setting/calling against it.

If you like this, you might also be interested in the equivalents for other reverse-engineering tools:

jfx_bridge_ida for IDA Pro
jfx_bridge_jeb for JEB Decompiler

If you really like this, feel free to buy me a coffee: https://ko-fi.com/justfoxing

How to use for Ghidra
Security warning
Remote eval
Long-running commands
Remote imports
Interactive mode
How it works
Design principles
Tested
TODO
Contributors

How to use for Ghidra

Install the Ghidra Bridge package and server scripts

Install the ghidra_bridge package (packaged at https://pypi.org/project/ghidra-bridge/):

pip install ghidra_bridge

Install the server scripts to a directory on the Ghidra's script path (e.g., ~/ghidra_scripts, or you can add more directories in the Ghidra Script Manager by clicking the "3 line" button left of the big red "plus" at the top of the Script Manager).

python -m ghidra_bridge.install_server ~/ghidra_scripts

(optional) In the Ghidra Script Manager, select the Bridge folder and click the "In Tool" checkbox at the far left for the ghidra_bridge_server_background.py and ghidra_bridge_server_shutdown.py scripts. This will add these scripts as convenient menu items in Tools->Ghidra Bridge.

Start Server

CodeBrowser Context

For a better interactive shell like IPython or if you need Python 3 libraries in your interactive environment you can start the bridge in the context of an interactive GUI session.

If you've done step 3 in the install instructions above, click Tools->Ghidra Bridge->Run in Background.

Otherwise:

Open the Ghidra Script Manager.
Select the Bridge folder.
Run the ghidra_bridge_server_background.py script for a clean, no-popups bridge. You can also use ghidra_bridge_server.py if for some reason you want a big script popup in your face the whole time.

Headless Analysis Context

You can run Ghidra Bridge as a post analysis script for a headless analysis and then run some further analysis from the client. Use the ghidra_bridge_server.py (not _background.py) for this one, so it doesn't exit until you shut the bridge down.

$ghidraRoot/support/analyzeHeadless <path to directory to store project> <name for project> -import <path to file to import>  -scriptPath <install directory for the server scripts> -postscript ghidra_bridge_server.py

See the analyzeHeadlessREADME.html in Ghidra's support/ directory for more information about how to run the analyzeHeadless command, if required.

pythonRun Context

You can start the bridge in an environment without any program loaded, for example if you want to access some API like the DataTypeManager that doesn't require a program being analyzed

$ghidraRoot/support/pythonRun <install directory for the server scripts>/ghidra_bridge_server.py

Setup Client

From the client python environment:

import ghidra_bridge
with ghidra_bridge.GhidraBridge(namespace=globals()):
    print(getState().getCurrentAddress().getOffset())
    ghidra.program.model.data.DataUtilities.isUndefinedData(currentProgram, currentAddress)

import ghidra_bridge
b = ghidra_bridge.GhidraBridge(namespace=globals()) # creates the bridge and loads the flat API into the global namespace
print(getState().getCurrentAddress().getOffset())
# ghidra module implicitly loaded at the same time as the flat API
ghidra.program.model.data.DataUtilities.isUndefinedData(currentProgram, currentAddress)

Shutting Down the Server

Warning: if you're running in non-background mode, avoid clicking the "Cancel" button on the script popup, as this will leave the server socket in a bad state, and you'll have to completely close Ghidra to fix it.

To shutdown the server cleanly, if you've done step 3 in the install instructions above, click Tools->Ghidra Bridge->Shutdown. Otherwise, run the ghidra_bridge_server_shutdown.py script from the Bridge folder.

Alternatively, you can call remote_shutdown from any connected client.

import ghidra_bridge
b = ghidra_bridge.GhidraBridge(namespace=globals())
b.remote_shutdown()

Security warning

Be aware that when running, a Ghidra Bridge server effectively provides code execution as a service. If an attacker is able to talk to the port Ghidra Bridge is running on, they can trivially gain execution with the privileges Ghidra is run with.

Also be aware that the protocol used for sending and receiving Ghidra Bridge messages is unencrypted and unverified - a person-in-the-middle attack would allow complete control of the commands and responses, again providing trivial code execution on the server (and with a little more work, on the client).

By default, the Ghidra Bridge server only listens on localhost to slightly reduce the attack surface. Only listen on external network addresses if you're confident you're on a network where it is safe to do so. Additionally, it is still possible for attackers to send messages to localhost (e.g., via malicious javascript in the browser, or by exploiting a different process and attacking Ghidra Bridge to elevate privileges). You can mitigate this risk by running Ghidra Bridge from a Ghidra server with reduced permissions (a non-admin user, or inside a container), by only running it when needed, or by running on non-network connected systems.

Remote eval

Ghidra Bridge is designed to be transparent, to allow easy porting of non-bridged scripts without too many changes. However, if you're happy to make changes, and you run into slowdowns caused by running lots of remote queries (e.g., something like for function in currentProgram.getFunctionManager().getFunctions(): doSomething() can be quite slow with a large number of functions as each function will result in a message across the bridge), you can make use of the remote_eval() function to ask for the result to be evaluated on the bridge server all at once, which will require only a single message roundtrip.

The following example demonstrates getting a list of all the names of all the functions in a binary:

import ghidra_bridge 
b = ghidra_bridge.GhidraBridge(namespace=globals())
name_list = b.remote_eval("[ f.getName() for f in currentProgram.getFunctionManager().getFunctions(True)]")

If your evaluation is going to take some time, you might need to use the timeout_override argument to increase how long the bridge will wait before deciding things have gone wrong.

If you need to supply an argument for the remote evaluation, you can provide arbitrary keyword arguments to the remote_eval function which will be passed into the evaluation context as local variables. The following argument passes in a function:

import ghidra_bridge 
b = ghidra_bridge.GhidraBridge(namespace=globals())
func = currentProgram.getFunctionManager().getFunctions(True).next()
mnemonics = b.remote_eval("[ i.getMnemonicString() for i in currentProgram.getListing().getInstructions(f.getBody(), True)]", f=func)

As a simplification, note also that the evaluation context has the same globals loaded into the __main__ of the script that started the server - in the case of the Ghidra Bridge server, these include the flat API and values such as the currentProgram.

Long-running commands

If you have a particularly slow call in your script, it may hit the response timeout that the bridge uses to make sure the connection hasn't broken. If this happens, you'll see something like Exception: Didn't receive response <UUID> before timeout.

There are two options to increase the timeout. When creating the bridge, you can set a timeout value in seconds with the response_timeout argument (e.g., b = ghidra_bridge.GhidraBridge(namespace=globals(), response_timeout=20)) which will apply to all commands run across the bridge. Alternatively, if you just want to change the timeout for one command, you can use remote_eval as mentioned above, with the timeout_override argument (e.g., b.remote_eval("[ f.getName() for f in currentProgram.getFunctionManager().getFunctions(True)]", timeout_override=20)). If you use the value -1 for either of these arguments, the response timeout will be disabled and the bridge will wait forever for your response to come back - note that this can cause your script to hang if the bridge runs into problems.

Remote imports

If you want to import modules from the ghidra-side (e.g., ghidra, java, docking namespaces), you have two options.

Use remote_import to get a BridgedModule back directly (e.g., remote_module = b.remote_import("java.math.BigInteger")). This has the advantage that you have exact control over getting the remote module (and can get remote modules with the same name as local modules) and when it's released, but it does take a little more work.
Specify hook_import=True when creating the bridge (e.g., b = ghidra_bridge.GhidraBridge(namespace=globals(), hook_import=True)). This will add a hook to the import machinery such that, if nothing else can fill the import, the bridge will try to handle it. This allows you to just use the standard import ghidra.framework.model.ToolListener syntax after you've connected the bridge. This has the advantage that it may be a little easier to use (you still have to make sure the imports happen AFTER the bridge is connected), but it doesn't allow you to import remote modules with the same name as local modules (the local imports take precedence) and it places the remote modules in sys.modules as proper imports, so they and the bridge will likely stay loaded until the process terminates. Additionally, multiple bridges with hook_import=True will attempt to resolve imports in the order they were connected, which may not be the behaviour you want.

Interactive mode

Normally, Ghidra scripts get an instance of the Ghidra state and current* variables (currentProgram, currentAddress, etc) when first started, and it doesn't update while the script runs. However, if you run the Ghidra Python interpreter, that updates its state with every command, so that currentAddress always matches the GUI.

To reflect this, GhidraBridge will automatically attempt to determine if you're running the client in an interactive environment (e.g., the Python interpreter, iPython) or just from a script. If it's an interactive environment, it'll register an event listener with Ghidra and perform some dubious behind-the-scenes shenanigans to make sure that the state is updated with GUI changes to behave like the Ghidra Python interpreter. It'll also replace help() with one that reaches out to use Ghidra's help across the bridge if you give it a bridged object.

You shouldn't have to care about this, but if for some reason the auto-detection doesn't give you the result you need, you can specify the boolean interactive_mode argument when creating your client GhidraBridge to force it on or off as required.

How it works

The actual bridge RPC code is implemented in jfx-bridge. Check it out there and file non-Ghidra specific issues related to the bridge there.

Design principles

Needs to be run in Ghidra/Jython 2.7 and Python 3
Needs to be easy to install in Ghidra - no pip install, just add a single directory (these two requirements ruled out some of the more mature Python RPC projects I looked into)

Tested

Tested and working on Ghidra 9.1(Jython 2.7.1) <-> Python 3.7.3 on Windows
Automatically tested on Ghidra 9.0(Jython 2.7.1) <-> Python 3.5.3 on Linux (bskaggs/ghidra docker image)

TODO

Ghidra plugin for server control (cleaner start/stop, port selection, easy packaging/install)
Examples
- Jupyter notebook

Contributors

Thx @fmagin for better iPython support, and much more useful reprs!
Thanks also to @fmagin for remote_eval, allowing faster remote processing for batch queries!

ghidra_bridge's People

Contributors

Stargazers

Watchers

ghidra_bridge's Issues

Are you able to write to a ghidra project?

With ghidra_bridge, I'm able to pull information out of my ghidra project.
Doing something like this works fine:

addr = toAddr(0x41cc28)
symbol = getSymbolAt(addr)
print(symbol.getName)

Issues happen when I try to write to the project though:

addr = toAddr(0x41cc28)
symbol = getSymbolAt(addr)
symbol.setName('hey', symbol.getSource())

I get this exception from ghidra.

I'm able to change the symbol name using this code and the python tool within ghidra itself.

Am I using ghidra_bridge incorrectly or is this a ghidra problem? All I have at the top of my script is the client code in the README

import ghidra_bridge
b = ghidra_bridge.GhidraBridge(namespace=globals()) # creates the bridge and loads the flat API into the global namespace
print(getState().getCurrentAddress().getOffset())
# ghidra module implicitly loaded at the same time as the flat API
ghidra.program.model.data.DataUtilities.isUndefinedData(currentProgram, currentAddress)

Sorry for the Hassle!

isinstance comparison always BridgedObject

Hi Again :-),

currently trying to run the following Code to get the length of a ClangVariableDecleration:
The Type is a ghidra.app.decompiler.ClangSyntaxToken

Example Code:

listing = currentProgram.getListing()

decompInterface = ghidra.app.decompiler.DecompInterface()
decompInterface.openProgram(currentProgram)

for func in listing.getFunctions(0):
    decompileResults = decompInterface.decompileFunction(func, 30, monitor)
    if decompileResults.decompileCompleted():
        print("Function {}:".format(func))
        Clang_Function = decompileResults.getCCodeMarkup()
        for a in range(Clang_Function.numChildren()):
            #IPython.embed()
            if isinstance(Clang_Function.Child(a), ghidra.app.decompiler.ClangVariableDecl):
                print(Clang_Function.Child(a).dataType.getLength())

When comparing the result of Clang_function.Child(a) should be something close like ghidra.app.decompiler.ClangVariableDecl. However it is always a BridgedObject:

In [5]: Clang_Function.Child(a)
Out[5]: <BridgedObject(ClangSyntaxToken, handle=7466c8fd-7fa3-4afb-91dd-aa6a9439e6c5)>

Is there any way to get the type out of the BridgedObject?
Thanks :-)

How to replace the 'animated dragon' window while bridge server is running?

Is there a way to either remove or replace the ghidra animated dragon with some static image or text?
Its annoying to have the GPU continuously churning 8-10% of its capacity just because the bridge is running.
Thanks

How to import ghidra in ghidra_bridge?

Whenever i try import ghidra it says module not found. Can you only import ghidra if you are actually within the ghidra app? If so, is there other ways of achieving all the functionality of ghidra headlessly?
Thanks!

Minor Issues: Hide Script Box, Scripts have to be in the Script directory and not a subdirectory, verbose output as sideeffect in iPython

System:
Arch Linux using the ghidra v9.0.1 package from the AUR ( https://aur.archlinux.org/packages/ghidra/)

I have various minor issues I am collecting here:

First:
I get a script box with the red dragon eating bits to indicate that a script is currently running. This is just mildly annoying because (unlike IDA...) the tool can still be used while a script is running. Maybe there is some other way to access the API, potentially via an Extension instead of script similar to how the Jython Shell accesses the API.

Second:
The doc states "Add the directory with the scripts in it to Ghidra's scripts directories." This did not work for me, the scripts had to be in the toplevel in my ghidra_scripts directory. I have no idea if this is a config issue on my side, a linux issue, or something else.

Third:
Using the plugin in an interactive shell like IPython generates a lot of output and makes it hard to see what the actual result of whatever was done should be. In IPython tab completion leads to this too.

analyze(currentProgram) may not be working

I posted here yesterday about a timeout I was getting with analyze(), and the following was suggested :
b.bridge.remote_eval("analyze(currentProgram)", timeout_override=<value>) . However, I am not sure this works . I am experiencing one of two issues (although it is hard to tell):
it either is not analyzing the program, or it is analyzing the program but never finishing after analysis.
When I run this command I can never break out. I know analyze() takes a long time, but when I do it in ghidra, it runs faster than when I do it in ghidra_bridge (which is never). Since this is newly implemented, could this maybe be a bug? Thanks!

Unable to stop auto-analysis for script connected via ghidra_bridge

I'd like to pause auto-analysis while doing some work via ghidra_bridge (which is excellent! Thanks!).
Currently my script does so many things that the Ghidra UI will freeze (UI won't render / be interactive) for more than an hour after my script ran, because it has to catch up on the auto-analysis. Fortunately the ghidra_bridge remains active during this time, so my script continues to work.

Normally the auto-analysis can be disabled using GhidraScript.getScriptAnalysisMode(). However, because ghidra_bridge is the script, and not the script(s) connected to it, that probably makes it impossible to handle it correctly.

Are there any known alternatives to disable auto-analysis?

If there is no alternative, then ghidra_bridge should expose auto-analysis mode selection.
I'm not sure how often that callback function is called, but there could also be a flag when creating the connection, or 2 different runner scripts to set the behavior; although this might also be a problem if multiple scripts connect at the same time.

There is https://ghidra.re/ghidra_docs/api/ghidra/program/flatapi/FlatProgramAPI.html#analyzeChanges(ghidra.program.model.listing.Program) to trigger manual updates, too. So disabling auto-updates is probably better than enabling them.

Question: How to analyze a big file using ghidra_bridge, without a timeout exception?

In ghidra_bridge I tried the following commands:

import ghidra_bridge b = ghidra_bridge.GhidraBridge(namespace=globals()) analyze(currentProgram)
Although this does start the analysis, I get a timeout before it completes it (its a big file). Is there any way I can do analysis remotely without a timeout?

Thanks!

Connection break down during running python commands

I write a python3 script to export disassembly code through this bridge. However, the connection between python3 and ghidra always break just in the process that commands are still running.

You can see my code at https://github.com/EmpRamses/ghidra_py_Instr

Error infos are following:

Traceback (most recent call last):
File "refine.py", line 17, in
inst = str(line)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/ghidra_bridge/bridge.py", line 1146, in str
return self._bridged_get_type()._bridged_get("str")(self)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/ghidra_bridge/bridge.py", line 1091, in _bridged_get
return self._bridge_conn.remote_get(self._bridge_handle, name)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/ghidra_bridge/bridge.py", line 616, in remote_get
return self.deserialize_from_dict(self.send_cmd(command_dict))
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/ghidra_bridge/bridge.py", line 604, in send_cmd
cmd_id, timeout=timeout_override if timeout_override else self.response_timeout)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/ghidra_bridge/bridge.py", line 417, in get_response
"Didn't receive response {} before timeout".format(response_id))
Exception: Didn't receive response b541f524-11f9-4afa-bd8c-491fde0d5530 before timeout

How to remote debug with VSCode?

Is there a host/port I can use?

Windows: Binding on low port blocked by firewall

Default behavior is to bind server on random port, which might be lower than 49152.
On Windows, this will result in the unindicative error:

_socket.error: [Errno 100] Address already in use

To address this the random port must be higher than 49152.

The issue is cause by the Windows firewall reserving all the lower ports.

More information: https://learn.microsoft.com/en-us/troubleshoot/windows-server/networking/service-overview-and-network-port-requirements

Pipe timeout oftens in tag 0.0.3

Hi,

Thanks for the update :-). Unfortunately now the pipe/socket does often produce timeouts.

Appears to happen more often when some load is going through the connection. (Maybe connected to the newly added threading)

tag 0.0.2 is still stable for me.

  File "/Users/Traxes/PycharmProjects/zeno/src/avd/wrapper/BackendGhidra.py", line 137, in do_backward_slice
    if tok.getVarnode():
  File "/usr/local/lib/python3.7/site-packages/ghidra_bridge/bridge.py", line 929, in __call__
    return self._bridge_conn.remote_call(self._bridge_handle, *args, **kwargs)
  File "/usr/local/lib/python3.7/site-packages/ghidra_bridge/bridge.py", line 630, in remote_call
    return self.deserialize_from_dict(self.send_cmd(command_dict))
  File "/usr/local/lib/python3.7/site-packages/ghidra_bridge/bridge.py", line 569, in send_cmd
    cmd_id, timeout=self.RESPONSE_TIMEOUT)
  File "/usr/local/lib/python3.7/site-packages/ghidra_bridge/bridge.py", line 387, in get_response
    "Didn't receive response {} before timeout".format(response_id))
Exception: Didn't receive response 0a005460-93f5-46cd-8807-71f08280ebb6 before timeout

Thanks for updating it regulary :-) Helps a lot.

Cheers,
Traxes

b.bridge.remote_shutdown() starting server of different programs

whenever i have b.bridge.remote_shutdown() it always ends the headless server of this current program, but then instantly just launches headless of another program (sometimes the exact same program, sometimes random). It is really annoying and usually is hard to truly end. Is there any way to fix this problem

Socket not closed, when GhidraBridge Object gets out of scope

I have observed, that the socket to ghidra is not closed, when the GhidraBridge Object gets out of scope. It gets closed, when the whole python process terminates. We have a long running python process and do some ghidra operations from time to time via a remotifyied class:

with ghidra_bridge.GhidraBridge(  ) as bridge:
            RemoteGhidraClass = bridge.remoteify(RemoteGhidra)
            remote_ghidra = RemoteGhidraClass().do_something()

In the log from Ghidra, we can see WARNING:jfx_bridge.bridge:Handling connection from ('127.0.0.1', 55500) each time the above code is called (with different port of course), but the connection gets not closed.
Also we experience some hung ups now and then. Could that be related? We have also experienced hangs, when keeping the GhidraBridge Object alive across the different calls. What would be the correct was here?

At the same time, we are experiencing following error log:

[ERROR bridge.py:522 run()] 'utf-8' codec can't decode byte 0xaa in position 327679: invalid start byte
Traceback (most recent call last):
  File "/usr/local/lib/python3.8/dist-packages/jfx_bridge/bridge.py", line 504, in run
    msg_dict = json.loads(data.decode("utf-8"))
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xaa in position 327679: invalid start byte

Wrong port in bridge from PyPi package

When installing ghidra_bridge from PyPi, the default server port is listed as 34942, but actually the server starts on 34940. Please update the package

Expose other IPC channels

Consider Unix Domain Sockets for example: https://stackoverflow.com/a/29436429
(Edit: This issue should probably have been on https://github.com/justfoxing/jfx_bridge)

AttributeError: 'instancemethod' object has no attribute 'len' (bridge.py", line 974)

I get a whole lot of the following in the Ghidra console when first connecting initiating a connection to ghidra_bridge. I'm on Windows, running Ghidra in interactive mode. It appears to be working (I'm just getting started), but I get enough of this to overflow the Ghidra console when starting to run a script.

Traceback (most recent call last):
  File "C:\Users\eddyw\ghidra_scripts\jfx_bridge\bridge.py", line 974, in local_get
    result = getattr(target, name)
AttributeError: 'instancemethod' object has no attribute '__len__'

Get isinstance to return reasonable results for BridgedObjects for type, callable, etc

I am currently working on this but it is in no shape for a PR yet.

The idea is that BridgedCallables should behave as close as possible to actual local Callables to allow the inspect features and IPython features building on top of that to work as good as possible.

The concrete goal is the following:
Assume a function like:

def add(x: int, y:int ) -> int:
    return x + y

IPython help returns the following.

In [34]: add?
Signature: add(x: int, y: int) -> int
Docstring: <no docstring>
File:      ~/Projects/ghidra_bridge/dev.py
Type:      function

This behavior should be replicated with BrigdedCallables. What is needed for this is that is after several layers of IPython code the inspect module can generate a valid signature for it by using Signature.from_callable. This currently fails with the following:

from inspect import Signature
f = currentProgram.functionManager.getFunctionAt
Signature.from_callable(f)

ValueError                                Traceback (most recent call last)
<ipython-input-36-6e450cf1e523> in <module>
----> 1 Signature.from_callable(f)

/usr/lib64/python3.7/inspect.py in from_callable(cls, obj, follow_wrapped)
   2831         """Constructs Signature for the given callable object."""
   2832         return _signature_from_callable(obj, sigcls=cls,
-> 2833                                         follow_wrapper_chains=follow_wrapped)
   2834 
   2835     @property

/usr/lib64/python3.7/inspect.py in _signature_from_callable(obj, follow_wrapper_chains, skip_bound_arg, sigcls)
   2286     if _signature_is_builtin(obj):
   2287         return _signature_from_builtin(sigcls, obj,
-> 2288                                        skip_bound_arg=skip_bound_arg)
   2289 
   2290     if isinstance(obj, functools.partial):

/usr/lib64/python3.7/inspect.py in _signature_from_builtin(cls, func, skip_bound_arg)
   2110     s = getattr(func, "__text_signature__", None)
   2111     if not s:
-> 2112         raise ValueError("no signature found for builtin {!r}".format(func))
   2113 
   2114     return _signature_fromstr(cls, func, s, skip_bound_arg)

ValueError: no signature found for builtin <BridgedCallable('<bound method ghidra.program.database.function.FunctionManagerDB.getFunctionAt of ghidra.program.database.function.FunctionManagerDB@5f0ef8c4>', type=instancemethod, handle=99f4707d-9f4a-4205-b820-7dac1b5a811b)>

The first hint is that no signature found for builtin is weird because a BridgedCallable is in no way a builtin so something is going quite wrong.

The first divergence from the Jython shell is that in the Jython shell isinstance(f, types.MethodType) is True while in the bridge client it is false. I am unsure how to fix this exactly as isinstance is a builtin that might be hard to trick.

An alternative to bypass this all is to fake __signature__ directly and just build one ourselves. Slightly annoying and ignores the potential actual problem.

It gets recognized as a builtin because ismethoddescriptor returns True, which in happens because ismethod returns False for currentProgram.functionManager.getFunctionAt

One core issue is: Is there some way to make isinstance(obj, type) go over to the bridge and is that even the correct way to do it?
I will look into how other environments like rpyc and Jython handle this and might have more concrete ideas then.

How to pass arguments to the post script

Hello @justfoxing,

I want to spawn multiple python subprocesses to analyze different files at the same time. The command I use is the following :

$ghidraRoot/support/analyzeHeadless.bat <path to directory to store project> <name for project> -import <path to file to import>  -scriptPath <install directory for the server scripts> -postscript ghidra_bridge_server.py

Each analysis require different port, so I tried to refactor the ghidra_bridge_server.py file to pass the host and port as arguments using sys.argv or argparse like this :

$ghidraRoot/support/analyzeHeadless.bat <path to directory to store project> <name for project> -import <path to file to import>  -scriptPath <install directory for the server scripts> -postscript ghidra_bridge_server.py port host

PS: found in Ghidra docs that you can pass arguments to the post script.
But no luck.
Next I tried running the hole thing on a docker container for each analysis, but first I changed the ghidra_bridge_server.py to read host and port from ENV, resulted in networking issues and can't reach the bridge server.
ENV variables without docker are hard to implement because subshells will inherit the same env variables from the main shell.
Any hints/suggestions will be appreciated it.

Question: how does the RPC bridge handle class callbacks?

Thank you for running such an awesome project! I've recently been looking back into using Python3 in Ghidra in a background process type of way and it seems you've got it handled pretty well with the background processing.

However, I was confused about how you handle something like class callbacks that Ghidra does. For instance, in one of my Java projects I am thinking of porting to Python3, I implement the DomainObjectListener class:
https://github.com/mahaloz/decomp2dbg/blob/69ad8ede239cdb5eab8a776f33426b980277cd2c/decompilers/d2d_ghidra/src/main/java/decomp2dbg/D2DPlugin.java#L49

I do this because it's a special class that will get callbacks from Ghidra whenever a DomainObject changes (like a function name).
It's clear to me that I can do this:

class MyListener(DomainObjectListener):
    pass

But I have no idea how I register this class with Ghidra... i.e., I don't know how to inject this into the Python 2 globals area on Ghidra.

I was curious if you had any advice or tried to do something similar with this project.

Thanks!

Memory.getBytes returns empty array

Hi,

I am trying to read memory using the following:

buffer = array.array ("b", b"\0" * length)
currentProgram.getMemory().getBytes(currentProgram.getAddressFactory().getDefaultAddressSpace().getAddress(hex(start)), buffer)

When I try to read say, 16 bytes length from 0x400000 (expecting to see an MZ header), I get:

array('b', [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0])

However, if I jiggle the code about to only read a single byte I get:

buffer = currentProgram.getMemory().getByte(currentProgram.getAddressFactory().getDefaultAddressSpace().getAddress(hex(start)))
print(buffer)
77

Am I doing something stupid, or is there an issue with the array handling or something?!

Regards,

Tom

setComment causes BridgeException

Hi @justfoxing!

I am currently testing ghidra_bridge with

Python 3.8.5
Ghidra 10.0.2

and while trying to set a comment I get the below error

BridgeException: ('Transaction has not been started', <_bridged_db.NoTransactionException('db.NoTransactionException: Transaction has not been started', type=db.NoTransactionException .....)>)

The code I am executing is the following

currentProgram.getListing().getCodeUnitAt(toAddr("<ADD-HERE-ADDR>")).setComment(0, "Just a comment!")
# setComment(0, "Just a comment!")  is the same as 
# setComment(ghidra.program.model.listing.CodeUnit.EOL_COMMENT, "Just a comment!")
# but also with this, it does not work

Can you suggest a workaround to this issue? Thank you!

PyPI description has three sequential steps all numbered "1"

At the PyPI website, the description for the Ghidra Bridge project has three sequential installation steps that are all numbered "1".

Perhaps the description there hasn't been updated from the latest readme.md on github?

"A picture is worth a thousand words", so:

DONT_BRIDGE, DONT_BRIDGE_UNLESS_IN_ATTRS and LOCAL_METHODS appear in tab completion

DONT_BRIDGE, DONT_BRIDGE_UNLESS_IN_ATTRS and LOCAL_METHODS appear as suggestions in tab completion with IPython for some objects, but only when not using Jedi

import ghidra_bridge
import IPython.utils.dir2
b = ghidra_bridge.GhidraBridge(namespace=globals())
fm = currentProgram.getFunctionManager()
IPython.utils.dir2.dir2(fm)

The weird thing is that neither dir(fm) nor fm.__dir__() actually contain those attributes, only dir2() returns them. Will look into this.

How to totally kill the running server in ghidra?

Hi,
I ran ghidra_bridge_server.py from the Ghidra Script Manager and stopped it, but could not rerun it again as a socket error will raise, that is
[error 98]Address already in use.
So how can I kill the running server totally?

BTW, could I run server in background without ghidra keeps on the screen >.<

Best

Unhashable type _bridged_ghidra.program.database.function.FunctionDB

I'm moving from a python ghidra_script to using ghidra_bridge, in the script theres a lookup to see if a function was saved to a cache:

if func not in highfunction_cache:

But I'm getting the following error

    if func not in highfunction_cache:
TypeError: unhashable type: '_bridged_ghidra.program.database.function.FunctionDB'

It looks like func is now type <class 'jfx_bridge.bridge._bridged_ghidra.program.database.function.FunctionDB'>

Is there a way to get the underlying FunctionDB type that can then be looked up in the dict?

Cannot inherit bridged classes

For example:

DefaultAnnotationHandler = ghidra.program.model.data.DefaultAnnotationHandler

class CPPAnnotationHandler(DefaultAnnotationHandler):
    FILE_EXTENSIONS = ["c", "h", "cpp"]
    def __init__(self):
        DefaultAnnotationHandler.__init__(self)
    def getPrefix(self, enum, member):
        return ""
    def getSuffix(self, enum, member):
        return ""
    def getPrefix(self, comp, dtc):
        return ""
    def getSuffix(self, comp, dtc):
        return ""
    def getDescription(self):
        return "Default C Annotations"
    def getLanguageName(self):
        return "C++"
    def getFileExtensions(self):
        return self.FILE_EXTENSIONS
    def toString(self):
        return self.getLanguageName()

Instanciating CPPAnnotationHandler causes

jfx_bridge.bridge.BridgeException: ('maximum recursion depth exceeded (Java StackOverflowError)', <_bridged_exceptions.RuntimeError('RuntimeError('maximum recursion depth exceeded (Java StackOverflowError)',)', type=exceptions.RuntimeError, handle=c22c43a8-f154-4f60-a008-3a1a9e96cabf)>)

Timeout value is not configurable in `ghidra_bridge_server.py`

Hi,

This is actually more like a follow up of #48 : Currently we can configure the timeout value using either response_timeout parameter for GhidraBridge, or timeout_override parameter for remote_eval.

However, when I set both values to -1 on client side (which should disable any timeout threshold) , a BridgeTimeoutException exception was still raised when I try to run a pretty large function using remote_eval. After some simple debugging, I found that the exception was actually raised from ghidra_bridge_server.py. More specific, it's the timeout value setup when constructing a jfx_bridge.BridgeServer object (line). And apparently this timeout value is fixed when using ghidra_bridge_server.py with analyzeHeadless.

I'm wondering if we can just set the timeout value here to -1 and let the client decide which timeout to use (using the existing response_time and timeout_override)? Or add a command line flag for ghidra_bridge_server.py to configure the timeout?

using in IDE with proper auto completion

In the example, you use the getState(), it works fine, but for the IDE, it is not defined.
Is it possible to add the types and functions from the Ghidra api ?

Handle iterators

I got another behavior where the Jython shell returns different values than the Bridge.

I was implementing an Iterator for functions:
(for debugging purpose i added the additional variable)

def func_iterator(funcs):
        got_next = funcs.hasNext()
        while got_next:
                yield funcs.next()
                got_next = funcs.hasNext()
                print(got_next)

b = ghidra_bridge.GhidraBridge(namespace=globals())
listing = currentProgram.getListing()
for func in func_iterator(listing.getFunctions(0)):
    print(func)

This does work perfectly in the Jython shell with the following output:

FUN_00081388
True
FUN_00081360
True
FUN_00081168
True
FUN_0008103c
False

As FUN_0008103c is the last function in the list..
However running it over the bridge:

FUN_00081360
True
FUN_00081168
True
FUN_0008103c
True

Then Java will of course throw an NullPointerException: java.lang.NullPointerException when checking for the next Object.

Further :-) it would be super improvement if your bride is returning a valid Iterator when the Object contains an Iterator.

Originally posted by @Traxes in #2 (comment)

Timeout Condition

Not sure if this is ghidra or jfx or me using the libraries incorrectly-

I'm getting the FlatProgramAPI for a DLL the main program calls, then trying to get all the functions from that library-

	if func not in highfunction_cache:			
		if func.isExternal():
			origName = func.getExternalLocation().getOriginalImportedName()
			origName = origName.split("_")[1] # Ordinal_XXX, return XXX
			extMgr = currentProgram.getExternalManager()
			fpAPI = FlatProgramAPI(prog)
			funcMgr = prog.getFunctionManager()
			fns = funcMgr.getFunctions(True)
			for x in fns:

It looks like I can get the Function Manager for the external program, but when I try and get the functions, I get either a timeout or invalid arg:

Any thoughts?

Different results in bridge and in Jython Shell

Running hex(getState().getCurrentAddress().getOffset()) in the Jython shell gives a different result than running

import ghidra_bridge
b = ghidra_bridge.GhidraBridge()
b.get_flat_api(namespace=globals())
print(hex(getState().getCurrentAddress().getOffset()))

in the bridge.

I have no idea why, if you cant reproduce this on windows I can investigate this further.

[Mac] Connection reset by peer

Hi, I've been experimenting with the ghidra_bridge over the last couple of days. For reference, I'm running on a Mac and the bridge is working.

I've noticed that when running tasks that take longer than 30 seconds I often get a "Connection reset by peer" errors. Below a related trace.

If there is anything I can do to help track these errors please let me know and I'll report back.

Exception in thread Thread-4:
Traceback (most recent call last):
  File "/Users/canedo/anaconda3/lib/python3.7/threading.py", line 917, in _bootstrap_inner
    self.run()
  File "/Users/canedo/anaconda3/lib/python3.7/site-packages/jfx_bridge/bridge.py", line 442, in run
    self.bridge_conn.get_socket())
  File "/Users/canedo/anaconda3/lib/python3.7/site-packages/jfx_bridge/bridge.py", line 297, in read_size_and_data_from_socket
    size_bytes = read_exactly(sock, struct.calcsize(SIZE_FORMAT))
  File "/Users/canedo/anaconda3/lib/python3.7/site-packages/jfx_bridge/bridge.py", line 284, in read_exactly
    new_data = sock.recv(num_bytes)
ConnectionResetError: [Errno 54] Connection reset by peer

BridgedIterables are inherently slow

Due to all the round trips BridgedIterables are fairly slow to the point of being unusable for interactive use, at least with large binaries (~6-7k functions).
%time [ f for f in fm.getFunctions(True)] takes nearly 2 minutes for me in this case.

My idea for this would be to extend BridgedIterables with a special method (list() maybe) that converts the iterable to a list on the server side of the bridge (which should be fairly fast), on then transfers it as one message over the bridge. I am not sure if this will work well, given that the result will just be a list of references to objects on the other side of the bridge and accessing them when iterating over the list will be slow again. Maybe a more general special function that allows the evaluation of arbitrary expressions on the server side where only the result is transferred would be an option too. Would lead to uglier code, but should be a speedup that would justify this IMO. E.g. getting the list of all function names ([ f.name for f in fm.getFunctions(True)]) should be roughly a few hundred ms this way vs over two minutes.

Another approach might be to try improving the speed of the bridge itself (e.g. an option for using OS specific IPC instead of TCP). A quick benchmark on my machine ( using https://stackoverflow.com/questions/14973942/tcp-loopback-connection-vs-unix-domain-socket-performance found https://stackoverflow.com/questions/14973942/tcp-loopback-connection-vs-unix-domain-socket-performance) shows ~5 times higher throughput and ~half the latency when using unix sockets vs tcp. This does not account for any python overhead though.

findBytes return an empty Array

The following code:

findBytes(currentProgram.getMinAddress(), b'\\x80', 0, 1)

Returns an array with multiple results in Ghidra’s python console, but an empty array in ghidra_bridge:

array(ghidra.program.model.address.Address)

I tried to change it to b’\x80’ and '\\x80', without success. I also tried:

bridge.remote_eval("list(findBytes(getCurrentProgram().getMinAddress(), b'\\x80', 0, 1))")

The result was an empty list, when it works as expected on Ghidra’s console.

what are the pros and cons of ghidra_bridge?

hi,
I am curious what is the difference between your ghidra_bridge and Ghidra's own server?

best

How to put arguments in remote_eval?

This does not run as expected:

import ghidra_bridge

with ghidra_bridge.GhidraBridge(namespace=globals()) as bridge:
    symbol_type = bridge.remote_import('ghidra.program.model.symbol.SymbolType')

    regex = re.compile(r'^PTR_(?:FUN|DAT|LOOP)_\d+$')
    symbols = currentProgram.getSymbolTable().getAllSymbols(True)

    symbols = bridge.remote_eval("next((sym for sym in symbols if sym.getSymbolType() == SymbolType.LABEL and r.match(sym.getName())), False)", symbols=symbols, SymbolType=symbol_type, r=regex)

Following exception occurs when tried to run it:

jfx_bridge.bridge.BridgeException: ("global name 'SymbolType' is not defined", <_bridged_exceptions.NameError('NameError("global name 'SymbolType' is not defined",)', type=exceptions.NameError, handle=6a1e7a2e-374e-4a81-8303-f6c5deac146d)>)

# if changed SymbolType to qualified ghidra.program.model.symbol.SymbolType in remote_eval()
jfx_bridge.bridge.BridgeException: ("global name 'r' is not defined", <_bridged_exceptions.NameError('NameError("global name 'r' is not defined",)', type=exceptions.NameError, handle=a55f35e3-47a1-4071-92f9-26b04a055a7f)>)

How to properly send arguments to evals, also is there a way to put multiple lines in remote_eval? Thanks!

Warning and potential issues when saving while script is running

When attempting to save (File->Save or Ctrl+s) when the bridge is running Ghidra issues a warning that the program is currently being modified by a python script and any changes made by that script will be lost and potential future errors might occur.

I think currently this is an inherent limitation of the design (as a script instead of maybe an extension). This might not actually break anything if the bridge isn't used to modify anything. Some temporary fix would be to fix the issue that the script can't be canceled and restarted without restarting Ghidra (due to the address not being freed), so it could just be canceled, program could be saved and restarted.

[Errno 111]Connection refused

Good morning,

While running the ghidra_bridge_server.py script, I executed the setup client code in step 2 and received a connection refused socket.error. I used an example binary file with a valid entry point address and installed ghidra_bridge using pip. Any idea what a socket error could stem from?

Bad file descriptor

I have something that leverages some of the gui portions of ghidra and any button I click from the pop up box from my script ( including the red x in the top right to close it) results in this error:

_socket.error: [Errno 9] Bad file descriptor Traceback (most recent call last): File "/home/stasia/ghidra_scripts/jfx_bridge/bridge.py", line 2092, in __call__ return self._bridge_conn.remote_call(self._bridge_handle, *args, **kwargs) File "/home/stasia/ghidra_scripts/jfx_bridge/bridge.py", line 194, in wrapper return func(self, *args, **kwargs) File "/home/stasia/ghidra_scripts/jfx_bridge/bridge.py", line 1053, in remote_call return self.deserialize_from_dict(self.send_cmd(command_dict)) File "/home/stasia/ghidra_scripts/jfx_bridge/bridge.py", line 207, in wrapper return_val = func(self, *args, **kwargs) File "/home/stasia/ghidra_scripts/jfx_bridge/bridge.py", line 959, in send_cmd write_size_and_data_to_socket(sock, data) File "/home/stasia/ghidra_scripts/jfx_bridge/bridge.py", line 311, in write_size_and_data_to_socket bytes_sent = sock.send(package[sent:]) File "/home/stasia/ghidra_10.0.3_PUBLIC/Ghidra/Features/Python/data/jython-2.7.2/Lib/_socket.py", line 1387, in _dummy raise error(errno.EBADF, 'Bad file descriptor') _socket.error: [Errno 9] Bad file descriptor

I'm not exactly what to make of this. I'm not sure what IS the bad file it's referring to.

Any insight would be appreciated!

Exposing Ghidra's `help` via the bridge

The built-in help() Python function has been altered by the Ghidra Python Interpreter to add support for displaying Ghidra's Javadoc (where available) for a given Ghidra class, method, or variable. For example, to see Ghidra's Javadoc on the state variable, simply do:
 
    >>> help(state)
    #####################################################
    class ghidra.app.script.GhidraState
      extends java.lang.Object

     Represents the current state of a Ghidra tool

    #####################################################

    PluginTool getTool()
       Returns the current tool.

      @return ghidra.framework.plugintool.PluginTool: the current tool

    -----------------------------------------------------
     Project getProject()
       Returns the current project.

       @return ghidra.framework.model.Project: the current project

    -----------------------------------------------------
    ...
    ...
    ...

Source: Python Interpeter Documentation (F1 when pointing to the Python interpreter in Ghidra)

It would be great if this feature could be exposed via the Bridge somehow, at least by allowing help(currentProgram) in the client using the bridge to work or even supporting the IPython magics like currentProgram? or currentProgram??

There probably will be multiple issues with this:
help(currentProgram) doesn't return the documentation as a string but seems to do some weird asynchronous vodoo. So the trick would be getting this as an actual Python String that can then be transferred over the bridge. My guess is that this will involve looking at their Jython plugin to see how they generate this doc and potentially re-implementing it.

How to commit params automatically?

I am trying to use ghidra to find the params of a function. That part is pretty easy to use using ghidra_bridge. However, it only works when the params have been commited (you can do this by just going to the function decompiler and pressing P or right clicking -> commit params). My end goal would be to do have this all done automatically, by one python script. Is there anyway to commit params automatically, or just another way of going about recovering this? Thanks! love the app!

How to use headlessAnalyzer with ghidra-bridge

I have written a script in python 3 and want to run it using the headlessAnalyzer. I have used ghidra-bridge to start a bridge server so that I can use my python 3 code, but when I run the script using headlessAnalyzer I get an error stating
ImportError: No module named ghidra_bridge
the headlessAnalyzer can't detect the ghidra_bridge module, is there a way where I can run this script using headlessAnalyzer

Handle iterable callables [Memory.getByte() strange return]

Hello,

Im trying ghidra_bridge for the first time.

I would like to read a byte from memory with :

function = getFunctionContaining(address)
body = function.getBody()
addr = body.getMinAddress()
b = ghidra.program.model.mem.Memory.getByte(addr)

This following error is raised

AssertionError: Found something callable and iterable at the same time

This is strange because its only a byte ...

thank you for your answer :)

justfoxing / ghidra_bridge Goto Github PK

ghidra_bridge's Introduction

Ghidra Bridge

Table of contents

How to use for Ghidra

Install the Ghidra Bridge package and server scripts

Start Server

CodeBrowser Context

Headless Analysis Context

pythonRun Context

Setup Client

Shutting Down the Server

Security warning

Remote eval

Long-running commands

Remote imports

Interactive mode

How it works

Design principles

Tested

TODO

Contributors

ghidra_bridge's People

Contributors

Stargazers

Watchers

Forkers

ghidra_bridge's Issues

Recommend Projects

Recommend Topics

Recommend Org