This is my scripts folder for Ghidra. They're all written in Java so far. That seemed like a good idea, since Ghidra is in Java.
GptOneFunction.java was the first script I wrote and the reason I started making Ghidra scripts. It uses Ghidra's APIs to get a string representation of the current (Ghidra-decompiled) function. Then it passes that string to one of Open AI's models (currently gpt-4-turbo-preview
) via their APIs with the following prompt:
This is a decompiled function from Ghidra; analyze it. Your reply must be non-nested (i.e. FLAT) JSON.
Give the function a better name. "functionName" is the key for the new function name.
Give the parameters and variables better names. All renames will have original name (without type info)
as key and new name (without type info) as value.
I may not keep this documentation completely up-to-date, so if that's important to you, check the code to make sure that's still right.
Usually, the model sticks to the JSON format specified and a response comes back with the structure described. The function, parameters, and variables are all then renamed using the pattern ${original}_${GPT_suggestion}
.
While testing GptOneFunction
I realized that things were going really well. The renamed variables were very helpful in figuring out what the function might be doing. The new function names were descriptive and typically gave short summaries of important logic in each function. This gave me an idea.
I figured I could write a script, GptAllFunctions.java, which could feed ALL functions to GPT. If I could get an acyclic graph of all function calls, I could even start with the leaf nodes (ideally these would be relatively simple functions if they called no other functions) and then work my way up. This way, each time GPT renamed a function, when its caller was later fed to GPT, it would have a summary (built into the renamed functions' names) of what any called function was doing. Amazingly, there's already a Ghidra API which gives you exactly the right structure needed to do this.
I don't know how that would've turned out though. The program I was fiddling with had more than 200k functions so the time and money required to process through the whole file would have ended up being more than I was willing to spend. It might be worth revisiting in the future if costs go down, speed goes up, or a novel way to batch functions without giving a cheaper GPT model a meltdown could be found.
ReadClipboardStack.java is non-GPT related. It checks the current clipboard for a stack trace generated by StackWalker and figures out in which function each line of the stack trace takes place. Ghidra even treats addresses and function names, printed in the console, as hyperlinks, so that's nice.
For example, this:
00000001407E8C72 (00007FF7715F8C72) (program): (filename not available): Scaleform::Render::Matrix2x4<float>::Prepend
00000001407E3AB1 (00007FF7715F3AB1) (program): (filename not available): Scaleform::Render::Matrix2x4<float>::Prepend
00000001407EF76B (00007FF7715FF76B) (program): (filename not available): Scaleform::Render::Matrix2x4<float>::Prepend
0000000140B915EB (00007FF7719A15EB) (program): (filename not available): Scaleform::Render::Matrix2x4<float>::Prepend
0000000140989093 (00007FF771799093) (program): (filename not available): Scaleform::Render::Matrix2x4<float>::Prepend
FFFF82470BECE1E8 (0000023D3CCDE1E8) ((module-name not available)): (filename not available): (function-name not available)
000000014028EE3F (00007FF77109EE3F) (program): (filename not available): (function-name not available)
Would become this:
ReadClipboardStack.java> Address 00000001407E8C72 is inside function: FUN_1407e8ba0
ReadClipboardStack.java> Address 00000001407E3AB1 is inside function: FUN_1407e39f0
ReadClipboardStack.java> Address 00000001407EF76B is inside function: FUN_1407ef720
ReadClipboardStack.java> Address 0000000140B915EB is inside function: FUN_1403914a0
ReadClipboardStack.java> Address 0000000140989093 is inside function: FUN_140988e00
ReadClipboardStack.java> Address FFFF82470BECE1E8 is not inside any known function.
ReadClipboardStack.java> Address 000000014028EE3F is not inside any known function.