A plugin to introduce a generic API for Decompiler support in GEF

Overview

decomp2gef

A plugin to introduce a generic API for Decompiler support in GEF. Like GEF, the plugin is battery-included and requires no external dependencies other than Python.

decomp2gef Demo viewable here.

Quick Start

First, install the decomp2gef plugin into gef:

cp decomp2gef.py ~/.decomp2gef.py && echo "source ~/.decomp2gef.py" >> ~/.gdbinit

Alternatively, you can load it for one-time-use inside gdb with:

source /path/to/decomp2gef.py

Now import the relevant script for you decompiler:

IDA

  • open IDA on your binary and press Alt-F7
  • popup "Run Script" will appear, load the decomp2gef_ida.py script from this repo

Now use the decompiler connect command in GDB. Note: you must be in a current session of debugging something.

Usage

In gdb, run:

decompiler connect ida

If all is well, you should see:

[+] Connected to decompiler!

Now just use GEF like normal and enjoy decompilation and decompiler symbol mapping! When you change a symbol in ida, like a function name, if will be automatically reflected in gdb after just 2 steps!

Features

  • Auto-updating decompilation context view
  • Auto-syncing function names
  • Breakable/Inspectable symbols
  • Auto-syncing stack variable names
  • Auto-syncing structs

Abstract

The reverse engineering process often involves a decompiler making it fundamental to support in a debugger since context switching knowledge between the two is hard. Decompilers have a lot in common. During the reversing process there are reverse engineering artifacts (REA). These REAs are common across all decompilers:

  • stack variables
  • global variables
  • structs
  • enums
  • function headers (name and prototype)
  • comments

Knowledge of REAs can be used to do lots of things, like sync REAs across decompilers or create a common interface for a debugger to display decompilation information. GEF is currently one of the best gdb upgrades making it a perfect place to first implement this idea. In the future, it should be easily transferable to any debugger supporting python3.

Adding your decompiler

To add your decompiler, simply make a Python XMLRPC server that implements the 4 server functions found in the decomp2gef Decompiler class. Follow the code for how to return correct types.

Comments
  • Missing or invalid attribute 'comment' on Windows IDA 7.6

    Missing or invalid attribute 'comment' on Windows IDA 7.6

    IDA version 7.6 Python 3.10

    decomp2gef ida script seems to fail at Z:\...\IDA\IDA Pro 7.6\plugins\decomp2gef_ida.py: Missing or invalid attribute 'comment' and I seem to be unable to see the decomp2gef plugin being loaded successfully. any ideas?

      bytes   pages size description
    --------- ----- ---- --------------------------------------------
       532480    65 8192 allocating memory for b-tree...
       507904    62 8192 allocating memory for virtual array...
       262144    32 8192 allocating memory for name pointers...
    -----------------------------------------------------------------
      1302528            total memory allocated
    
    Loading processor module Z:\...\IDA\IDA Pro 7.6\procs\pc.dll for metapc...Initializing processor module metapc...OK
    Loading type libraries...
    Autoanalysis subsystem has been initialized.
    Z:\...\IDA\IDA Pro 7.6\plugins\decomp2gef_ida.py: Missing or invalid attribute 'comment'
    Database for file 'redacted' has been loaded.
    Hex-Rays Decompiler plugin has been loaded (v7.6.0.210427)
      License: 57-631C-7A2B-72 IDA PRO 7.6 SP1 (99 users)
      The hotkeys are F5: decompile, Ctrl-F5: decompile all.
    
      Please check the Edit/Plugins menu for more informaton.
    Z:\...\IDA\IDA Pro 7.6\plugins\decomp2gef_ida.py: Missing or invalid attribute 'comment'
    805EF50: restored microcode from idb
    805EF50: restored pseudocode from idb
    -----------------------------------------------------------------------------------------
    Python 3.10.1 (tags/v3.10.1:2cd268a, Dec  6 2021, 19:10:37) [MSC v.1929 64 bit (AMD64)] 
    IDAPython v7.4.0 final (serial 0) (c) The IDAPython Team <[email protected]>
    -----------------------------------------------------------------------------------------
    
    bug 
    opened by caprinux 10
  • ELF failes to build with objcopy on some binaries

    ELF failes to build with objcopy on some binaries

    Found out about this decomp2gef script and was eager to try it out :)

    When trying it out, I encountered this error: image

    I'm currently using GDB 11.1, IDA 7.6 together with the latest gef.py script freshly pulled from the gef github. Any clue how I could fix this?

    bug 
    opened by caprinux 8
  • Improve code logic and fix errors

    Improve code logic and fix errors

    This PR aims to address a few issues in decomp2gef

    First issue addressed

    As of now, if you try to connect decompiler to debug a binary over a remote server via gdbserver, chances are you will encounter the error min() arg is an empty sequence which arises due to the following lines

    base_address = min([x.page_start for x in vmmap if x.path == get_filepath()])
    ...
    text_base = min([x.page_start for x in vmmap if x.path == get_filepath()])
    

    This arises due to inconsistency in the exact path of the binary between the remote server and the local binary.

    For example, if I try to debug a binary on remote with file path /bin/program with the local binary in /tmp/program, x.path will return /bin/program and get_filepath() will return /tmp/program which does not match and causes the array to be empty.\

    Hence a slight modification to compare the file name rather than the absolute path will solve this issue.

    base_address = min([x.page_start for x in vmmap if x.path.split('/')[-1] == get_filename()])
    ...
    text_base = min([x.page_start for x in vmmap if x.path.split('/')[-1] == get_filename()])
    

    Second issue addressed

    def update_function_data(self, addr):
        ...
        for idx, arg in args.item():
            idx = int(idx, 0)
            expr = f"""(({arg['type']}) {current_arch.function_parameters[idx]}"""
        ...
    

    Within the update_function_data(), we use current_arch.function_parameters to get the registers in which our function arguments are stored and use IDX to match the argument to the respective register/place in memory where the argument is stored.

    In x86_64, current_arch.function_parameters = ['$rdi', '$rsi', '$rdx', '$rcx', '$r8', '$r9'] and this works for functions with 6 arguments as current_arch.function_parameters[0] will match argument 1 to $rdi and so on.

    The flaw comes when we fail to consider functions with 7 or more arguments where arguments will then be found in the stack.

    However we can set this aside, for now, we don't usually encounter more than 7 arguments, right?

    The more urgent flaw is when we bring in the X86 architecture.

    In x86, current_arch.function_parameters = ['$esp']. This means that beyond the first argument, decomp2gef will break with an IndexError as it tries to access current_arch.function_parameters[1] for the 2nd argument and so on.

    I'm not sure if there's a nice way to do this but I essentially redefined current_arch.function_parameters for X86 architectures.

    current_arch.function_parameters = [f'$esp+{x}' for x in range(0, 28, 4)]
    

    This allows decomp2gef to work, but I haven't considered implications yet as it may get pretty complicated(?)

    I welcome any ideas!!

    opened by caprinux 7
  • invalid string offset for section `.strtab'

    invalid string offset for section `.strtab'

    When connecting GEF to the decompiler, gdb fails to add-symbol-file and throws an error.

    BFD: /tmp/tmp3pmay68f.c.debug: invalid string offset 16777215 >= 282 for section '.strtab'

    Although this does not break decomp2gef, it causes the debugger to be without a symbol file and hence renders some features unusable.

    Sample binary with this behavior: sample_program.zip

    bug 
    opened by caprinux 6
  • Add proper support for attaching

    Add proper support for attaching

    Currently support for using the attach command is shaky with PIE binaries, and requires the strict process of launching a new gdb instances, attaching to the target process id, and then connecting to decomp2dbg. Attempting to attach again after this will cause the attached-to binary to have a new base address, and I assume this prevents decomp2dbg from functioning properly as I lose symbols after that. On my local system, disconnecting and reconnecting doesn't fix this (if done before or after the binary is attached to for the second time).

    enhancement 
    opened by frqmod 2
  • add instruction for wsl2

    add instruction for wsl2

    I suggest to add instruction for those who want to use this tool in wsl.

    1. run ./install.sh --ida /mnt/c/xxx/IDA/plugins to install
    2. listen to 0.0.0.0:3662 in IDA
    3. add an Inbound Rules for port 3662 in Windows Firewall, private or domain network
    4. run decompiler connect ida 192.168.xxx.xxx(LAN IP) 3662 in gdb to connect
    opened by RoderickChan 1
  • add checks when forcing text size

    add checks when forcing text size

    Previously, we force text_size to 0xFFFFFF with a nasty hack which only works on 64 bit binaries.

    This means that 32-bit binaries will throw a .strtab offset error or something along those times most of the time which is rather ugly.

    Hence we implement a bit check on the binary and enforce the hack if binary is 64-bit. We should definitely implement a nicer method that caters to 32-bit binaries when possible, but until then these will suffice to preserve sanity.

    addresses #14 !!

    opened by caprinux 1
  • Fix symbol size offsets/Designate appropriate symbol size to symbols

    Fix symbol size offsets/Designate appropriate symbol size to symbols

    Addresses #15, took a while to debug but on comparing queued_sym_sizes to sym_info_list, they seemed to be correct and match accordingly.

    I tested this PR against very basic 32 and 64 bit binaries, which both seems to give me the appropriate symbol sizes upon calling readelf.

    Do have a look! Unfortunately, does not fix #14 :(

    opened by caprinux 1
  • Requires sortedcontainers

    Requires sortedcontainers

    decomp2gef actually does have a single dependency which I did not realize was a dependency: sortedcontainers. It's needed to create a fast and memory-friendly mapping for non-native symbols in gef. We should decide if we want to make decomp2gef dependent of some python packages, or try to replace the functionality of SortedDict.

    discussion 
    opened by mahaloz 1
  • Feat Request: Programmable Ports & IPs

    Feat Request: Programmable Ports & IPs

    As brought up by @caprinux (in #3), we don't support the ability to specify ports or ips for connecting GEF over. Currently, it's hardcoded to 3662.

    To allow for this, we will need a fundamental change in architecture for the server-side, since we need a way to specify port and IP.

    enhancement 
    opened by mahaloz 1
  • Register Variable Support

    Register Variable Support

    • we can now supper every variable shown in a decompiler (including the ones assigned to a variable)
    • functions args are being deprecated in favor of setting them through either register vars or stack vars
    • refactored some janky type setting code

    Closes #38

    opened by mahaloz 0
  • Stack Vars from IDA assigned to incorrect locations

    Stack Vars from IDA assigned to incorrect locations

    Here is a simple example I tested on decomp2dbg v3.1.3 and ida7.5:

    #include <stdio.h>
    
    int main()
    {
        int a = 1;
        int b;
        scanf("%d",&b);
        int c = a + b;
        printf("%d\n",c);
        return 0;
    }
    

    the disassembled result of ida is:

    int __cdecl main(int argc, const char **argv, const char **envp)
    {
      int v4; // [rsp+Ch] [rbp-14h] BYREF
      int v5; // [rsp+10h] [rbp-10h]
      int v6; // [rsp+14h] [rbp-Ch]
      unsigned __int64 v7; // [rsp+18h] [rbp-8h]
    
      v7 = __readfsqword(0x28u);
      v5 = 1;
      __isoc99_scanf(&unk_2004, &v4, envp);
      v6 = v4 + v5;
      printf("%d\n", (unsigned int)(v4 + v5));
      return 0;
    }
    

    when I executed to __isoc99_scanf(&unk_2004, &v4, envp);, I tried to show the value of v5, but it is different from $rbp - 0x10

    image

    Is this a bug or did I do something wrong?

    bug 
    opened by LioTree 2
  • binary ninja plugin manager?

    binary ninja plugin manager?

    The binary ninja plugin manager supports plugins that exist in subfolders. All it would take would be to tag and cut a release (or using release_helper) and let me know on this issue and I can add it. Then subsequent releases automatically notify us and we update the plugin manager accordingly.

    Note that I haven't looked at the current import hierarchy but because the plugin manager doesn't necessarily install things to the global namespace it might require some tweaks to how imports are done.

    opened by psifertex 1
  • Tab completion broken in Archlinux gdb

    Tab completion broken in Archlinux gdb

    After sourcing decomp2dbg in my gdbinit I have the commands available but it's not possible anymore to use tab-completion to look for help or alike. This happens with a naked gdb with source ~/.decomp2dbg.py as only entry.

    opened by ysf 6
  • Add Support for shared librairies.

    Add Support for shared librairies.

    Hello. Tested on binary it work perfect, great tools! But when i test on shared libraries started with another binary, after connecting to the server, the tool won't work. (no decompilation, breakpoints have offset errors) . I know original behavior is to start the server on binary and use the connect while debugging the binary with gdb. But in my situation I can't debug the libraries without starting the linked main binary before. (Maybe adding muliple server syncing ? Decomp2dbg can't export ida decompiled code on gdb because shared librairies is extern but maybe adding another syncing server on shared lib IDA instance + syncing correctly when jumping on shared lib can work)

    opened by 0xMirasio 16
  • Add Support for Struct Imports

    Add Support for Struct Imports

    For now, we will only support IDA since we have a clear-cut way to both get every struct and also know when they have been updated. This may also be possible in Binja, but out of question for future Ghidra support... that one will have to wait.

    IDA Changes

    In IDA we need to utilize finding all ordinal numbers, which represents each custom struct in IDA. After that, we can use idc.print_decls("1", 0) for each number to get a nice C representation of the struct. Now that we have a string that has the C-definition of the struct we need to do things in the core.

    The changes all take place in the server. It's possible this may change the old API.

    Client Changes

    Assuming we now have a series of structs that are represented in C, we actually need to compile them into an object file and then add them with the classic add-symbol-file we use on the backend for other things. The trick though is adding this symbol file before we add the big one with all the global symbols here: https://github.com/mahaloz/decomp2dbg/blob/57983617d9a14f1f2ed7b54ee07bd15f14075c45/decomp2dbg/clients/gdb/symbol_mapper.py#L243

    Since both symbol files will be loaded into the same place, there will be an overlapping main function. We either need to bake structs directly into the first file we create, or we need to make a new way to add native-struct through the symbol mapper.

    enhancement 
    opened by mahaloz 0
Releases(v3.1.3)
  • v3.1.3(Nov 21, 2022)

  • v3.1.2(Nov 21, 2022)

  • v3.1.1(Nov 18, 2022)

  • v3.1.0(Nov 17, 2022)

    What's Changed

    • Add Ghidra Demo & Refactor readme by @mahaloz in https://github.com/mahaloz/decomp2dbg/pull/47
    • Reflect config changes in pwndbg by @szymex73 in https://github.com/mahaloz/decomp2dbg/pull/48
    • Typo in README.md by @Ice1187 in https://github.com/mahaloz/decomp2dbg/pull/49

    New Contributors

    • @szymex73 made their first contribution in https://github.com/mahaloz/decomp2dbg/pull/48
    • @Ice1187 made their first contribution in https://github.com/mahaloz/decomp2dbg/pull/49

    Full Changelog: https://github.com/mahaloz/decomp2dbg/compare/v3.0.0...v3.1.0

    Source code(tar.gz)
    Source code(zip)
    d2d-ghidra-plugin.zip(1.41 MB)
  • v3.0.0(Oct 25, 2022)

    • Added Ghidra support
    • Refactored how installing works to full Python-only
    • Added dependence on the BinSync project

    What's Changed

    • Python Installer by @mahaloz in https://github.com/mahaloz/decomp2dbg/pull/46

    Full Changelog: https://github.com/mahaloz/decomp2dbg/compare/v2.2.0...v3.0.0

    Source code(tar.gz)
    Source code(zip)
    d2d-ghidra-plugin.zip(1.41 MB)
  • v2.2.0(Oct 25, 2022)

    What's Changed

    • Support native symbol mapping by @mahaloz in https://github.com/mahaloz/decomp2dbg/pull/1
    • added REAL native support by @mahaloz in https://github.com/mahaloz/decomp2dbg/pull/2
    • fail gracefully on bad argv & fix bad elfs reads by @mahaloz in https://github.com/mahaloz/decomp2dbg/pull/6
    • Add angrmanagement support for decomp2gef by @Cl4sm in https://github.com/mahaloz/decomp2dbg/pull/8
    • DRAFT: Local Var Support by @mahaloz in https://github.com/mahaloz/decomp2dbg/pull/7
    • Api refactor by @mahaloz in https://github.com/mahaloz/decomp2dbg/pull/9
    • Major Refactor: Global Vars, Programmable Ports, Packaging by @mahaloz in https://github.com/mahaloz/decomp2dbg/pull/10
    • make sure IDA Plugin has all consts defined by @mahaloz in https://github.com/mahaloz/decomp2dbg/pull/12
    • Support remote debugging by @mahaloz in https://github.com/mahaloz/decomp2dbg/pull/17
    • Fix rebasing bugs in angr-decompiler plugin by @mahaloz in https://github.com/mahaloz/decomp2dbg/pull/18
    • Update GEF API use to latest version by @mahaloz in https://github.com/mahaloz/decomp2dbg/pull/22
    • Fix symbol size offsets/Designate appropriate symbol size to symbols by @caprinux in https://github.com/mahaloz/decomp2dbg/pull/20
    • add checks when forcing text size by @caprinux in https://github.com/mahaloz/decomp2dbg/pull/21
    • minor fixes + replace all usage of GEF Elf object with pyelftools by @caprinux in https://github.com/mahaloz/decomp2dbg/pull/25
    • [WIP] Binja Support by @mahaloz in https://github.com/mahaloz/decomp2dbg/pull/19
    • fixed bad sizing on binaries that generate a larger blank symbol section by @mahaloz in https://github.com/mahaloz/decomp2dbg/pull/26
    • Fix symbol duplication by @mahaloz in https://github.com/mahaloz/decomp2dbg/pull/27
    • fix another duplication bug by @mahaloz in https://github.com/mahaloz/decomp2dbg/pull/28
    • [WIP] Support Vanilla GDB by @mahaloz in https://github.com/mahaloz/decomp2dbg/pull/30
    • Fix a typo that caused the plugin not to work for Python <3.8. by @adamdoupe in https://github.com/mahaloz/decomp2dbg/pull/32
    • Fix Manual Install Instructions by @adamdoupe in https://github.com/mahaloz/decomp2dbg/pull/31
    • Fix README typos by @mborgerson in https://github.com/mahaloz/decomp2dbg/pull/33
    • Handle stack frame offset for stack variables on x86 architectures by @zolutal in https://github.com/mahaloz/decomp2dbg/pull/35
    • always refresh baseaddr on connect by @mahaloz in https://github.com/mahaloz/decomp2dbg/pull/37
    • Register Variable Support by @mahaloz in https://github.com/mahaloz/decomp2dbg/pull/39
    • Support loading symbols at configurable base addresses by @mahaloz in https://github.com/mahaloz/decomp2dbg/pull/42
    • Ghidra Support by @mahaloz in https://github.com/mahaloz/decomp2dbg/pull/45

    New Contributors

    • @mahaloz made their first contribution in https://github.com/mahaloz/decomp2dbg/pull/1
    • @Cl4sm made their first contribution in https://github.com/mahaloz/decomp2dbg/pull/8
    • @caprinux made their first contribution in https://github.com/mahaloz/decomp2dbg/pull/20
    • @adamdoupe made their first contribution in https://github.com/mahaloz/decomp2dbg/pull/32
    • @mborgerson made their first contribution in https://github.com/mahaloz/decomp2dbg/pull/33
    • @zolutal made their first contribution in https://github.com/mahaloz/decomp2dbg/pull/35

    Full Changelog: https://github.com/mahaloz/decomp2dbg/commits/v2.2.0

    Source code(tar.gz)
    Source code(zip)
    d2d-ghidra-plugin.zip(1.41 MB)
Owner
Zion
Native Hawaiian | Phd Student @sefcom | Co-captain @shellphish | President of @asu-hacking-club
Zion
YAML metadata extension for Python-Markdown

YAML metadata extension for Python-Markdown This extension adds YAML meta data handling to markdown with all YAML features. As in the original, metada

Nikita Sivakov 14 Dec 30, 2022
Hjson for Python

hjson-py Hjson, a user interface for JSON Hjson works with Python 2.5+ and Python 3.3+ The Python implementation of Hjson is based on simplejson. For

Hjson 185 Dec 13, 2022
The OpenAPI Specification Repository

The OpenAPI Specification The OpenAPI Specification is a community-driven open specification within the OpenAPI Initiative, a Linux Foundation Collabo

OpenAPI Initiative 25.5k Dec 29, 2022
Explicit, strict and automatic project version management based on semantic versioning.

Explicit, strict and automatic project version management based on semantic versioning. Getting started End users Semantic versioning Project version

Dmytro Striletskyi 6 Jan 25, 2022
freeCodeCamp Scientific Computing with Python Project for Certification.

Polygon_Area_Calculator freeCodeCamp Python Project freeCodeCamp Scientific Computing with Python Project for Certification. In this project you will

Rajdeep Mondal 1 Dec 23, 2021
100 numpy exercises (with solutions)

100 numpy exercises This is a collection of numpy exercises from numpy mailing list, stack overflow, and numpy documentation. I've also created some p

Nicolas P. Rougier 9.5k Dec 30, 2022
Documentation for the lottie file format

Lottie Documentation This repository contains both human-readable and machine-readable documentation about the Lottie format The documentation is avai

LottieFiles 25 Jan 05, 2023
Cleaner script to normalize knock's output EPUBs

clean-epub The excellent knock application by Benton Edmondson outputs EPUBs that seem to be DRM-free. However, if you run the application twice on th

2 Dec 16, 2022
script to calculate total GPA out of 4, based on input gpa.csv

gpa_calculator script to calculate total GPA out of 4 based on input gpa.csv to use, create a total.csv file containing only one integer showing the t

Mohamad Bastin 1 Feb 07, 2022
A Sublime Text plugin to select a default syntax dialect

Default Syntax Chooser This Sublime Text 4 plugin provides the set_default_syntax_dialect command. This command manipulates a syntax file (e.g.: SQL.s

3 Jan 14, 2022
The purpose of this project is to share knowledge on how awesome Streamlit is and can be

Awesome Streamlit The fastest way to build Awesome Tools and Apps! Powered by Python! The purpose of this project is to share knowledge on how Awesome

Marc Skov Madsen 1.5k Jan 07, 2023
A markdown wiki and dashboarding system for Datasette

datasette-notebook A markdown wiki and dashboarding system for Datasette This is an experimental alpha and everything about it is likely to change. In

Simon Willison 19 Apr 20, 2022
Tips for Writing a Research Paper using LaTeX

Tips for Writing a Research Paper using LaTeX

Guanying Chen 727 Dec 26, 2022
Python-samples - This project is to help someone need some practices when learning python language

Python-samples - This project is to help someone need some practices when learning python language

Gui Chen 0 Feb 14, 2022
FxBuzzly - Buzzly.art links do not embed in Discord, this fixes them (rudimentarily)

fxBuzzly Buzzly.art links do not embed in Discord, this fixes them (rudimentaril

Dania Rifki 2 Oct 27, 2022
Exercism exercises in Python.

Exercism exercises in Python.

Exercism 1.3k Jan 04, 2023
Highlight Translator can help you translate the words quickly and accurately.

Highlight Translator can help you translate the words quickly and accurately. By only highlighting, copying, or screenshoting the content you want to translate anywhere on your computer (ex. PDF, PPT

Coolshan 48 Dec 21, 2022
sphinx builder that outputs markdown files.

sphinx-markdown-builder sphinx builder that outputs markdown files Please ★ this repo if you found it useful ★ ★ ★ If you want frontmatter support ple

Clay Risser 144 Jan 06, 2023
Near Zero-Overhead Python Code Coverage

Slipcover: Near Zero-Overhead Python Code Coverage by Juan Altmayer Pizzorno and Emery Berger at UMass Amherst's PLASMA lab. About Slipcover Slipcover

PLASMA @ UMass 325 Dec 28, 2022
Proyecto - Desgaste y rendimiento de empleados de IBM HR Analytics

Acceder al código desde Google Colab para poder ver de manera adecuada todas las visualizaciones y poder interactuar con ellas. Links de acceso: Noteb

1 Jan 31, 2022