Back to Skills

Binary Analysis and Reverse Engineering

macaugh
Updated Yesterday
10 views
2
2
View on GitHub
Developmentgeneral

About

This skill enables developers to analyze compiled binaries to understand program behavior and identify vulnerabilities without source code. It combines static analysis (disassembly) and dynamic analysis (debugging) for a comprehensive approach. Use it for security assessments, malware analysis, or reverse engineering closed-source software.

Documentation

Binary Analysis and Reverse Engineering

Overview

Binary analysis examines compiled executables without access to source code. This skill combines static analysis (disassembly, decompilation) with dynamic analysis (debugging, tracing) to understand program behavior, identify vulnerabilities, and reverse engineer functionality.

Core principle: Combine static and dynamic analysis. Static reveals structure; dynamic reveals behavior.

When to Use

  • Analyzing closed-source software for vulnerabilities
  • Malware analysis and understanding
  • Reverse engineering proprietary protocols
  • Understanding third-party libraries or dependencies
  • CTF challenges and security research
  • Verifying security claims of binary-only software

Analysis Workflow

Phase 1: Initial Assessment

Goal: Understand what you're analyzing and gather basic information.

# File type and architecture
file binary
# ELF 64-bit LSB executable, x86-64, dynamically linked

# Strings (quick insight into functionality)
strings binary | less

# Dependencies
ldd binary
# Check for linked libraries

# Security features
checksec binary
# RELRO, Stack Canary, NX, PIE, FORTIFY

# Basic metadata
readelf -h binary  # ELF headers
objdump -p binary  # Program headers

Phase 2: Static Analysis

Goal: Understand program structure without execution.

Disassembly

# Linear disassembly
objdump -d binary > disassembly.txt

# Intelligent disassembly with Ghidra
# 1. Import binary into Ghidra
# 2. Analyze with default options
# 3. Review function list, strings, cross-references

# IDA Pro (commercial but powerful)
# - Advanced decompilation
# - Cross-references
# - Graph view

Function Analysis

# Using radare2
"""
$ r2 binary
[0x00001000]> aa   # Analyze all
[0x00001000]> afl  # List functions
[0x00001000]> pdf @ main  # Disassemble main
[0x00001000]> VV @ main   # Visual graph mode
"""

# Common patterns to look for:
# - Entry point and main function
# - Vulnerable functions (strcpy, gets, sprintf)
# - Cryptographic operations
# - Network operations (socket, connect, send)
# - File operations (fopen, read, write)
# - Privilege operations (setuid, system)

Control Flow Analysis

"""
Analyze program flow:

1. Identify entry point
2. Follow execution paths
3. Identify decision points (if/else, switch)
4. Map loops and recursion
5. Identify error handling
6. Find return points

Key questions:
- What are the main code paths?
- Where does user input enter?
- What validation occurs?
- Where are dangerous operations?
"""

Phase 3: Dynamic Analysis

Goal: Observe actual program behavior during execution.

Debugging with GDB

# Start GDB
gdb ./binary

# Set breakpoints
(gdb) break main
(gdb) break *0x401234  # Specific address

# Run with arguments
(gdb) run arg1 arg2

# Examine registers
(gdb) info registers

# Examine memory
(gdb) x/10x $rsp      # 10 hex words at stack pointer
(gdb) x/s 0x404000    # String at address

# Step through code
(gdb) stepi           # Step one instruction
(gdb) nexti           # Step over function call
(gdb) continue        # Continue to next breakpoint

# Display on each step
(gdb) display/i $pc   # Show current instruction
(gdb) display/x $rax  # Show RAX register

Enhanced GDB with PEDA/GEF/pwndbg

# Install PEDA
git clone https://github.com/longld/peda.git ~/peda
echo "source ~/peda/peda.py" >> ~/.gdbinit

# Or GEF (recommended)
bash -c "$(curl -fsSL https://gef.blah.cat/sh)"

# Enhanced features:
# - Color coding
# - Automatic display of registers, stack, code
# - Pattern creation/offset calculation
# - ROP gadget search
# - Heap analysis

System Call Tracing

# Trace system calls
strace ./binary

# Trace with details
strace -v -s 1024 ./binary

# Trace specific syscalls
strace -e trace=open,read,write ./binary

# Trace library calls
ltrace ./binary

# Follow child processes
strace -f ./binary

Dynamic Instrumentation

# Using Frida for runtime instrumentation
import frida
import sys

def on_message(message, data):
    print(f"[*] {message}")

# Attach to process
session = frida.attach("target_process")

# JavaScript to inject
script_code = """
Interceptor.attach(Module.findExportByName(null, 'strcmp'), {
    onEnter: function(args) {
        console.log('[*] strcmp called');
        console.log('    arg1: ' + Memory.readUtf8String(args[0]));
        console.log('    arg2: ' + Memory.readUtf8String(args[1]));
    },
    onLeave: function(retval) {
        console.log('    return: ' + retval);
    }
});
"""

script = session.create_script(script_code)
script.on('message', on_message)
script.load()
sys.stdin.read()

Phase 4: Vulnerability Identification

Common Vulnerability Patterns:

Buffer Overflow

; Look for unsafe string operations
call strcpy     ; No bounds checking
call gets       ; Always unsafe
call sprintf    ; No bounds checking

; Check for:
; - Fixed-size buffers
; - User-controlled input
; - No size validation

Format String

; User input directly to printf
mov rdi, [user_input]
call printf     ; Dangerous if user_input has format specifiers

Use After Free

; Pattern:
call free       ; Free memory
; ... later ...
mov rax, [ptr]  ; Use freed pointer

Integer Overflow

; Look for:
; - Size calculations
; - Loop counters
; - Memory allocation sizes

imul eax, [count], 8  ; Can overflow
call malloc           ; Allocates wrong size

Phase 5: Exploit Development

See skills/exploitation/exploit-dev-workflow for detailed exploitation process.

Tool Ecosystem

Disassemblers/Decompilers

Ghidra (Free)

# NSA's reverse engineering suite
# Features:
# - Decompiler (C-like output)
# - Cross-references
# - Scripting (Python/Java)
# - Collaborative analysis

# Download from: https://ghidra-sre.org/

IDA Pro (Commercial)

  • Industry standard
  • Best-in-class decompiler (Hex-Rays)
  • Extensive plugin ecosystem
  • IDA Free available with limitations

Binary Ninja (Commercial)

  • Modern UI
  • Medium-level IL
  • Good API for automation
  • Active development

radare2 (Free)

# Command-line focused
# Steep learning curve but powerful

# Basic workflow
r2 binary
[0x00001000]> aa        # Analyze
[0x00001000]> afl       # Functions
[0x00001000]> s main    # Seek to main
[0x00001000]> pdf       # Disassemble
[0x00001000]> VV        # Visual graph

Debuggers

GDB with Extensions

  • PEDA - Python Exploit Development Assistance
  • GEF - GDB Enhanced Features
  • pwndbg - Exploit development focused

WinDbg (Windows)

  • Microsoft's debugger
  • Kernel and user-mode debugging
  • Essential for Windows analysis

x64dbg (Windows)

  • Modern UI
  • Plugin support
  • Good for malware analysis

Dynamic Analysis

Frida

  • Dynamic instrumentation
  • JavaScript API
  • Cross-platform
  • Runtime modification

PIN/DynamoRIO

  • Dynamic binary instrumentation frameworks
  • Research-grade tools
  • Performance analysis

Valgrind

# Memory debugging
valgrind --leak-check=full ./binary

# Memory profiling
valgrind --tool=massif ./binary

Common Analysis Scenarios

Scenario 1: Finding Hardcoded Credentials

# Search strings
strings binary | grep -i "password\|secret\|key"

# In Ghidra:
# 1. Window -> Defined Strings
# 2. Search for interesting patterns
# 3. Check cross-references to see usage

Scenario 2: Understanding Network Protocol

# 1. Trace network calls
strace -e trace=network ./binary

# 2. Analyze send/recv calls in disassembly
# 3. Set breakpoints on socket operations
gdb ./binary
(gdb) break send
(gdb) break recv

# 4. Capture actual packets
tcpdump -i lo -w capture.pcap

# 5. Analyze protocol structure
wireshark capture.pcap

Scenario 3: Bypassing License Check

# 1. Search for error messages
strings binary | grep -i "license\|trial\|expired"

# 2. Find string references in Ghidra
# 3. Understand check logic
# 4. Identify bypass point

# Options:
# - Patch binary (change jump condition)
# - Hook function at runtime (Frida)
# - Modify return value in debugger

Scenario 4: Extracting Encryption Keys

# Dynamic approach - hook crypto functions
"""
Interceptor.attach(Module.findExportByName('libcrypto.so', 'AES_set_encrypt_key'), {
    onEnter: function(args) {
        console.log('[*] AES key:');
        console.log(hexdump(args[0], { length: 32 }));
    }
});
"""

# Static approach - look for key initialization
# - Search for crypto constants (S-boxes, magic numbers)
# - Analyze key derivation functions
# - Check for embedded keys

Practical Tips

Naming Conventions

# In disassemblers, rename variables/functions for clarity
# Bad:  FUN_00401234(local_10, DAT_00404000)
# Good: validate_input(user_buffer, key_string)

# Document as you analyze
# Add comments explaining complex logic
# Create structure definitions for data

Cross-Reference Analysis

# Follow data flow:
# 1. Find interesting data (strings, constants)
# 2. Find references (where it's used)
# 3. Understand context of use
# 4. Trace back to source

# Example: Password validation
# "Invalid password" string
#   -> Used in check_password()
#   -> Called from login()
#   -> Gets input from get_user_input()

Identifying Compiler Artifacts

; Stack canary check (GCC)
mov rax, fs:0x28
mov [rbp-0x8], rax
; ... function body ...
mov rdx, [rbp-0x8]
xor rdx, fs:0x28
je .no_corruption
call __stack_chk_fail

; C++ name mangling
_ZN6Class14memberFunctionEi  ; Class1::memberFunction(int)

Anti-Analysis Techniques

Detection:

  • Debugger detection (ptrace, IsDebuggerPresent)
  • Timing checks
  • Code obfuscation
  • Anti-disassembly tricks

Countermeasures:

# Patch debugger checks
# In GDB:
(gdb) break ptrace
(gdb) return 0

# Use stealthy debuggers
# - ScyllaHide (plugin)
# - Custom tools

# Deobfuscation
# - Symbolic execution (angr)
# - Dynamic unpacking
# - Pattern matching

Common Pitfalls

MistakeImpactSolution
Only static analysisMiss runtime behaviorCombine static + dynamic
Not documenting findingsLose contextTake detailed notes
Analyzing without goalWaste timeDefine specific objectives
Ignoring cross-referencesMiss important connectionsFollow all references
Not checking compiler versionMisinterpret artifactsIdentify compiler/flags used

Integration with Other Skills

  • skills/analysis/zero-day-hunting - Finding vulnerabilities in binaries
  • skills/exploitation/exploit-dev-workflow - Exploiting discovered flaws
  • skills/analysis/static-vuln-analysis - Source code analysis if available

Legal and Ethical Considerations

Authorization Required:

  • Only analyze authorized software
  • Respect license agreements
  • Don't distribute cracked software
  • Follow responsible disclosure

Legitimate Use Cases:

  • Security research with permission
  • Malware analysis for defense
  • Own software assessment
  • Educational purposes with legal samples

Success Metrics

  • Understanding program functionality
  • Identifying security vulnerabilities
  • Extracting useful intelligence
  • Creating working exploits (if authorized)
  • Comprehensive documentation

References and Further Reading

  • "Practical Reverse Engineering" by Dang et al.
  • "The IDA Pro Book" by Chris Eagle
  • "Practical Binary Analysis" by Dennis Andriesse
  • Ghidra documentation and training materials
  • Malware analysis books (for dynamic analysis techniques)
  • Assembly language references (Intel manuals, x86-64 ABI)
  • CTF write-ups for practical examples

Quick Install

/plugin add https://github.com/macaugh/super-rouge-hunter-skills/tree/main/binary-analysis

Copy and paste this command in Claude Code to install this skill

GitHub 仓库

macaugh/super-rouge-hunter-skills
Path: skills/analysis/binary-analysis

Related Skills

analyzing-dependencies

Meta

This skill analyzes project dependencies for security vulnerabilities, outdated packages, and license compliance issues. It helps developers identify potential risks in their dependencies using the dependency-checker plugin. The skill supports popular package managers including npm, pip, composer, gem, and Go modules.

View skill

work-execution-principles

Other

This Claude Skill establishes core development principles for work breakdown, scope definition, testing strategies, and dependency management. It provides a systematic approach for code reviews, planning, and architectural decisions to ensure consistent quality standards across all development activities. The skill is universally applicable to any programming language or framework when starting development work or planning implementation approaches.

View skill

Git Commit Helper

Meta

This Claude Skill generates descriptive commit messages by analyzing git diffs. It automatically follows conventional commit format with proper types like feat, fix, and docs. Use it when you need help writing commit messages or reviewing staged changes in your repository.

View skill

subagent-driven-development

Development

This skill executes implementation plans by dispatching a fresh subagent for each independent task, with code review between tasks. It enables fast iteration while maintaining quality gates through this review process. Use it when working on mostly independent tasks within the same session to ensure continuous progress with built-in quality checks.

View skill