Writing
·5 min read·Mohamed Abdullahi

debin - A declarative binary parser for Python

debin is a fast, declarative binary parser for Python. Define binary formats using dataclasses and get type-safe, Rust-powered parsing with zero boilerplate.


If you work with binary data, you know how painful parsing can be. Reverse engineering file formats, decoding game assets, or inspecting packets usually means manual offsets, brittle slicing, and unreadable code.

I wanted something better. So I built debin.

What Is Binary Parsing?

Binary parsing converts raw bytes into structured data. Instead of a wall of hex, you define fields and structure such as headers, payloads, and conditional sections.

Binary data is everywhere, but parsing it in Python is often unpleasant.

So how do we do this simply using an easy-to-use language like Python?

A Tour of Existing Tools (This isn't new)

Before creating debin, I tried a few existing ibraries:

  • struct - Python’s built-in option. Fast and native, but primitive. You're on your own for everything beyond the basics.
  • bitstring - Useful for parsing at the bit level, but too low-level and verbose for structured formats. A lot similar to struct.
  • construct - A powerful declarative and symmetrical parser and builder for binary data.
  • caterpillar - Loved the creativity, but it leans into a DSL-like syntax. It's clever, but doesn’t feel like Python anymore.

My Design Choice

debin keeps everything native to Python. You use dataclasses, type hints, and decorators. No DSL, no format strings, and no manual offsets.

It was heavily inspired by binrw, a Rust declarative binary parser. binrw is fast and elegant, but Rust has a steep learning curve. debin brings that model to Python while keeping performance high using Rust internally.

Introducing debin 🧙

debin lets you define binary formats using simple dataclasses. It generates fast, type safe parsers automatically.

Typical parsing with struct:

import struct
 
data = open("object.bin", "rb").read()
 
x, y, z, object_id = struct.unpack("<fffI", data)
 
obj = {
    "position": {"x": x, "y": y, "z": z},
    "id": object_id
}

The problems:

  • Format strings are fragile

  • No structure or type safety

  • Hard to scale or reuse

The same example in debin:

from debin import *
 
 
@debin
class Vec3:
    x: float32
    y: float32
    z: float32
 
@debin
class Object:
    position: Vec3
    id: uint32
 
with open("sample.bin", "rb") as file:
        buffer = file.read()
obj = Object.read_le(buffer)

No offsets or unpacking. Just describe the format.

It Gets Even Better

debin also supports:

  • Conditional fields

  • Variable length lists

  • Dynamic parsing logic

  • Nested structures

Example PNG parser:

 
from debin import *
from debin.helpers import until_eof
 
@debin(magic=b"\x89PNG\r\n\x1a\n")
class PNGSignature:
    pass  # Magic-only section
 
@debin
class PNGChunk:
    length: uint32
    type: bytes4
    data: List[uint8] = field(metadata={"count": 'length'})
    crc: uint32
 
@debin
class PNGFile:
    signature: PNGSignature
    chunks: List[PNGChunk] = field(metadata={"parse_with": until_eof})

Define the structure and debin handles the parsing.

Want to learn more or try it for yourself? 🧙

pip install debin

💻 Github: https://github.com/maxcabd/debin

📦 PyPi : https://pypi.org/project/debin/

Issues, feedback, and PRs are always welcome.