Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Introduction

Crates.io GitHub License OpenSSF Scorecard OpenSSF Best Practices

libmagic-rs is a pure-Rust reimplementation of the libmagic library — the file-type detection engine behind the GNU file command. The library has no unsafe code in its own modules and uses memory-mapped I/O for file reads. The magic-file syntax matches GNU file closely; rules written for file should mostly work without changes.

This guide is the developer reference: the library, the rmagic CLI, and the magic-file format.

What’s in the box

The parser handles text-format magic files (Magdir/ directories and individual files), with hierarchical rules, comments, line continuations, and a parse_text_magic_file() API. Format detection picks the right loader for a given path automatically.

The evaluator handles offset resolution (absolute, from-end, relative, indirect), the libmagic type primitives, the operator set (equality, inequality, comparisons, bitwise, any-value), cross-type integer coercion, and per-rule error recovery so a single bad rule doesn’t abort the whole match. A !:strength directive lets rules adjust their priority.

The rmagic CLI takes one or more files (or stdin), produces text or JSON output, honors a per-file timeout, and supports --use-builtin for a no-external-files mode. Built-in rules ship for ELF, PE/DOS, ZIP, TAR, GZIP, JPEG, PNG, GIF, BMP, and PDF.

Smaller pieces worth knowing about: opt-in MIME type mapping, semantic tag extraction from match descriptions, confidence scoring based on hierarchy depth, three configuration presets (default(), performance(), comprehensive()) with security-bound validation, and structured error types covering parse, evaluation, config, file, and timeout failures.

The project is under active development. CI runs ~1,200 tests on every change, clippy runs in pedantic mode with warnings treated as errors, and no unsafe code is allowed in library modules.

What’s next

Indirect offsets with complex pointer dereferencing patterns, parallel evaluation across multiple files, and broader format coverage are on the roadmap. Binary .mgc support is intentionally out of scope — the project is text-magic only, OpenBSD-style.

Architecture at a glance

flowchart LR
    MF[Magic File] --> P[Parser]
    P --> AST[AST]
    AST --> E[Evaluator]
    TF[Target File] --> FB[File Buffer]
    FB --> E
    E --> R[Results]
    R --> F[Formatter]

    style MF fill:#e3f2fd
    style TF fill:#e3f2fd
    style F fill:#c8e6c9

Magic files become an AST at load time. The evaluator runs that AST against a file buffer at evaluation time. Output formatters turn results into text or JSON.

How this guide is organized

The chapters move from end-user concerns toward internals: getting started and CLI usage first, then library integration and configuration, then architecture and per-module reference, then advanced topics (magic format, testing, performance), then contribution and release process. The appendices cover the API reference, full CLI reference, and example magic rules.

Getting help

For bugs and feature requests, use GitHub Issues. For open-ended questions, GitHub Discussions. Generated API documentation is at docs.rs/libmagic-rs.

Contributing

Contributions welcome. See CONTRIBUTING.md and the Development Setup guide.

License

Apache 2.0 — see the LICENSE file.

Acknowledgments

This is a clean-room reimplementation. The original libmagic is the work of Ian F. Darwin and Christos Zoulas, with many other contributors over the decades. We have benefited enormously from the magic-file format they shaped, and from the corpus of rules that ships with file.