Rust in lib-common, Part 1: Integrating Rust in a Waf-based C Build System

This is Part 1 of a multi-part series on integrating Rust into a large C codebase.

  • Part 1: Integrating Rust in a Waf-based C Build System
  • Part 2: First Rust Component: Rewriting farchc (WIP)

Introduction

lib-common is a C library developed by Intersec . It provides core utilities (strings, containers, memory management), networking (HTTP, RPC, event loop), serialization through IOP (Intersec Object Packer, similar to Protocol Buffers), etc. It targets Linux environments and has been in active development for many years.

Like many mature C codebases, lib-common faces a familiar issue: the code works, it is battle-tested, but certain classes of bugs (use-after-free, buffer overflows, data races) remain a perpetual risk. Rust addresses these problems at compile time.

No one can rewrite a library of this size overnight. The practical question is: how do we progressively introduce Rust into a project that already has a working build system, hundreds of C source files, and deep inter-module dependencies?

This article walks through the build-system integration layer that makes it possible.


What is Waf?

Waf is a Python-based build system built around task-based scheduling. Build logic is written in Python, and tasks are scheduled based on a dependency graph computed at build time.

lib-common uses Waf with a set of build profiles controlled by the P environment variable:

P=debug   waf configure && waf   # Debug build
P=release waf configure && waf   # Release build
P=asan    waf configure && waf   # AddressSanitizer build
P=tsan    waf configure && waf   # ThreadSanitizer build

The default workflow is simply:

waf configure
waf

Profiles affect compiler flags, optimization levels, whether sanitizers are enabled, etc.


What is Cargo?

Cargo is Rust’s build system and package manager. It handles dependency resolution, feature flags, conditional compilation, and proc-macro crate compilation. Skipping it when building Rust code is impratical. Cargo is the standard in Rust, and working around it costs far more than using it.

lib-common’s Rust workspace is configured in Cargo.toml at the repository root. (Cargo.toml:21-37 )

Each Rust component is a separate crate within the workspace. Cargo profiles mirror the Waf profiles. (Cargo.toml:43-74 )

Since the Rust code links into C binaries, unwinding across the FFI boundary would be undefined behavior. So using panic = "abort" and aborting is the simplest choice.

The PIC (Position Independent Code) variants are handled via suffixed profiles. (Cargo.toml:78-106 )

This allows building the same crate twice under a single Cargo workspace: once for static linking into an executable, once for shared libraries, with different code generation options.


The Core Challenge

We now have two build systems in one repository:

  • Waf orchestrates C compilation. It knows about C header dependencies, include paths, library link order, and build profiles.
  • Cargo handles Rust crate resolution, feature unification, proc-macro compilation, and Rust-specific toolchain concerns.

Neither can replace the other. Waf cannot resolve Cargo dependencies or run proc macros. Cargo cannot understand Waf’s task dependency graph or compile C code the way lib-common needs.

The solution is a bridge: Waf remains the top-level orchestrator, and it invokes Cargo as a sub-build. The tricky part is passing the right C environment (include paths, defines, link flags) into Cargo’s build scripts, and ensuring correct task ordering so that C libraries are built before the Rust code that depends on them, and reciprocally.


The Solution: build/waftools/rust.py

The integration lives in a single Waf tool file: build/waftools/rust.py. It defines Waf task classes that wrap Cargo invocations, manages the data flow between the two build systems, and handles the mapping from Waf build outputs to locations where Waf expects them.

Waf as Orchestrator: the CargoBuildBase Task

The CargoBuildBase task class is responsible for invoking Cargo from Waf. (build/waftools/rust.py:187-207 )

The run() method captures the entire build protocol in three steps:

  1. make_waf_build_env(): Serialize the C build environment into a JSON file that Cargo build scripts can read.
  2. run_cargo(): Invoke cargo build with the right profile and package.
  3. make_hardlinks(): Link Cargo’s output binaries into the source directory, so that Rust binaries end up in the same locations as C binaries.

always_run = True is important. Waf does file fingerprinting to decide whether a task needs to re-run. But Cargo has its own caching and already knows when to skip recompilation. Letting Waf also cache would lead to stale builds, and duplicating Cargo’s complex dependency resolution logic in Python would be impractical. So we always invoke Cargo and let it decide.

The JSON Bridge: make_waf_build_env()

Cargo build scripts (build.rs) run in an isolated environment. They do not have access to Waf’s internal state. The solution is a JSON file that carries the C build environment across the boundary. (build/waftools/rust.py:209-259 )

This is the data contract between the two build systems. Every piece of information that a Cargo build script might need (include directories, C compiler defines, linker flags, paths to dependent static libraries) is serialized into a single JSON file at .waf-cargo-build/waf_build_env.json (or .waf-cargo-build-pic/ for PIC builds).

If the JSON content has not changed since the last build, the file is not rewritten. Since Cargo build scripts use cargo::rerun-if-changed= to watch this file, an unchanged file means Cargo will skip the build script entirely. This avoids cascading rebuilds when only unrelated C files have changed. (build/waftools/rust.py:261-277 )

Running Cargo

The run_cargo() method constructs the cargo build command. (build/waftools/rust.py:279-331 )

Running ASAN or TSAN on Rust code requires rebuilding the Rust standard library with sanitizer instrumentation. The -Zbuild-std=panic_abort,std flag achieves this, but it is an unstable feature. Rather than requiring a nightly toolchain, the code sets RUSTC_BOOTSTRAP=1 to unlock unstable flags on the stable compiler. This is a pragmatic trade-off: nightly toolchains introduce instability in CI, while RUSTC_BOOTSTRAP is a discouraged work-around used by projects that need specific unstable features with a stable compiler.

The Hard-linking Strategy

Cargo writes its output to .cargo/target/<profile>/. To place Rust binaries in the source build directory alongside C binaries, rather than copying files (which would waste time and disk space), the integration uses hard links. (build/waftools/rust.py:341-350 )

Hard links are preferable to symbolic links here because Waf’s dependency tracking examines file metadata (timestamps, sizes). A hard link shares the same inode as the original, so the metadata is always consistent. A symlink would add an indirection that could confuse timestamp-based checks.

Task Ordering: rust_add_dep_task

When a Rust crate depends on a C static library (or another Rust crate that produces one), Waf needs to ensure the dependency is built first. This is handled by the rust_add_dep_task method, which runs after Waf’s standard process_use step. (build/waftools/rust.py:493-539 )

When a Rust crate depends on another Rust crate, Cargo handles the linking internally, not Waf. So the dependent Rust library is removed from Waf’s STLIB and LIB lists. If it were left in, Waf would try to pass it to the linker a second time, causing duplicate symbol errors.


The waf-cargo-build Crate

On the Rust side, the waf-cargo-build crate is the counterpart to rust.py. It is used in build.rs scripts to read the JSON environment file produced by make_waf_build_env() and emit the correct Cargo build directives.

WafBuildEnvJson

The JSON file is deserialized into a Rust struct with Serde that mirrors exactly what rust.py writes. (rust/waf-cargo-build/lib.rs:214-225 )

WafBuild::read_build_env()

The main entry point reads the JSON and determines the different profiles and configurations. (rust/waf-cargo-build/lib.rs:258-298 )

Integrating Bindgen

The generate_bindings() method wraps bindgen to generate Rust FFI (Foreign Function Interface) bindings from C headers. It solves a specific problem: when multiple Rust crates in the workspace bind to overlapping sets of C headers, they generate duplicate type definitions. (rust/waf-cargo-build/lib.rs:387-414 )

Each crate writes a bindings_items.json listing every C symbol it binds. When a downstream crate generates its own bindings, it reads the JSON files from all upstream crates and blocks those symbols, preventing duplicates. The downstream crate then re-exports its dependencies’ bindings via pub use.

ⓘ Note

We use a custom fork of Bindgen that adds a ParseCallbacks::allow_or_block_item() method. Upstream Bindgen only supports regex-based allowlists and blocklists via RegexSet, which has two limitations: regexes lack the flexibility needed for dynamic allow/block decisions, and RegexSet scales linearly with the number of entries, which becomes a problem when blocking thousands of symbols from upstream crates. The custom callback lets each crate use its own data structure (e.g., a HashSet) to efficiently decide which items to generate, which is exactly what the duplicate-symbol blocking mechanism described above relies on.

The C environment from the JSON bridge is forwarded to bindgen as clang arguments. (rust/waf-cargo-build/lib.rs:429-434 )

This ensures that bindgen sees the same preprocessor state as the C compiler, which is essential for correct type generation.

To reduce boilerplate in every Rust crate that uses C bindings, two proc macros are provided:

#[bindings_mod]

Bindgen output triggers hundreds of Clippy and compiler warnings (non-snake-case names, missing docs, etc.). Rather than suppressing them manually in every crate, the #[bindings_mod] attribute macro wraps the module with a comprehensive #[allow(...)] block. (rust/waf-cargo-build-macros/lib.rs:47-72 )

include_bindings!()

The include_bindings!() macro expands to an include!() call pointing to the generated bindings file. (rust/waf-cargo-build-macros/lib.rs:98-103 )

A typical usage in a Rust crate looks like:

#[waf_cargo_build::bindings_mod]
pub mod bindings {
    pub use libcommon::bindings::*;
    waf_cargo_build::include_bindings!();
}

This re-exports all bindings from libcommon (so downstream code has a single import path) and adds the current crate’s own generated bindings, all wrapped in the warning suppression attribute.


The deps-workspace-hack Pattern

When a Cargo workspace contains multiple crates, the same dependency can be compiled multiple times with different feature flags. For example, if crate A depends on serde with the derive feature and crate B depends on serde without it, Cargo may compile serde twice. This wastes time and can cause subtle issues.

The deps-workspace-hack pattern prevents this. It is a crate that depends on every workspace dependency, forcing feature unification across the entire workspace. The Cargo.toml for this crate is auto-generated from a template. (rust/deps-workspace-hack/Cargo.toml.tpl )

During waf configure (and before each build), the generation function reads the root Cargo.toml, extracts all workspace dependencies, and substitutes them into the template. (build/waftools/rust.py:90-141 )

The result is a Cargo.toml that pulls in every workspace dependency in both [dependencies] and [build-dependencies]. The crate itself is empty, its only purpose is to force feature unification.

Every cargo build invocation includes --package deps-workspace-hack alongside the actual target package. This ensures that even when building a single crate, Cargo sees the full set of features and compiles each dependency only once.


Putting It All Together: Adding a Rust Crate

With all the plumbing in place, here is what it looks like to add a new Rust crate that binds to C code. We will use libcommon-core as a concrete example: it is the lowest-level Rust crate in the workspace, providing bindings to lib-common’s core C library.

1. Register the Crate in the Cargo Workspace

Add the new crate to the workspace members in the root Cargo.toml: (Cargo.toml:26 )

[workspace]
members = [
    "rust/deps-workspace-hack",
    "rust/libcommon-core",
    # ...
]

2. Write the Crate’s Cargo.toml

The crate needs waf-cargo-build as both a build-dependency (for the build.rs script) and a regular dependency (for the proc macros used in source code): (rust/libcommon-core/Cargo.toml )

[package]
name = "libcommon-core"
edition.workspace = true
publish.workspace = true

[lints]
workspace = true

[lib]
path = "lib.rs"
crate-type = ["lib", "staticlib"]

[build-dependencies]
waf-cargo-build = { path = "../waf-cargo-build" }

[dependencies]
waf-cargo-build = { path = "../waf-cargo-build" }

crate-type = ["lib", "staticlib"] tells Cargo to produce both a Rust library (for other Rust crates to depend on) and a C-compatible static library (.a file) for linking into C binaries.

3. Write the build.rs

The build script reads the Waf environment and generates C bindings: (rust/libcommon-core/build.rs )

use std::error;
use waf_cargo_build::WafBuild;

fn main() -> Result<(), Box<dyn error::Error>> {
    let waf_build = WafBuild::read_build_env()?;

    waf_build.print_cargo_instructions();
    waf_build.generate_bindings(|builder| {
        let mut builder = builder;

        builder = builder.header("wrapper.h");

        Ok(builder)
    })?;

    Ok(())
}

read_build_env() loads the JSON file written by Waf, print_cargo_instructions() emits the cargo:rustc directives, and generate_bindings() runs bindgen with the C include paths and defines from the Waf environment. The wrapper.h header is the entry point for all C declarations that should be exposed to Rust.

4. Include the Bindings in Source Code

In lib.rs, the generated bindings are included with the proc macros: (rust/libcommon-core/lib.rs )

#[waf_cargo_build::bindings_mod]
pub mod bindings {
    waf_cargo_build::include_bindings!();
}

5. Declare the Waf Target

Finally, a wscript_build file tells Waf about the Rust crate and its C dependencies: (rust/libcommon-core/wscript_build )

ctx(target='libcommon-core-rs', features='rust',
    cargo_package='libcommon-core',
    use=['libcommon-minimal'],
)
  • target='libcommon-core-rs' declares the name of target in Waf.
  • features='rust' activates the Rust integration from rust.py.
  • cargo_package='libcommon-core' maps the Waf target name to the Cargo package name.
  • use=['libcommon-minimal'] lists the Waf targets this crate depends on, here, the C library libcommon-minimal. Waf will ensure it is built first and its include paths, defines, and link flags are passed through the JSON bridge into the build.rs.

That’s it. Five files, and the new Rust crate is fully integrated and ready to be used in both build systems.


What’s Next

This article covered the plumbing: how Waf and Cargo coexist, how the C build environment flows into Rust build scripts, and how task ordering and binary placement work.

In Part 2, we will look at the first real Rust component built on this foundation: farchc, a rewrite of lib-common’s file archive compiler. It is a concrete example of replacing a C tool with Rust while maintaining full compatibility with the existing C codebase.


Disclaimer: Designed by hand, written jointly with an LLM, reviewed by humans.

Author Avatar

Nicolas Pauss