Been wrestling with a pathfinding bottleneck in my Godot 4 project. Started in GDScript (obviously a mistake at scale), ported to C# which helped but wasn't enough. Running A* over a dense nav graph with ~600 nodes and dynamic edge weights that recalculate every few frames during combat, C# was still eating 2–3ms per query under load, which was blowing my frame budget on lower-end targets.
So I went down the Rust + GDExtension rabbit hole. The pitch: compile a Rust library with pure data logic, expose it via a C ABI, bind it through GDExtension. No GC pressure, no runtime overhead, deterministic performance. In theory.
The binding surface I ended with:
#[no_mangle]
pub extern "C" fn find_path(
graph: *const GraphSnapshot,
start: u32,
goal: u32,
out_path: *mut u32,
out_len: *mut usize,
) -> i32 {
let graph = unsafe { &*graph };
match astar(graph, start, goal) {
Some(path) => {
let len = path.len().min(MAX_PATH_LEN);
unsafe {
std::ptr::copy_nonoverlapping(path.as_ptr(), out_path, len);
*out_len = len;
}
0
}
None => -1,
}
}The GDExtension wrapper serializes the graph into a flat struct before each query. I was worried that step would eat the gains, but since I only rebuild the snapshot when the graph actually changes, not every frame, the overhead ends up negligible in practice.
Performance on my test scene: C# A* averaged ~2.4ms per query. Rust via GDExtension: ~0.3ms. That's a real win. But I'll be honest, cross-compilation for export targets requires careful setup, and debugging across the FFI boundary is genuinely unpleasant when something goes sideways. No nice stack traces, just a crash or a silent wrong answer.
Curious if anyone else has gone this route for something other than pathfinding, physics substeps, procedural gen, anything CPU-bound. Was the setup friction worth it? And has anyone found a cleaner workflow for cross-compilation when targeting multiple export platforms?