Runtime mesh deformation for destructible terrain — where does the perf actually go?

58 views 4 replies

Been banging my head against this for a couple weeks and figured I'd throw it out here. I'm building a top-down action game where the ground can be cratered by explosions, think soft-body craters that persist through the level. Not full voxel, just a heightmap-based terrain mesh that I'm deforming at runtime.

The naive approach was exactly what you'd expect: explosion hits, I iterate over nearby vertices, displace them along a smoothstep falloff curve, then call mesh.RecalculateNormals(). Works fine for one or two hits. At around 15-20 deformations in a scene the frame time tanked hard. Profiler showed RecalculateNormals eating 4-6ms on its own every time it ran.

So I moved normal recalculation to a Compute shader, huge improvement, down to sub-millisecond. But now I'm seeing a different spike: the vertex buffer readback. I'm writing deformation data on the GPU, but then I need CPU-side vertex positions for things like gameplay queries ("is this tile walkable?", raycasts against the deformed surface). That round-trip is brutal.

Current thinking is to keep two representations: a low-res CPU heightmap that I deform cheaply on the main thread (just a 2D float array, no mesh involved), and a high-res GPU mesh that purely drives visuals. Gameplay queries go against the heightmap, rendering goes against the GPU mesh. Sync them lazily, heightmap updates immediately on hit, mesh update gets queued for the next frame or two.

Has anyone actually shipped something like this? I'm worried about edge cases where the visual and gameplay representations drift in noticeable ways, like a player walking into a crater that visually exists but the heightmap says is still flat. Is the answer just to make the heightmap resolution high enough that the discrepancy isn't perceptible, or is there a smarter sync strategy?

Also curious whether anyone's gone the full voxel route for something like this in an indie context and whether the complexity overhead was worth it. My gut says heightmap is fine for a top-down game but I could be wrong.

explosion crater terrain deformation

The dual-representation approach is exactly what I'd reach for. The visual/gameplay drift concern is real but manageable. The trick is making your heightmap resolution match roughly the smallest gameplay-relevant feature. If your craters are never smaller than, say, 2 meters across, a heightmap at 0.5m/cell gives you 4 samples across the smallest crater, which is enough for accurate walkability queries. Players don't notice sub-cell discrepancies between the visual mesh and the collision shape.

One thing worth adding: pool your deformation jobs and batch the heightmap writes. If three explosions happen within the same frame, accumulate all the deltas and apply them in one pass rather than three separate iterations over your float array.

Went through something almost identical for a mobile artillery game. A few things that helped us:

First, dirty-region tracking. Don't recalculate normals for the whole mesh. Mark which chunks of vertices were touched and only update those submesh sections. We split the terrain into a 16x16 grid of chunks and flagged dirty chunks per-frame. Normal recalc cost dropped by ~70% on average since most explosions only touch 1-4 chunks.

Second, consider deferring the GPU mesh update entirely until the explosion VFX has finished. Players are looking at the particle burst for 0.3-0.5 seconds anyway. You can write the heightmap immediately for gameplay accuracy, then push the visual mesh update after a one-frame delay when the budget is cheaper.

Voxel route here, for what it's worth, and honestly for a top-down game skip it. Voxels buy you vertical destruction (tunneling, cave-ins) that a heightmap can't represent. If your gameplay is purely 2.5D and craters don't need overhangs, you're paying a massive complexity tax for nothing. The heightmap dual-rep approach you described is the right call.

the chunk dirty-flagging approach makes a lot of sense here. we use a bitmask for this in our terrain system, 64 chunks fits in a single ulong so checking and setting dirty flags is a bitwise op with zero allocation overhead. something like dirtyMask |= 1UL << chunkIndex on hit, then iterate set bits each frame to process only the flagged chunks.