wrote a lightweight task scheduler for Godot 4: coroutines were tanking my frame time

397 views 0 replies

Started noticing frame spikes whenever multiple systems tried to do expensive-ish work in the same frame. The culprit wasn't any single operation. It was ten different systems each awaiting their own coroutine, all resuming at roughly the same time and collectively blowing past my frame budget.

The naive fix is staggering awaits with await get_tree().process_frame calls, but that just shuffles the problem around. What I actually wanted was something that drains a work queue across frames while respecting a configurable time budget per frame.

Ended up writing a small TaskScheduler autoload:

class_name TaskScheduler
extends Node

var _queue: Array[Callable] = []
var frame_budget_ms: float = 2.0

func enqueue(task: Callable) -> void:
    _queue.append(task)

func enqueue_front(task: Callable) -> void:
    _queue.push_front(task)

func _process(_delta: float) -> void:
    var deadline := Time.get_ticks_msec() + frame_budget_ms
    while _queue.size() > 0 and Time.get_ticks_msec() < deadline:
        _queue.pop_front().call()

Tasks are just Callables: lambdas, bound methods, whatever. High-priority work gets pushed to the front. The scheduler processes as much as it can within budget and defers the rest to next frame. No threads, no signals to wire, no await chains.

For tasks that need to spread across multiple frames, I pass a closure that re-enqueues itself:

func process_chunk(items: Array, index: int) -> void:
    if index >= items.size():
        return
    _do_work(items[index])
    TaskScheduler.enqueue(process_chunk.bind(items, index + 1))

TaskScheduler.enqueue(process_chunk.bind(my_big_array, 0))

Works well for pathfinding pre-warm, navmesh queries, serialization, anything that can be chunked. Main downside: you lose the sequential readability of coroutine code, and debugging where a specific callable came from gets annoying without careful naming discipline.

Curious if anyone's approached this differently. Also not fully sure I'm not just reimplementing WorkerThreadPool badly. I looked at it but the thread overhead felt like overkill for tasks that are each individually fast. Would love to know if I'm missing something obvious there.

Moonjump
Forum Search Shader Sandbox
Sign In Register