Starlight

Starlight Engine: Rebuilding Raylib for Console, Custom Rendering & Neural Locomotion

Kieren Hovasapian March 10, 2026 5 min read

Every studio hits the same wall eventually. You're deep into production, the engine you licensed does 80% of what you need, but that last 20% is where your game actually lives. At Moonjump, that wall came early. The project needed PlayStation export without a six-figure middleware license. It needed a renderer pipeline inside Maya that let artists iterate without waiting for full builds. And it needed character locomotion that could handle thousands of animation states without eating the memory budget alive.

The answer was taking Raylib, stripping it to the studs, and rebuilding it into something called Starlight.

Why Raylib as the Foundation

Raylib is a C library designed for simplicity. No class hierarchies. No entity-component frameworks. No opinion about how you structure your game. It gives you a window, a renderer, input handling, and audio. That's it. For a project that wants to own its entire pipeline, this is exactly the right starting point.

The usual suspects all fell short: Unreal's licensing structure didn't fit the model. Unity's runtime had overhead that couldn't be justified for the target hardware budgets. Godot was promising but the C++ integration story wasn't there yet. Raylib offered something none of them could — a small, auditable codebase that could be understood completely. Every line of rendering code, every memory allocation, every platform abstraction was readable, modifiable, and extendable without reverse-engineering framework internals.

The entire Raylib core compiles in under 4 seconds. When you're making deep changes to a renderer, that iteration speed matters more than any feature list.

PlayStation Export: The First Big Extension

Console export was the first major surgery. Raylib targets desktop and mobile via OpenGL/ES, WebGL, and some Vulkan support. PlayStation requires GNM/GNMX (PS4) and AGC (PS5), Sony's proprietary graphics APIs. You can't just swap in a new backend — the entire resource management model is different.

The solution was a hardware abstraction layer (HAL) that sits between Starlight's renderer and the platform GPU APIs. The HAL exposes a small surface area — init, shutdown, buffer/image/shader/pipeline creation, pass management, draw calls, and commit. Every platform implements the same interface. The OpenGL backend fills it with GL calls. The Vulkan backend fills it with Vulkan calls. The PlayStation backends fill it with GNM/AGC calls. The renderer above the HAL never knows which GPU API is running underneath. The same game code compiles against any backend by swapping a single compile-time flag.

The tricky part wasn't the API translation — it was memory. PlayStation consoles give you explicit control over GPU memory allocation through a unified memory model. You allocate from a garlic (GPU-preferred) or onion (CPU-preferred) heap, and you manage the lifetime yourself. Raylib's original design assumed OpenGL would handle resource management behind the scenes.

The fix was a pool allocator that pre-allocates GPU memory in large blocks and sub-allocates from them. Every texture, vertex buffer, and shader constant buffer pulls from these pools. When a level unloads, the entire pool frees in one call instead of deallocating thousands of individual resources. This gives deterministic memory usage with zero fragmentation over long play sessions.

The Maya Renderer: Killing the Export-Load-Check Loop

Artists lose hours per day to the classic iteration loop: make a change in Maya, export the asset, wait for the engine to reimport, load the level, navigate to where the asset appears, check if it looks right. If it doesn't, go back to step one.

Starlight includes a custom viewport renderer for Maya that uses the same shading pipeline as the engine. It's implemented as a Maya VP2 override — a C++ plugin that intercepts Maya's viewport rendering and replaces it with Starlight's draw calls. Artists see exactly how their assets will look in-engine without ever leaving Maya.

The renderer syncs material parameters bidirectionally. Change a roughness value in the custom material editor panel, and it updates both the Maya viewport and writes the value to the asset format. When the artist saves, the engine hot-reloads the material definition without a restart.

The lighting model also pipes into the Maya renderer. The game uses a custom atmosphere scattering model for outdoor scenes, and that same scattering math runs in Maya's viewport. An artist placing a rock formation on a hillside sees the exact same colour grading, fog density, and light scatter they'll see in the final game. No more "it looked different in Maya" conversations.

The viewport renderer also evaluates the custom LOD system. Artists can scrub a distance slider and watch their model transition between LOD levels in realtime, seeing exactly where the pops happen and adjusting geometry until the transitions are invisible.

Neural Network Locomotion: Big Data, Small Memory

This is the most interesting piece of Starlight. Traditional locomotion systems work by blending between hand-authored animation clips. You have a walk cycle, a run cycle, a turn-left, a turn-right, and a state machine that cross-fades between them. This works for simple movement, but it breaks down when you need characters to handle varied terrain, dynamic obstacles, and natural-looking transitions across hundreds of movement states.

Motion matching solved some of this by searching a database of motion capture data for the best matching pose at each frame. But motion matching has a cost: the database needs to stay in memory. A 30-minute mocap session at 30fps produces 54,000 frames. Each frame stores joint positions, velocities, and trajectory data. For a character with 65 joints, that's roughly 120 bytes per joint per frame. A single 30-minute clip costs around 400MB uncompressed. Compression helps, but you still need enough in memory to search efficiently.

Starlight's approach uses a compact neural network trained offline on the motion capture data, then replaces the runtime database entirely. The network takes the character's current pose, the desired velocity vector, and the terrain geometry under the character's feet, and outputs the next frame's joint rotations directly.

The architecture is deliberately small — a 3-layer MLP (multi-layer perceptron) with 512, 256, and 256 units respectively. The input vector encodes:

Current joint rotations (65 joints × 4 quaternion components = 260 floats)
Current joint velocities (65 × 3 = 195 floats)
Desired movement trajectory (future positions at 0.2s, 0.4s, 0.6s, 1.0s = 12 floats)
Terrain height samples in a 3×3 grid under the character (9 floats)
Character facing direction and speed (4 floats)

Total input: 480 floats. The output is 260 floats (next frame's joint rotations as quaternions). The entire network — weights and biases included — fits in 2.1MB of memory. Compare that to the hundreds of megabytes a motion matching database would require for equivalent movement quality.

Inference takes approximately 0.08ms per character per frame on a single CPU core. 60 characters can run through the network in under 5ms total, leaving plenty of budget for physics, AI, and rendering. On PlayStation hardware, NEON SIMD intrinsics vectorize the matrix multiplications, bringing per-character cost down to around 0.04ms.

The training pipeline runs offline on cloud GPUs. Several hours of motion capture data covering walks, runs, sprints, turns, slopes, stairs, starts, stops, and idles across different terrain types go in. The network learns to generalize — give it a terrain slope it never saw during training, and it produces reasonable foot placement because it learned the underlying relationship between terrain angle and ankle rotation, not just specific poses for specific slopes.

The result is characters that move with the fluidity of motion-captured animation, handle arbitrary terrain without authored transition animations, and cost less than 3MB of memory per character archetype. For a game targeting 16GB consoles where every megabyte counts, this is the difference between having rich locomotion and having to cut corners.

Where Starlight Goes Next

Current work includes extending the HAL to support mesh shaders for next-gen geometry processing, building a GPU-driven culling pipeline that handles open-world draw calls without CPU bottlenecking, and training locomotion networks for quadruped and flying character archetypes. The Maya renderer is getting cloth and hair simulation preview so artists can author those assets with full visual fidelity.

Starlight isn't a product shipping externally — it's a tool built because the problems being solved didn't have off-the-shelf answers. But the techniques behind it — the HAL pattern for cross-platform rendering, compact neural nets for animation, and in-DCC preview renderers — are patterns any studio can adopt. The details are worth sharing because more studios should consider owning their core technology.

Comments

Like this article? Consider supporting us

Your support helps us keep creating free game dev content, tutorials, and tools.

Newsletter access
Public posts & updates
Community access

Join Free

Popular

Everything in Free
Members-only game dev articles
Behind-the-scenes dev logs
Support the forum community
Vote on future game features

Everything in Supporter
Your name in game credits
Priority feature requests
Direct developer access
Monthly asset downloads

Starlight Engine: Rebuilding Raylib for Console, Custom Rendering & Neural Locomotion

Why Raylib as the Foundation

PlayStation Export: The First Big Extension

The Maya Renderer: Killing the Export-Load-Check Loop

Neural Network Locomotion: Big Data, Small Memory

Where Starlight Goes Next

Comments

Like this article? Consider supporting us

Free

Supporter

Studio Backer

Welcome back!