wrote a Blender script to auto-trim dead frames from mocap takes — tired of finding where the action starts by hand

150 views 11 replies

Every mocap session I work with comes in with 20–60 frames of the actor standing static before they start moving, and another chunk at the end after they've stopped. Not a huge deal per take, but when you're processing 30 takes in one batch export run, trimming each one by hand adds up fast.

The script samples a set of anchor bones (hips, spine, wrists) and walks the frame range looking for the first and last frames where any of them exceed a velocity threshold. Then it snaps action.frame_range to that window with a small padding value.

import bpy
import math

def get_bone_velocity(action, bone_name, frame):
    total_delta = 0.0
    for fcurve in action.fcurves:
        if bone_name not in fcurve.data_path:
            continue
        val_curr = fcurve.evaluate(frame)
        val_prev = fcurve.evaluate(frame - 1)
        total_delta += (val_curr - val_prev) ** 2
    return math.sqrt(total_delta)

def find_active_range(action, sample_bones, threshold=0.001, padding=2):
    frame_start = int(action.frame_range[0])
    frame_end = int(action.frame_range[1])
    first_active = frame_end
    last_active = frame_start

    for frame in range(frame_start + 1, frame_end + 1):
        for bone in sample_bones:
            if get_bone_velocity(action, bone, frame) > threshold:
                first_active = min(first_active, frame)
                last_active = max(last_active, frame)
                break

    return (
        max(frame_start, first_active - padding),
        min(frame_end, last_active + padding)
    )

# Run with your armature selected
obj = bpy.context.active_object
if obj and obj.type == 'ARMATURE':
    sample_bones = [
        'mixamorig:Hips',
        'mixamorig:Spine',
        'mixamorig:LeftHand',
        'mixamorig:RightHand'
    ]

    for action in bpy.data.actions:
        if action.users == 0:
            continue
        start, end = find_active_range(action, sample_bones, threshold=0.002, padding=3)
        action.frame_range = (start, end)
        print(f"{action.name}: trimmed to [{start}, {end}]")

A few things I ran into:

  • Threshold needs tuning per capture setup. Rokoko data tends to sit noisier at rest than optical, so I push it to around 0.005 for those sessions.
  • Bone names are Mixamo-specific here, so if your naming convention differs, match by partial string or just pass in a custom list.
  • The action.users == 0 skip is there because Blender loves accumulating orphaned actions that you definitely don't want to process.

One thing I haven't solved yet: takes where the actor holds a static pose intentionally in the middle. The velocity trimmer doesn't touch those since it only trims the ends, but if someone asks for internal pause trimming that's a different problem entirely. Probably needs segment-based gap detection with a minimum gap length so you don't accidentally eat real holds.

Anyone else doing pre-processing passes like this in Blender before batch export? Also genuinely curious whether people handle this in MotionBuilder instead. My pipeline touches both and I'm still not sure which layer is the right place for this kind of cleanup.

Replying to QuantumHaze: the per-actor warmup pattern thing is genuinely useful data. ran the script on a...

The per-actor warmup data pays off way later than when you collect it, which makes it easy to skip at the time. We ended up keeping a small JSON file per actor with their typical warmup range from the first session. By session three or four the auto-trim was basically self-tuned to each person without us having to touch it. Just a lookup at trim time and a slight offset to the motion threshold start frame. Makes retakes on later sessions go noticeably faster. You're not babysitting the trim window on every single import.

Replying to CyberHaze: one edge case that got me: actors who fidget while waiting for the director to c...

yeah this broke my script on literally the first real batch i ran it on lmao. one performer could not stop settling, small hip shifts, weight transfers, nothing that looked like action but enough to trip a naive velocity threshold constantly.

fixed it by requiring N consecutive frames above threshold rather than any single frame crossing it:

def find_motion_start(fcurves, threshold=0.02, min_run=8):
    start = int(min(fc.range()[0] for fc in fcurves))
    end = int(max(fc.range()[1] for fc in fcurves))
    for f in range(start, end):
        consecutive = 0
        for check_f in range(f, min(f + min_run, end)):
            vels = [abs(fc.evaluate(check_f + 1) - fc.evaluate(check_f)) for fc in fcurves]
            if max(vels) > threshold:
                consecutive += 1
        if consecutive >= min_run:
            return f
    return start

min_run=8 basically eliminates false positives from fidgeting in my sessions. might need tuning depending on capture rate.

statue slowly turning head when no one is looking
Replying to RiftFox: End-of-take detection is consistently harder than start detection and doesn't ge...

The asymmetry makes sense if you think about it in terms of energy state transitions. "Started moving" is a transition into a high-energy state: fast, well-defined, easy to detect. "Stopped moving" is a transition out of one, and actors almost never do that cleanly. They relax, breathe, shift weight, glance at the director. None of that is the captured action, but it's not stillness either.

What worked better for me than a raw threshold crossing: require a sustained window of consecutive frames below threshold before committing to an end point, around 12–15 frames at 30fps, roughly half a second. Single-frame dips get ignored entirely. Only once you've had a genuine lull does the script mark a candidate end, and then you walk back through the window to find the last frame with real motion above it. Two-pass, but it catches most of the "actor relaxed but didn't freeze" situations cleanly.

The failure mode that still beats it: an actor holding a final pose actively, like sustaining a combat stance at the end of a fight take. They're never truly below threshold so the window never fires. For those I gave up on auto-detection and just flag the take for manual review. Better than a wrong trim point silently corrupting a batch.

one edge case that got me: actors who fidget while waiting for the director to call action. small weight shifts, subtle head turns — enough to cross a naive velocity threshold but not the actual start of the performance. my fix was adding a minimum consecutive active frames requirement. the motion has to sustain above the threshold for at least N frames before it registers as the real start. a single frame of fidget noise doesn't qualify, an actual performance does. N=8 works for most material, bumped it to 12 for slower deliberate takes where the threshold itself is lower.

Replying to LunaLynx: this solves a real problem. one thing worth adding on top: log the detected trim...

yeah logging the range is essential for batch work. i also added a per-take confidence score to the output, just the ratio of frames above the motion threshold to total usable frames. low-confidence takes get flagged so i know to double-check them:

confidence = active_frames / usable_frames
row = [take_name, start, end, usable_frames, f"{confidence:.2f}"]

useful for catching takes where the actor half-moved at the start and tripped the threshold before the actual action began. without that flag those were silently getting clean-looking timecodes when they really needed a second look.

Replying to CyberHaze: one edge case that got me: actors who fidget while waiting for the director to c...

same problem, possibly the same actor lol. fix that worked for me: instead of triggering on any single frame above threshold, require some minimum number of consecutive frames, 8 to 10 at 30fps. a quick weight shift or nervous head turn almost never sustains that long. actual intentional motion almost always does.

doesn't kill all false positives but cuts them down a lot. close enough good enough shrug

End-of-take detection is consistently harder than start detection and doesn't get talked about enough. At the start you're looking for the first sustained motion above threshold, which is well-defined and usually works. At the end you're looking for when motion drops below threshold and stays there, but actors almost always have a natural deceleration phase before they fully stop. A naive trailing threshold can trim right into that deceleration and cut off follow-through.

For some workflows that's fine, you'd retime and extend the hold anyway. But if you're piping directly into retargeting or delivery without a cleanup pass after trimming, clipping the follow-through makes the clip feel like the recording was cut early rather than like the character came to a natural rest. I handle it with a fixed 12-frame buffer after the detected end point regardless of whether those frames are above or below threshold. Crude but reliable.

this solves a real problem. one thing worth adding on top: log the detected trim range to a text file when running in batch. start frame, end frame, usable frame count. when you're processing 40+ takes, knowing that take_17 had only 4 usable frames after trim tells you immediately which takes need to go back to the studio, without having to open each one.

also worth thinking about whether your "not moving" threshold needs to be per-bone or global. on hand and finger takes, the fingers can already be animating while the root is still in rest pose. a global velocity threshold will clip actual content in those cases.

Replying to LunaLynx: this solves a real problem. one thing worth adding on top: log the detected trim...

The CSV format is worth doing even for small batches. Once you have start frame, end frame, and usable duration as structured data, patterns start appearing. One actor always needs 30+ frames to settle, a specific session consistently has long wind-downs, certain takes have unusually small usable windows. That kind of data starts feeding back into how you give direction during capture, not just how you clean up afterward.

Worth logging the session name and take filename alongside the trim data if your naming convention is consistent. Turns the whole thing into an actual production record rather than a one-time cleanup log you'll never look at again.

Replying to LunaLynx: this solves a real problem. one thing worth adding on top: log the detected trim...

also worth logging which threshold value was active when the trim ran, especially if you're using per-session or per-actor overrides. two months later when you're re-processing takes and the trim points don't match your old logs, you'll want to know what parameters generated them. found this out when a session got re-exported from the suit software with slightly different curve smoothing and all my old trim frames shifted by 5–10, making the original logs look wrong when they weren't.

Replying to FrostFern: The CSV format is worth doing even for small batches. Once you have start frame,...

the per-actor warmup pattern thing is genuinely useful data. ran the script on a full session once and noticed one actor was consistently sitting at 50+ frames before any meaningful motion. turned out they had a habit of slowly settling into their start pose rather than hitting it and holding still. once we knew that we briefed them before takes and their warmup dropped by like half. nice to have something concrete to point to instead of just vibes about why setup takes forever.

Moonjump
Forum Search Shader Sandbox
Sign In Register