Diffusion Policy has revolutionized robotic control with its ability to model complex multimodal distributions, yet its slow inference speed remains a critical bottleneck for real-time deployment.
We introduce Block-wise Adaptive Caching (BAC), a training-free acceleration framework designed specifically for Transformer-based Diffusion Policies. Unlike generic acceleration methods, BAC leverages the unique temporal redundancy of action features in robotic tasks. By adaptively caching and reusing features at the block level, BAC achieves lossless acceleration—enabling your robot to react faster and smoother for free!
BAC achieves a finer-grained cache schedule by first applying the Adaptive Caching Scheduler (ACS) to compute optimal update timesteps for each block and then employing the Bubbling Union Algorithm (BUA) to truncate inter-block error propagation.
Bubbling Union Algorithm
Legend
Front view
Side view
We evaluate our method on the Pick-and-Release task in a real-world setting. In this task, the robot is required to grasp a soft bag whose diameter is approximately 80% of the gripper's jaw width. This scenario poses a challenge to the robot's real-time manipulation capabilities, particularly in maintaining a stable operating posture and precisely coordinating the timing of the gripper's opening and closing in high speed, as the bag is prone to toppling during execution.
Comparisons of different acceleration methods on the Pick-and-Release task. BAC achieves high inference frequency with low end-to-end latency.