SpikePingpong: High-Frequency Spike Vision-based Robot Learning for Precise Striking in Table Tennis Game

SpikePingpong: High-Frequency Spike Vision-based Robot Learning for Precise Striking in Table Tennis Game

1State Key Laboratory of Multimedia Information Processing, School of Computer Science, Peking University; 2Beijing Academy of Artificial Intelligence (BAAI);
Equal Contribution   ✉ Corresponding Author  

highlight

Summary Video

highlight

Highlights

  • We propose SpikePingpong, a novel robot learning framework that decomposes the complex table tennis task into specialized interception and hitting phases, addressing the limitations of both traditional control-based and learning-based approaches.

  • We introduce a comprehensive neural control system combining SONIC, a calibration system for precise ball-racket interaction prediction, and IMPACT, an imitation learning framework that captures expert striking strategies through demonstrations to enable precise control over ball landing positions.

  • Experimental results demonstrate that SpikePingpong achieves a remarkable 91% success rate for 30cm accuracy zones and 71% for high-precision 20cm targeting across four distinct target regions, surpassing previous state-of-the-art approaches by 38% and 37% respectively.

Abstract

Learning to control high-speed objects in the real world remains a challenging frontier in robotics. Table tennis serves as an ideal testbed for this problem, demanding both rapid interception of fast-moving balls and precise adjustment of their trajectories. This task presents two fundamental challenges: it requires a high-precision vision system capable of accurately predicting ball trajectories, and it necessitates intelligent strategic planning to ensure precise ball placement to target regions. The dynamic nature of table tennis, coupled with its real-time response requirements, makes it particularly well-suited for advancing robotic control capabilities in fast-paced, precision-critical domains. In this paper, we present SpikePingpong, a novel system that integrates spike-based vision with imitation learning for high-precision robotic table tennis. Our approach introduces two key attempts that directly address the aforementioned challenges: SONIC, a spike camera-based module that achieves millimeter-level precision in ball-racket contact prediction by compensating for real-world uncertainties such as air resistance and friction; and IMPACT, a strategic planning module that enables accurate ball placement to targeted table regions. The system harnesses a 20 kHz spike camera for high-temporal resolution ball tracking, combined with efficient neural network models for real-time trajectory correction and stroke planning. Experimental results demonstrate that SpikePingpong achieves a remarkable 91% success rate for 30 cm accuracy target area and 71% in the more challenging 20 cm accuracy task, surpassing previous state-of-the-art approaches by 38% and 37% respectively. These significant performance improvements enable the robust implementation of sophisticated tactical gameplay strategies, providing a new research perspective for robotic control in high-speed dynamic tasks.

Method Overview

Framework of SpikePingpong. Our system integrates three key components: (1) initial trajectory prediction using RGB-D camera data, (2) SONIC module for refined hittable position estimation through neuromorphic vision, and (3) IMPACT module for strategic motion planning and control. This comprehensive pipeline enables precise ball interception and tactical return placement.

Ball-Racket Contact Precision Evaluation

Our SONIC module achieves exceptional performance with a mean absolute error of 2.47mm and root mean square error of 3.13mm for deviation prediction. Visual validation through spike camera captures confirms SONIC's effectiveness, showing substantially improved contact precision with minimal separation between ball center and racket center when activated, demonstrating how millimeter-level prediction accuracy translates into enhanced physical interception performance.

Empirical Assessment of Spatial Control Capabilities

We conducted comparative evaluations against baseline methods across four target regions (A, B, C, and D). SpikePingpong achieves exceptional performance with an average success rate of 91% within the 30cm accuracy zone and 71% within the 20cm high-precision zone, demonstrating superior precision control in tactical ball placement.

Sequential Target Execution Capabilities

To evaluate tactical sequence execution, we designed experiments involving 20 consecutive returns with random target sequences. SpikePingpong achieves an overall sequence success rate of 95%, demonstrating that our system maintains precision during extended tactical sequences and represents a significant advancement toward sustained strategic gameplay rather than merely returning balls.

Real-world Demos

We present real-world demonstration videos showing SpikePingpong executing precise shots to target regions A, B, C, and D, validating the system's tactical placement capabilities in actual gameplay scenarios. Additionally, we demonstrate our system's capabilities through a video of SpikePingpong engaging in a table tennis rally against a human player. The demonstration showcases the robot's ability to maintain sustained exchanges, accurately track and return incoming balls, and execute precise shot placements in a dynamic gameplay environment, validating our approach's effectiveness in practical human-robot interaction scenarios.


Target A


Target B


Target C


Target D


Human vs Robot



BibTeX

@article{wang2025spikepingpong,
    title={SpikePingpong: High-Frequency Spike Vision-based Robot Learning for Precise Striking in Table Tennis Game},
    author={Wang, Hao and Hou, Chengkai and Li, Xianglong and Fu, Yankai and Li, Chenxuan and Chen, Ning and Dai, Gaole and Liu, Jiaming and Huang, Tiejun and Zhang, Shanghang},
    journal={arXiv preprint arXiv:2506.06690},
    year={2025}
}