Yuhui Wang
I am currently a Postdoctoral Research Fellow at the Generative AI Center at King Abdullah University of Science and Technology (KAUST), working with Prof. Jürgen Schmidhuber. I received my Ph.D. and M.S. degree in the College of Computer Science and Technology from Nanjing University of Aeronautics and Astronautics, supervised by Prof. Xiaoyang Tan in 2022 and 2017.
My research interests focus on reinforcement learning, with an emphasis on long-horizon decision making, policy optimization, and efficient exploration. I study both the theoretical and algorithmic aspects of RL, including credit assignment under sparse rewards, stability of policy learning, and exploration–exploitation trade-offs. My work aims to develop scalable RL methods for complex sequential decision problems, with applications to intelligent agents and control systems.
Selected Publciations
- Y. Dai*, Y. Wang*, D. R. Ashley, J. Schmidhuber. Efficient Morphology–Control Co-Design via Stackelberg PPO under Non-Differentiable Leader–Follower Interfaces. ICLR 2026. Link
- Y. Wang*, Q. Wu*, D. R. Ashley, F. Faccio, W. Li, C. Huang, J. Schmidhuber. Scaling Value Iteration Networks to 5000 Layers for Extreme Long-Term Planning. ICML 2025. Link
- Q. Wu*, Y. Wang*, S. S. Zhan, Y. Wang, C. W. Lin, C. Lv, Q. Zhu, J. Schmidhuber, C. Huang. Directly Forecasting Belief for Reinforcement Learning with Delays. ICML 2025. Link
- Y. Wang*, W. Li*, F. Faccio, Q. Wu, J. Schmidhuber. Highway Value Iteration Networks. ICML 2024. Link
- Y. Wang, M. Strupl, F. Faccio, Q. Wu, H. Liu, M. Grudzień, X. Tan, J. Schmidhuber. Highway Reinforcement Learning. Preprint. 2023. Link
- Y. Wang, X. Tan. Deep Recurrent Belief Propagation Network for POMDPs. AAAI 2021. Link
- Y. Wang, H. He, X. Tan, Y. Gan. Trust Region-Guided Proximal Policy Optimization. 2019. NeurIPS. Link
- Y. Wang*, H. He*, X. Tan. Truly Proximal Policy Optimization. UAI 2019 (Oral). Link