Personal Profile

I am currently a Postdoctoral Research Fellow at the KAUST AI Initiative at King Abdullah University of Science and Technology (KAUST), working with Professor Jürgen Schmidhuber. I received my Ph.D. and M.S. degree in the College of Computer Science and Technology from Nanjing University of Aeronautics and Astronautics, supervised by Professor Xiaoyang Tan in 2021 and 2017.

My research interests focus on reinforcement learning, especially on multi-step off-policy learning and the ones involving uncertainty or hidden information .

I am currently serving as a reviewer for NeurIPS’2020,2021, ICLR’2021,2022, ICML’2021.


  1. Y. Wang, H. He, X. Tan, Y. Gan. Trust Region-Guided Proximal Policy Optimization. 2019. NeurIPS. Link
  2. Y. Wang, X. Tan. Deep Recurrent Belief Propagation Network for POMDPs. 2021. AAAI. Link
  3. Y. Wang, H. He, X. Tan. Truly Proximal Policy Optimization. 2019. UAI. Link
  4. Y. Wang, X. Tan. Greedy Multi-step Off-Policy Reinforcement Learning. 2020. NeurIPS WorkShop on Deep RL. Preprint

More Papers