Xudong Yu

I am a fourth-year PhD student at the School of Astronautics, Harbin Institute of Technology (HIT).

I obtained both my Bachelor’s and Master’s degree from HIT in 2018 and 2020, respectively. My research has primarily focused on offline reinforcement learning, online adaptation, and task transfer. I aim to develop agents that can effectively leverage extensive offline data to adapt seamlessly to new environments while retaining prior knowledge.

In addition, I am interested in Preference-based Reinforcement Learning (PbRL), which eliminates the need for meticulously designing reward functions.

Recently, my research has shifted towards Large Language Model (LLM) alignment and Reinforcement Learning from Human Feedback (RLHF). I aspire to enhance the reasoning and continual learning capabilities of these models.

Email  /  CV  /  Google Scholar  /  Zhihu  /  Github

profile photo
Research

I'm interested in LLM alignment, diffusion model, offline RL, offline-to-online RL, transfer learning, meta learning and etc.

Regularized Conditional Diffusion Model for Multi-Task Preference Alignment.
Xudong Yu, Chenjia Bai, Haoran He, Changhong Wang, Xuelong Li
Under Review
Arxiv

Contrastive Representation for Data Filtering in Cross-Domain Offline Reinforcement Learning.
Xiaoyu Wen, Chenjia Bai, Kang Xu, Xudong Yu, Yang Zhang, Xuelong Li, Zhen Wang
ICML 2024
Arxiv

Towards Robust Offline-to-Online Reinforcement Learning via Uncertainty and Smoothness.
Xiaoyu Wen, Xudong Yu*, Rui Yang, Chenjia Bai, Zhen Wang
Under Review
Arxiv

Ensemble Successor Representations for Task Generalization in Offline-to-Online Reinforcement Learning.
Changhong Wang, Xudong Yu*, Chenjia Bai, Zhen Wang
Science China Information Sciences
paper

Lightweight Uncertainty for Offline Reinforcement Learning via Bayesian Posterior.
Xudong Yu, Chenjia Bai, Hongyi Guo, Changhong Wang, Zhen Wang, Xuelong Li
Under Review
paper

Self-Supervised Imitation for Offline Reinforcement Learning With Hindsight Relabeling.
Xudong Yu, Chenjia Bai, Changhong Wang, Dengxiu Yu, C.L. Philip Chen, Zhen Wang
IEEE Transactions on Systems, Man, Cybernetics: Systems, 2023
paper

Curriculum Goal-conditioned Self-imitation for Offline Reinforcement Learning.
Xiaoyun Feng, Li Jiang, Xudong Yu, Haoran Xu, Xiaoyan Sun, Jie Wang, Xianyuan Zhan, Wai Kin (Victor) Chan
IEEE Transactions on Games, 2022 paper

Multi-Sensor Fusion Localization with Factor Graphs for UGVs Under Adverse Conditions.
Xudong Yu, Changhong Wang
IEEE International Conference on Unmanned Systems (ICUS), 2021

paper