Hello! I am Haoran Xu. I am a 2nd-year Ph.D. student at UT Austin, advised by Prof. Amy Zhang. I used to be in AIR, Tsinghua University, JD.com and MSRA, working with Xianyuan Zhan and Zheng Yu.
My research aims towards super-human AGI using reinforcement learning, currently I am interested in:
- (Dual-RL) Rethinking the dual formulation of reinforcement learning: is this in-sample learning paradigm all we need?
- (GenRL) How to leverage the strong power of generative models to decision making?
- (LLM+RL) How to enable better training and finetuning of LLMs using RL?
Some links: Github / Twitter / Google Scholar / haoran.xu@utexas.edu
News
- One paper (Diffusion-DICE) on Dual-RL and GenRL is accepted to NeurIPS 2024.
- I will start my summer internship at MSR (New York), working with Alex Lamb and John Langford.
- π¦πΉ I am attending ICLR 2024 in-person at Vienna.
- One paper (ODICE) on Dual-RL is accepted to ICLR 2024 as spotlight.
- πΊπΈ I am attending NeurIPS 2023 in-person at New Orleans.
- One paper (OMIGA) on offline multi-agent RL is accepted to NeurIPS 2023.
- π Starting my PhD at UT Austin.
- π·πΌ I am attending ICLR 2023 in-person at Kigali.
- π One paper (IVR) on offline RL is accepted to ICLR 2023 as oral.
- Honored to be selected as Top Reviewers in NeurIPS 2022.
- One paper (POR) on offline RL is accepted to NeurIPS 2022 as oral.
- One paper (DWBC) on offline IL is accepted to ICML 2022.
Publications (* marks equal contribution)
- ODICE: Revealing the Mystery of Distribution Correction Estimation via Orthogonal-gradient Update ICLR 2024 (Spotlight, Top 5%) 2024 Paper | Code | Thread
- Offline RL with No OOD Actions: In-Sample Learning via Implicit Value Regularization ICLR 2023 (Oral, Top 5%) 2023 Paper | Code | Slide
- A Policy-Guided Imitation Approach for Offline Reinforcement Learning NeurIPS 2022 (Oral, Top 2%) 2023 Paper | Code | Thread | Media
- Discriminator-Weighted Offline Imitation Learning from Suboptimal Demonstrations ICML 2022 2022 Paper | Code | Thread
- Constraints Penalized Q-Learning for Safe Offline Reinforcement Learning AAAI 2022 (Spotlight @ ICML 2021 RL4RealLife workshop) 2022 Paper | Code | Slides
- DeepThermal: Combustion Optimization for Thermal Power Generating Units Using Offline Reinforcement Learning AAAI 2022 (Spotlight @ ICML 2021 RL4RealLife workshop) 2022 Paper | Code
- Diffusion-DICE: In-Sample Diffusion Guidance for Offline Reinforcement Learning NeurIPS 2024 2024 Paper | Code | Website |
- ODICE: Revealing the Mystery of Distribution Correction Estimation via Orthogonal-gradient Update ICLR 2024 (Spotlight, Top 5%) 2024 Paper | Code | Thread
- Offline RL with No OOD Actions: In-Sample Learning via Implicit Value Regularization ICLR 2023 (Oral, Top 5%) 2023 Paper | Code | Slide
- A Policy-Guided Imitation Approach for Offline Reinforcement Learning NeurIPS 2022 (Oral, Top 2%) 2023 Paper | Code | Thread | Media
- Diffusion-DICE: In-Sample Diffusion Guidance for Offline Reinforcement Learning NeurIPS 2024 2024 Paper | Code | Website |
- PROTO: Iterative Policy Regularized Offline-to-Online Reinforcement Learning Preprint 2024 Paper | Code
- ODICE: Revealing the Mystery of Distribution Correction Estimation via Orthogonal-gradient Update ICLR 2024 (Spotlight, Top 5%) 2024 Paper | Code | Thread
- Offline Multi-Agent Reinforcement Learning with Implicit Global-to-Local Value Regularization NeurIPS 2023 2023 Paper | Code
- SaFormer: A Conditional Sequence Modeling Approach to Offline Safe Reinforcement Learning ICLR 2023 SR4AD Workshop 2023 Paper
- Offline RL with No OOD Actions: In-Sample Learning via Implicit Value Regularization ICLR 2023 (Notable Top 5%) 2023 Paper | Code
- Mind the Gap: Offline Policy Optimizaiton for Imperfect Rewards ICLR 2023 2023 Paper | Code
- When data geometry meets deep function: Generalizing offline reinforcement learning ICLR 2023 2023 Paper | Code
- A Policy-Guided Imitation Approach for Offline Reinforcement Learning NeurIPS 2022 (Oral, Top 2%) 2022 Paper | Code | Slides | Media
- Discriminator-Weighted Offline Imitation Learning from Suboptimal Demonstrations ICML 2022 2022 Paper | Code | Slides
- Constraints Penalized Q-Learning for Safe Offline Reinforcement Learning AAAI 2022 (Spotlight @ ICML 2021 RL4RealLife workshop) 2022 Paper | Code | Slides
- DeepThermal: Combustion Optimization for Thermal Power Generating Units Using Offline Reinforcement Learning AAAI 2022 (Spotlight @ ICML 2021 RL4RealLife workshop) 2022 Paper | Code
- Discriminator-Guided Model-Based Offline Imitation Learning CoRL 2022 2022 Paper
- Model-Based Offline Planning with Trajectory Pruning IJCAI 2022 2022 Paper | Code
- ECoalVis: Visual Analysis of Control Strategies in Coal-fired Power Plants IEEE VIS 2022 2022 Paper | Code
- Multi-Memory enhanced Separation Network for Indoor Temperature Prediction DASFAA 2022 2022 Paper
- Offline Reinforcement Learning with Soft Behavioral Regularization NeurIPS 2021 Offline RL Workshop 2021 Paper | Code
- Robust Spatio-Temporal Purchase Prediction via Deep Meta Learning AAAI 2021 2021 Paper
Professional Services
Reviewer for ICLR, ICML, NeurIPS