Yuxiong Wang's Homepage

About Me

Make things as simple as possible, but not simpler

– Albert Einstein

Welcome to my homepage. I am an Assistant Professor of Computer Science at the University of Illinois Urbana-Champaign (UIUC).

My research lies in computer vision, machine learning, and robotics, with a specific focus on open-world perception, meta-learning, multi-modal learning, generative modeling, and agent learning.

Before joining Illinois CS, I was a postdoctoral fellow in the Robotics Institute at Carnegie Mellon University, advised by Prof. Martial Hebert. I was a visitor to the Center for Data Science at New York University, working with Prof. Jean Ponce. I obtained my Ph.D. under the supervision of Prof. Martial Hebert in the Robotics Institute. I have also been closely working with Prof. Deva Ramanan and Prof. Ruslan Salakhutdinov. I receive the Best Paper Honorable Mention Award for streaming perception in ECCV 2020, and Best Paper Award Finalist in CVPR 2019, 2022. I am recognized as a Notable Area Chair in ICLR 2023, and an Expert Reviewer in TMLR. I am selected to participate in the National Academy of Engineering’s (NAE) Frontiers of Engineering symposium.

Please feel free to contact me if interested!

  • I am always seeking self-motivated Ph.D., M.S., and undergraduate students. 
  • Our group also has openings for visiting students.

Current Research Topics

  • Meta-learning and learning to learn
  • Open-world, multi-modal, few-shot, and self-supervised learning
  • 3D vision
  • Generative modeling, predictive learning
  • Human motion and human-object interaction modeling
  • Reinforcement learning, robot/agent learning
  • In-the-wild applications in robotics, autonomous driving, agriculture, materials science, chemistry, healthcare, etc.

Dissertation


Talks

One of our primary research efforts focuses on bridging generative and discriminative learning, facilitating autonomous agents to perceive, interact, and act in the open world. For representative works, please refer to my two recent talks: the first talk, presented at the C3.ai Generative AI Workshop, elaborates on how we ground generative modeling in 3D and 4D; the second talk, given at the London Machine Learning Meetup, explores various techniques that leverage LLMs and generative visual models for perception and decision-making.


Students

PhD Students


Recent Publications

Lexicon3D: Probing Visual Foundation Models for Complex 3D Scene Reasoning

Lexicon3D: Probing Visual Foundation Models for Complex 3D Scene Reasoning

Yunze Man, Shuhong Zheng, Zhipeng Bao, Martial Hebert, Liang-Yan Gui, Yu-Xiong Wang

arXiv, 2024.

[Website] [PDF]

Floating No More: Object-Ground Reconstruction from a Single Image

Floating No More: Object-Ground Reconstruction from a Single Image

Yunze Man, Yichen Sheng, Jianming Zhang, Liang-Yan Gui, Yu-Xiong Wang

arXiv, 2024.

[Website] [PDF]

InterDreamer: Zero-Shot Text to 3D Dynamic Human-Object Interaction

InterDreamer: Zero-Shot Text to 3D Dynamic Human-Object Interaction

Sirui Xu, Ziyin Wang, Yu-Xiong Wang, Liang-Yan Gui

arXiv, 2024.

[Website] [PDF]

AlignDiff: Aligning Diffusion Models for General Few-Shot Segmentation

AlignDiff: Aligning Diffusion Models for General Few-Shot Segmentation

Ri-Zhao Qiu, Yu-Xiong Wang*, Kris Hauser*

ECCV, 2024 (Oral).

[PDF] [Code]

Language Agent Tree Search Unifies Reasoning, Acting, and Planning in Language Models

Language Agent Tree Search Unifies Reasoning, Acting, and Planning in Language Models

Andy Zhou, Kai Yan, Michal Shlapentokh-Rothman, Haohan Wang, Yu-Xiong Wang

ICML, 2024.

[Website] [PDF] [Code]

Offline Imitation from Observation via Primal Wasserstein State Occupancy Matching

Offline Imitation from Observation via Primal Wasserstein State Occupancy Matching

Kai Yan, Alex Schwing, Yu-Xiong Wang

ICML, 2024.

[Website] [PDF] [Code]

ATraDiff: Accelerating Online Reinforcement Learning with Imaginary Trajectories

ATraDiff: Accelerating Online Reinforcement Learning with Imaginary Trajectories

Qianlan Yang, Yu-Xiong Wang

ICML, 2024.

[Website] [PDF]

Aligning Large Multimodal Models with Factually Augmented RLHF

Aligning Large Multimodal Models with Factually Augmented RLHF

Zhiqing Sun, Sheng Shen, Shengcao Cao, Haotian Liu, Chunyuan Li, Yikang Shen, Chuang Gan, Liangyan Gui, Yu-Xiong Wang, Yiming Yang, Kurt Keutzer, Trevor Darrell

ACL Findings, 2024.

[Website] [PDF] [Code]

Separate-and-Enhance: Compositional Finetuning for Text2Image Diffusion Models

Separate-and-Enhance: Compositional Finetuning for Text2Image Diffusion Models

Zhipeng Bao, Yijun Li, Krishna Kumar Singh, Yu-Xiong Wang, Martial Hebert

SIGGRAPH, 2024.

[PDF]

Instruct 4D-to-4D: Editing 4D Scenes as Pseudo-3D Scenes Using 2D Diffusion

Instruct 4D-to-4D: Editing 4D Scenes as Pseudo-3D Scenes Using 2D Diffusion

Linzhan Mou*, Jun-Kun Chen*, Yu-Xiong Wang

CVPR, 2024.

[Website] [PDF] [Poster] [Video]

ConsistDreamer: 3D-Consistent 2D Diffusion for High-Fidelity Scene Editing

ConsistDreamer: 3D-Consistent 2D Diffusion for High-Fidelity Scene Editing

Jun-Kun Chen, Samuel Rota Bulò, Norman Müller, Lorenzo Porzi, Peter Kontschieder, Yu-Xiong Wang

CVPR, 2024.

[Website] [PDF] [Poster] [Video]

TAMM: TriAdapter Multi-Modal Learning for 3D Shape Understanding

TAMM: TriAdapter Multi-Modal Learning for 3D Shape Understanding

Zhihao Zhang, Shengcao Cao, Yu-Xiong Wang

CVPR, 2024.

[Website] [PDF] [Code] [Video]

Situational Awareness Matters in 3D Vision Language Reasoning

Situational Awareness Matters in 3D Vision Language Reasoning

Yunze Man, Liangyan Gui, Yu-Xiong Wang

CVPR, 2024.

[Website] [PDF] [Code] [Video]

Restricted Memory Banks Improve Video Object Segmentation: A Revisit

Restricted Memory Banks Improve Video Object Segmentation: A Revisit

Junbao Zhou, Ziqi Pang, Yu-Xiong Wang

CVPR, 2024.

[Website] [PDF] [Code] [Video]

Region Representations Revisited

Region Representations Revisited

Michal Shlapentokh-Rothman* , Ansel Blume*, Yao Xiao, Yuqun Wu, Sethurame TV, Heyi Tao, Jae Yong Lee, Wilfredo Torres, Yu-Xiong Wang, Derek Hoiem

CVPR, 2024.

[Website] [PDF] [Code] [Poster] [Video]

More Publications


Teaching