Research

Our research focuses on embodied AI, multimodal learning, and robotics. Selected publications and preprints are listed below.

Egocentric Instruction-oriented Affordance Prediction via Large Multimodal Model

Bokai Ji, Jie Gu, Xiaokang Ma, Chu Tang, Jingmin Chen, Guangxia Li

Under Review
arXiv

Generic Token Compression in Multimodal Large Language Models from an Explainability Perspective

Lei Lei, Jie Gu, Xiaokang Ma, Chu Tang, Jingmin Chen, Tong Xu

Under Review
arXiv

Stimulating Imagination: Towards General-purpose "Something Something Placement"

Jianyang Wu, Jie Gu, Xiaokang Ma, Fangzhou Qiu, Chu Tang, Jingmin Chen

IROS 2025Accepted
arXiv

Gaussian Sequences: Self-bootstrapping Dynamic Gaussian Reconstruction from Monocular Casual Videos

Under Review

QCDT: Efficiently Optimize and Deploy Detection Transformers via Query Compression

Under Review