DECO: Decoupled Multimodal Diffusion Transformer for Bimanual Dexterous Manipulation with a Plugin Tactile Adapter

2月 5, 2026·
Xukun Li
,
Yu Sun
,
Lei Zhang
,
Bosheng Huang
,
Yibo Peng
,
Yuan Meng
,
Haojun Jiang
,
Shaoxuan Xie
,
Guocai Yao
,
Alois Knoll
,
Zhenshan Bing
,
Xinlong Wang
,
Zhenguo Sun
· 1 分钟阅读时长
摘要
Bimanual dexterous manipulation relies on integrating multimodal inputs to perform complex real-world tasks. We propose DECO, a decoupled multimodal diffusion transformer that disentangles vision, proprioception, and tactile signals through specialized conditioning pathways, with a lightweight adapter for parameter-efficient injection of additional signals. Alongside DECO, we release DECO-50 dataset for bimanual dexterous manipulation with tactile sensing, consisting of 50 hours of data and over 5M frames. Experimental results show that DECO achieves the72.25% average success rate with a 21% improvement over the baseline.
类型
出版物
arXiv preprint
Released on arXiv:

DECO is a decoupled multimodal diffusion transformer for bimanual dexterous manipulation that:

  • Disentangles multimodal inputs (vision, proprioception, tactile) through specialized conditioning pathways
  • Features a lightweight tactile adapter for parameter-efficient injection of tactile signals
  • Achieves 72.25% success rate with 21% improvement over baseline
  • Introduces DECO-50 dataset with 50 hours of data and over 5M frames