ManiSkill2: A Unified Benchmark for Generalizable Manipulation Skills

Jiayuan Gu*, Fanbo Xiang*, Xuanlin Li, Zhan Ling, Xiqiang Liu, Tongzhou Mu, Yihe Tang, Stone Tao, Xinyue Wei, Yunchao Yao, Xiaodi Yuan, Pengwei Xie, Zhiao Huang, Rui Chen, Hao Su
International Conference on Learning Representations (ICLR) 2023
PDF Challenge Project Code

ManiSkill2 is a unified benchmark for learning generalizable robotic manipulation skills powered by SAPIEN. It features 20 out-of-box task families with 2000+ diverse object models and 4M+ demonstration frames. Moreover, it empowers fast visual input learning algorithms so that a CNN-based policy can collect samples at about 2000 FPS with 1 GPU and 16 processes on a workstation. The benchmark can be used to study a wide range of algorithms: 2D & 3D vision-based reinforcement learning, imitation learning, sense-plan-act, etc.

Close the Optical Sensing Domain Gap by Physics-Grounded Active Stereo Sensor Simulation

Xiaoshuai Zhang, Rui Chen, Ang Li, Fanbo Xiang, Yuzhe Qin, Jiayuan Gu, Zhan Ling, Minghua Liu, Peiyu Zeng, Songfang Han, Zhiao Huang, Tongzhou Mu, Jing Xu, Hao Su
T-RO 2023
PDF Project Doc

SAPIEN Realistic depth lowers the sim-to-real gap of simulated depth and real active stereovision depth sensors, by designing a fully physics-grounded pipeline. Perception and RL methods trained in simulation can transfer well to the real world without any fine-tuning. It can also estimate the algorithm performance in the real world, largely reducing human effort of algorithm evaluation.

ManiSkill: Learning-from-Demonstrations Benchmark for Generalizable Manipulation Skills

Tongzhou Mu*, Zhan Ling*, Fanbo Xiang*, Derek Yang*, Xuanlin Li*, Stone Tao, Zhiao Huang, Zhiwei Jia, Hao Su
NeurIPS 2021 Datasets and Benchmarks Track
PDF Challenge Video Code

Learning to manipulate unseen objects from 3D visual inputs is crucial for robots to achieve task automation. See how we build the SAPIEN Manipulation Skill Benchmark and collect many demonstrations without human labelling. ManiSkill supports object-level variations by utilizing a rich and diverse set of articulated objects, and each task is carefully designed for learning manipulations on a single category of objects.

OCRTOC: A Cloud-Based Competition and Benchmark for Robotic Grasping and Manipulation

Ziyuan Liu, Wei Liu, Yuzhe Qin, Fanbo Xiang, Songyan Xin, Maximo A Roa, Berk Calli, Hao Su, Yu Sun, Ping Tan
IEEE Robotics and Automation Letters (RA-L)
PDF Challenge

We propose a cloud-based benchmark for robotic grasping and manipulation, specifically table organization tasks. With the OCRTOC benchmark, we aim to lower the barrier of conducting reproducible research on robotic grasping and accelerate progress in this field. Using this benchmark we held a competition in IROS 2020, and 59 teams took part in this competition worldwide.

O2O-Afford: Annotation-Free Large-Scale Object-Object Affordance Learning

Kaichun Mo, Yuzhe Qin, Fanbo Xiang, Hao Su, Leonidas J. Guibas
Conference on Robot Learning (CoRL) 2021
PDF Code

Contrary to the vast literature in studying agent-object interaction, very few past works have studied the task of object-object interaction, which also plays important role in downstream robotic manipulation and planning tasks. In this paper, we propose a large-scale annotation-free object-object affordance learning framework for diverse object-object interaction tasks.

MVSNeRF: Fast Generalizable Radiance Field Reconstruction from Multi-View Stereo

Anpei Chen, Zexiang Xu, Fuqiang Zhao, Xiaoshuai Zhang, Fanbo Xiang, Jingyi Yu, Hao Su
ICCV 2021
PDF Code

We present MVSNeRF, a novel neural rendering approach that can efficiently reconstruct neural radiance fields for view synthesis. Our approach leverages plane-swept cost volumes (widely used in multi-view stereo) for geometry-aware scene reasoning, and combines this with physically based volume rendering for neural radiance field reconstruction.

NeuTex: Neural Texture Mapping for Volumetric Neural Rendering

Fanbo Xiang, Zexiang Xu, Miloš Hašan, Yannick Hold-Geoffroy, Kalyan Sunkavalli, Hao Su
Conference on Computer Vision and Pattern Recognition (CVPR) 2021, Oral
PDF Project Website

We present an approach that explicitly disentangles geometry--represented as a continuous 3D volume--from appearance--represented as a continuous 2D texture map. We achieve this by introducing a 3D-to-2D texture mapping (or surface parameterization) network into volumetric representations.

SAPIEN: A SimulAted Part-based Interactive ENvironment

Fanbo Xiang, Yuzhe Qin, Kaichun Mo, Yikuan Xia, Hao Zhu, Fangchen Liu, Minghua Liu, Hanxiao Jiang, Yifu Yuan, He Wang, Li Yi, Angel Chang, Leonidas Guibas, Hao Su
Conference on Computer Vision and Pattern Recognition (CVPR) 2020, Oral
PDF Code Project Website My contribution

We constructed a PhysX-based simulation environment using PartNet-Mobility dataset to support household robotics tasks. I am the project leader and worked on the following: 1. Constructed web interface for annotation of articulated object dataset. 2. Worked on a high-level simulator backed by PhysX. 3. Implemented OpenGL rasterization and OptiX ray-tracing for scene rendering. 4. Benchmarked motion prediction task with ResNet50/PointNet++ on RGB-D/point cloud inputs.

Copyright © 2020, Fanbo Xiang. Website generated with org-mode and Jekyll.
Please feel free to use my designs however you like. Code is available here.