Learning from Synthetic Human Group Activities

Author

Che-Jui Chang (Rutgers University), Danrui Li (Rutgers University), Deep Patel (NEC Laboratories), Parth Goel (Rutgers University), Honglu Zhou (NEC Laboratories), Seonghyeon Moon (Rutgers University), Samuel S. Sohn (Rutgers University), , Sejong Yoon (College of New Jersey), Vladimir Pavlovic (Rutgers University), Mubbasir Kapadia (Roblox)

Venue

Computer Vision and Pattern Recognition (CVPR), 2024

Abstract

The study of complex human interactions and group activities has become a focal point in human-centric computer vision. However, progress in related tasks is often hindered by the challenges of obtaining large-scale labeled datasets from real-world scenarios. To address the limitation, we introduce M3Act, a synthetic data generator for multi-view multi-group multi-person human atomic actions and group activities. Powered by Unity Engine, M3Act features multiple semantic groups, highly diverse and photorealistic images, and a comprehensive set of annotations, which facilitates the learning of human-centered tasks across single-person, multi-person, and multi-group conditions. We demonstrate the advantages of M3Act across three core experiments. The results suggest our synthetic dataset can significantly improve the performance of several downstream methods and replace real-world datasets to reduce cost. Notably, M3Act improves the state-of-the-art MOTRv2 on DanceTrack dataset, leading to a hop on the leaderboard from 10th to 2nd place. Moreover, M3Act opens new research for controllable 3D group activity generation. We define multiple metrics and propose a competitive baseline for the novel task. Our code and data are available at our project page.

Join us in shaping the future

View All Jobs

Learning from Synthetic Human Group Activities

Author

Venue

Abstract

Related Publications

Cube: A Roblox View of 3D Intelligence

SmoothCache: A Universal Inference Acceleration Technique for Diffusion Transformers

Can It Edit? Evaluating the Ability of Large Language Models to Follow Code Editing Instructions

Join us in shaping the future