Periodic Reporting for period 2 - 3DInAction (Understanding human action from unstructured 3D point clouds using deep learning methods)
Reporting period: 2023-01-01 to 2023-12-31
During the first period of this project, we proposed seven novel methods for point cloud processing and action understanding. Three of these have been accepted and published in CVPR, IROS, and Computer and Graphics, another three have been submitted and are currently under review and one is in the preparation stage for submission in March to ICCV.
Details of each paper:
* GoferBot - we proposed, constructed, and evaluated a human-robot collaborative system for the task of assembling furniture, published and presented in IROS. In this system, a human was tasked with assembling a piece of IKEA furniture, A robot arm, equipped with a Kinect camera had to infer the current human action, predict their next action, and retrieve the next piece of assembly. We conducted a thorough evaluation of the system's individual components and the system as a whole. Surprisingly, despite performing objectively faster than command-based alternatives we found that humans perceived the collaboration to be less fluent.
* DiGS - we proposed a divergence-based novel Neural Implicit representation for point clouds. Published in CVPR 2022. In this project, we take point clouds as input and train a neural network to represent a signed distance function where the zero level set is the surface on which the points were sampled on. We demonstrated improved performance on the existing state-of-the-art methods, particularly for unoriented point clouds.
* CloudWalker - we proposed a novel method for representing point clouds using random walks. Published in Computers & Graphics 2022.
* IKEA Assembly in the wild (IAW) dataset - We collected a dataset of YouTube IKEA assembly videos and annotated them with an alignment of instructional manual diagrams. We proposed a novel method to solve the alignment problem using contrastive learning. Submitted, is under review.
* GraVoS - we propose a novel method to improve point cloud detection methods using gradient-based selection. Submitted, is under review.
* OG surface reconstruction - we propose a novel method for unoriented point cloud surface reconstruction using Octree data structure guidance. Submitted, is under review.
* IKEA Ego dataset - We collected and annotated a dataset of human assembly actions using a Hololens 2.0 headset from an ego-view perspective. The dataset includes point clouds, RGB, depth, view and hand-tracking information alongside action label annotation. The paper will be submitted in March to ICCV.
Exploitation: I have been contacted by IKEA digital lab representatives and we had several meetings to discuss possible collaborations. Nothing concrete has materialized yet but I expect it to be viable in the near future as our research interest aligns well with their work.
Furthermore, I contacted Mobileye and met with some of their representatives to present our work. We discussed possibilities of collaboration that have not materialized yet. Additionally, several start-up companies have expressed interest in my work but we are still in the very early stage of communications and I do not expect it to materialize in the future into a collaboration.
Dissemination and outreach: I have co-organized the 2021 Robotic vision Summer school and 2022 Robotic vision Summer school in Australia. Additionally, I have created the Talking Papers Podcast where I host authors of seminal papers in the field of computer vision and machine learning. Furthermore, I have presented our work at CVPR 2022, and Israel Vision day 2021 and have attended ECCV 2022 and Israel Vision day 2022 in person.
The socio-economic and wider societal implications of the project are expected to be substantial, if and when integrated into cutting-edge vision systems, for example in autonomous cars or wearable headsets. The knowledge generated from this research provides the foundation for these applications to flurish in a dynamic human environment.