During the first year of the project, we deeply analyzed the state of the art of video data augmentation for action recognition. We published a survey on videos data augmentation for deep learning models in the Future Internet Journal. This review shows that, in data augmentation, we are having a transition from methods based on basic image transformations to more complex generative and simulated models.
In consultation with a professional personal trainer, we selected 11 gentle gymnastic exercises to use in the data collection and test phase of our project. The image named “motions.png” depicts each one of the 11 exercises.
We collected a small real-life video dataset in our laboratory, which we subsequently expanded by generating a larger synthetic dataset using our Synthetic Video Generator. Twenty subjects were recorded in a controlled environment inside our lab to generate a training/validation set. The video recordings of additional three subjects were used to create our test set. Videos in this dataset were collected outdoors, under variable conditions. The image named “Collected.png” displays some example frames extracted from the collected dataset. The collected datasets are accessible online.
We have developed a video generator capable of producing synthetic videos featuring a subject performing specific actions. To test the capability of our synthetic video generator, we augmented the previously collected video dataset generating a new synthetic video dataset. Each video was generated by randomizing avatar appearance, background images, scene illumination, animation speed, and camera position and orientation. The image named “GeneratorExamples.png” displays some examples of generated frames. The synthetic video generator and other utility scripts are accessible online.
Both collected and augmented datasets were tested on state-of-the-art action recognition models (CNNs and Transformers based), and we presented the results in the paper “Synthetic data augmentation for video action classification using Unity”. The paper also contains implementation details of the synthetic data generator. At the moment, the paper is under review for publication in the journal “Transactions on Pattern Analysis and Machine Intelligence.”
We implemented the final system on the Nao robot. For the conversational module, we used a question-answering model based on ChatGPT. To monitor the action of the subjects, we used the Timesformer model trained on our augmented dataset. We tested the system with 9 subjects, and we made them fill in a survey about their experience interacting with the robotic coach. We are now finishing the writing of a paper to present the final system, the results obtained during the test session, and the analysis of the surveys.
Regarding the dissemination of the project, we created and uploaded the project webpage (
https://drvcoach.unica.it(se abrirá en una nueva ventana)) with all the most useful information in it. We gave several interviews on national and local media. A collection of them can be found at:
https://drvcoach.unica.it/news-1.html(se abrirá en una nueva ventana). We presented the project at the "SHARPER European Researcher's Night 2022" held at the Cagliari Botanical Garden. We also presented the project in a book chapter titled "Sensor Datasets for Human Daily Safety and Well-being", published by Springer in September 2023.