Periodic Reporting for period 1 - Econ-ML (Econometric Machine Learning for better Heterogeneity Representation)
Okres sprawozdawczy: 2022-10-01 do 2024-09-30
The project aimed to bridge this gap by combining the strengths of machine learning with the robust theoretical foundation of econometrics. Specifically, it sought to develop hybrid modelling frameworks that integrate ML techniques, such as Variational Autoencoders (VAEs), into econometric models like Latent Class Choice Models (LCCMs) and Mixed Logit Models. These hybrid approaches were designed to enhance traditional behavioural choice models by improving out-of-sample generalization, generating synthetic data, imputing missing data, providing a more accurate representation of heterogeneity, while maintaining interpretability consistent with economic theory.
By applying these advanced methodologies to real-world transport data, the project aimed to generate new insights into travellers’ behaviours and contribute to the broader field of transport modelling. The project’s goal was to improve the quality and scalability of decision-support tools for policymakers and transport planners. The developed models can be also applied beyond transportation, with potential applications in other fields such as marketing, finance, economics, healthcare, and environmental economics, where understanding and predicting human behaviour are equally critical.
- Hybrid Model Development: Two hybrid machine learning and discrete choice models were conceptualised and implemented. A Variational Autoencoder Latent Class Choice Model (VAE-LCCM) that integrates deep generative modelling with class-based segmentation, and a Variational Autoencoder Mixed Logit Model (VAE-MXL) that combines machine learning's generative power with the flexibility of Mixed Logit models. Results showed that both models can enhance traditional behavioural choice models by generating synthetic data, imputing missing data, and providing a more accurate representation of heterogeneity. Furthermore, they can improve the goodness-of-fit and out-of-sample generalisation of traditional discrete choice models while maintaining their behavioural and economic interpretability.
- Data Integration and Analysis: A large-scale smart card data was matched and integrated with a national travel survey data from Denmark to analyse and quantify reporting errors in travel surveys. This integration showcased the complementary nature of the two data sources, particularly in the context of large-scale public transport networks. It also provided valuable insights into improving the quality and reliability of travel survey data, which is essential for enhancing the performance of both econometric and machine learning models and contributing to more effective and informed decision-making in transport planning.
Another study compared traditional choice set generation methods with those derived from empirical smart card data in multimodal public transport networks. The findings showed the importance of using smart card data in multimodal public transport route choice models, which are important tools for policy simulation and planning.
- Computational Efforts: The project used Bayesian inference techniques and variational methods to overcome computational challenges associated with large-scale modelling.
- Dissemination: The project outputs were shared through peer-reviewed publications, international conference presentations, and seminars, reaching both academic and practitioner audiences.