Publication Detail

Usage-Aware Representation Learning for Critical Information Identification in Transportation Networks

UCD-ITS-RP-24-11

Journal Article

3 Revolutions Future Mobility Program

Suggested Citation:
Sun, Ran and Yueyue Fan (2024)

Usage-Aware Representation Learning for Critical Information Identification in Transportation Networks

. Institute of Transportation Studies, University of California, Davis, Journal Article UCD-ITS-RP-24-11

Extracting meaningful information from noisy high-dimensional data is attracting increasing attention as richer and higher resolution data is being collected and used for transportation system planning and management purposes. Discovering critical information via effective data representation learning not only helps reduce data dimension, it also enables a deeper understanding of the underlying properties of noisy data, which could then lead to better planning and operations decisions. In this study, we present a new perspective that, unlike most existing approaches in the general data science literature, the design of data representation should go beyond the data itself; it should incorporate an understanding of how the data is used in the domain-specific applications. We further argue that this design philosophy is particularly important for transportation data because of the high spatial correlations of transportation data brought by network interdependence. We propose a usage-aware representation learning framework by incorporating the information loss for downstream application into the data encoding-decoding process. The proposed approach is formulated as a Stiefel manifold optimization problem. The effectiveness of the proposed framework is demonstrated in two network applications: modeling transportation network flows and estimating network-level vehicular emissions. The performance of the learned representation from our approach is compared with existing approaches using multiple evaluation context, including data reconstruction quality, clustering, anomaly detection, and critical information identification, through case studies implemented in Sioux Falls, Boston, and San Jose networks. The good performance of our approach consistently observed in those experiments indicates the importance of incorporating the downstream data usage in the process of data representation learning.


Key words:

representation learning, transportation networks, high-dimensional data, data reconstruction and classification, data-usage-aware