Publication Detail

A Survey of Transformer Architectures for Autonomous Driving

UCD-ITS-RP-25-106

Journal Article

Suggested Citation:
Chu, Fulin, Haoyu Li, Lili Xie, Jingyuan Zhao (2025)

A Survey of Transformer Architectures for Autonomous Driving

. Expert Systems with Applications 229 Pard D

Transformers have emerged as a foundational paradigm in autonomous driving, enabling high-capacity modeling of complex, multimodal, and dynamic environments. Their self-attention mechanisms, scalability, and sequence modeling capabilities support spatial–temporal reasoning and long-range dependency capture. Although increasingly adopted in core modules—such as perception, trajectory prediction, decision-making, and anomaly detection—a system-level survey of their architectural evolution and deployment challenges remains lacking. This paper presents a structured survey of Transformer models in autonomous driving, introducing a task-oriented taxonomy that spans object detection, sensor fusion, trajectory forecasting, motion planning, and intent prediction. We compare Transformer architectures with traditional deep learning models (e.g., CNNs, RNNs), highlighting advantages in global context modeling, multimodal alignment, and unified representations across heterogeneous inputs (camera, LiDAR, radar, HD maps). Beyond current applications, this work examines large-scale, end-to-end Transformer systems and their potential as foundation models for autonomous driving. We analyze design patterns involving chain-of-thought reasoning, neuro-symbolic integration, federated learning, and privacy-preserving edge deployment. Case studies from industry leaders (e.g., Tesla, Baidu, NVIDIA, Aurora) illustrate practical trade-offs and architectural adaptations. Despite progress, challenges remain in achieving real-time efficiency, robustness in open-world scenarios, interpretability in sequential decision-making, and integration with cost-sensitive sensors. We identify research gaps and propose directions in scalable Transformer design, explainable AI, and policy-aware planning. This survey aims to guide researchers and practitioners at the intersection of AI and intelligent transportation, supporting the development of interpretable, efficient, and generalizable Transformer-based autonomous driving systems.


Key words:

transformer, autonomous driving, perception, decision-making, multimodal