Passer à la navigation principale Passer à la recherche Passer au contenu principal

Foundation models for autonomous driving: A comprehensive survey

  • École de technologie supérieure
  • Mediterranean Institute of Technology
  • King Abdulaziz University

Résultats de recherche: Contribution à un journalBrève enquêteRevue par des pairs

Résumé

Large Language Models (LLMs) have showcased remarkable proficiency in various information-processing tasks. They excel at data extraction, literature summarization, content generation, predictive modeling, decision-making, and system control. Moreover, Vision-Language Models (VLMs) and Multimodal LLMs (MLLMs), collectively referred to in this work as Cross-modal Language Models (XLMs), integrate multiple data modalities with language understanding, thereby advancing Autonomous Driving Systems (ADS). On the implemented Artificial Intelligence (AI) side, we analyze core techniques such as prompt engineering, supervised fine-tuning, reinforcement learning from human feedback, knowledge distillation, quantization and pruning, and safety alignment/verification, together with edge-aware deployment strategies. On the application of AI side, we map XLMs capabilities to the driving stack, including perception, prediction, planning, control, and human–machine interaction/vehicle-to-everything, and summarize how XLMs improve scene understanding, intent forecasting, decision-making, and closed-loop control by coupling natural-language reasoning with multimodal sensory inputs, such as panoramic images, Light Detection and Ranging (LiDAR), and radar. In this survey, we synthesize the state of XLMs for ADS: we review the relevant literature on ADS and XLMs, including their architectures, tools, and frameworks. We then compare deployment approaches across the driving stack and summarize datasets, simulators, and benchmarks for both open- and closed-loop evaluation. Finally, we analyze key challenges, such as grounding and hallucination, long-tail robustness, real-time and resource constraints, safety alignment and verification, and data governance and privacy, and outline research directions toward safe, efficient, and trustworthy XLM-enabled ADS.

langue originaleAnglais
Numéro d'article114805
journalEngineering Applications of Artificial Intelligence
Volume176
Les DOIs
étatPublié - 15 juil. 2026

Empreinte digitale

Voici les principaux termes ou expressions associés à « Foundation models for autonomous driving: A comprehensive survey ». Ces libellés thématiques sont générés à partir du titre et du résumé de la publication. Ensemble, ils forment une empreinte digitale unique.

Contient cette citation