Press Release

China VLA Large Model Applications in Automotive and Robotics Research Report 2025 | Technical Origin, Development Stages, Application Cases and Core Characteristics – ResearchAndMarkets.com

DUBLIN–(BUSINESS WIRE)–The “VLA Large Model Applications in Automotive and Robotics Research Report, 2025” report has been added to ResearchAndMarkets.com’s offering.


‘VLA Large Model Applications in Automotive and Robotics Research Report, 2025’

  • The report summarizes and analyzes the technical origin, development stages, application cases and core characteristics of VLA large models.
  • It sorts out 8 typical VLA implementation solutions, as well as typical VLA large models in the fields of intelligent driving and robotics, and summarizes 4 major trends in VLA development.
  • It analyzes the VLA application solutions in the field of intelligent driving of companies such as Li Auto, XPeng Motors, Chery Automobile, Geely Automobile, Xiaomi Auto, DeepRoute.ai, Baidu, Horizon Robotics, SenseTime, NVIDIA, and iMotion.
  • It sorts out more than 40 large model frameworks or solutions such as robot general basic models, multimodal large models, data generalization models, VLM models, VLN models, VLA models and robot world models.
  • It analyzes the large models and VLA large model application solutions of companies such as AgiBot, Galbot, Robot Era, Estun, Unitree, UBTECH, Tesla Optimus, Figure AI, Apptronik, Agility Robotics, XPeng IRON, Xiaomi CyberOne, GAC GoMate, Chery Mornine, Leju Robotics, LimX Dynamics, AI Robotics, and X Square Robot.

The concept of VLA was quickly noticed by automobile companies and rapidly applied to the field of automotive intelligent driving. If ‘end-to-end’ was the hottest term in the intelligent driving field in 2024, then ‘VLA’ will be the one in 2025. Companies such as XPeng Motors, Li Auto, and DeepRoute.ai have released their respective VLA solutions.

Recently, research teams from McGill University, Tsinghua University, Xiaomi Corporation, and the University of Wisconsin-Madison jointly released a comprehensive review article on VLA models in the field of autonomous driving, ‘A Survey on Vision-Language-Action Models for Autonomous Driving’. The article divides the development of VLA into four stages: Pre-VLA (VLM as explainer), Modular VLA, End-to-end VLA and Augmented VLA, clearly showing the characteristics of VLA in different stages and the gradual development process of VLA.

There are over 100 robot VLA models, constantly exploring in different paths

Compared with the application of VLA large models in automobiles, which have tens of billions of parameters and nearly 1,000 TOPS of computing power, AI computing chips in the robotics field are still optional, and the number of parameters in training data sets is mostly between 1 million and 3 million. There are also controversies over the mixed use of real data and simulated synthetic data and routes.

One of the reasons is that the number of cars on the road is hundreds of millions, while the number of actually deployed robots is very small; another important reason is that robot VLA models focus on the exploration of the microcosmic world. Compared with the grand automotive world model, the multimodal perception of robot application scenarios is richer, the execution actions are more complex, and the sensor data is more microscopic.

There are more than 100 VLA models and related data sets in the robotics field, and new papers are constantly emerging, with various teams exploring in different paths.

Exploration 1: VTLA framework integrating tactile perception

In May 2025, research teams from the Institute of Automation of the Chinese Academy of Sciences, Samsung Beijing Research Institute, Beijing Academy of Artificial Intelligence (BAAI), and the University of Wisconsin-Madison jointly released a paper on VTLA related to insertion manipulation tasks. The research shows that the integration of visual and tactile perception is crucial for robots to perform tasks with high precision requirements when performing contact-intensive operation tasks. By integrating visual, tactile and language inputs, combined with a time enhancement module and a preference learning strategy, VTLA has shown better performance than traditional imitation learning methods and single-modal models in contact-intensive insertion tasks.

Exploration 2: VLA model supporting multi-robot collaborative operation

In February 2025, Figure AI released the Helix general Embodied AI model. Helix can run collaboratively on humanoid robots, enabling two robots to cooperate to solve a shared, long-term operation task. In the video demonstrated at the press conference, Figure AI’s robots showed a smooth collaborative mode in the operation of placing fruits: the robot on the left pulled the fruit basin over, the robot on the right put the fruits in, and then the robot on the left put the fruit basin back to its original position.

Figure AI emphasized that this is only touching ‘the surface of possibilities’, and the company is eager to see what happens when Helix is scaled up 1000 times. Figure AI introduced that Helix can run completely on embedded low-power GPUs and can be commercially deployed immediately.

Exploration 3: Offline end-side VLA model in the robotics field

In June 2025, Google released Gemini Robotics On-Device, a VLA multimodal large model that can run locally offline on embodied robots. The model can simultaneously process visual input, natural language instructions, and action output. It can maintain stable operation even in an environment without a network.

It is particularly worth noting that the model has strong adaptability and versatility. Google pointed out that Gemini Robotics On-Device is the first robot VLA model that opens the fine-tuning function to developers, enabling developers to conduct personalized training on the model according to their specific needs and application scenarios.

VLA robots have been applied in a large number of automobile factories

When the macro world model of automobiles is integrated with the micro world model of robots, the real era of Embodied AI will come.

When Embodied AI enters the stage of VLA development, automobile enterprises have natural first-mover advantages. Tesla Optimus, XPeng Iron, and Xiaomi CyberOne robots have fully learned from their rich experience in intelligent driving, sensor technology, machine vision and other fields, and integrated their technical accumulation in the field of intelligent driving. XPeng Iron robot is equipped with XPeng Motors’ AI Hawkeye vision system, end-to-end large model, Tianji AIOS and Turing AI chip.

At the same time, automobile factories are currently the main application scenarios for robots. Tesla Optimus robots are currently mainly used in Tesla’s battery workshops. Apptronik cooperates with Mercedes-Benz, and Apollo robots enter Mercedes-Benz factories to participate in car manufacturing, with tasks including handling, assembly and other physical work. At the model level, Apptronik has established a strategic cooperation with Google DeepMind, and Apollo has integrated Google’s Gemini Robotics VLA large model.

On July 18, UBTECH released the hot-swappable autonomous battery replacement system for the humanoid robot Walker S2, which enables Walker S2 to achieve 3-minute autonomous battery replacement without manual intervention.

According to public reports, many car companies including Tesla, BMW, Mercedes-Benz, BYD, Geely Zeekr, Dongfeng Liuzhou Motor, Audi FAW, FAW Hongqi, SAIC-GM, NIO, XPeng, Xiaomi, and BAIC Off-Road Vehicle have deployed humanoid robots in their automobile factories. Humanoid robots such as Figure AI, Apptronik, UBTECH, AI Robotics, and Leju are widely used in various links such as automobile and parts production and assembly, logistics and transportation, equipment inspection, and factory operation and maintenance. In the near future, AI robots will be the main ‘labor force’ in ‘unmanned factories’.

Key Topics Covered:

Chapter 1 Overview of VLA Large Models

  • Basic Definition of VLA (Vision-Language-Action Model)
  • Origin and Evolution of VLA Technology
  • Classification of VLA Large Model Methods
  • Four Stages of VLA Model Development in Autonomous Driving
  • VLA Solution Application
  • Case 1: Enhancement of VLA Generalization
  • Case 2: VLA Computational Overhead
  • Core Characteristics of VLA
  • Challenges in VLA Technology Development

Chapter 2 VLA Technical Architecture, Solutions and Trends

  • Analysis of VLA Core Technical Architecture
  • VLA Decision Core – Chain-of-Thought (CoT) Technology
  • Overview of VLA Large Model Implementation Solutions
  • Solution Based on Classic Transformer Structure
  • Solution Based on Pre-trained LLM/VLM
  • Solution Based on Diffusion Model
  • LLM + Diffusion Model Solution
  • Video Generation + Inverse Kinematics Solution
  • Explicit End-to-End VLA Solution
  • Implicit End-to-End VLA Solution
  • Hierarchical End-to-End VLA Solution
  • Summary of Intelligent Driving VLA Models
  • Summary of Embodied AI VLA Models
  • VLA Development Trend

Chapter 3 VLA Large Model Application in the Automotive Field

  • Li Auto
  • XPeng Motors
  • Chery Automobile
  • Geely
  • Xiaomi Auto
  • DeepRoute.ai
  • Baidu Apollo
  • Horizon Robotics
  • SenseTime
  • NVIDIA
  • iMotion

Chapter 4 Progress of Large Models in the Robotics Field

  • Beijing Academy of Artificial Intelligence (BAAI)
  • SenseTime
  • Manycore Tech
  • Peking University
  • Renmin University
  • AgiBot
  • Unitree
  • Shanghai Jiao Tong University
  • Beijing Innovation Center of Humanoid Robotics
  • Figure AI
  • OpenAI
  • Noematrix
  • Galbot
  • Google (Gemini Robotics)
  • Shanghai AI Lab
  • DAMO Academy (Alibaba DAMO Academy)

Chapter 5 VLA Application Cases in the Robotics Field

  • AgiBot
  • Galbot
  • Robot Era
  • Estun
  • Unitree
  • UBTECH
  • Tesla (Optimus)
  • Figure AI
  • Apptronik
  • Google (Gemini Robotics, via Apollo Robot)
  • Agility Robotics
  • XPeng (IRON)
  • Xiaomi (CyberOne)
  • GAC (GoMate)
  • Mornine
  • Leju Robotics
  • LimX Dynamics
  • Quectel
  • AI Robotics
  • Meituan

For more information about this report visit https://www.researchandmarkets.com/r/ethvma

About ResearchAndMarkets.com

ResearchAndMarkets.com is the world’s leading source for international market research reports and market data. We provide you with the latest data on international and regional markets, key industries, the top companies, new products and the latest trends.

Contacts

ResearchAndMarkets.com

Laura Wood, Senior Press Manager

[email protected]

For E.S.T Office Hours Call 1-917-300-0470

For U.S./ CAN Toll Free Call 1-800-526-8630

For GMT Office Hours Call +353-1-416-8900

Author

Related Articles

Back to top button