National Data Administration: Driving the development of embodied intelligence with improved data engineering

2026-06-02

On May 31st, the National Data Administration announced that Liu Liehong, the director of the National Data Administration, recently stated at the 2026 World Intelligent Industry Expo that high-quality datasets are an important foundation for embodied intelligence's "perception decision-making execution". It is necessary to drive the development of embodied intelligence with sound data engineering and carry out systematic practices. Since the beginning of this year, there have been many policy developments in the field of high-quality datasets, and an industrial ecosystem centered around high-quality datasets is taking shape. Experts say that the construction of high-quality datasets has gradually shifted from "advocating construction" to "building according to standards, piloting according to mechanisms, and promoting according to systems", and the development of the industry is expected to further accelerate. By using industrial applications to drive data supply, 2026 will be the year of 'value release of data elements'. The National Data Administration will launch the' Implementation Plan for Promoting the Construction of High quality Data Sets in the Industry ', focusing on six major actions: strengthening the foundation and expanding capacity, tackling annotation challenges, improving quality and efficiency, empowering applications, providing management services, and releasing value. The plan will focus on empowering industrial development needs with artificial intelligence, driving data supply with industrial applications, and driving industrial intelligence development with data to promote better transformation of various industries'' data flywheels'. ”Liu Liehong said. Liu Liehong stated that high-quality datasets are the fundamental resources and innovation engines for the intelligent upgrading of advanced manufacturing industries, empowering artificial intelligence innovation and development through data. To collect, manage, and utilize data systems such as real production lines, equipment operation, and quality inspection, in order to better support industry models and intelligent agents in understanding industrial mechanisms, adapting to industrial scenarios, and optimizing industrial processes. We need to increase investment in high-quality industry datasets, promote modulus resonance, and facilitate deep integration of data, models, equipment, and scenarios. High quality datasets are an important foundation for embodied intelligence's "perception decision execution". Liu Liehong stated that the autonomous adaptation and task execution ability of embodied intelligence in real environments rely on high-quality, multimodal training data such as vision, touch, and audio. It is necessary to drive the development of embodied intelligence with comprehensive data engineering and carry out systematic practice. High quality datasets are the key support for accelerating the development of AI for Science. Liu Liehong stated that scientific research requires higher accuracy, standardization, and credibility of data. High quality datasets are not only the foundation for model training, pattern discovery, and result verification in the scientific field, but also the key support for promoting basic research towards industrial applications and achieving the true implementation of AI for Science. Since the beginning of this year, there have been many new trends in the field of high-quality datasets. On April 15th, the National Data Administration released the "Implementation Plan for Promoting the Construction of High Quality Data Sets in the Industry (Draft for Comments)" and publicly solicited opinions from the society. The Ministry of Industry and Information Technology and the National Data Administration recently jointly issued a notice on the joint implementation of the 2026 "Modular Resonance" action, promoting the synergy and resonance between artificial intelligence models and data resources, and proposing to basically form a virtuous cycle of "data model scenario application" by the end of 2026, promoting high-level empowerment of new industrialization by artificial intelligence. At the industry platform level, on April 29th, the National Dataset Management Service Platform was released and launched for trial operation, providing public service capabilities covering the entire lifecycle of datasets. As of May 31st, 516 certified institutions have released 1350 datasets covering key areas such as agriculture, industrial manufacturing, transportation, and cultural tourism. As of the first quarter of this year, over 116000 high-quality datasets have been built nationwide, with a total volume of over 960PB. As of March this year, the daily average number of token calls in China has exceeded 140 trillion. Since the beginning of this year, many regions have actively responded by proposing the construction of high-quality datasets. The "Special Action Plan for the Construction of High Quality Industry Data Sets in Shandong Province" issued by the Shandong Provincial Big Data Bureau shows that by the end of 2026, about 2 specialized data sets will be built in 16 key industries such as industrial manufacturing and transportation; By the end of 2027, a total of 50 high-quality datasets will be built, and specific requirements will be proposed to strengthen public data supply, accelerate enterprise data development, strengthen data supply and demand docking, and develop the data annotation industry. In addition, in order to thoroughly implement the national deployment on improving the data efficiency of state-owned enterprises, the Guangdong Provincial Government Service and Data Management Bureau, together with the Guangdong Provincial State owned Assets Supervision and Administration Commission, recently officially launched the Guangdong Provincial High Quality Data Efficiency Improvement Action for State owned Enterprises. Zong Jianshu, Chief Analyst of the Computer Industry at Changjiang Securities, stated that China's large-scale model industry continues to develop rapidly. As the fundamental resource for training and optimizing large-scale models, the quality and diversity of datasets directly affect the performance and effectiveness of large-scale models. High quality datasets, as key production materials for the industrialization of artificial intelligence, are expected to become the core hub connecting industry scenarios, model training, intelligent agent applications, and data value release. The construction of high-quality datasets has gradually shifted from "advocating construction" to "building according to standards, piloting according to mechanisms, and promoting according to systems", and the industrial development is expected to further accelerate. According to a research report by Jishi Information, the large-scale construction of high-quality datasets will further drive the rapid growth of three billion dollar software sub sectors: the construction and service of high-quality industry datasets, industry knowledge graphs and intelligent agent knowledge bases, and synthetic data generation and data privacy protection platforms, injecting new growth momentum into the development of China's software industry. (Looking into the New Era)

Edit:Momo    Responsible editor:Chen zhaozhao

Source:Xinhua News Agency

Special statement: if the pictures and texts reproduced or quoted on this site infringe your legitimate rights and interests, please contact this site, and this site will correct and delete them in time. For copyright issues and website cooperation, please contact through outlook new era email:lwxsd@liaowanghn.com

Return to list

Recommended Reading Change it

Links

Submission mailbox:lwxsd@liaowanghn.com Tel:020-817896455

粤ICP备19140089号-4 Copyright © 2019 by www.outlooknewera.com.cn all rights reserved

>