Accelerate the construction of an independent and controllable scientific data system
2025-10-27
Scientific data has become increasingly fundamental and strategic in the accelerated evolution of research paradigms driven by data and intelligence. It is of great significance for China to achieve high-level technological self-reliance, enhance international competitive discourse power, and develop new quality productivity. At present, there are still many problems in the construction of China's scientific data system, such as fragmented management. It is urgent to coordinate and plan, establish a sound scientific data management system, and integrate policies, standards, technologies, platforms, services, and applications throughout the entire chain. The construction of a scientific data system is of great significance. Scientific data, also known as research data or scientific research data, is evidence created and collected by researchers in scientific and technological innovation activities, and adds value in the process of application and dissemination. From the perspective of scientific data attributes, compared with other types of data such as personal data, public data, industrial (industry) data, enterprise data, and government data, scientific data has both uniqueness and intersectionality. Especially with the transformation of scientific research paradigms and the deep integration of technological innovation and industrial innovation, the boundary between scientific data and industrial (industry) data is becoming increasingly blurred, and the relationship between scientific data and personal behavior data is becoming closer. Scientific data itself also has public data attributes, but it is also public data with scientific research characteristics. Accelerating the construction of a scientific data system and building a full lifecycle governance framework for scientific data is of great significance at present. Firstly, to empower scientific research with artificial intelligence and reshape the paradigm of scientific discovery. The current artificial intelligence and interdisciplinary research require high-quality, high-value density data, breaking down data boundaries and promoting interconnectivity. The scientific data system provides systematic, standardized, and accessible training "fuel" for artificial intelligence models, which is the foundation for driving the role of artificial intelligence in scientific research such as material design and drug screening, thereby accelerating the discovery of new laws and breakthroughs in major scientific problems. Secondly, defend data sovereignty in the game of great powers and ensure national security. Scientific data has become a strategic resource, and building an independent and controllable scientific data system to ensure the independent and controllable acquisition, storage, and processing of scientific data in key areas can provide important alternative support, which is crucial for national security. Thirdly, support the construction of a strong technological country in China and achieve high-level technological self-reliance and self-improvement. Through the construction of a scientific data public platform, limited research funds can be more concentrated on original research and key technological breakthroughs, thereby improving the overall efficiency of national science and technology investment and output. A high-quality scientific data system can also attract top global scientific and technological talents, accelerate the transformation of scientific and technological achievements, and provide strong data-driven power for cultivating new quality productivity. Although China's scientific data governance capacity has significantly improved, there are still problems in the construction of the scientific data system, such as lack of systematic planning, fragmented management, lack of high-quality databases, and insufficient resource investment. Firstly, there is a lack of systematic planning at the national level, making cross departmental coordination difficult. The policies that have been introduced lack a unified management framework, resulting in difficulties in multi headed management and cross departmental coordination. Due to the lack of cross departmental coordination agencies, data holding departments have encountered situations of "unwilling to share", "afraid to share", and "unable to share" due to risk avoidance considerations such as data control, data security, and intellectual property. Secondly, scientific data management is fragmented, and there are problems in key stages of the lifecycle. In the process of data submission, the main responsible units for scientific research lack initiative in submitting data, and some of the submitted data is of low quality, lacking a sustainable submission mechanism. In the process of data sharing, a large amount of scientific data is still distributed among individual researchers, and data holders are unwilling to share it due to unclear ownership, lack of guaranteed benefits, security concerns, and other reasons. In the field of data application, there is a lack of timely response to the latest application scenarios such as empowering scientific research with artificial intelligence, and an emerging discipline database has not yet been constructed. In terms of standard system, there is a lack of unified interdisciplinary and interdisciplinary data standards, and mandatory standards are lacking, making it difficult to integrate and utilize. Thirdly, there is a lack of high-quality databases and excessive reliance on foreign countries for basic software. According to the statistics of the 2024 global scientific data repository registration platform, there are currently 3300 registered databases, of which only 63 are led by China, and there are generally problems with incomplete data, untimely updates, and uneven quality. At the same time, there is a heavy reliance on foreign open source or commercial software, such as GEE, Pytorch, Neo4j, DOI technology resource identification systems, among which technology resource identification services have experienced several supply disruptions, seriously affecting global access to resources in China. Fourthly, there is insufficient investment of resources and a lack of incentive and guarantee mechanisms. The construction of scientific data in our country has not been given the same important status as scientific research instruments, and there is a lack of scientific and technological special support. Lack of incentive and guarantee mechanisms, failure to include scientific data in the evaluation system of scientific and technological achievements, insufficient economic incentives, and difficulty in evaluating the professional titles of scientific researchers engaged in data work have led to serious personnel turnover. The urgent task of coordinating and accelerating the construction of a scientific data system is to break free from departmental interests and the mindset of "patching up", with the goal of building a strong technological country. Through top-level design, institutional innovation, platform construction, and scenario driven approaches, we aim to achieve self-reliance and self-improvement in scientific data. Firstly, strengthen top-level design, redefine "scientific data", and coordinate the allocation of resources from various departments. Promote the revision of the "Scientific Data Management Measures", redefine "scientific data" and its boundaries, refine and enrich the attributes of scientific data, and adapt it to the requirements of the new era. Anchor the goal of building a strong technological country and clarify the strategic positioning of scientific data as the "technological infrastructure". Strengthen departmental coordination and collaboration, establish national level major science and technology projects, empower chief scientists with the authority to dispatch scientific data within the project framework, and promote scientific data sharing. Secondly, explore institutional innovation, promote the definition of ownership, and improve incentive and evaluation policies for scientific researchers. Change the traditional mindset of researchers who value papers over data, and include scientific data in the category of technological achievements, accepting the value of data as an independent achievement. Promote the commercialization of data, draw on intellectual property models, and encourage data processing products to enter the market. Reform the assessment and evaluation mechanism, incorporating data resource construction, management capabilities, and open sharing effectiveness into the evaluation system of scientific research institutions, universities, and other institutions. In the professional title evaluation, talent plan selection, and performance assessment of scientific researchers, the creation, maintenance, sharing, and widespread impact of high-quality scientific datasets are important criteria. Establish a channel for the title of data engineer and encourage researchers to participate in data governance. Thirdly, build a platform system, increase resource investment and integration, and improve the construction of standard identification. Optimize the scientific data platform system and build a "1+M+N" hierarchical governance system for scientific data, consisting of the National Science Data Center, Provincial Nodes, and Field Centers. National level scientific data centers focus on basic disciplines and "bottleneck" fields, while provincial scientific data centers covering the central and western regions focus on applied disciplines. Scientific data platforms built by universities, enterprises, and other institutions are gradually integrating with metadata frameworks. Gradually establish a hierarchical scientific data system by connecting the scientific data chains of five main entities: scientific data centers, national laboratories, major scientific and technological projects, academic journals, and corporate institutions. National and local governments establish special projects to support the processing and application of scientific data, ensuring the full lifecycle management of scientific data. The National Science Data Center provides unified public services such as identification services, security scanning, and citation tracking. Fourthly, strengthen scenario driven approaches, expand data applications, and support the integration of technological innovation and industrial innovation. Establish a joint special project for "data algorithm" to support basic theoretical research and paradigm innovation based on the integration of high-quality scientific data and advanced algorithms. Encourage enterprises to develop vertical models using national scientific data and explore customized data application solutions based on their own characteristics. Encourage secondary research, product development, and service innovation based on open data. Promote the deep application of scientific data in various industries and establish industrial integration demonstration zones. (New Society)
Edit:Momo Responsible editor:Chen zhaozhao
Source:Science and Technology Daily
Special statement: if the pictures and texts reproduced or quoted on this site infringe your legitimate rights and interests, please contact this site, and this site will correct and delete them in time. For copyright issues and website cooperation, please contact through outlook new era email:lwxsd@liaowanghn.com