China Pursues Weather Forecasting Sovereignty with Its Own Developed CMA-RA V1.5 Dataset

5

As artificial intelligence reshapes the architecture of weather forecasting, meteorological data have emerged as a core strategic resource for nations. In this new era, control over high-quality atmospheric datasets is no longer merely a scientific concern but a matter of national security, technological sovereignty, and economic competitiveness.

Against this backdrop, China is accelerating the development of its own atmospheric reanalysis datasets in order to reduce long-standing reliance on European-dominated products and to align with broader national strategies on data security and technological self-reliance. 

For years, the global benchmark in climate data has been the European Centre for Medium-Range Weather Forecasts’ fifth-generation atmospheric reanalysis dataset, known as ERA5. Covering more than 80 years of historical data and continuously updated, ERA5 integrates global observations to reconstruct comprehensive climate records. It provides detailed variables including precipitation, temperature, and wind, and has become foundational to the artificial intelligence revolution in meteorology. Many leading Chinese-developed AI weather models have relied heavily on ERA5 for training.

However, dependence on external datasets raises strategic concerns. The value of meteorological data now extends far beyond routine weather forecasting. By reconstructing long-term atmospheric conditions, reanalysis datasets such as ERA5 are essential for understanding climate trends, improving forecast accuracy, and supporting disaster risk management. 

Governments worldwide use ERA5 to assess and manage risks from floods, wildfires, and other natural hazards, while insurance companies incorporate it into catastrophe modeling frameworks. The European Union has estimated that the dataset generates hundreds of millions of dollars in economic value annually. Yet Andreas Prein, professor of weather and climate modeling at ETH Zurich, has emphasized that weather forecasting is closely tied to national security. Excessive reliance on external data sources, he warns, can leave a country in a vulnerable and reactive position.

In response to these concerns, China has moved to secure greater autonomy in atmospheric data infrastructure. In a statement released in September, the National Data Administration announced that the China Meteorological Administration (CMA) had launched a global atmospheric reanalysis system development project. 

One of its central objectives is to break China’s operational dependence on European and American reanalysis products. That same month, the CMA opened global download access to its updated dataset, CMA-RA V1.5, marking the first time this new-generation reanalysis product has been made publicly available. According to the agency, several domestic AI weather models have already begun training on the dataset.

CMA-RA V1.5 demonstrates notable technical advances that signal China’s transition in the reanalysis field from following global leaders, to matching them, and in some areas achieving a leading position. One major breakthrough lies in data assimilation technology. The system incorporates a four-dimensional ensemble-variational hybrid assimilation framework, overcoming multiple technical bottlenecks. Satellite data assimilation in the early 20-year period increased by 13 percent, while the construction of a flow-dependent background error covariance matrix has enhanced assimilation efficiency. Product quality has surpassed earlier regional datasets such as CRA-40 and Japan’s JRA-55.

A second advance involves the integration of domestically controlled observation data. The dataset incorporates China-specific observational sources and includes independently developed radiosonde bias-correction techniques. In total, CMA-RA V1.5 assimilates data from 116 satellites encompassing 215 types of instruments, including 37 Chinese satellites covering 45 instrument categories. Domestic satellite data account for up to 18 percent of the assimilated observations, strengthening national data autonomy.

Third, the dataset achieves internationally competitive spatial resolution and timeliness. Its model resolution reaches 13 kilometers, with post-processing refinement to 10 kilometers, and it updates on an hourly basis in near real time. By comparison, ERA5 operates at a 25-kilometer resolution and is typically updated with a five-day delay. This combination of higher spatial resolution and shorter latency enhances the dataset’s suitability for both operational forecasting and AI model training.

The practical applications of CMA-RA V1.5 are already expanding. The dataset now serves 18 sectors, including agriculture, energy, and transportation, and supports more than 3,600 users. In the renewable energy sector, its 100-meter wind data have improved wind farm site selection, increasing power generation efficiency by approximately 15 percent. In agriculture, downscaled temperature and precipitation data have helped optimize planting strategies, reducing annual grain losses by an estimated five million tons.

The dataset is also gaining traction in academic and entrepreneurial circles. Professor Su Hui of the Hong Kong University of Science and Technology is incorporating CMA-RA V1.5 into the work of her meteorological technology startup, Stellerus, using it to train regional AI weather models and evaluate numerical forecasting systems. She notes that one of the dataset’s key strengths is its finer global grid resolution compared with ERA5. The combination of high spatial and temporal resolution provides a vast and detailed data foundation for machine learning applications.

International industry stakeholders are also taking notice. David Whitehead, head of meteorological risk management at the Finnish listed company Vaisala Oyj, has suggested that broader international access to Chinese meteorological data could stimulate the development and brokerage of weather derivatives in global markets. 

Vaisala, which specializes in providing meteorological data for financial hedging, has already begun exploring potential applications of CMA-RA V1.5. Rémi Gandoin, product development manager at the Danish engineering consultancy C2Wind, has observed that ERA5 contains certain biases and limitations, and that integrating multiple datasets can benefit researchers studying climate change and extreme weather. Such integration can also provide wind project developers with more robust data support for engineering design and decision-making.

Looking ahead, experts increasingly argue that the future of meteorological science and climate risk management lies not in reliance on a single global dataset but in the coexistence of multiple high-quality data systems. A diversified data ecosystem enhances resilience, reduces systemic vulnerability, and supports innovation across forecasting, energy planning, disaster mitigation, and financial risk modeling. 

As artificial intelligence becomes ever more central to weather prediction and climate services, the strategic significance of independently developed atmospheric datasets will continue to grow. In this context, CMA-RA V1.5 represents not only a technical milestone but also a broader shift in how nations approach data sovereignty and strategic capability in the age of AI-driven meteorology.

Source: sina, sohu, szhk