当前位置: X-MOL 学术Earth Syst. Sci. Data › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Data mining-based machine learning methods for improving hydrological data a case study of salinity field in the Western Arctic Ocean
Earth System Science Data ( IF 11.4 ) Pub Date : 2024-05-03 , DOI: 10.5194/essd-2024-138
Shuhao Tao , Ling Du , Jiahao Li

Abstract. In the Western Arctic Ocean lies the largest freshwater reservoir in the Arctic Ocean, the Beaufort Gyre. Long-term changes in freshwater reservoirs are critical for understanding the Arctic Ocean, and data from various sources, particularly measured or reanalyzed data, must be used to the greatest extent possible. Over the past two decades, a large number of intensive field observations and ship surveys have been conducted in the western Arctic Ocean to obtain a large amount of CTD data. Multiple machine learning methods were evaluated and merged to reconstruct annual salinity product in the western Arctic Ocean over the period 2003–2022. Data mining-based machine learning methods make use of variables determined by physical processes, such as sea level pressure, sea ice concentration, and drift. Our objective is to effectively manage the mean root mean square error (RMSE) of sea surface salinity, which exhibits greater susceptibility to atmospheric, sea ice, and oceanic changes. Considering the higher susceptibility of sea surface salinity to atmospheric, sea ice, and oceanic changes, which leads to greater variability, we ensured that the average root mean square error of CTD and EN4 sea surface salinity field during the machine learning training process was constrained within 0.25 psu. The machine learning process reveals that the uncertainty in predicting sea surface salinity, as constrained by CTD data, is 0.24 %, whereas when constrained by EN4 data it reduces to 0.02 %. During data merging and post-calibrating, the weight coefficients are constrained by imposing limitations on the uncertainty value. Compared with commonly used EN4 and ORAS5 salinity in the Arctic Ocean, our salinity product provide more accurate descriptions of freshwater content in the Beaufort Gyre and depth variations at its halocline base. The application potential of this multi-machine learning results approach for evaluating and integrating extends beyond the salinity field, encompassing hydrometeorology, sea ice thickness, polar biogeochemistry, and other related fields. The datasets are available at https://zenodo.org/records/10990138 (Tao and Du, 2024).

中文翻译:

基于数据挖掘的机器学习改进水文数据的方法——以北冰洋西部盐度场为例

摘要。北冰洋西部有北冰洋最大的淡水水库——波弗特环流。淡水水库的长期变化对于了解北冰洋至关重要,必须最大限度地利用各种来源的数据,特别是测量或重新分析的数据。二十年来,在北冰洋西部地区进行了大量密集的野外观测和船舶调查,获得了大量的CTD数据。对多种机器学习方法进行了评估和合并,以重建 2003 年至 2022 年期间北冰洋西部的年盐度产品。基于数据挖掘的机器学习方法利用物理过程确定的变量,例如海平面压力、海冰浓度和漂移。我们的目标是有效管理海面盐度的均方根误差 (RMSE),它对大气、海冰和海洋变化表现出更大的敏感性。考虑到海面盐度对大气、海冰和海洋变化的敏感性较高,导致变异性较大,我们确保机器学习训练过程中CTD和EN4海面盐度场的平均均方根误差限制在0.25 马力。机器学习过程表明,在 CTD 数据约束下,预测海面盐度的不确定性为 0.24%,而在 EN4 数据约束下,预测海面盐度的不确定性降至 0.02%。在数据合并和后校准期间,通过对不确定性值施加限制来约束权重系数。与北冰洋常用的EN4和ORAS5盐度相比,我们的盐度产品更准确地描述了波弗特环流的淡水含量及其盐跃层底部的深度变化。这种多机器学习结果评估和整合方法的应用潜力超出了盐度领域,涵盖水文气象学、海冰厚度、极地生物地球化学和其他相关领域。数据集可在 https://zenodo.org/records/10990138 上获取(Tao 和 Du,2024)。
更新日期:2024-05-03
down
wechat
bug