【KDD19】Deep Uncertainty Quantification: A Machine Learning Approach for Weather Forecasting|电子爱好者

admin管理员组
文章数量:1652184

address

http://urban-computing/pdf/kdd19-BinWang.pdf

ABSTRACT

propose a novel negative log-likelihood error loss function(new loss function)
single-value forecasting and uncertainty quantification(deep uncertainty quantification)
this approach significantly improves accuracy by 47.76% , which is a state-of-the-art result on this benchmark dataset

一、INTRODUCTION

The most common method currently utilized in meteorology is the use of physical models to simulate and predict meteorological dynamics known as numerical weather prediction, or NWP may not be reliable due to the instability
the NWP may not be reliable due to the instability of these differential equations
require big data and a tedious amount of feature engineering

二、RELATED WORKS

三、OUR METHOD

3.1 Problem Statement

3.1.1 Notations.

for a weather station s .

E为历史气候特征数据
D为将来的数字预报数值和station ID ，timestamps
X为【ED】即输入的数据
Y为目标预测数据，即这里应该算是ture target value

3.1.2 Task Definition

给于X，预测Y的估计值。真实Y在估计近似Y的一个区间内，并满足一个容忍可能性

数据为当天凌晨3点到第二天的15点。即37个小时的预测区间。
target value为t2m、 rh2m、w10m 。

The proposed method can be easily extended for any time interval prediction and more target variables.

3.2 Information Fusion Methodology

过去三年的target values，明显的可以看出t2m有季节性，二rh2m和w10m noise太多不予判断。

E用来建模近来的气候流体力学。

把NWP 加入到D中，这样可以很正确的吸取NWP中的重要的信息

求出凌晨3点到第二天15点，时长为37个小时的每个时刻的该target值求得对应的mean值。
可以从图中明显存在 mean值和方差的统计不同，所谓方差也就是图像斜率，这里我个人理解为图形的变换趋势。
并且可以明显发现id为7的station和其他站点有明显的不同。

对应于每一个station的图形的每一个时刻的mean和方差也不同

最终解决方案，将station ID和time ID带入D中

3.3 Data Preprocessing

3.3.1 Missing values

有两种缺失值的情况

缺失一天（删除这一天的对应values）
缺失一天中的连续时间（用线性差值办法来解决）

3.3.2 Normalization of Continuous Variables

输入数据进行 normalization
输出的结果进行 denormalization

3.3.3 Category Variables

two category variables

time ID
station ID

Rather than hard-coding, such as onehot or sin-cosine coding.

3.3.4 Input/Output Tensors

设定tensor每个dimension的类型数据。

I day index
T为time index
S station index
N为features

I is the date index and S is the station index.
这样的优点是更好的扩展性和在实验中好检查。

3.4 Model Architecture

主要应用于seq2seq model framework（seq2seq tensorflow学习笔记）
encoder：

意思为通过输入E，extract representation c，这里的c ，paper中的解释是得到了从历史真实数据中提取出的气候动力学系数，并把这个值当做decoder的输入initial state
decoder：
decoder输入有timeID ， stationID，以及UWP

并且在timeID和stationID后加了两个embedding层，来获得embedding representation，如下图所示

之后他这里稍微说了一下model的结构并且把encoder和decoder的输入当做一个输入为X，之后输出为一个点估计
这里需要读到后面才能知道，输出为每个target value的两个值一个是方差一个是平均值，所以文章中也提到可以估计出一个区间。

3.5 Learning Phase

DUQ 在每个timestamp预测的两个值为 mean和variance

y0(t) 为第i天，第s个station，在timestep t的目标变量o的真实值。
为方差
为mean

整个训练过程如图所示，这里比较easy，不做说明

3.6 Inference Phase

我个人感觉这篇论文看到这里才算是真的融会贯通了，输出就是mean和方差。

这里把上下界说的很清楚了，就是把方差当成偏差，之后用mean加减方差。

输出从左到右分别是，估计的下届，估计的上届，估计的target value，方差

输出算出来的上下界需要进行denormalization

3.7 Inference Phase

3.8 Evaluation Metrics

3.8.1 Point Estimation Measurement.

利用RMSE来进行判断。
以下是每个station、每个时刻的一个target value的RMSE

这里定义每一天的RMSE

又定义了一个评分，SSobj越高，说明ML方法比NWP方法更好。

定义一天的

Savg为所有天的SSday的平均值

3.8.2 Prediction Interval Measurement.

为什么要有预测空隙测量方法呢？因为本篇文章的亮点在于预测出一个上下界的范围。
接下来就是如何来计算出多少点在这个范围之内。

定义的这个小系数的主要作用就是在于，如果这个值在预测的估计上下限范围之内则为1.否则为0，可以把这个系数当做用来计算是否覆盖的一个中间计算变量。

计算出所有的数量，即在每一个时刻，每一个站点的目标值是否在估计的范围之内。

prediction interval coverage probability (PICP)即每一个站点和时刻的覆盖数，即预测间隔覆盖率

四. EXPERIMENTS AND PERFORMANCE ANALYSIS

4.1 Baselines

4.2 Experimental Environments

GPU+Keras

4.3 Parameter Settings for Reproducibility

paper说了一堆参数的问题，说了epoch不用太在意，因为使用了early-stopping。

1148 是训练集的天数 87是验证集的天数
28为之前可知的28个小时
37为预测的将来37个小时
Encoder的9为观测数据特征值
Decoder inputs的31为NWP的29个特征值+stationID+timeID
Decoder outpus的3为三个target value的真实观测数据。

4.4 Performance analysis

4.4.1 Effect of information fusion

对于有无obs和有无NWP最后得出结论
单独的obs或者NWP都是效果不好的
obs可以建模出一定的气象流体力学的系数

4.4.2 Effect of deep learning

深度学习好于机器学习
比较不同层或者同一层不同网络点的问题。

4.4.3 Effect of loss function.

通过early-stopping来计算iteration的时间，发现NLE loss所需的时间最长，paper中的解释是NLE loss 考虑到mean loss+variance loss。经过两个任务的优化过程，所以收敛时间最长，并且这种优化可能会相当于一种正则化的功能。

4.4.4 Effect of ensemble.

多个DUQ model ensemble效果更好

Quality of prediction interval

绿色为实际obs数据，右侧为UWP数据，点估计数据。还有预测区间估计。
区间估计可以给于更好的旅游产品规划。 w10m波动较大，总体来说最难预测。

Future works

will be directed towards architecture improvement (e.g., attention
mechanism), automatic hyperparameter-tuning, and theoretical
comparison between NLE and MSE/MAE.

本文标签： Uncertainty Quantification Deep machine Weather

版权声明：本文标题：【KDD19】Deep Uncertainty Quantification: A Machine Learning Approach for Weather Forecasting 内容由热心网友自发贡献，该文观点仅代表作者本人，转载请联系作者并注明出处：https://m.elefans.com/xitong/1729578912a1207364.html，本站仅提供信息存储空间服务，不拥有所有权，不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容，一经查实，本站将立刻删除。

更多相关文章

xp系统

电子爱好者 - 最新技术资讯及电子产品介绍！

【KDD19】Deep Uncertainty Quantification: A Machine Learning Approach for Weather Forecasting

address

ABSTRACT

一、INTRODUCTION

二、RELATED WORKS

三、OUR METHOD

3.1 Problem Statement

3.1.1 Notations.

3.1.2 Task Definition

3.2 Information Fusion Methodology

3.3 Data Preprocessing

3.3.1 Missing values

3.3.2 Normalization of Continuous Variables

3.3.3 Category Variables

3.3.4 Input/Output Tensors

3.4 Model Architecture

3.5 Learning Phase

3.6 Inference Phase

3.7 Inference Phase

3.8 Evaluation Metrics

3.8.1 Point Estimation Measurement.

3.8.2 Prediction Interval Measurement.

四. EXPERIMENTS AND PERFORMANCE ANALYSIS

4.1 Baselines

4.2 Experimental Environments

4.3 Parameter Settings for Reproducibility

4.4 Performance analysis

4.4.1 Effect of information fusion

4.4.2 Effect of deep learning

4.4.3 Effect of loss function.

4.4.4 Effect of ensemble.

Quality of prediction interval

Future works

更多相关文章

Deep Reinforcement Learning: Pong from Pixels

推荐项目：深度思考（Deep Thought）——智能部署，轻而易举

使用Deep Exploration进行STK 模型的转换-材质与贴图

Perhaps it was the deep customization of domestic

论文解读 Aggregated Deep Feature from Activation Clusters for Particular Object Retrieval

Deep-COVID：利用深度迁移学习在胸腔X射线图像识别COVID-19

ICCV 2019 |Deep Contextual Attention for Human-Object Interaction Detection论文阅读笔记

【Deep Linking】：从浏览器调起 APP

深度Deep系统的使用问题记录

(ICPR 20) DIP: Distinctive 3D local deep descriptors

CXPlain: Causal Explanations for Model Interpretation under Uncertainty

Machine Learning - The Knowledge of Mathematics

The fundamental values of concrete block making machine

文献阅读-5：Deep security analysis of program code A systematic literature review

[200715] Understanding of complex parts in literature of deep learning?

Learning deep representations by mutual information estimation and maximization

深度多视图信息瓶颈：Deep Multi-view Information Bottleneck

【语义分割】DFANet --Deep Feature Aggregation for Real-Time Semantic Segmentation

【KDD19】Deep Uncertainty Quantification: A Machine Learning Approach for Weather Forecasting

【论文笔记】An Improved Deep Learning Approach for Retrieving Outfalls Into Rivers From UAS Imagery

发表评论

推荐文章

JetBrains学生认证连接不到https:account.jetbrains.com 网站网址

Win7 文件加密存储操作后，如何在事后备份证书、秘钥

文件删不掉？ Chkdsk工具运行

win10误删计算机图标,Win10电脑回收站图标不见了怎么恢复？

Disk genius（Diskgenius）修复硬盘分区表

热门文章

Xcode No account for team &quot;&quot;. Add a new account in the Accounts preference pane or verify

实现Vue移动端的PDF预览

一看就会，安装VMware虚拟机和linux的基本配置和卸载安装的虚拟机

如何在windows电脑中安装OpenSSH服务? (包含面向Windows10和非Windows10两种方法) (〃^ω^)微软大法好

Masked Feature Prediction for Self-Supervised Visual Pre-Training

Windows11如何让桌面图标的箭头消失(去掉快捷键箭头)

电脑上我的文档图标不见了怎么办

无乐不作android手机版,酷狗音乐9.4.4版本

大数据第一天 Hadoop01-入门&amp;集群环境搭建

关于Oracle11g回收站(Recycle Bin)对象的进入与恢复实验

最新文章

CDR2024破解完整版下载安装永久激活最新

windows系统激活时间查询

中文linux 老旧电脑,安装Bodhi Linux让老旧电脑重新焕发活力

网络安全初学者工具安装：Kali，Windows xp虚拟机，pikachu靶场，burpsuite安装配置，phpstudy安装（学习笔记）

XP SP3无法安装IIS 系统版本iis 5.1 iis 6

win7虚拟机黑苹果_苹果Mac虚拟机安装Win7系统的方法【图文教程】

Xcode No account for team "". Add a new account in the Accounts preference pane or verify

大数据第一天 Hadoop01-入门&集群环境搭建

【鬼泣5（Devil May Cry V）v1.0十四项修改】鬼泣5（Devil May Cry V）v1.0十四项修改官方免费下载