Google 因果推断的CausalImpact 贝叶斯结构时间序列模型（二十二）|电子爱好者

admin管理员组
文章数量:1576349

之前一篇:跟着开源项目学因果推断——CausalImpact 贝叶斯结构时间序列模型（二十一）

这里另外写一篇来继续研究一下CausalImpact 这个开源库的一些细节的

1 CausalImpact 一些可调参数

1.1 CausalImpact默认的两种算法

CausalImpact默认使用TensorFlow Probability来求的两种算法，分别是 Variational Inference和Hamiltonian Monte Carlo

VI，变分推断，变分推断（Variational Inference）进展简述
HMC，哈密尔顿蒙特卡洛 (HMC) ，参考：如何简单地理解「哈密尔顿蒙特卡洛 (HMC)」？
其中都可以使用GPU来加速，因为使用的是Tensorflow，但是HMC非常慢，复杂一些的算法很容易超过1H

切换的方式：

# HMC
ci = CausalImpact(data, pre_period, post_period, model_args={'fit_method': 'hmc'})

# VI
ci = CausalImpact(data, pre_period, post_period, model_args={'fit_method': 'vi'})

1.2 model_args的可调节模型参数

CausalImpact中，model_args还可以做几种修改：

standardize: bool
   If `True`, standardizes data to have zero mean and unitary standard
   deviation.
prior_level_sd: Optional[float]
   Prior value for the local level standard deviation. If `None` then an
   automatic optimization of the local level is performed. This is
   recommended when there's uncertainty about what prior value is
   appropriate for the data.
   In general, if the covariates are expected to be good descriptors of the
   observed response then this value can be low (such as the default of
   0.01). In cases when the linear regression is not quite expected to fully
   explain the observed data, the value 0.1 can be used.
fit_method: str
   Which method to use for the Bayesian algorithm. Can be either 'vi'
   (default) or 'hmc' (more precision but much slower).
nseasons: int
 Specifies the duration of the period of the seasonal component; if input
 data is specified in terms of days, then choosing nseasons=7 adds a weekly
 seasonal effect.
season_duration: int
 Specifies how many data points each value in season spans over. A good
 example to understand this argument is to consider a hourly data as input.
 For modeling a weekly season on this data, one can specify `nseasons=7` and
 season_duration=24 which means each value that builds the season component
 is repeated for 24 data points. Default value is 1 which means the season
 component spans over just 1 point (this in practice doesn't change
 anything). If this value is specified and bigger than 1 then `nseasons`
 must be specified and bigger than 1 as well.

其中，

standardize = True，需要给入的数据就是标准化过的；
nseasons: int，如果想by week来看影响，可以设置为：

ci = CausalImpact(data, pre_period, post_period,nseasons=[{'period': 7}],trend=True)

prior_level_sd,设定先验标准差
season_duration: int，这个要与nseasons联合来看

举一个例子，比如我们现在有by hour的数据，然后我们希望By week来看效果，那么需要设定为：

df = pd.DataFrame('tests/fixtures/arma_data.csv')
df = df.set_index(pd.date_range(start='20200101', periods=len(data), freq='H'))
pre_period = ['20200101 00:00:00', '20200311 23:00:00']
post_period = ['20200312 00:00:00', '20200410 23:00:00']
ci = CausalImpact(df, pre_period, post_period, model_args={'nseasons': 7,   'season_duration': 24})
print(ci.summary())

这里可以这么理解，如果是Hour粒度数据需要By week看，那就需要每隔 7*24个数据点作为一个batch，所以这里就是nseasons * season_duration

1.3 CausalImpact自定义模型

如果除了提供的VI / HMC都不满足，自己可以自定义：

      import tensorflow_probability as tfp


      pre_y = data[:70, 0]
      pre_X = data[:70, 1:]
      obs_series = data.iloc[:, 0]
      local_linear = tfp.sts.LocalLinearTrend(observed_time_series=obs_series)
      seasonal = tfp.sts.Seasonal(nseasons=7, observed_time_series=obs_series)
      model = tfp.sts.Sum([local_linear, seasonal], observed_time_series=obs_series)

      ci = CausalImpact(data, pre_period, post_period, model=model)
      print(ci.summary())

分别在3.7.4 和 2.3的案例中都会采用自定义

1.4 CausalImpact如何数据标准化

来看一个来自[test_main.py ]，数据是官方项目里面的数据，例子：

import numpy as np
import pandas as pd
import pytest
import tensorflow as tf
import tensorflow_probability as tfp
from numpy.testing import assert_array_equal
from pandas.util.testing import assert_frame_equal

from causalimpact import CausalImpact
from causalimpact.misc import standardize

data = pd.read_csv('tests/fixtures/btc.csv', parse_dates=True, index_col='Date')
training_start = "2020-12-01"
training_end = "2021-02-05"
treatment_start = "2021-02-08"
treatment_end = "2021-02-09"
pre_period = [training_start, training_end]
post_period = [treatment_start, treatment_end]

pre_data = rand_data.loc[pre_int_period[0]: pre_int_period[1], :]
# 标准化
normed_pre_data, (mu, sig) = standardize(pre_data)
# mu / sig如何让原数据post_data  ->  标准化之后的normed_post_data 

normed_post_data = (post_data - mu) / sig

使用causalimpact.misc进行标准化;
如果你自行标准化之后，在训练模型的时候，需要：

model_args == {'fit_method': 'hmc', 'niter': 1000, 'prior_level_sd': 0.01,  'season_duration': 1, 'nseasons': 1, 'standardize': True}

2 协变量回归系数解读

在整个模型中会有:
y t = Z t T α t + β X t + G t ϵ t y_t = Z^T_t\alpha_t + \beta X_t + G_t\epsilon_t yt=ZtTαt+βXt+Gtϵt

其中线性部分，拆出来单独理解

2.1 Horseshoe prior的Bayes回归

来自统计之都的一篇文章先认识一下Horseshoe prior：
使用Horseshoe 先验的Bayes回归及代码解析
以及：
稀疏数据分析：马蹄估计量及其理论性质

也贴一段，tensorflow_probability感觉写的很好：

  The Cauchy scale parameters puts substantial mass near zero, encouraging
  weights to be sparse, but their heavy tails allow weights far from zero to be
  estimated without excessive shrinkage. The horseshoe can be thought of as a
  continuous relaxation of a traditional 'spike-and-slab' discrete sparsity
  prior, in which the latent Cauchy scale mixes between 'spike'
  (`scales[i] ~= 0`) and 'slab' (`scales[i] >> 0`) regimes.
  Following the recommendations in [2], `SparseLinearRegression` implements
  a horseshoe with the following adaptations:
  - The Cauchy prior on `scales[i]` is represented as an InverseGamma-Normal
    compound.
  - The `global_scale` parameter is integrated out following a `Cauchy(0.,
    scale=weights_prior_scale)` hyperprior, which is also represented as an
    InverseGamma-Normal compound.
  - All compound distributions are implemented using a non-centered
    parameterization.
  The compound, non-centered representation defines the same marginal prior as
  the original horseshoe (up to integrating out the global scale),
  but allows samplers to mix more efficiently through the heavy tails; for
  variational inference, the compound representation implicity expands the
  representational power of the variational model.
  Note that we do not yet implement the regularized ('Finnish') horseshoe,
  proposed in [2] for models with weak likelihoods, because the likelihood
  in STS models is typically Gaussian, where it's not clear that additional
  regularization is appropriate. If you need this functionality, please
  email tfprobability@tensorflow.

Horseshoe prior是一种稀疏bayes监督学习的方法。通过对模型参数的先验分布中加入稀疏特征，从而得到稀疏的估计。

horseshoe prior属于multivariate scale mixtures of normals的分布族。所以和其他常用的稀疏bayes学习方法，Laplacian prior, (Lasso), Student-t prior，非常类似。

其中 λ j \lambda_j λj 叫做local shrinkage parameter，局部压缩参数，对于不同的 θ j \theta_j θj 可以有不同的压缩系数，
τ \tau τ 叫做global shrinkage parameter

其中，half-Cauchy分布的概率密度函数：

回看上面三层先验，上面 a a a就可以带入各自的先验：

考虑 λ i \lambda_i λi 的边缘先验分布

定义 κ i = 1 / ( 1 + λ i 2 ) \kappa_i=1/(1+\lambda_i^2) κi=1/(1+λi2)这个量在Bayesian shrinkage中非常重要，我们在下一个小标题介绍它的意义，但我们可以先分析它的先验分布。现在我们只想做一点定性分析，了解一下 κ i \kappa_i κi 的先验的形状，所以简单起见假设 σ = τ = 1 \sigma=\tau=1 σ=τ=1，于是

因此 k i ∼ B e t a ( 1 / 2 , 1 / 2 ) ki \sim Beta(1/2,1/2) ki∼Beta(1/2,1/2)，来看 k i ki ki的先验分布，这里可以看到 a a a= β \beta β=0.5的时候，就会出现马蹄状，基于这种先验的贝叶斯方法被称为马蹄估计。

这里有一段tensorflow_probability 中的评价：

The Cauchy scale parameters puts substantial mass near zero, encouraging   weights to be sparse, but their heavy tails allow weights far from zero to be   estimated without excessive shrinkage.
The horseshoe can be thought of as a   continuous relaxation of a traditional 'spike-and-slab' discrete sparsity   prior, in which the latent Cauchy scale mixes between 'spike'   (`scales[i] ~= 0`) and 'slab' (`scales[i] >> 0`) regimes.

2.2 SparseLinearRegression的weight计算逻辑

因为在CausalImpact 会使用SparseLinearRegression，我来看一下回归部分的系数weight求解，参考下面公式：

来到tensorflow_probability的源代码中看到：

This is identical to `tfp.sts.LinearRegression`, except that   `SparseLinearRegression` uses a parameterization of a Horseshoe prior [1][2] to encode the assumption that many of the `weights` are zero,  i.e., many of the covariate time series are irrelevant.

可以看到local scales 是服从HalfCauchy分布：

scales[i] ~ HalfCauchy(loc=0, scale=1)
weights[i] ~ Normal(loc=0., scale=scales[i] * global_scale)`

由伪代码来看整个回归系数计算过程：

# Sample global_scale from Cauchy(0, scale=weights_prior_scale).
global_scale_variance ~ InverseGamma(alpha=0.5, beta=0.5)
global_scale_noncentered ~ HalfNormal(loc=0, scale=1)
global_scale = (global_scale_noncentered *
               sqrt(global_scale_variance) *
               weights_prior_scale)
# Sample local_scales from Cauchy(0, 1).
local_scale_variances[i] ~ InverseGamma(alpha=0.5, beta=0.5)
local_scales_noncentered[i] ~ HalfNormal(loc=0, scale=1)
local_scales[i] = local_scales_noncentered[i] * sqrt(local_scale_variances[i])
weights[i] ~ Normal(loc=0., scale=local_scales[i] * global_scale)

从weights[i] ~ Normal(loc=0., scale=local_scales[i] * global_scale)开始反着看，weight = local的 λ j \lambda_j λj 和 global的 τ \tau τ 的相乘。

2.3 案例推导回归系数的计算

如果默认VI算法的情况下：

import numpy as np
import pandas as pd
import pytest
import tensorflow as tf
import tensorflow_probability as tfp
from numpy.testing import assert_array_equal
from pandas.util.testing import assert_frame_equal

from causalimpact import CausalImpact
from causalimpact.misc import standardize

data = pd.read_csv('../tests/fixtures/btc.csv', parse_dates=True, index_col='Date')
training_start = "2020-12-01"
training_end = "2021-02-05"
treatment_start = "2021-02-08"
treatment_end = "2021-02-09"
pre_period = [training_start, training_end]
post_period = [treatment_start, treatment_end]


rand_data = data
pre_int_period = pre_period
post_int_period = post_period
ci = CausalImpact(rand_data, pre_int_period, post_int_period)


for name, values in ci.model_samples.items():
    print(f'{name}: {values.numpy().mean(axis=0)}')

本来官方教程里面默认算法下的一些标准差为：

print('Mean value of noise observation std: ', ci.model_samples[0].numpy().mean())
print('Mean value of level std: ', ci.model_samples[1].numpy().mean())
print('Mean value of linear regression x0, x1: ', ci.model_samples[2].numpy().mean(axis=0))

可以看到三个比较明显的指标，noise std，level std,回归系数。
那么现在的版本，可以看到ci.model_samples是一个dict形，然后得出有，

observation_noise_scale: 0.455566942691803
LocalLevel/_level_scale: 0.01129493024200201
SparseLinearRegression/_global_scale_variance: 0.8674567341804504
SparseLinearRegression/_global_scale_noncentered: 0.9352844953536987
SparseLinearRegression/_local_scale_variances: [1.4979423  1.3915676  2.5401886  0.82398146 1.3655316  1.3455669
 1.2680935  0.9430675  2.4824152 ]
SparseLinearRegression/_local_scales_noncentered: [0.6454335  1.1153866  1.5575289  0.39585915 0.85925627 0.964816
 0.7337909  0.7373393  1.4985642 ]
SparseLinearRegression/_weights_noncentered: [-0.31404912  1.3935399   1.904644   -0.769298    0.7848593   0.53083557
  0.9080193   0.37757748  1.6231401 ]

observation_noise_scale 和 LocalLevel/_level_scale 与之前一样，这里LR改成了，特有的SparseLinearRegression

再返回CausalImpact 的输出结果：

causalimpact 的issue
tensorflow_probability

先来看tensorflow_probability 的源码，可以从Line298开始看：

  def params_to_weights(
                        global_scale_variance,
                        global_scale_noncentered,
                        local_scale_variances,
                        local_scales_noncentered,
                        weights_noncentered,
  weights_prior_scale = 0.1):
    """Build regression weights from model parameters."""
    global_scale = (global_scale_noncentered *
                    tf.sqrt(global_scale_variance) *
                    weights_prior_scale)

    local_scales = local_scales_noncentered * tf.sqrt(local_scale_variances)
    return weights_noncentered * local_scales * global_scale[..., tf.newaxis]

其中weights_prior_scale是The weights_prior_scale determines the level of sparsity; small scales encourage the weights to be sparse. 是先验分布的参数，决定稀疏程度，一般默认为0.1

再来看一下weight输出的结果，是weights.shape == [num_users, num_features]，不像之前

再是causalimpact 的issue，参考 :
Printing averages of the posterior does not work #24

import tensorflow as tf


weights_prior_scale = 0.1
global_scale_nonentered = param_samples['SparseLinearRegression/_global_scale_noncentered']
global_scale_variance = param_samples['SparseLinearRegression/_global_scale_variance']
local_scales_noncentered = param_samples['SparseLinearRegression/_local_scales_noncentered']
local_scale_variances = param_samples['SparseLinearRegression/_local_scale_variances']
global_scale = global_scale_nonentered * tf.sqrt(global_scale_variance) * weights_prior_scale
weights_noncented = param_samples['SparseLinearRegression/_weights_noncentered']
local_scales = local_scales_noncentered * tf.sqrt(local_scale_variances)

weights = weights_noncented * local_scales * global_scale[..., tf.newaxis]

# 按样本的权重，100个
weights.numpy().mean(axis=1)

# 按特征的权重 ，2个
weights.numpy().mean(axis=0)

# 所有的平均权重，1个
weights.numpy().mean(axis=1).mean()

如果有了weight，那如何进行预测，可以简单看一下预测的逻辑：

predicted_timeseries = self.design_matrix.matmul(weights[..., tf.newaxis])

3 案例：干货 | 贝叶斯结构模型在全量营销效果评估的应用

参考链接：
https://cloud.tencent/developer/article/2329872

用户营销是促进留存及转化的重要方式，其中对用户进行消息触达是一大核心手段，尤其是在节假日的购票高峰期对用户进行推送，方式包括站内push；微信生态中的小程序订阅消息；公众号或是企微环境等等，目的包括但不限于提醒用户购票；宣传品牌功能；发放优惠券吸引用户转化等等。在节假日之后，我们希望对这次的营销触达进行效果评估。

这是一个较为典型的不适合进行AB实验的场景，首先因为节假日是流量高峰时期，如果严格预留50%用户不触达，可能会损失一批潜在的转化用户；如果改将对照组预留很少的人数，例如对照组：实验组=1:9，那对于后续的转化对比的科学性会产生影响。其次，节假日的推送策略往往非常精细化，总量达几十条，我们较难保证对照组用户的“纯粹性”，用户可能会被交叉触达。

基于种种问题，我们较难通过传统AB 的手段来评估推送带来的转化效果。因此我们考虑使用因果推断的方式来解决。常规可选的方法和潜在问题如下：

如果使用PSM，需要在大盘中寻找与推送人群相似但是没有被推送的用户作为对照组。但一般节假日推送时都会有兜底策略，几乎覆盖了95%以上的平台用户，较难从中找到符合条件但未被推送的人群来进行对照。
如果使用SCM，我们较难找到合适的对照组来合成。如评估度假BU的推送效果时，我们不太可能用火车、机票、酒店等各个产线合成一个“虚拟度假BU”，因为本身各个产线的用户需求就不同，使用这样合成的虚拟对照组来对比度假订单的转化率是不够科学的。
DID的方式也同理，我们很难找到一个满足平行趋势假设且业务场景相似的对照组来进行推送前后的对比。
综上所述，一些传统的因果推断方法纵使在技术上可行，在业务的解释性上也有所欠缺。而且，以上三种方式都没有考虑到用户购票行为的“时间周期性”。因此即使合成了对照组也不一定能够匹配到实验组真正的结构特点，进而导致效应值计算有偏。于是我们考虑首先验证用户购票的数据周期性；在定位到周期规律之后尝试使用BSTS模型结合CausalImpact来进行反事实值的预测。下文我们选择2022年端午的火车票营销推送场景进行实践。

具体实现方式：

考虑到端午作为节假日本身就有的自然流量增长，支付人数的提升不能完全归因于推送带来的，因此训练时序模型的时候，选取了19年-21年所有的端午数据（正端午前后10天）输入BSTS模型进行训练，得到端午这个窗口内的特有的结构状态，随后用这个结构化的模型来代入22年的端午数据，对2022年端午推送之后的转化人数做出预测。
最后使用真实的转化人数与预测人数作差体现本次营销推送的效果。

本文标签：二十推断因果序列模型

版权声明：本文标题：Google 因果推断的CausalImpact 贝叶斯结构时间序列模型（二十二）内容由热心网友自发贡献，该文观点仅代表作者本人，转载请联系作者并注明出处：https://m.elefans.com/dongtai/1727798931a1130600.html，本站仅提供信息存储空间服务，不拥有所有权，不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容，一经查实，本站将立刻删除。

电子爱好者 - 最新技术资讯及电子产品介绍！

Google 因果推断的CausalImpact 贝叶斯结构时间序列模型（二十二）

1 CausalImpact 一些可调参数

1.1 CausalImpact默认的两种算法

1.2 model_args的可调节模型参数

1.3 CausalImpact自定义模型

1.4 CausalImpact如何数据标准化

2 协变量回归系数解读

2.1 Horseshoe prior的Bayes回归

2.2 SparseLinearRegression的weight计算逻辑

2.3 案例推导回归系数的计算

3 案例：干货 | 贝叶斯结构模型在全量营销效果评估的应用

更多相关文章

大模型应用开发技术：Multi-Agent框架流程、源码及案例实战（六） 多智能体通用统计

【Unity 资源分享】 | Unity 精品写实动物模型+全套动画 资源分享

AI：大模型领域最新算法SOTA核心技术要点总结(一直持续更新)、大模型实战与理论经验总结(训练优化+代码实战+前沿技术探讨+最新案例应用)、带你精细解读多篇优秀的大模型论文、AI领域各种工具产品集合

NLP：《ChatGPT: Optimizing Language Models for Dialogue一种优化的对话语言模型》翻译与解读

LangChain带你轻松玩转ChatGPT等大模型开发

创业者必备知识SWOT分析模型+案例分析

使用Schema Workbench 设计一个模型文件

蜂巢输入法android,讯飞输入法首创云+端“蜂巢”输入模型

如何利用利用Processing Modflow软件建立地下水-地面沉降数值模型

医疗金融法律大模型：从ChatDoctor到BloombergGPTFinGPTFinBERT、ChatLawLawGPT_zh

百度输入法全面升级，打造首个基于大模型的输入法原生应用

Transformer再下一城！DeepMind新模型自动生成CAD草图，网友：建筑设计要起飞了

Navisworks软件二次开发：三维模型导入与导出技术

[嵌入式AI从0开始到入土]3_官方模型适配工具使用

revit模型怎么在手机上看_revit手机浏览器有哪些？怎样在浏览器中展示自己的Revit模型？...

九月十月百度，迅雷，华为，阿里巴巴最新校招笔试面试二十题（10.12）

sizebox模型下载_【gts游戏】sizebox的一些功能介绍以及使用教程

AI视界周刊第 1 期：大模型集体翻车、最具性价比 GPT-4o mini 发布、语言模型安全漏洞

AI视界周刊第 1 期：最具性价比 GPT-4o mini 发布、大模型集体失智、语言模型安全漏洞

NeurIPS 2022 | CATER：针对模型窃取的版权保护之选择性水印

发表评论

推荐文章

mrp手机qq2008下载-MRP格式的手机QQ2008软件介绍及如何正确安装MRPQQ2008挂Q版

简单Android app开发_个人如何开发手机app？做个app需要的技术？最简单app制作流程...

四、CNNs网络架构-深度可分离卷积（MobileNet v1、MobileNet v2、MobileNet v3、Xception、ShuffleNet v1、ShuffleNet v2）

HCIA配置命令集

Windows 8系统下USB无线网卡的驱动安装教程

热门文章

win10虚拟机安装VMware Tools

jar文件不能打开

关于Libra的思考：金融服务是否应该开源？

JavaScript动态创建script标签并执行js代码

ButterKnife 使用方法

面对电脑 directX 丢失的4大修复策略及工具下载【最新、最全】

win10win11更新后没有声音，音频服务未响应

Windows 10 和 Windows 7 TTS 无法朗读修复补丁

原创|推荐一款密码管理神器

阿里云ECS安装FTP服务

最新文章

windows安装adb方法及问题解决

域环境搭建、域用户、Windows Server 2012安装

Windows上Fidder everywhere安装与配置

raidrive安装失败_如何用RaiDrive无缝打通Windows本地硬盘？

jdk-18.0.0.1安装包下载安装教程

Elasticsearch-7.0和Logstash-7.0和Kibaa-7.0的下载以及安装（百度网盘）

安装Visual Studio,Installer无法下载（已解决）

在安卓手机上安装完整kali linux系统

在Windowslinux中安装Nessus

windows 下 win11 JDK17安装与环境变量的配置（配置简单详细,包含IJ中java文件如何使用命令运行)

Appium 服务端在 Windows下使用 npm 安装配置指南

python用pip安装pillow_Windows下Python中pip安装Pillow报错总结(转载)

怎样用计算机进入手机驱动,电脑没有手机驱动_怎么安装手机驱动_好特教程

最佳iOS设备管理器imazing 2.16.9官网MacWindows下载电脑版怎么下载安装

【Redis学习】：Windows环境下的Redis安装与配置

小米手机肿么还原时钟

15000流明是多少瓦

一般普通投影机功率多大?

苹果绿联转换器有些投影机不能用

坚果V9投影机具体参数?

有关九年级作文850字精选

80后90后_高一作文

中级卫生专业资格中医全科学主治医师中级模拟题2021年(9)案与解析

(精品)师范大学招考硕士研究生课程八六0试卷

ZXMVC8900(V3

【模拟人生4（The Sims 4）性感露背黑色亮片礼服MOD V20190313】模拟人生4（The Sims 4）性感露背黑色亮片礼服MOD V20190313 官方免费下载

【生化危机2：重制版（Resident Evil 2 Remake）克莱尔红头发深色服装MOD】生化危机2：重制版（Resident Evil 2 Remake）克莱尔红头发深色服装MOD 官方免费下载

【模拟人生4（The Sims 4）性感露背深V领吊带裙MOD V20190311】模拟人生4（The Sims 4）性感露背深V领吊带裙MOD V20190311 官方免费下载

大模型应用开发技术：Multi-Agent框架流程、源码及案例实战（六）多智能体通用统计

【Unity 资源分享】 | Unity 精品写实动物模型+全套动画资源分享

【鬼泣5（Devil May Cry V）v1.0十四项修改】鬼泣5（Devil May Cry V）v1.0十四项修改官方免费下载