【论文阅读】Search-Based Testing Approach for Deep Reinforcement Learning Agents|电子爱好者

admin管理员组
文章数量:1652184

文章目录

一.论文信息
二.论文结构
三.论文内容
- Abstract
- 摘要

一.论文信息

题目： Search-Based Testing Approach for DeepReinforcement Learning Agents.【基于搜索的深度强化学习智能体测试方法】

发表年份： 2022

期刊/会议： arkiv

论文链接： http://arxiv/abs/2206.07813

作者信息： Amirhossein Zolfagharian, Manel Abdellatif, Lionel Briand, Mojtaba Bagherzadeh and Ramesh S

二.论文结构

1.Introduction
2.Background
	2.1 Deﬁnitions
	2.2 State Abstraction	
3.Problem Definition
	3.1 RL Agent Testing Challenges
	3.2 Assumptions
4.Approach
	4.1 Reformulation as a Search Problem（重新表述为一个搜索问题）
	4.2 Overview of the Approach（方法概括）
	4.3 Initial Population（初始化种群）
	4.4 Fitness Computations（健康度的计算）
	4.5 Search Operators（搜索算符）
	4.6 Execution of Final Results（执行最终结果）
5.Empirical Evaluation（经验评估）
	5.1 Research Questions（提出的研究问题）
	5.2 Case Study（案例研究）
	5.3 Implementation（实现）
	5.4 Evaluation and Results（效果和评价）
6.Discussions
7.Threats to Validity（威胁的有效性）
8.Related Work
9.Conclusion

三.论文内容

Abstract

Deep Reinforcement Learning (DRL) algorithms have been increasingly employed during the last decade to solve various decision-making problems such as autonomous driving and robotics. However, these algorithms have faced great challenges when deployed in safety-critical environments since they often exhibit erroneous behaviors that can lead to potentially critical errors.

One way to assess the safety of DRL agents is to test them to detect possible faults leading to critical failures during their execution. This raises the question of how we can efficiently test DRL policies to ensure their correctness and adherence to safety requirements.

Most existing works on testing DRL agents use adversarial attacks that perturb states or actions of the agent. However, such attacks often lead to unrealistic states of the environment. Their main goal is to test the robustness of DRL agents rather than testing the compliance of agents’ policies with respect to requirements.

Due to the huge state space of DRL environments, the high cost of test execution, and the black-box nature of DRL algorithms, the exhaustive testing of DRL agents is impossible. In this paper, we propose a Search-based Testing Approach of Reinforcement Learning Agents (STARLA) to test the policy of a DRL agent by effectively searching for failing executions of the agent within a limited testing budget. We use machine learning models and a dedicated genetic algorithm to narrow the search towards faulty episodes.

We apply STARLA on a Deep-Q-Learning agent which is widely used as a benchmark and show that it significantly outperforms Random Testing by detecting more faults related to the agent’s policy. We also investigate how to extract rules that characterize faulty episodes of the DRL agent using our search results. Such rules can be used to understand the conditions under which the agent fails and thus assess its deployment risks.

摘要

在过去十年中(during the last decade)，深度强化学习（DRL）算法被越来越多地用于解决各种决策问题(solve various decision-making problems)，如自动驾驶、交易决策和机器人技术。然而，这些算法在安全关键环境中部署时面临着巨大的挑战，因为它们经常表现出错误的行为(exhibit erroneous behaviors)，可能导致潜在的关键错误。

评估DRL智能体安全性(assess the safety of DRL agents)的方法之一是对其进行测试，以检测在执行过程中可能导致关键故障的故障。这就提出了一个问题(this raises the question of)，即我们如何有效地测试DRL策略，以确保它们的正确性和符合安全需求(adherence to safety requirements)。

大多数现有的测试(most existing works on)DRL智能体的工作使用干扰智能体状态或动作(perturb states or actions)的对抗性攻击。然而，这种攻击往往会导致环境的不现实状态(lead to unrealistic states of the environment)。此外，他们的主要目标是测试DRL智能体的鲁棒性(test the robustness of DRL agents)，而不是测试智能体的策略与需求的合规性(testing the compliance of agents’ policies with respect to requirements)。

由于深度强化学习环境的巨大状态空间(the huge state space of DRL environments)、测试执行成本高(the high cost of test execution)以及深度强化学习算法的黑箱特性(the black-box nature of DRL algorithms)，无法对深度强化学习代理进行穷举测试。本文提出一种基于搜索的强化学习智能体测试方法(STARLA)，通过在有限的测试预算(within a limited testing budget)中有效搜索智能体执行失败的策略来测试DRL智能体的策略。依靠机器学习模型和专用遗传算法(a dedicated genetic algorithm)将搜索范围缩小到错误情节(即DRL智能体产生的状态和动作序列)(faulty episodes)。将STARLA应用于一个广泛使用的深度q学习智能体上，作为基准，表明它通过检测更多与智能体策略相关的错误，明显优于随机测试。

我们还研究了如何使用搜索结果提取描述DRL智能体错误情节的规则。这些规则可用于了解智能体失败的条件，从而评估部署它的风险(assess the risks of deploying it)。

本文标签：论文 Based testing Approach Search

版权声明：本文标题：【论文阅读】Search-Based Testing Approach for Deep Reinforcement Learning Agents 内容由热心网友自发贡献，该文观点仅代表作者本人，转载请联系作者并注明出处：https://m.elefans.com/xitong/1729578595a1207328.html，本站仅提供信息存储空间服务，不拥有所有权，不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容，一经查实，本站将立刻删除。

更多相关文章

xp系统

电子爱好者 - 最新技术资讯及电子产品介绍！

【论文阅读】Search-Based Testing Approach for Deep Reinforcement Learning Agents

文章目录

一.论文信息

二.论文结构

三.论文内容

Abstract

摘要

更多相关文章

移动通信专业毕业设计（论文）-自愈双环光纤传输统

spring boot校园商铺系统 毕业设计源码论文+答辩PPT

DenseFuse: A Fusion Approach to Infrared and Visible Images 阅读笔记

Learning to Rank: From Pairwise Approach to Listwise Approach论文笔记

【预训练语言模型】RoBERTa: A Robustly Optimized BERT Pretraining Approach

深入理解深度学习——BERT派生模型：RoBERTa（A Robustly Optimized BERT Pretraining Approach）

【读论文】A Unified Approach to Interpreting Model Predictions

《MixMatch: A Holistic Approach to Semi-Supervised Learning》论文阅读报告

《论文笔记》A Frontier-Based Approach for Autonomous Exploration

A Contrastive Learning Approach for Hierarchy Text Classification源码阅读

《A Unified Approach to Interpreting Model Predictions》论文解读——解释模型 预测的统一方法

算法设计技巧与分析（五）：贪心算法(The Greedy Approach)

《论文阅读》RoBERTa: A Robustly Optimized BERT Pretraining Approach

【论文阅读】CentralNet: a Multilayer Approach for Multimodal Fusion

A Sim2Real Deep Learning Approach for theTransformation of Images from Multiple Vehicle-Mounted Cam

《FL-MSRE: A Few-Shot Learning based Approach to Multimodal Social RelationExtraction》

论文笔记：A Robust Learning Approach to Domain Adaptive Object Detection

论文阅读：GS-WGAN: A Gradient-Sanitized Approach for Learning Differentially Private Generators

Bootstrap Your Own Latent A New Approach to Self-Supervised Learning

python论文排版格式_学位论文排版教程2

发表评论

推荐文章

word文件加密怎么操作？1分钟就可以轻松实现！

ArcGIS API for JS4.7加载FeatureLayer，点击弹出信息并高亮显示

YOLOv7改进主干CFPNet系列：全网首发结合最新Centralized Feature Pyramid集中特征金字塔，测试私有数据集涨点，通过COCO数据集验证强势涨点

电脑垃圾站刚清空的文件怎么找回？总结了三种方法

百度网盘，资源显示一直在请求中

热门文章

Internet Download Manager2023最新永久版下载及功能介绍

win10控制面板快捷键_常用的几个win10使用小窍门

【你好，windows】Windows 10 20H2 19042.630专业工作站纯净版2020.12.1

视频目标检测paper带读（一）《Flow-Guided Feature Aggregation for Video Object Detection》

编程语言常用的 21种设计模式

如何恢复u盘删除的文件？这几种恢复办法分享

DiskGenius无法分区

傲梅分区助手实现系统盘迁移

回收站删除了怎么恢复

百度网盘文件真实地址解析

最新文章

CDR2024破解完整版下载安装永久激活最新

windows系统激活时间查询

中文linux 老旧电脑,安装Bodhi Linux让老旧电脑重新焕发活力

网络安全初学者工具安装：Kali，Windows xp虚拟机，pikachu靶场，burpsuite安装配置，phpstudy安装（学习笔记）

XP SP3无法安装IIS 系统版本iis 5.1 iis 6

win7虚拟机黑苹果_苹果Mac虚拟机安装Win7系统的方法【图文教程】

MathType7永久免费无需激活版下载，数学神器轻松get！

QT历届版本下载总汇

在XP下安装Ubuntu双系统

vmware Tools 全系列版本下载及安装方法（vmware Tools 10~12）

Windows server 2022datacenter版本的j激活过程

mathtype2024最新破解永久激活码密钥序列号+下载安装教程

【C++软件调试技术】使用 Windbg 分析软件异常时的诸多细节与技巧总结

Java版本历史

跟老男孩学 Linux 运维：Web 集群实战

小米手机肿么还原时钟

15000流明是多少瓦

一般普通投影机功率多大?

苹果绿联转换器有些投影机不能用

坚果V9投影机具体参数?

有关九年级作文850字精选

80后90后_高一作文

中级卫生专业资格中医全科学主治医师中级模拟题2021年(9)案与解析

(精品)师范大学招考硕士研究生课程八六0试卷

ZXMVC8900(V3

【模拟人生4（The Sims 4）性感露背黑色亮片礼服MOD V20190313】模拟人生4（The Sims 4）性感露背黑色亮片礼服MOD V20190313 官方免费下载

【生化危机2：重制版（Resident Evil 2 Remake）克莱尔红头发深色服装MOD】生化危机2：重制版（Resident Evil 2 Remake）克莱尔红头发深色服装MOD 官方免费下载

【模拟人生4（The Sims 4）性感露背深V领吊带裙MOD V20190311】模拟人生4（The Sims 4）性感露背深V领吊带裙MOD V20190311 官方免费下载

【模拟人生4（The Sims 4）科幻风宇宙飞船家庭住宅MOD V20190311】模拟人生4（The Sims 4）科幻风宇宙飞船家庭住宅MOD V20190311 官方免费下载

【鬼泣5（Devil May Cry V）v1.0十四项修改】鬼泣5（Devil May Cry V）v1.0十四项修改 官方免费下载

如何实现高效的treenode搜索算法

treenode与链表有何本质区别

spring boot校园商铺系统毕业设计源码论文+答辩PPT

《A Unified Approach to Interpreting Model Predictions》论文解读——解释模型预测的统一方法

【鬼泣5（Devil May Cry V）v1.0十四项修改】鬼泣5（Devil May Cry V）v1.0十四项修改官方免费下载