Black-Box Stress Testing
Unhide for installation (waiting on Julia registry).
xxxxxxxxxxusing Distributions, Parameters, POMDPStressTesting, Latexify, PlutoUITo find failures in a black-box autonomous system, we can use the POMDPStressTesting package which is part of the POMDPs.jl ecosystem.
Various solvers—which adhere to the POMDPs.jl interface—can be used:
MCTSPWSolver(MCTS with action progressive widening)TRPOSolverandPPOSolver(deep reinforcement learning policy optimization)CEMSolver(cross-entropy method)RandomSearchSolver
Simple Problem: One-Dimensional Walk
We define a simple problem for adaptive stress testing (AST) to find failures. This problem, called Walk1D, samples random walking distances from a standard normal distribution
Gray-Box Simulator and Environment
The simulator and environment are treated as gray-box because we need access to the state-transition distributions and their associated likelihoods.
Parameters
First, we define the parameters of our simulation.
xxxxxxxxxx mutable struct Walk1DParams startx::Float64 = 0 # Starting x-position threshx::Float64 = 10 # +- boundary threshold endtime::Int64 = 30 # Simulate end timeend;Simulation
Next, we define a GrayBox.Simulation structure.
xxxxxxxxxx mutable struct Walk1DSim <: GrayBox.Simulation params::Walk1DParams = Walk1DParams() # Parameters x::Float64 = 0 # Current x-position t::Int64 = 0 # Current time ± distribution::Distribution = Normal(0, 1) # Transition distributionend;GrayBox.environment
Then, we define our GrayBox.Environment distributions. When using the ASTSampleAction, as opposed to ASTSeedAction, we need to provide access to the sampleable environment.
xxxxxxxxxxGrayBox.environment(sim::Walk1DSim) = GrayBox.Environment(:x => sim.distribution)GrayBox.transition!
We override the transition! function from the GrayBox interface, which takes an environment sample as input. We apply the sample in our simulator, and return the log-likelihood.
xxxxxxxxxxfunction GrayBox.transition!(sim::Walk1DSim, sample::GrayBox.EnvironmentSample) sim.t += 1 # Keep track of time sim.x += sample[:x].value # Move agent using sampled value from input return logpdf(sample)::Real # Summation handled by `logpdf()`endBlack-Box System
The system under test, in this case a simple single-dimensional moving agent, is always treated as black-box. The following interface functions are overridden to minimally interact with the system, and use outputs from the system to determine failure event indications and distance metrics.
BlackBox.initialize!
Now we override the BlackBox interface, starting with the function that initializes the simulation object. Interface functions ending in ! may modify the sim object in place.
xxxxxxxxxxfunction BlackBox.initialize!(sim::Walk1DSim) sim.t = 0 sim.x = sim.params.startxendBlackBox.distance
We define how close we are to a failure event using a non-negative distance metric.
xxxxxxxxxxBlackBox.distance(sim::Walk1DSim) = max(sim.params.threshx - abs(sim.x), 0)BlackBox.isevent
We define an indication that a failure event occurred.
xxxxxxxxxxBlackBox.isevent(sim::Walk1DSim) = abs(sim.x) ≥ sim.params.threshxBlackBox.isterminal
Similarly, we define an indication that the simulation is in a terminal state.
xxxxxxxxxxfunction BlackBox.isterminal(sim::Walk1DSim) return BlackBox.isevent(sim) || sim.t ≥ sim.params.endtimeendBlackBox.evaluate!
Lastly, we use our defined interface to evaluate the system under test. Using the input sample, we return the log-likelihood, distance to an event, and event indication.
xxxxxxxxxxfunction BlackBox.evaluate!(sim::Walk1DSim, sample::GrayBox.EnvironmentSample) logprob::Real = GrayBox.transition!(sim, sample) # Step simulation d::Real = BlackBox.distance(sim) # Calculate miss distance event::Bool = BlackBox.isevent(sim) # Check event indication return (logprob::Real, d::Real, event::Bool)endAST Setup and Running
Setting up our simulation, we instantiate our simulation object and pass that to the Markov decision proccess (MDP) object of the adaptive stress testing formulation. We use Monte Carlo tree search (MCTS) with progressive widening on the action space as our solver. Hyperparameters are passed to MCTSPWSolver, which is a simple wrapper around the POMDPs.jl implementation of MCTS. Lastly, we solve the MDP to produce a planner. Note we are using the ASTSampleAction.
xxxxxxxxxxfunction setup_ast(seed=0) # Create gray-box simulation object sim::GrayBox.Simulation = Walk1DSim() # AST MDP formulation object mdp::ASTMDP = ASTMDP{ASTSampleAction}(sim) mdp.params.debug = true # record metrics mdp.params.top_k = 10 # record top k best trajectories mdp.params.seed = seed # set RNG seed for determinism # Hyperparameters for MCTS-PW as the solver solver = MCTSPWSolver(n_iterations=1000, # number of algorithm iterations exploration_constant=1.0, # UCT exploration k_action=1.0, # action widening alpha_action=0.5, # action widening depth=sim.params.endtime) # tree depth # Get online planner (no work done, yet) planner = solve(solver, mdp) return plannerend;Searching for Failures
After setup, we search for failures using the planner and output the best action trace.
xxxxxxxxxxplanner = setup_ast();ASTSampleAction(Dict{Symbol,POMDPStressTesting.AST.GrayBox.Sample}(:x=>Sample{Float64}(0.015328, -0.919056)))ASTSampleAction(Dict{Symbol,POMDPStressTesting.AST.GrayBox.Sample}(:x=>Sample{Float64}(0.28838, -0.96052)))ASTSampleAction(Dict{Symbol,POMDPStressTesting.AST.GrayBox.Sample}(:x=>Sample{Float64}(1.19606, -1.63422)))ASTSampleAction(Dict{Symbol,POMDPStressTesting.AST.GrayBox.Sample}(:x=>Sample{Float64}(0.609811, -1.10487)))ASTSampleAction(Dict{Symbol,POMDPStressTesting.AST.GrayBox.Sample}(:x=>Sample{Float64}(0.843249, -1.27447)))ASTSampleAction(Dict{Symbol,POMDPStressTesting.AST.GrayBox.Sample}(:x=>Sample{Float64}(1.47323, -2.00414)))ASTSampleAction(Dict{Symbol,POMDPStressTesting.AST.GrayBox.Sample}(:x=>Sample{Float64}(2.37764, -3.74553)))ASTSampleAction(Dict{Symbol,POMDPStressTesting.AST.GrayBox.Sample}(:x=>Sample{Float64}(1.23179, -1.67759)))ASTSampleAction(Dict{Symbol,POMDPStressTesting.AST.GrayBox.Sample}(:x=>Sample{Float64}(1.14347, -1.5727)))ASTSampleAction(Dict{Symbol,POMDPStressTesting.AST.GrayBox.Sample}(:x=>Sample{Float64}(1.31136, -1.77877)))Playback
We can also playback specific trajectories and print intermediate
0.0
0.015328
0.303708
1.49977
2.10958
2.95283
4.42605
6.8037
8.03549
9.17896
10.4903
xxxxxxxxxxplayback_trace = playback(planner, action_trace, sim->sim.x, return_trace=true)0.8159216715195341xxxxxxxxxxfailure_rate = print_metrics(planner)Other Solvers: Cross-Entropy Method
We can easily take our ASTMDP object (planner.mdp) and re-solve the MDP using a different solver—in this case the CEMSolver.
xxxxxxxxxxmdp = planner.mdp; # reused from aboveCEMSolver
n_iterations: Int64 1000
episode_length: Int64 30
num_samples: Int64 100
min_elite_samples: Int64 10
max_elite_samples: Int64 9223372036854775807
elite_thresh: Float64 -0.99
weight_fn: #16 (function of type POMDPStressTesting.var"#16#22")
add_entropy: #17 (function of type POMDPStressTesting.var"#17#23")
show_progress: Bool true
verbose: Bool false
xxxxxxxxxxcem_solver = CEMSolver(n_iterations=1000, episode_length=mdp.sim.params.endtime)xxxxxxxxxxcem_planner = solve(cem_solver, mdp);xxxxxxxxxxcem_action_trace = search!(cem_planner);Notice the failure rate is about 10x of MCTSPWSolver.
12.882493795314756xxxxxxxxxxcem_failure_rate = print_metrics(cem_planner)AST Reward Function
The AST reward function gives a reward of
xxxxxxxxxx function R(p,e,d,τ) if τ && e return 0 elseif τ && !e return -d else return log(p) endendReferences
Robert J. Moss, Ritchie Lee, Nicholas Visser, Joachim Hochwarth, James G. Lopez, and Mykel J. Kochenderfer, "Adaptive Stress Testing of Trajectory Predictions in Flight Management Systems", Digital Avionics Systems Conference, 2020.
POMDPStressTesting.jl
x
PlutoUI.TableOfContents("POMDPStressTesting.jl")