Skip to content
This repository has been archived by the owner on Jul 14, 2021. It is now read-only.

Commit

Permalink
Abstract and acknowledgments sections
Browse files Browse the repository at this point in the history
  • Loading branch information
mossr committed Apr 28, 2021
1 parent f22162f commit 2612eeb
Show file tree
Hide file tree
Showing 6 changed files with 90 additions and 59 deletions.
32 changes: 32 additions & 0 deletions chapters/abstract.tex
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
Before safety-critical autonomous systems are deployed into the real-world, we first must validate how safe they are by stress testing the systems in simulation.
This work proposes several techniques to aid in efficient stress testing of black-box systems, especially when those systems are computationally expensive to evaluate.
We first introduce novel variants of the cross-entropy method for stochastic optimization used to find rare failure events.
The original cross-entropy method relies on enough objective function calls to accurately estimate the optimal parameters of the underlying distribution and may get stuck in local minima.
% Certain objective functions may be computationally expensive to evaluate, and the cross-entropy method could potentially get stuck in local minima.
The variants we introduce attempt to address these concerns and the primary idea is to use every sample to build a surrogate model to offload computation from an expensive system under test.
% To mitigate expensive function calls, during optimization we use every sample to build a surrogate model to approximate the objective function.
% The surrogate model augments the belief of the objective function with less expensive evaluations.
%%% We use a Gaussian process for our surrogate model to incorporate uncertainty in the predictions which is especially helpful when dealing with sparse dfata.
%%% To address local minima convergence, we use Gaussian mixture models to encourage exploration of the design space.
To test our approach, we created a parameterized test objective function with many local minima and a single global minimum, where the test function can be adjusted to control the spread and distinction of the minima.
Experiments were run to stress the cross-entropy method variants and results indicate that the surrogate model-based approach reduces local minima convergence using the same number of function evaluations.

To find failure events and their likelihoods in computationally expensive sequential decision making systems, we propose a modification to the black-box stress testing approach called \textit{adaptive stress testing}.
This modification generalizes adaptive stress testing to be broadly applied to episodic systems, where a reward is only received at the end of an episode.
To test this approach, we analyze an aircraft trajectory predictor from a developmental commercial flight management system which takes as input a collection of lateral waypoints and en-route environmental conditions.
The intention of this work is to find likely failures and report them back to the developers so they can address and potentially resolve shortcomings of the system before deployment.
We use a modified Monte Carlo tree search algorithm with progressive widening as our adversarial reinforcement learner and compared performance to direct Monte Carlo simulations and to the cross-entropy method as an alternative importance sampling baseline.
The goal is to find potential problems otherwise not found by traditional requirements-based avionics testing.
Results indicate that our adaptive stress testing approach finds more failures with higher likelihoods relative to the baselines.

When validating a system that relies on a static validation dataset, one could exhaustively evalute the entire dataset, yet that process may be computationally intractable especially when testing minor modification to the system under test.
To address this, we reformulate the problem to attempt to intelligently select candidate validation data points that we predict to likely cause a failure, using knowledge of the system failures experienced so far.
We propose an adaptive black-box validation framework that will learn system weaknesses over time and exploit this knowledge to propose validation samples that will likely result in a failure.
We use a low-dimensional encoded representation of inputs to train an adversarial failure classifier to intelligently select candidate failures to evaluate.
Experiments were run to test our approach against a random candidate selection process and we also compare against full knowledge of the true system failures.
We stress test a black-box neural network classifier trained on the MNIST dataset,
and results show that using our framework, the adversarial failure classifier selects failures about $3$ times more often than random.

A motivating principle of this work is a committement to open source software.
The core software for each of the introduced techniques have been developed as Julia packages and publically released.
We introduce the software at a high level and discuss alternative applications from both a research and industrial perspective.
52 changes: 52 additions & 0 deletions chapters/acknowledgments.tex
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
% \begin{itemize}
% \item Thank Mykel.
% \begin{itemize}
% \item Saw inspiration in 2012 at MIT Lincoln Laboratory
% \item Always available
% \item High integrity
% \item An model leader
% \end{itemize}
% \item Thank Dorsa for being a secondary adviser.
% \item Thank Eva Moss.
% \begin{itemize}
% \item Supportive
% \item Interested
% \item Flexible
% \end{itemize}
% \begin{itemize}
% \item SISL: Ritchie (for AST), Anthony, Mark, Shushman Choudhury, Jayesh Gupta, Bernard Lange, Alex Koufos, Sydney Katz, SISL as a whole. Zachary Sunberg for POMDPs.jl and MCTS.jl. Tomer Arnon (CEM).
% \end{itemize}
% \begin{itemize}
% \item Sponsors: GE (Jerry, Joachim, Nick, etc.)
% \item NASA Ames: Edward Balaban.
% \item AI Center for Safety
% \end{itemize}
% \end{itemize}

I would especially like to thank Professor Mykel Kochenderfer for his continued guidance over the past decade.
Our time together at MIT Lincoln Laboratory has certainly shaped me into the researcher am I today.
The positivity in Mykel's leadership is inspirational and his high level of integrity and honesty encouraged me to always do my best.
He is a model leader and I thank him for the incredible opportunity both at MIT Lincoln Laboratory at here at Stanford.
I'd also like to thank Professor Dorsa Sadigh for being my secondary research adviser and for her advice regarding this thesis.

As with many theses, I am standing on the shoulders of giants.
I would like to thank Dr. Ritchie Lee from NASA Ames for his original development of the adaptive stress testing idea and for his patience and guidance as he helped shape my ideas.
I want to thank Dr. Anthony Corso, Dr. Mark Koren, and Dr. Alex Koufos for always listening to my ideas, encouraging my excitement in the AI safety field and always providing constructive feedback. Without their advice, this work would not have been possible.
I'd also like to thank members of the Stanford Intelligent Systems Laboratory (SISL) for their encouragement and willingness to listen; particularly Bernard Lange, Dr. Shushman Choudhury, Dr. Jayesh Gupta, Sydney Katz, and Tomer Arnon.
Because this work is built off of other open source tools, I'm forever indebted to the SISL members that developed the POMDPs.jl ecosystem; this includes Dr. Zachary Sunberg, Maxim Egorov, and Dr. Tim Wheeler.
I want to extend a thank you to Dr. Edward Balaban at NASA Ames for the opportunity to work on a decision making under uncertainty system in a high-profile NASA mission.

% Episodic AST
Part of this work had the support from GE's Global Research Center and GE Aviation through the Stanford Center for AI Safety.
I want to thank each of these organizations for their fascinating problems and allowing me to explore research ideas that fit not only my interests but had large industrial impacts.
I also want to thank the NASA AOSP System-Wide Safety Project for partially supporting this work and Dr. Jerry Lopez, Nicholas Visser, and Joachim Hochwarth for their engineering guidance.

My family and friends have always been there for me, even as we are physically distant.
My Mom, Dad, brothers Travis and Jake, and sister Emily are a big reason I have core values that have helped me succeed.
Their love and support is infinite and I could not thank you enough for the life you've provided for me.
Everyone back in Rockport, MA and beyond have seen me grown through every phase in my life, and that bond is irreplaceable; so thank you.

Lastly---but most importantly---I want to thank my wife, Eva Moss, for always being supportive and growing with me during my graduate studies.
Eva, you always make me laugh and smile and have shaped me into a better person because of it.
Your logical thinking always helps me to check my opinions at the door.
Your flexibility in leaving our home back in Massachusetts and moving out to California tremendously helped in reducing the stress of graduate school---I love you and I am forever grateful.
17 changes: 0 additions & 17 deletions chapters/cem_variants.tex
Original file line number Diff line number Diff line change
@@ -1,20 +1,3 @@
% \begin{abstract}
% The cross-entropy (CE) method is a popular stochastic method for optimization due to its simplicity and effectiveness.
% Designed for rare-event simulations where the probability of a target event occurring is relatively small,
% the CE-method relies on enough objective function calls to accurately estimate the optimal parameters of the underlying distribution.
% Certain objective functions may be computationally expensive to evaluate, and the CE-method could potentially get stuck in local minima.
% This is compounded with the need to have an initial covariance wide enough to cover the design space of interest.
% We introduce novel variants of the CE-method to address these concerns.
% To mitigate expensive function calls, during optimization we use every sample to build a surrogate model to approximate the objective function.
% The surrogate model augments the belief of the objective function with less expensive evaluations.
% We use a Gaussian process for our surrogate model to incorporate uncertainty in the predictions which is especially helpful when dealing with sparse data.
% To address local minima convergence, we use Gaussian mixture models to encourage exploration of the design space.
% We experiment with evaluation scheduling techniques to reallocate true objective function calls earlier in the optimization when the covariance is the largest.
% To test our approach, we created a parameterized test objective function with many local minima and a single global minimum. Our test function can be adjusted to control the spread and distinction of the minima.
% Experiments were run to stress the cross-entropy method variants and results indicate that the surrogate model-based approach reduces local minima convergence using the same number of function evaluations.
% \end{abstract}


\section{Introduction}
The cross-entropy (CE) method is a probabilistic optimization approach that attempts to iteratively fit a distribution to elite samples from an initial input distribution \cite{rubinstein2004cross,rubinstein1999cross}.
The goal is to estimate a rare-event probability by minimizing the \textit{cross-entropy} between the two distributions \cite{de2005tutorial}.
Expand Down
25 changes: 0 additions & 25 deletions chapters/episodic_ast.tex
Original file line number Diff line number Diff line change
@@ -1,20 +1,3 @@
% \begin{abstract}
% To find failure events and their likelihoods in flight-critical systems, we investigate the use of an advanced black-box stress testing approach called adaptive stress testing.
% We analyze a trajectory predictor from a developmental commercial flight management system which takes as input a collection of lateral waypoints and en-route environmental conditions.
% Our aim is to search for failure events relating to inconsistencies in the predicted lateral trajectories.
% The intention of this work is to find likely failures and report them back to the developers so they can address and potentially resolve shortcomings of the system before deployment.
% To improve search performance, this work extends the adaptive stress testing formulation to be applied more generally to sequential decision-making problems with episodic reward by collecting the state transitions during the search and evaluating at the end of the simulated rollout.
% We use a modified Monte Carlo tree search algorithm with progressive widening as our adversarial reinforcement learner.
% The performance is compared to direct Monte Carlo simulations and to the cross-entropy method as an alternative importance sampling baseline.
% The goal is to find potential problems otherwise not found by traditional requirements-based testing.
% Results indicate that our adaptive stress testing approach finds more failures and finds failures with higher likelihood relative to the baseline approaches.
% \end{abstract}

% \begin{IEEEkeywords}
% cyber-physical systems, adaptive stress testing, black-box, Markov decision process, Monte Carlo tree search, flight management systems.
% \end{IEEEkeywords}


\section{Introduction}
\label{sec:introduction}

Expand Down Expand Up @@ -593,11 +576,3 @@ \section{Conclusion}
Results suggest that the AST approach finds more failures with both higher severity and higher relative likelihood.
The failure cases are provided to the system engineers to address unwanted behaviors before system deployment.
In addition to requirements-based tests, we show that AST can be used for confidence testing during development.


\section*{Acknowledgments}
The authors would like to thank GE's Global Research Center and GE Aviation for supporting this work through the Stanford Center for AI Safety.
We also thank the NASA AOSP System-Wide Safety Project for partially supporting this work.
The authors will also like to thank Nicholas Visser and Joachim Hochwarth for their contributions.
We thank Anthony Corso and Mark Koren for their feedback and the Stanford Intelligent Systems Laboratory for their development of the POMDPs.jl ecosystem and the MCTS.jl package.
We would also like to thank Mark Darnell, Scott Edwards, and Andrew Foster for their technical support.
14 changes: 0 additions & 14 deletions chapters/weakness_rec.tex
Original file line number Diff line number Diff line change
@@ -1,17 +1,3 @@
% \title{Adversarial Weakness Recognition\\ for Efficient Black-Box Validation}

% \begin{abstract}
% When validating a black-box system, exhaustively evaluating over the entire validation dataset may be computationally intractable.
% The challenge then becomes to intelligently automate selective validation given knowledge of the system failures experienced so far.
% We propose an adaptive black-box validation framework that will learn system weaknesses over time and exploit this knowledge to propose validation samples that will likely result in a failure.
% We use a low-dimensional encoded representation of inputs to train an adversarial failure classifier to intelligently select candidate failures to evaluate.
% Experiments were run to test our approach against a random candidate selection process and we also compare against full knowledge of the true system failures.
% We stress test a black-box neural network classifier trained on the MNIST dataset.
% Results show that using our framework, the adversarial failure classifier selects failures about $3$ times more often than random.

% \end{abstract}


\section{Introduction}
Finding failures in a validation dataset may be computationally expensive if we search over the entire dataset.
Then the challenge becomes how to intelligently select candidate inputs that are likely to lead to failures.
Expand Down
9 changes: 6 additions & 3 deletions main.tex
Original file line number Diff line number Diff line change
Expand Up @@ -31,13 +31,16 @@
\principaladviser{Mykel J. Kochenderfer}
\firstreader{Dorsa Sadigh}


\beforepreface
\prefacesection{Preface}
This thesis tells you all you need to know about...
\prefacesection{Abstract}
\input{chapters/abstract}
%
\prefacesection{Acknowledgments}
I would like to thank...
\input{chapters/acknowledgments}
\afterpreface


\chapter{Introduction}
\input{chapters/introduction}

Expand Down

0 comments on commit 2612eeb

Please sign in to comment.