Skip to content
This repository has been archived by the owner on Jul 14, 2021. It is now read-only.

Commit

Permalink
Mixture model component equations separated
Browse files Browse the repository at this point in the history
  • Loading branch information
mossr committed May 3, 2021
1 parent aa0de28 commit e424b87
Show file tree
Hide file tree
Showing 2 changed files with 13 additions and 12 deletions.
25 changes: 13 additions & 12 deletions chapters/cem_variants.tex
Original file line number Diff line number Diff line change
Expand Up @@ -261,43 +261,44 @@ \section{Experiments} \label{sec:cem_experiments}


\subsection{Test Objective Function Generation}\label{sec:sierra}
To stress the cross-entropy method and its variants, we created a test objective function called \textit{sierra} that is generated from a mixture model comprised of $49$ multivariate Gaussian distributions.
We chose this construction so that we can use the negative peeks of the component distributions as local minima and can force a global minimum centered at our desired $\mathbf{\widetilde{\vec{\mu}}}$.
\begin{figure*}[!t]
\centering
\resizebox{0.8\textwidth}{!}{\input{figures/cem_variants/sierra_group.tex}}
\resizebox{0.7\textwidth}{!}{\input{figures/cem_variants/sierra_group.tex}}
\caption{
\label{fig:sierra}
Example test objective functions generated using the sierra function.
}
\end{figure*}
To stress the cross-entropy method and its variants, we created a test objective function called \textit{sierra} that is generated from a mixture model comprised of $49$ multivariate Gaussian distributions.
We chose this construction so that we can use the negative peeks of the component distributions as local minima and can force a global minimum centered at our desired $\mathbf{\widetilde{\vec{\mu}}}$.
The construction of the sierra test function can be controlled by parameters that define the spread of the local minima.
We first start with the center defined by a mean vector $\mathbf{\widetilde{\vec{\mu}}}$ and we use a common covariance $\mathbf{\widetilde{\mat{\Sigma}}}$:
\begin{align*}
\mathbf{\widetilde{\vec{\mu}}} &= [\mu_1, \mu_2], \quad \mathbf{\widetilde{\mat{\Sigma}}} = \begin{bmatrix}\sigma & 0\\ 0 & \sigma \end{bmatrix}
\end{align*}
Next, we use the parameter $\delta$ to control the clustered distance between symmetric points:
\begin{align*}
\mat{G} &= \left\{[+\delta, +\delta], [+\delta, -\delta], [-\delta, +\delta], [-\delta, -\delta]\right\}
\mat{G} &= \Big\{[+\delta, +\delta], [+\delta, -\delta], [-\delta, +\delta], [-\delta, -\delta]\Big\}
\end{align*}
We chose points $\mat{P}$ to fan out the clustered minima relative to the center defined by $\mathbf{\widetilde{\vec{\mu}}}$:
\begin{align*}
\mat{P} &= \left\{[0, 0], [1, 1], [2, 0], [3, 1], [0, 2], [1, 3]\right\}
\mat{P} &= \Big\{[0, 0], [1, 1], [2, 0], [3, 1], [0, 2], [1, 3]\Big\}
\end{align*}
The vector $\vec{s}$ is used to control the $\pm$ distance to create an `s' shape comprised of minima, using the standard deviation $\sigma$:
$\vec{s} = [+\sigma, -\sigma]$.
We set the following default parameters: standard deviation $\sigma=3$, spread rate $\eta=6$, and cluster distance $\delta=2$.
We can also control if the local minima clusters ``decay'', thus making those local minima less distinct (where $\text{decay} \in \{0, 1\})$.
The parameters that define the sierra function are collected into $\vec{\theta} = \langle \mathbf{\widetilde{\vec{\mu}}}, \mathbf{\widetilde{\mat{\Sigma}}}, \mat{G}, \mat{P}, \vec{s} \rangle$.
Using these parameters, we can define the mixture model with uniform weights used by the sierra function as:
\begin{gather}
\Sierra \sim \operatorname{Mixture}\biggl(\underbrace{\Big\{ \vec{g} + s\vec{p}_i + \widetilde{\vec{\mu}} \mid (\vec{g}, \vec{p}_i, s) \in (\mat{G}, \mat{P}, \vec{s}) \Big\}}_{\text{component means}}, \underbrace{\Big\{ \widetilde{\mat{\Sigma}} \cdot i^{\text{decay}}/\eta \mid i \Big\}}_{\text{component covariances}} \mid \vec{\theta} \biggr)
\end{gather}
Using the parameters $\langle \mathbf{\widetilde{\vec{\mu}}}, \mathbf{\widetilde{\mat{\Sigma}}}, \mat{G}, \mat{P}, \vec{s} \rangle$, we can define the sierra mixture model with the set of component means $\vec{\mu}_\mathcal{S}$, component covariances $\mat{\Sigma}_\mathcal{S}$, and weights $\vec{w}_\mathcal{S}$ as:
\begin{align*}
\vec{\mu}_\mathcal{S} &= \Big\{ \vec{g} + s\vec{p}_i + \widetilde{\vec{\mu}} \mid \vec{g} \in \mat{G}, s \in \vec{s}, \vec{p}_i \in \mat{P} \Big\}\tag{component means}\\
\mat{\Sigma}_\mathcal{S} &= \Big\{ \widetilde{\mat{\Sigma}} \cdot i^{\text{decay}}/\eta \mid i \in \{1,\ldots,|\mat{P}|\} \Big\}\tag{component covariances}\\
\vec{w}_\mathcal{S} &= \Big\{ 1/m \mid m = |\mat{G}|\cdot|\vec{s}|\cdot|\mat{P}| + 1 \Big\}\tag{uniform component weights}
\end{align*}
We add a final component to be our global minimum centered at $\mathbf{\widetilde{\vec{\mu}}}$ and with a covariance scaled by $\sigma\eta$. Namely, the global minimum is $\vec{x}^* = \E\left[\Normal(\mathbf{\widetilde{\vec{\mu}}}, \mathbf{\widetilde{\mat{\Sigma}}}/(\sigma\eta))\right] = \mathbf{\widetilde{\vec{\mu}}}$.
We can now use this constant mixture model with $49$ components and define the sierra objective function $\mathcal{S}(\vec{x})$ to be the negative probability density of the mixture at input $\vec{x}$ with uniform weights (where $|\Sierra|$ denotes the number of components in the mixture model, i.e., 49):
We can now use this constant mixture model with $49$ components and define the sierra objective function $\mathcal{S}(\vec{x})$ to be the negative probability density of the mixture at input $\vec{x}$ with uniform weights (where $m=49$ denotes the number of components in the mixture model):

\begin{equation}
\mathcal{S}(\vec{x}) = -p(\vec{x}) = -\frac{1}{|\Sierra|}\sum_{j=1}^{n}\Normal(\vec{x} \mid \vec{\mu}_j, \mat{\Sigma}_j)
\mathcal{S}(\vec{x}) = -p(\vec{x}) = -\frac{1}{m}\sum_{j=1}^{n}\Normal(\vec{x} \mid \vec{\mu}_j, \mat{\Sigma}_j)
\end{equation}
An example of six different objective functions generated using the sierra function are shown in \cref{fig:sierra}, sweeping over the spread rate $\eta$, with and without decay.

Expand Down
Binary file modified moss-mscs-thesis.pdf
Binary file not shown.

0 comments on commit e424b87

Please sign in to comment.