Skip to content

A constraint optimizer based intended for noisy black-box functions

License

Notifications You must be signed in to change notification settings

kblomdahl/green-tea

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

18 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Green Tea

Green tea is an constraint optimizer based on SHAC [1]. Its intended use is for minimizing noisy, and slow evaluation functions, such as:

  • Minimize the runtime of mixed integer solvers based on hyper parameter tuning
  • Tuning of game-playing parameters for chess / go engines
  • Neural architecture search

[1] Manoj Kumar, George E. Dahl, Vijay Vasudevan, Mohammad Norouzi, Parallel Architecture and Hyperparameter Search via Successive Halving and Classification, https://arxiv.org/abs/1805.10255

Dependencies

Usage

./green-tea.py < [control file]

Control file syntax

The control files is a YAML file containing directives of what executable to minimize, and what parameters to minimize it over:

  • exec (required) - Path to an executable that execute the process to be minimized. It will be provided a YAML file containing the parameter values from standard input.
  • params (required) - The parameters to optimize over, each parameter has a few properties:
    • type (required) - uniform, normal, or integer.
    • shape (optional) - The shape of the parameter, for example [2, 8].
    • range (required) - The lower (inclusive) and upper (inclusive) bound of the parameter. For normal parameters one can also set mean and std.
  • constraints (optional) - The global constraints to apply, each constraint is a python expression that should return a boolean.

Example

exec: ./examples/rosenbrock.py
params:
  x:
    type: uniform
    range:
      lower: -2.0
      upper: 2.0
  y:
    type: uniform
    range:
      lower: -2.0
      upper: 2.0
constraints:
  - x + y > -3
  - x + y <  3

How it works

The algorithm used is a variation of random sampling, where after evaluation a batch of random samples we build a model that cut the batch into two sets of good and bad samples. This model is then used to reject any future random samples that would get classified as bad. This is reminiscent of the cutting-plane method for linear programming, but since we are optimizing a black-box function each cut is a classification problem.

Simplified steps, for the full details see the paper:

  1. Generate, and evaluate, a batch of N random samples.
  2. Classify each sample in the batch as Good if it better than the median evaluation value of the batch, otherwise the sample is classified as Bad.
  3. Train a model to predict whether a sample is Good or Bad based on the data-set generated in step 2.
  4. Generate, and evaluate, a new batch of N samples where each sample must be classified as Good by all of the classifier(s) generated from step 3.
  5. Repeat from step 2.

Classifier Cut

The quality of the classifier used to reduce the feasible set after each batch is paramount to the quality of the final solution. If the classifier is bad, then this method is, at best, no better than random sampling. Any classifier used should have the following properties:

  • Robust to over fitting for very small data-sets, a batch is normally 12 to 20 samples.
  • Fast to evaluate, since we will be generating and reject a lot of random samples.
  • Produce conservative classifications, since we want to avoid removing regions that has not been properly explored yet.

These properties limits us to relatively simple algorithms. Currently the following seems feasible:

  1. Gradient Boosting Decision Trees
  2. Logistic Regression - This may produce too aggressive cuts, consider increasing -p for this

Classification Classes

The percentage of samples that are classified as Good and Bad can be controlled with the -p argument, by default is 50:

./green-tea.py -p 75 < [control file]

It may be useful to tune this parameter for certain problems where, for example the initial feasible set is very large, the evaluation function looks like a parabolic function, or has many highs and valleys.

TODO

  • Add support for constraints.
  • Add support for additional parameter types, for example integer.
  • Investigate fuzzy classification to help with noisy evaluation functions.
  • Optimize random sample generation by tightening bounds based on classifier fit.

About

A constraint optimizer based intended for noisy black-box functions

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages