Skip to content

Commit ce77a31

Browse files
chore: Examples pruning (#1)
* add all legacy examples to this repo
1 parent 1990d8d commit ce77a31

File tree

365 files changed

+26610
-2
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

365 files changed

+26610
-2
lines changed

.gitignore

+50
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,50 @@
1+
# Byte-compiled / optimized / DLL files
2+
__pycache__/
3+
*.py[cod]
4+
5+
# All log files
6+
*.log
7+
8+
# Jupyter Notebook
9+
.ipynb_checkpoints
10+
11+
# pyenv
12+
.python-version
13+
14+
# dotenv
15+
.env
16+
17+
# virtualenv
18+
.venv
19+
venv/
20+
ENV/
21+
22+
# mypy
23+
.mypy_cache/
24+
25+
# Determined distributable package
26+
determined-*.tar.gz
27+
28+
# All Python wheels
29+
*.whl
30+
31+
# Node modules
32+
node_modules/
33+
34+
# VSCode
35+
.vscode/
36+
37+
# JetBrains IDEs (e.g., PyCharm and GoLand)
38+
.idea/
39+
40+
# gobin directory used for tests
41+
gobin
42+
43+
# MacOS system files
44+
*.DS_Store
45+
.dccache
46+
47+
# Hydra output
48+
model_hub/mmdetection/hydra/outputs
49+
50+
build/

README.md

+79-2
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,80 @@
1-
# Determined Examples
1+
# Determined Legacy Examples
22

3-
[CycleGAN](cyclegan)
3+
This Repository contains Determined examples that are no longer actively maintained by the determined team.
4+
5+
## Tutorials
6+
7+
| Example | Dataset | Framework |
8+
|:-------------------------------------------------------------:|:----------------:|:---------------------:|
9+
| [fashion\_mnist\_tf\_keras](tutorials/fashion_mnist_tf_keras) | Fashion MNIST | TensorFlow (tf.keras) |
10+
11+
## Computer Vision
12+
13+
| Example | Dataset | Framework |
14+
|:----------------------------------------------------------------------------:|:----------------------------:|:----------------------------------------:|
15+
| [cifar10\_pytorch](computer_vision/cifar10_pytorch) | CIFAR-10 | PyTorch |
16+
| [cifar10\_pytorch\_inference](computer_vision/cifar10_pytorch_inference) | CIFAR-10 | PyTorch |
17+
| [fasterrcnn\_coco\_pytorch](computer_vision/fasterrcnn_coco_pytorch) | Penn-Fudan Dataset | PyTorch |
18+
| [mmdetection\_pytorch](computer_vision/mmdetection_pytorch) | COCO | PyTorch |
19+
| [detr\_coco\_pytorch](computer_vision/detr_coco_pytorch) | COCO | PyTorch |
20+
| [deformabledetr\_coco\_pytorch](computer_vision/deformabledetr_coco_pytorch) | COCO | PyTorch |
21+
| [iris\_tf\_keras](computer_vision/iris_tf_keras) | Iris Dataset | TensorFlow (tf.keras) |
22+
| [unets\_tf\_keras](computer_vision/unets_tf_keras) | Oxford-IIIT Pet Dataset | TensorFlow (tf.keras) |
23+
| [efficientdet\_pytorch](computer_vision/efficientdet_pytorch) | COCO | PyTorch |
24+
| [byol\_pytorch](computer_vision/byol_pytorch) | CIFAR-10 / STL-10 / ImageNet | PyTorch |
25+
| [deepspeed\_cifar10_cpu_offloading](deepspeed/cifar10_cpu_offloading) | CIFAR-10 | PyTorch (DeepSpeed) |
26+
27+
## Natural Language Processing (NLP)
28+
29+
| Example | Dataset | Framework |
30+
|:--------------------------------------------------:|:----------:|:---------:|
31+
| [albert\_squad\_pytorch](nlp/albert_squad_pytorch) | SQuAD | PyTorch |
32+
| [bert\_glue\_pytorch](nlp/bert_glue_pytorch) | GLUE | PyTorch |
33+
| [word\_language\_model](nlp/word_language_model) | WikiText-2 | PyTorch |
34+
35+
## HP Search Benchmarks
36+
37+
| Example | Dataset | Framework |
38+
|:-------------------------------------------------------------------------------:|:---------------------:|:---------:|
39+
| [darts\_cifar10\_pytorch](hp_search_benchmarks/darts_cifar10_pytorch) | CIFAR-10 | PyTorch |
40+
| [darts\_penntreebank\_pytorch](hp_search_benchmarks/darts_penntreebank_pytorch) | Penn Treebank Dataset | PyTorch |
41+
42+
## Neural Architecture Search (NAS)
43+
44+
| Example | Dataset | Framework |
45+
|:---------------------------------:|:-------:|:---------:|
46+
| [gaea\_pytorch](nas/gaea_pytorch) | DARTS | PyTorch |
47+
48+
## Meta Learning
49+
50+
| Example | Dataset | Framework |
51+
|:----------------------------------------------------------------------:|:--------:|:---------:|
52+
| [protonet\_omniglot\_pytorch](meta_learning/protonet_omniglot_pytorch) | Omniglot | PyTorch |
53+
54+
## Generative Adversarial Networks (GAN)
55+
56+
| Example | Dataset | Framework |
57+
|:----------------------------------------------|:----------------:|:---------------------:|
58+
| [dc\_gan\_tf\_keras](gan/dcgan_tf_keras) | MNIST | TensorFlow (tf.keras) |
59+
| [gan\_mnist\_pytorch](gan/gan_mnist_pytorch) | MNIST | PyTorch |
60+
| [deepspeed\_dcgan](deepspeed/deepspeed_dcgan) | MNIST / CIFAR-10 | PyTorch (DeepSpeed) |
61+
| [pix2pix\_tf\_keras](gan/pix2pix_tf_keras) | pix2pix | TensorFlow (tf.keras) |
62+
| [cyclegan](gan/cyclegan) | monet2photo | PyTorch |
63+
64+
## Custom Reducers
65+
66+
| Example | Dataset | Framework |
67+
|:--------------------------------------------------------------------------:|:-------:|:----------:|
68+
| [custom\_reducers\_mnist\_pytorch](features/custom_reducers_mnist_pytorch) | MNIST | PyTorch |
69+
70+
## HP Search Constraints
71+
72+
| Example | Dataset | Framework |
73+
|:------------------------------------------------------------------------:|:-------:|:----------:|
74+
| [hp\_constraints\_mnist\_pytorch](features/hp_constraints_mnist_pytorch) | MNIST | PyTorch |
75+
76+
## Custom Search Method
77+
78+
| Example | Dataset | Framework |
79+
|:------------------------------------------------------------------------:|:-------:|:----------:|
80+
| [asha\_search\_method](custom_search_method/asha_search_method) | MNIST | PyTorch |
+58
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,58 @@
1+
# Pytorch Bootstrap Your Own Latent (BYOL) Example
2+
3+
This example shows how to perform self-supervised image classifier training with BYOL using
4+
Determined's PyTorch API. This example is based on the [byol-pytorch](https://github.com/lucidrains/byol-pytorch/tree/master/byol_pytorch) package.
5+
6+
Original BYOL paper: https://arxiv.org/abs/2006.0
7+
8+
Code and configuration details also sourced from the following BYOL implementations:
9+
- (JAX, paper authors) https://github.com/deepmind/deepmind-research/tree/master/byol
10+
- (Pytorch) https://github.com/untitled-ai/self_supervised
11+
12+
# Files
13+
* [backbone.py](backbone.py): Backbone registry.
14+
* [data.py](data.py): Dataset downloading and metadata registry.
15+
* [evaluate_result.py](evaluate_result.py): Kicks off an evaluation run, for longer training of classifier heads.
16+
* [generate_blob_list.py](generate_blob_list.py): Script to generate a blob list from a GCS bucket + prefix. Used to support GCS streaming for ImageNet dataset.
17+
* [model_def.py](model_def.py): Core trial and callback definitions. This is the entrypoint for trials.
18+
* [optim.py](optim.py): Optimizer definitions and utilities.
19+
* [reducers.py](reducers.py): Custom reducers used for evaluation metrics.
20+
* [startup-hook.sh](startup-hook.sh): This script will automatically be run by Determined during startup of every container launched for this experiment. This script installs some additional dependencies.
21+
* [utils.py](utils.py): Simple utility functions and classes.
22+
23+
# Configuration Files
24+
* [const-cifar10.yaml](const-cifar10.yaml): Train with CIFAR-10 on a single GPU with constant hyperparameter values.
25+
* [distributed-stl10.yaml](distributed-stl10.yaml): Train with STL-10 using 8 GPU distributed training with constant hyperparameter values.
26+
* [distributed-imagenet.yaml](distributed-imagenet.yaml): Train with ImageNet using 64 GPU distributed training with constant hyperparameter values.
27+
28+
# Data
29+
This repo uses three datasets:
30+
- CIFAR-10 (32x32, 10 classes), automatically downloaded via torchvision.
31+
- STL-10 (96x96, 10 classes), automatically downloaded via torchvision.
32+
- ImageNet-1k (1000 classes), which must stored in a GCS bucket along with a blob index. Information on downloading ImageNet-1k is available at the [ImageNet website](https://image-net.org/download.php). See `distributed-imagenet.yaml` for an example bucket configuration, and `generate_blob_list.py` for a script to generate the blob list.
33+
34+
# To Run
35+
If you have not yet installed Determined, installation instructions can be found under `docs/install-admin.html` or at https://docs.determined.ai/latest/index.html
36+
37+
Run the following command to kick off self-supervised training: `det -m <master host:port> experiment create -f config/const-cifar10.yaml .`
38+
39+
The other configurations can be run by specifying the appropriate configuration file in place of `const-cifar10.yaml`.
40+
41+
42+
To run classifier training and validation on a completed self-supervised training:
43+
44+
1. Find the experiment ID of your self-supervised training.
45+
2. Run `python evaluate_result.py --experiment-id=<id> --classifier-train-epochs=<number>`
46+
47+
This is necessary for ImageNet, where `hyperparameters.validate_with_classifier` is set to `false` during self-supervised training due to the time it takes to train the classifier. Other configs have `hyperparameters.validate_with_classifier` set to true to collect `test_accuracy` during the self-supervised training.
48+
49+
50+
## Results
51+
52+
For `const-cifar10.yaml` and `distributed-stl10.yaml`, results were taken from best `test_accuracy` achieved over the self-supervised training duration. For `distributed-imagenet.yaml`, result was taken from running `evaluate_result.py` for 80 classifier training epochs.
53+
54+
| Config file | Test Accuracy (%) |
55+
| ----------- | ------------- |
56+
| const-cifar10.yaml | 74.91 |
57+
| distributed-stl10.yaml | 91.10 |
58+
| distributed-imagenet.yaml | 71.37 |
+27
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,27 @@
1+
from dataclasses import dataclass
2+
from typing import Callable
3+
4+
import torch.nn as nn
5+
import torchvision.models as models
6+
7+
8+
@dataclass
9+
class BackboneMetadata:
10+
feature_size: int
11+
build_fn: Callable[[], nn.Module]
12+
13+
14+
BACKBONE_METADATA_BY_NAME = {
15+
"resnet18": BackboneMetadata(
16+
feature_size=512, build_fn=lambda: models.resnet18(pretrained=True)
17+
),
18+
"resnet34": BackboneMetadata(
19+
feature_size=512, build_fn=lambda: models.resnet34(pretrained=True)
20+
),
21+
"resnet50": BackboneMetadata(
22+
feature_size=2048, build_fn=lambda: models.resnet50(pretrained=True)
23+
),
24+
"resnet101": BackboneMetadata(
25+
feature_size=2048, build_fn=lambda: models.resnet101(pretrained=True)
26+
),
27+
}
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,81 @@
1+
name: cifar10_byol_const
2+
entrypoint: model_def:BYOLTrial
3+
records_per_epoch: 45000
4+
resources:
5+
slots_per_trial: 1
6+
shm_size: 17179869184
7+
min_validation_period:
8+
epochs: 2
9+
10+
data:
11+
dataset_name: cifar10
12+
download_dir: /data
13+
num_workers: 8
14+
validation_subset_size: 5000
15+
eval_transform:
16+
resize_short_edge: 32
17+
center_crop_size: 32
18+
train_transform1:
19+
random_crop_size: 32
20+
random_crop_min_scale: 0.2
21+
random_hflip_prob: 0.5
22+
color_jitter_prob: 0.8
23+
color_jitter_brightness: 0.4
24+
color_jitter_contrast: 0.4
25+
color_jitter_saturation: 0.2
26+
color_jitter_hue: 0.1
27+
grayscale_prob: 0.2
28+
gaussian_blur_prob: 1.0
29+
gaussian_blur_kernel_size: 3
30+
gaussian_blur_min_std: 0.1
31+
gaussian_blur_max_std: 2.0
32+
solarization_prob: 0.0
33+
train_transform2:
34+
random_crop_size: 32
35+
random_crop_min_scale: 0.2
36+
random_hflip_prob: 0.5
37+
color_jitter_prob: 0.8
38+
color_jitter_brightness: 0.4
39+
color_jitter_contrast: 0.4
40+
color_jitter_saturation: 0.2
41+
color_jitter_hue: 0.1
42+
grayscale_prob: 0.2
43+
gaussian_blur_prob: 0.1
44+
gaussian_blur_kernel_size: 3
45+
gaussian_blur_min_std: 0.1
46+
gaussian_blur_max_std: 2.0
47+
solarization_prob: 0.2
48+
49+
hyperparameters:
50+
training_mode: SELF_SUPERVISED
51+
validate_with_classifier: true
52+
backbone_name: resnet18
53+
global_batch_size: 256
54+
classifier:
55+
learning_rates: [0.1, 0.05, 0.025, 0.01, 0.005]
56+
logit_clipping:
57+
enabled: true
58+
alpha: 20
59+
logit_regularization_beta: 1e-2
60+
momentum: 0.9
61+
train_epochs: 4
62+
self_supervised:
63+
lars_eta: 0.001
64+
momentum: 0.9
65+
moving_average_decay_base: 0.996
66+
weight_decay: 1.5e-6
67+
learning_rate:
68+
base: 0.2
69+
base_batch_size: 256
70+
warmup_epochs: 10
71+
72+
searcher:
73+
name: single
74+
metric: test_accuracy
75+
smaller_is_better: false
76+
max_length:
77+
epochs: 100
78+
79+
bind_mounts:
80+
- host_path: /tmp
81+
container_path: /data

0 commit comments

Comments
 (0)