Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: loader, etc, etc #1174

Merged
merged 9 commits into from
Apr 14, 2021
Merged
Show file tree
Hide file tree
Changes from 7 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
23 changes: 17 additions & 6 deletions .github/workflows/dl_cpu.yml
Original file line number Diff line number Diff line change
Expand Up @@ -109,9 +109,20 @@ jobs:
- name: check examples
env:
REQUIREMENTS: ${{ matrix.requirements }}
run: |
pytest .
pip install -e .
cd examples && bash mnist_stages/run_config.sh && cd ..
cd examples && bash mnist_stages/run_hydra.sh && cd ..
cd examples && bash mnist_stages/run_tune.sh && cd ..
uses: nick-invision/retry@v2
with:
timeout_minutes: 15
max_attempts: 4
retry_on: timeout
command: |
pytest .
pip install -e .
cd examples && bash mnist_stages/run_config.sh && cd ..
cd examples && bash mnist_stages/run_hydra.sh && cd ..
cd examples && bash mnist_stages/run_tune.sh && cd ..
# run: |
# pytest .
# pip install -e .
# cd examples && bash mnist_stages/run_config.sh && cd ..
# cd examples && bash mnist_stages/run_hydra.sh && cd ..
# cd examples && bash mnist_stages/run_tune.sh && cd ..
21 changes: 14 additions & 7 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,26 +12,32 @@ The format is based on [Keep a Changelog](http://keepachangelog.com/en/1.0.0/).
- Nifti Reader (NiftiReader) ([#1151](https://github.com/catalyst-team/catalyst/pull/1151))
- CMC score and callback for ReID task (ReidCMCMetric and ReidCMCScoreCallback) ([#1170](https://github.com/catalyst-team/catalyst/pull/1170))
- Market1501 metric learning datasets (Market1501MLDataset and Market1501QGDataset) ([#1170](https://github.com/catalyst-team/catalyst/pull/1170))
- extra kwargs support for Engines ([#1156](https://github.com/catalyst-team/catalyst/pull/1156))
- engines exception for unknown model type ([#1174](https://github.com/catalyst-team/catalyst/issues/1174))
- a few docs to the supported loggers ([#1174](https://github.com/catalyst-team/catalyst/issues/1174))

### Changed

-
- ``TensorboardLogger`` switched from ``global_batch_step`` counter to ``global_sample_step`` one ([#1174](https://github.com/catalyst-team/catalyst/issues/1174))
- ``TensorboardLogger`` logs loader metric ``on_loader_end`` rather than ``on_epoch_end`` ([#1174](https://github.com/catalyst-team/catalyst/issues/1174))
- ``prefix`` renamed to ``metric_key`` for ``MetricAggregationCallback`` ([#1174](https://github.com/catalyst-team/catalyst/issues/1174))
- ``micro``, ``macro`` and ``weighted`` aggregations renamed to ``_micro``, ``_macro`` and ``_weighted`` ([#1174](https://github.com/catalyst-team/catalyst/issues/1174))

### Removed

-
- auto ``torch.sigmoid`` usage for ``metrics.AUCMetric`` and ``metrics.auc`` ([#1174](https://github.com/catalyst-team/catalyst/issues/1174))

### Fixed

-
- hitrate calculation issue ([#1155](https://github.com/catalyst-team/catalyst/issues/1155))
- ILoader wrapper usage issue with Runner ([#1174](https://github.com/catalyst-team/catalyst/issues/1174))

## [21.03.2] - 2021-03-29

### Fixed

- minimal requirements issue ([#1147](https://github.com/catalyst-team/catalyst/issues/1147))
- nested dicts in `loaders_params`/`samplers_params` overriding fixed ([#1150](https://github.com/catalyst-team/catalyst/pull/1150))
- fixed hitrate calculation issue ([#1155]) (https://github.com/catalyst-team/catalyst/issues/1155)
- nested dicts in `loaders_params`/`samplers_params` overriding ([#1150](https://github.com/catalyst-team/catalyst/pull/1150))

## [21.03.1] - 2021-03-28

Expand All @@ -46,7 +52,7 @@ The format is based on [Keep a Changelog](http://keepachangelog.com/en/1.0.0/).
- FactorizedLinear to contrib ([1142](https://github.com/catalyst-team/catalyst/pull/1142))
- Extra init params for ``ConsoleLogger`` ([1142](https://github.com/catalyst-team/catalyst/pull/1142))
- Tracing, Quantization, Onnx, Pruninng Callbacks ([1127](https://github.com/catalyst-team/catalyst/pull/1127))
- `_key_value` for schedulers in case of multiple optimizers fixed ([#1146](https://github.com/catalyst-team/catalyst/pull/1146))


### Changed

Expand All @@ -63,7 +69,8 @@ The format is based on [Keep a Changelog](http://keepachangelog.com/en/1.0.0/).
- BatchLimitLoaderWrapper logic for loaders with shuffle flag ([#1136](https://github.com/catalyst-team/catalyst/issues/1136))
- config description in the examples ([1142](https://github.com/catalyst-team/catalyst/pull/1142))
- Config API deprecated parsings logic ([1142](https://github.com/catalyst-team/catalyst/pull/1142)) ([1138](https://github.com/catalyst-team/catalyst/pull/1138))
- RecSys metrics Top_k calculations ([#1140] (https://github.com/catalyst-team/catalyst/pull/1140))
- RecSys metrics Top_k calculations ([#1140](https://github.com/catalyst-team/catalyst/pull/1140))
- `_key_value` for schedulers in case of multiple optimizers ([#1146](https://github.com/catalyst-team/catalyst/pull/1146))

## [21.03] - 2021-03-13 ([#1095](https://github.com/catalyst-team/catalyst/issues/1095))

Expand Down
166 changes: 84 additions & 82 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -181,6 +181,90 @@ Tested on Ubuntu 16.04/18.04/20.04, macOS 10.15, Windows 10, and Windows Subsyst

### Minimal Examples

<details>
<summary>CustomRunner – PyTorch for-loop decomposition</summary>
<p>

```python
import os
from torch import nn, optim
from torch.nn import functional as F
from torch.utils.data import DataLoader
from catalyst import dl, metrics
from catalyst.data.transforms import ToTensor
from catalyst.contrib.datasets import MNIST

model = nn.Sequential(nn.Flatten(), nn.Linear(28 * 28, 10))
optimizer = optim.Adam(model.parameters(), lr=0.02)

loaders = {
"train": DataLoader(
MNIST(os.getcwd(), train=True, download=True, transform=ToTensor()), batch_size=32
),
"valid": DataLoader(
MNIST(os.getcwd(), train=False, download=True, transform=ToTensor()), batch_size=32
),
}

class CustomRunner(dl.Runner):
def predict_batch(self, batch):
# model inference step
return self.model(batch[0].to(self.device))

def on_loader_start(self, runner):
super().on_loader_start(runner)
self.meters = {
key: metrics.AdditiveValueMetric(compute_on_call=False)
for key in ["loss", "accuracy01", "accuracy03"]
}
Comment on lines +214 to +219
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about removing on_loader_start/on_loader_end methods? I think it can make the example a little bit simpler?


def handle_batch(self, batch):
# model train/valid step
# unpack the batch
x, y = batch
# run model forward pass
logits = self.model(x)
# compute the loss
loss = F.cross_entropy(logits, y)
# compute other metrics of interest
accuracy01, accuracy03 = metrics.accuracy(logits, y, topk=(1, 3))
# log metrics
self.batch_metrics.update(
{"loss": loss, "accuracy01": accuracy01, "accuracy03": accuracy03}
)
for key in ["loss", "accuracy01", "accuracy03"]:
self.meters[key].update(self.batch_metrics[key].item(), self.batch_size)
# run model backward pass
if self.is_train_loader:
loss.backward()
self.optimizer.step()
self.optimizer.zero_grad()

def on_loader_end(self, runner):
for key in ["loss", "accuracy01", "accuracy03"]:
self.loader_metrics[key] = self.meters[key].compute()[0]
super().on_loader_end(runner)

runner = CustomRunner()
# model training
runner.train(
model=model,
optimizer=optimizer,
loaders=loaders,
logdir="./logs",
num_epochs=5,
verbose=True,
valid_loader="valid",
valid_metric="loss",
minimize_valid_metric=True,
)
# model inference
for logits in runner.predict_loader(loader=loaders["valid"]):
assert logits.detach().cpu().numpy().shape[-1] == 10
```
</p>
</details>

<details>
<summary>ML - linear regression</summary>
<p>
Expand Down Expand Up @@ -493,88 +577,6 @@ runner.train(
</details>


<details>
<summary>Custom batch-metrics logging</summary>
<p>

```python
import os
from torch import nn, optim
from torch.nn import functional as F
from torch.utils.data import DataLoader
from catalyst import dl, metrics
from catalyst.data.transforms import ToTensor
from catalyst.contrib.datasets import MNIST

model = nn.Sequential(nn.Flatten(), nn.Linear(28 * 28, 10))
optimizer = optim.Adam(model.parameters(), lr=0.02)

loaders = {
"train": DataLoader(
MNIST(os.getcwd(), train=True, download=True, transform=ToTensor()), batch_size=32
),
"valid": DataLoader(
MNIST(os.getcwd(), train=False, download=True, transform=ToTensor()), batch_size=32
),
}

class CustomRunner(dl.Runner):
def predict_batch(self, batch):
# model inference step
return self.model(batch[0].to(self.device))

def on_loader_start(self, runner):
super().on_loader_start(runner)
self.meters = {
key: metrics.AdditiveValueMetric(compute_on_call=False)
for key in ["loss", "accuracy01", "accuracy03"]
}

def handle_batch(self, batch):
# model train/valid step
# unpack the batch
x, y = batch
# run model forward pass
logits = self.model(x)
# compute the loss
loss = F.cross_entropy(logits, y)
# compute other metrics of interest
accuracy01, accuracy03 = metrics.accuracy(logits, y, topk=(1, 3))
# log metrics
self.batch_metrics.update(
{"loss": loss, "accuracy01": accuracy01, "accuracy03": accuracy03}
)
for key in ["loss", "accuracy01", "accuracy03"]:
self.meters[key].update(self.batch_metrics[key].item(), self.batch_size)
# run model backward pass
if self.is_train_loader:
loss.backward()
self.optimizer.step()
self.optimizer.zero_grad()

def on_loader_end(self, runner):
for key in ["loss", "accuracy01", "accuracy03"]:
self.loader_metrics[key] = self.meters[key].compute()[0]
super().on_loader_end(runner)

runner = CustomRunner()
# model training
runner.train(
model=model,
optimizer=optimizer,
loaders=loaders,
logdir="./logs",
num_epochs=5,
verbose=True,
valid_loader="valid",
valid_metric="loss",
minimize_valid_metric=True,
)
```
</p>
</details>


<details>
<summary>CV - MNIST classification</summary>
<p>
Expand Down
2 changes: 1 addition & 1 deletion catalyst/__version__.py
Original file line number Diff line number Diff line change
@@ -1 +1 @@
__version__ = "21.03.2"
__version__ = "21.04"
Loading