Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: Pluggable ModelTrainer train function #3084

Merged
merged 15 commits into from
Mar 24, 2023

Conversation

plonerma
Copy link
Collaborator

@plonerma plonerma commented Feb 6, 2023

This PR aims to make the ModelTrainer easier to adapt to a wider range of training settings. I propose the following changes to move in this direction. The proposed architecture as well as the concrete implementation are very much up for discussion.

Plugin System

A plugin system has been introduced to replace the long train function and disentangle its components.
Plugins can hook into the events produced by the training loops to alter the behavior of the training procedure, produce performance scores, track results, etc.

Logging

Logging may refer to regular evaluations and irregular events. Currently there is no dedicated mechanism for logging regular evaluations.

With this PR, results of model evaluations are published via the metric_recorded event in the ModelTrainer. Regular performance reports are logged to the flair-logger in a dedicated plugin.
Loss-files and other artifacts produced by the training-loop will also be handled in dedicated plugins.

As currently, irregular events are logged via the logging module to the flair-logger wherever the occur (directly in the trainer of in separate plugins). A plugin takes care of opening and closing a file handler.

Metrics

All metrics are published as metric_recorded events in the trainer. By applying hooks to this event, these metrics can be recorded to files, a tensorboard, and/or the shell. This also allows to replace the tensorboard with a different logging framework.

Performance values are published with a metric name, the value, the type of value, the (wall)time of evaluation as well as the current step.

@plonerma plonerma marked this pull request as draft February 6, 2023 16:14
@plonerma plonerma changed the title Pluggable ModelTrainer train function Proposal: Pluggable ModelTrainer train function Feb 6, 2023
@alanakbik alanakbik changed the base branch from master to pluggable_trainer March 24, 2023 11:43
@alanakbik alanakbik marked this pull request as ready for review March 24, 2023 11:44
@alanakbik alanakbik merged commit a92619b into flairNLP:pluggable_trainer Mar 24, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants