Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Baal in Production Notebook | Classification | NLP | Hugging Face #245

Merged
merged 4 commits into from
Jan 28, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion docs/tutorials/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,8 @@ latter on how we integrate with other common frameworks such as Label Studio, Hu
## :material-file-tree: How to

* [Run an active learning experiments](notebooks/active_learning_process.ipynb)
* [Active learning in production](notebooks/baal_prod_cls.ipynb)
* [Active learning in production (Image Classification)](notebooks/production/baal_prod_cls.ipynb)
* [Active learning in production (Text Classification)](notebooks/production/baal_prod_cls_nlp_hf.ipynb)
* [Deep Ensembles](../notebooks/deep_ensemble.ipynb)

## :material-file-tree: Compatibility
Expand Down
4 changes: 3 additions & 1 deletion mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -91,8 +91,10 @@ nav:
- Compatibility:
- HuggingFace: notebooks/compatibility/nlp_classification.ipynb
- Scikit-learn: notebooks/compatibility/sklearn_tutorial.ipynb
- Production use cases:
- Computer vision: notebooks/production/baal_prod_cls.ipynb
- Text classification: notebooks/production/baal_prod_cls_nlp_hf.ipynb
- Active learning for research: notebooks/active_learning_process.ipynb
- Active learning for production: notebooks/baal_prod_cls.ipynb
- Deep Ensembles for active learning: notebooks/deep_ensemble.ipynb
- Research:
- research/index.md
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -3,13 +3,10 @@
{
"cell_type": "markdown",
"metadata": {
"collapsed": true,
"pycharm": {
"name": "#%% md\n"
}
"collapsed": true
},
"source": [
"# Use Baal in production (Classification)\n",
"# Use Baal in production (Image classification)\n",
"\n",
"[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/baal-org/baal/blob/master/notebooks/baal_prod_cls.ipynb)\n",
"\n",
Expand All @@ -35,8 +32,7 @@
"execution_count": 1,
"metadata": {
"pycharm": {
"is_executing": false,
"name": "#%%\n"
"is_executing": false
}
},
"outputs": [
Expand All @@ -60,11 +56,7 @@
},
{
"cell_type": "markdown",
"metadata": {
"pycharm": {
"name": "#%% md\n"
}
},
"metadata": {},
"source": [
"Introducing `baal.active.FileDataset` and `baal.active.ActiveLearningDataset`\n",
"\n",
Expand All @@ -84,8 +76,7 @@
"execution_count": 2,
"metadata": {
"pycharm": {
"is_executing": false,
"name": "#%%\n"
"is_executing": false
}
},
"outputs": [],
Expand Down Expand Up @@ -114,11 +105,7 @@
},
{
"cell_type": "markdown",
"metadata": {
"pycharm": {
"name": "#%% md\n"
}
},
"metadata": {},
"source": [
"\n",
"We now have two unlabeled datasets : train and validation. We encapsulate the training dataset in a \n",
Expand All @@ -139,8 +126,7 @@
"execution_count": 3,
"metadata": {
"pycharm": {
"is_executing": false,
"name": "#%%\n"
"is_executing": false
}
},
"outputs": [],
Expand All @@ -167,11 +153,7 @@
},
{
"cell_type": "markdown",
"metadata": {
"pycharm": {
"name": "#%% md\n"
}
},
"metadata": {},
"source": [
"### Heuristics\n",
"\n",
Expand All @@ -185,8 +167,7 @@
"execution_count": 4,
"metadata": {
"pycharm": {
"is_executing": false,
"name": "#%%\n"
"is_executing": false
}
},
"outputs": [],
Expand All @@ -197,11 +178,7 @@
},
{
"cell_type": "markdown",
"metadata": {
"pycharm": {
"name": "#%% md\n"
}
},
"metadata": {},
"source": [
"### Oracle\n",
"When the AL process requires a new item to labeled, we need to provide an Oracle. In your case, the Oracle will\n",
Expand All @@ -213,8 +190,7 @@
"execution_count": 5,
"metadata": {
"pycharm": {
"is_executing": false,
"name": "#%%\n"
"is_executing": false
}
},
"outputs": [],
Expand All @@ -227,11 +203,7 @@
},
{
"cell_type": "markdown",
"metadata": {
"pycharm": {
"name": "#%% md\n"
}
},
"metadata": {},
"source": [
"### Labeling process\n",
"The labeling will go like this:\n",
Expand All @@ -248,8 +220,7 @@
"execution_count": 6,
"metadata": {
"pycharm": {
"is_executing": false,
"name": "#%%\n"
"is_executing": false
}
},
"outputs": [
Expand Down Expand Up @@ -282,8 +253,7 @@
"execution_count": 7,
"metadata": {
"pycharm": {
"is_executing": false,
"name": "#%%\n"
"is_executing": false
}
},
"outputs": [
Expand Down Expand Up @@ -326,11 +296,7 @@
{
"cell_type": "code",
"execution_count": 8,
"metadata": {
"pycharm": {
"name": "#%%\n"
}
},
"metadata": {},
"outputs": [
{
"name": "stdout",
Expand All @@ -355,11 +321,7 @@
{
"cell_type": "code",
"execution_count": 9,
"metadata": {
"pycharm": {
"name": "#%%\n"
}
},
"metadata": {},
"outputs": [
{
"name": "stdout",
Expand All @@ -383,7 +345,6 @@
"execution_count": null,
"metadata": {
"pycharm": {
"name": "#%%\n",
"is_executing": true
}
},
Expand Down Expand Up @@ -415,11 +376,7 @@
},
{
"cell_type": "markdown",
"metadata": {
"pycharm": {
"name": "#%% md\n"
}
},
"metadata": {},
"source": [
"And we're done!\n",
"Be sure to save the dataset and the model.\n"
Expand All @@ -428,11 +385,7 @@
{
"cell_type": "code",
"execution_count": 11,
"metadata": {
"pycharm": {
"name": "#%%\n"
}
},
"metadata": {},
"outputs": [],
"source": [
"torch.save({\n",
Expand All @@ -444,11 +397,7 @@
},
{
"cell_type": "markdown",
"metadata": {
"pycharm": {
"name": "#%% md\n"
}
},
"metadata": {},
"source": [
"## Support\n",
"Submit an issue or reach us to our Slack!"
Expand Down Expand Up @@ -476,4 +425,4 @@
},
"nbformat": 4,
"nbformat_minor": 1
}
}
Loading