Skip to content

Commit

Permalink
Add some links to the post
Browse files Browse the repository at this point in the history
Co-authored-by: Brian Chen <[email protected]>
  • Loading branch information
theabhirath and ToucheSir authored Jun 12, 2023
1 parent 140e104 commit 32a2f8c
Showing 1 changed file with 3 additions and 3 deletions.
6 changes: 3 additions & 3 deletions blogposts/2023-06-07-metalhead-v0.8.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,13 +16,13 @@ Metalhead v0.8.0 ships with more exported models than any other previous Metalhe
- [EfficientNetv2 and MNASNet](https://github.com/FluxML/Metalhead.jl/pull/198)
- [The ViT model introduced in v0.7 is now more robust](https://github.com/FluxML/Metalhead.jl/pull/230) and comes with an option for [loading pre-trained weights on ImageNet](https://github.com/FluxML/Metalhead.jl/pull/235)

In Metalhead v0.7, support was added for pre-trained models for VGG and ResNets. v0.8.0 takes this further by adding support for Wide ResNets (an architecture previously not supported by Metalhead), certain configurations of ResNeXt, and SqueezeNet. This makes it easier for users to get started with transfer learning tasks. We also now export the `backbone` and `classifier` functions, which return the feature extractor and classifier head portions of the model respectively. This should make it easier for users to hit the ground running.
In Metalhead v0.7, support was added for pre-trained models for VGG and ResNets. v0.8.0 takes this further by adding support for Wide ResNets (an architecture previously not supported by Metalhead), certain configurations of ResNeXt, and SqueezeNet. This makes it easier for users to get started with transfer learning tasks. We also now export the [`backbone` and `classifier`](https://fluxml.ai/Metalhead.jl/v0.8/api/utilities) functions, which return the feature extractor and classifier head portions of the model respectively. This should make it easier for users to hit the ground running.

Metalhead is always looking for contributors to help with adding pre-trained weights for the models. To know how you can help with this effort, please check out the contributor’s guide in the documentation. We will be happy to help you work through any issues you may encounter!

## A `Layers` module to make it easier to build custom models

Previously, Metalhead v0.7 introduced the `Layers` module but it was not publicly documented as the internals were still being polished. With Metalhead v0.8.0, it has reached a point of stability. The `Layers` module exposes functionality that allows users to build custom neural network models very easily. Some notable improvements are:
Previously, Metalhead v0.7 introduced the `Layers` module but it was not publicly documented as the internals were still being polished. With Metalhead v0.8.0, it has reached a point of stability. The [`Layers`](https://fluxml.ai/Metalhead.jl/v0.8/api/layers) module exposes functionality that allows users to build custom neural network models very easily. Some notable improvements are:

1. Stochastic Depth and DropBlock layers were added, and are now fully featured ([https://github.com/FluxML/Metalhead.jl/pull/174](https://github.com/FluxML/Metalhead.jl/pull/174), [https://github.com/FluxML/Metalhead.jl/pull/200](https://github.com/FluxML/Metalhead.jl/pull/200)). In particular, Stochastic Depth adds support for batch mode. Note that v0.7 used the term `DropPath` – however, this term was changed to `StochasticDepth` to remove any ambiguity over what the layer was used for.
2. The `create_classifier` function now adds a distinction between no Dropout and Dropout of rate 0; moreover, it also now supports an expanded classifier with an additional `Dense` layer in between ([https://github.com/FluxML/Metalhead.jl/pull/198](https://github.com/FluxML/Metalhead.jl/pull/198)).
Expand Down Expand Up @@ -61,7 +61,7 @@ The “low-level” interface allows engineers and researchers to iterate on mod

One of the major changes Metalhead v0.8 introduces is the concept of *builders.* These are functions that return closures over the **stage and block indices**. These two numbers together index into a block completely (the stage index represents which stage of the large model is being referred to, and the block index finds the specific block inside the stage). Builders have allowed writing functions that construct blocks purely based on the stage and block indices, abstracting away construction of the layers as well as details like calculations for strides, output feature maps or probabilities for stochastic depth/DropBlock. These builders have been used to rewrite the underlying implementations of the three largest CNN families supported by Metalhead and that are used commonly – ResNets, MobileNets and EfficientNets.

The biggest beneficiary of this has been the MobileNet and EfficientNet model families, which are constructed from a single function using nothing but a uniform configuration dictionary format. Yep, you read that right. That’s six different models expressed using a single underlying function which is just about sixty lines of code. As a user, all you need to do is change the configuration dictionary and watch the magic happen. This means that conceptualising new models has become as simple as visualising the model structure as a dictionary of configurations (in short, just the novelties), and then watching the function take your configurations and produce the model.
The biggest beneficiary of this has been the MobileNet and EfficientNet model families, which are constructed from a [single function](https://github.com/FluxML/Metalhead.jl/pull/200) using nothing but a uniform configuration dictionary format. Yep, you read that right. That’s six different models expressed using a single underlying function which is just about sixty lines of code. As a user, all you need to do is change the configuration dictionary and watch the magic happen. This means that conceptualising new models has become as simple as visualising the model structure as a dictionary of configurations (in short, just the novelties), and then watching the function take your configurations and produce the model.

There were improvements for the ResNet family of models as well: the stride and feature map calculations through the ResNet block builders are now callbacks, and can thus be easily modified by users if they wish to define custom stride/feature map calculations. To know more about builders and how they work for the different model families, check out the documentation! We are still working on adding documentation for all of the features, so if you find something that's unexplained, open an issue and we will be happy to assist you as well as to improve the documentation!

Expand Down

0 comments on commit 32a2f8c

Please sign in to comment.