Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgrade torch and correct dim mismatch #3

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

KE7
Copy link

@KE7 KE7 commented Apr 16, 2023

No description provided.

@anentropic
Copy link

anentropic commented Apr 18, 2023

+1 on this, max versions of dependencies shouldn't be specified in library code (unless library is known to be incompatible with a specific version) ...it just makes life awkward for consumers of the library

@@ -3,6 +3,7 @@
# Copyright (C) 2022 Apple Inc. All Rights Reserved.
#

import einops

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this looks like a 3rd party dependency that needs adding to requirements.txt + setup.py ?

@mikowals
Copy link

The shape mismatch errors should be fixed in the models and not in the tests. The _register_load_state_dict_pre_hook() logic is not correct. Currently it is only applied in the base DistilBert model but that skips the pre_classifier and classifier weights added by other models. Until the model __init__s are fixed it is correct to leave the test failing.

test_distilbert.py will pass by adding a pre_hook in DistilBertForSequenceClassification's __init__ :

self._register_load_state_dict_pre_hook(linear_to_conv2d_map)

I think the same needs to be done for all the models that create pre_classifier and classifier weights. At least that appears to be the intention of linear_to_conv2d_map which expects to handle layers with classifier.weights in the name.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants