Improve CI speed and resolve issues of run_quantization_check
#1682
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
I have analyzed the time cost in
run_quantization_check
Test command:
CUDA_VISIBLE_DEVICES= KERAS_BACKEND=jax pytest keras_nlp/src/ -k backbone_basics
run_quantization_check=False
run_quantization_check=True
without callingquantize
run_quantization_check=True
with callingquantize
run_quantization_check=True
with callingquantize
+ savingObviously, the bottleneck is the underlying computation when calling
Model.quantize
.Here, I propose an improvement by pre-configuring
DTypePolicyMap
and using it to instantiate the model to avoid quantization-related computation.This should improve the speed of CI.
Some minor bugs have been spotted and resolved too.
Backbone.get_config()
should consider thatself.dtype_policy
is already aDTypePolicyMap
self.embeddings_layer_norm
inBloomBackbone
has been changed to avoid a duplicated name that breaksDTypePolicyMap
.OPTBackbone.get_config
missedsuper()
XLNetBackbone
failed to passrun_quantization_check
. (Will try to fix it in the future)EDITED:
The CI is much faster now. (JAX: ~27mins -> 18mins)