Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Post-training Activation Pruning algorithm #2683

Merged
Merged
Changes from 1 commit
Commits
Show all changes
76 commits
Select commit Hold shift + click to select a range
73f5e66
first commit
yujiepan-work Jul 16, 2024
a2b6f94
can work except ignored_scope
yujiepan-work Jul 16, 2024
42ef72c
refine code structure and add comments
yujiepan-work Jul 16, 2024
8f5bd9f
disable grad in do_inference
yujiepan-work Jul 16, 2024
6773497
add copyright
yujiepan-work Jul 16, 2024
9631b09
minor style change
yujiepan-work Jul 16, 2024
43de3bc
style fix
yujiepan-work Jul 16, 2024
2d7c1fc
fix "find activation port id"
yujiepan-work Jul 16, 2024
0dcd0f2
minor doc change
yujiepan-work Jul 16, 2024
7fa081d
use numpy's quantile
yujiepan-work Jul 16, 2024
668a434
change method order
yujiepan-work Jul 16, 2024
3712610
algo test
yujiepan-work Jul 16, 2024
b1b66d6
fix test on newer cpu as it might use bf16 for ov inference
yujiepan-work Jul 16, 2024
66f59c0
add tests
yujiepan-work Jul 16, 2024
e1d5f79
upload dots
yujiepan-work Jul 16, 2024
0930870
conformance test
yujiepan-work Jul 16, 2024
fad6961
type hint fix for older python
yujiepan-work Jul 16, 2024
6c8c1e1
typo
yujiepan-work Jul 16, 2024
9158b58
leave abstractmethod empty
yujiepan-work Jul 16, 2024
b6353cd
move ref dot files to "experimental subfolder"
yujiepan-work Jul 16, 2024
d1a99bb
fix type hint for older python
yujiepan-work Jul 16, 2024
0702b4a
tests for "no layers matched"
yujiepan-work Jul 16, 2024
c52c022
use PTTransformationLayout instead of TransformationLayout
yujiepan-work Jul 16, 2024
5cff914
initialize sparsifier as frozen, with running_threshold is -inf
yujiepan-work Jul 16, 2024
796e5bc
clean unnecessary codes
yujiepan-work Jul 16, 2024
1f4fe99
use get_nodes_by_metatypes instead of topolodical sort
yujiepan-work Jul 16, 2024
5838d9d
check sparsifer does not change model output before calibration
yujiepan-work Jul 16, 2024
0308586
runnable confromance test for lm
yujiepan-work Jul 16, 2024
3519661
add conformance test for deit-small
yujiepan-work Jul 16, 2024
b679130
add tests
yujiepan-work Jul 16, 2024
91bb3f7
support subset_size
yujiepan-work Jul 16, 2024
90bd849
fix graph match for weightsdecompressor and eager mode attention
yujiepan-work Jul 16, 2024
3113e66
temporarily fix the timm & graph match issues
yujiepan-work Jul 16, 2024
e307ac8
update metric
yujiepan-work Jul 16, 2024
ff00a1e
re-enable graph comparison for dummy llama
yujiepan-work Jul 16, 2024
982024d
re-order staticmethods
yujiepan-work Jul 16, 2024
8f500f7
refactor pipelines with added cuda_torch backend tests
yujiepan-work Jul 16, 2024
109d305
bugfix
yujiepan-work Jul 16, 2024
f546826
add tests on sparsifier pattern count
yujiepan-work Jul 16, 2024
85b2683
update metric
yujiepan-work Jul 16, 2024
e2fee6a
adjust atol
yujiepan-work Jul 16, 2024
c4993ac
ref metric fix
yujiepan-work Jul 16, 2024
5101282
attempt to solve the "import file mismatch" error in testing
yujiepan-work Jul 16, 2024
3806ef4
switch to "TargetScope" in interface
yujiepan-work Jul 16, 2024
400d1da
rename ref dot files
yujiepan-work Jul 16, 2024
fe1cedf
misc order change
yujiepan-work Jul 16, 2024
c3835ef
add documentation
yujiepan-work Jul 16, 2024
1c3c10e
update metric
yujiepan-work Jul 16, 2024
2201d88
update documentation
yujiepan-work Jul 16, 2024
b6b2a35
minor variable name fix
yujiepan-work Jul 16, 2024
acf6510
simplify _get_target_sparsity_by_node
yujiepan-work Jul 16, 2024
9dbeea7
use nncf's quantile impl
yujiepan-work Jul 16, 2024
591e4db
refine TargetScope docstring
yujiepan-work Jul 16, 2024
9439b82
use higher precision to calculate running_threshold
yujiepan-work Jul 16, 2024
a90dc18
delete `apply_sparsifiers` as it is not needed
yujiepan-work Jul 16, 2024
53f6b26
make `freeze` a property
yujiepan-work Jul 16, 2024
8a5fc4a
enhanace reproducibility
yujiepan-work Jul 16, 2024
6e35851
use fp16 on cuda
yujiepan-work Jul 16, 2024
9171f0c
fix int8+sparse export
yujiepan-work Jul 16, 2024
b652785
update metric
yujiepan-work Jul 16, 2024
c67753f
update ref metric
yujiepan-work Jul 16, 2024
303a66a
Initial documentation of sparsify_activations algorithm
yujiepan-work Jul 16, 2024
e90f39a
Revise ActivationSparsity.md
vuiseng9 Jul 16, 2024
4b8eb09
update readme
yujiepan-work Jul 16, 2024
238d6eb
style fix
yujiepan-work Jul 16, 2024
cb39cfd
update main readme
yujiepan-work Jul 16, 2024
893947f
fix equation
yujiepan-work Jul 16, 2024
93844a9
update readme
yujiepan-work Jul 16, 2024
35e279c
documentation update
yujiepan-work Jul 16, 2024
6491ba8
update arxiv link
yujiepan-work Jul 16, 2024
aa67251
mention dejavu for acceleration example
yujiepan-work Jul 16, 2024
b1ea929
Revise ActivationSparsity.md
vuiseng9 Jul 16, 2024
d8a0ef6
Revise ActivationSparsity.md
vuiseng9 Jul 16, 2024
98bd3ba
style fix
yujiepan-work Jul 16, 2024
0613dce
fix compress_weights name
yujiepan-work Jul 16, 2024
5f04275
minor fix for "L"inear and parentheses for citation
yujiepan-work Jul 16, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
update ref metric
  • Loading branch information
yujiepan-work committed Jul 16, 2024
commit c67753f504745403299c922ca2cbe1d47b0abc3f
Original file line number Diff line number Diff line change
Expand Up @@ -4,26 +4,26 @@ tinyllama_backend_FP32:
num_int8: 0
num_sparse_activations: 0
tinyllama_ffn_sparse20_backend_CUDA_TORCH:
metric_value: 0.7858
atol: 0.02
metric_value: 0.7818
atol: 0.025
num_int4: 0
num_int8: 0
num_sparse_activations: 44
tinyllama_ffn_sparse20_backend_TORCH:
metric_value: 0.7882
atol: 0.02
metric_value: 0.7879
atol: 0.025
num_int4: 0
num_int8: 0
num_sparse_activations: 44
tinyllama_int8_asym_data_free_ffn_sparse20_backend_CUDA_TORCH:
metric_value: 0.8044
atol: 0.02
atol: 0.025
num_int4: 0
num_int8: 312
num_sparse_activations: 44
tinyllama_int8_asym_data_free_ffn_sparse20_backend_TORCH:
metric_value: 0.7977
atol: 0.02
metric_value: 0.7846
atol: 0.030
num_int4: 0
num_int8: 312
num_sparse_activations: 44
Expand Down