Remove f16, bf16 from node's evaluate methods v2 #22674

mateusztabaka · 2024-02-06T09:23:55Z

Since this change, most of operation that override Node::evaluate will stop instantiating evaluate methods for f16 and bf16. There are few exceptions - we still keep f16 (and/or bf16) evaluates for the following operations:

Ceiling
Convert
FakeConvert

Primary reason for that is to reduce binary size. The change saves us around 200 KB. The change is transparent to the caller, so you can still evaluate f16/bf16 operations, but internally they'll be executed with f32 precision.

Ticket: CVS-108489

Since this change, most of operation that override Node::evaluate will stop instantiating evaluate methods for f16 and bf16. There are few exceptions - we still keep f16 (and/or bf16) evaluates for the following operations: - Ceiling - Convert - FakeConvert Primary reason for that is to reduce binary size. The change saves us around 200 KB. The change is transparent to the caller, so you can still evaluate f16/bf16 operations, but internally they'll be executed with f32 precision. Ticket: CVS-108489

src/core/tests/pass/constant_folding.cpp

src/core/src/constant_fold_utils.cpp

src/core/src/op/subtract.cpp

src/core/src/op/util/gather_base.cpp

src/core/src/constant_fold_utils.cpp

src/core/src/node.cpp

src/core/src/validation_util.cpp

iefode · 2024-02-19T09:35:33Z

src/core/src/constant_fold_utils.cpp

+#include "ov_ops/type_relaxed.hpp"
+
+const ov::element::TypeVector& ov::util::unsupported_types() {
+    static const ov::element::TypeVector types{ov::element::f16, ov::element::bf16};


just idea: As I know, some plugins use conversion from i64 to i32. Will it be useful to apply the same approach (fp32 -> fp32) for the mentioned element types?

The goal was to remove the f16, and bf16 from core ops to reduce bin-size but constant folding and shape inference (before apply the plugin conversion) still requires these precisions to do calculations and this is reason to apply convert to f32 without changing the model. In core when operator do calculations on f16, bf16 values are converted to float and then back (element by element)

The i64 and i32 are common types when used for shapes calculations (constant fold and shape inference) and apply conversion will may introduce data copies native support I think is better.

Since this change, most of operation that override Node::evaluate will stop instantiating evaluate methods for f16 and bf16. There are few exceptions - we still keep f16 (and/or bf16) evaluates for the following operations: - Ceiling - Convert - FakeConvert Primary reason for that is to reduce binary size. The change saves us around 200 KB. The change is transparent to the caller, so you can still evaluate f16/bf16 operations, but internally they'll be executed with f32 precision. Ticket: CVS-108489

rkazants

What about reduction at the cost of removing i32 and use i64 for it? And u32->u64? Or other integer types like i8, i16

…ding is omitted (#26756) Details: It's a modification of #22674 f16 LLM (llama was tested) compilation time on ARM is unreasonable huge. Perf report shows that every ConstantFolding transformation takes several seconds even if the graph is not modified. The root cause is util::convert_to_supported_precision call even if constant folding is skipped. The suggested fix is to skip util::convert_to_supported_precision call if folding is not applied. Tickets: CVS-152428 --------- Co-authored-by: Aleksandr Voron <[email protected]> Co-authored-by: Andrii Staikov <[email protected]>

…ding is omitted (openvinotoolkit#26756) Details: It's a modification of openvinotoolkit#22674 f16 LLM (llama was tested) compilation time on ARM is unreasonable huge. Perf report shows that every ConstantFolding transformation takes several seconds even if the graph is not modified. The root cause is util::convert_to_supported_precision call even if constant folding is skipped. The suggested fix is to skip util::convert_to_supported_precision call if folding is not applied. Tickets: CVS-152428 --------- Co-authored-by: Aleksandr Voron <[email protected]> Co-authored-by: Andrii Staikov <[email protected]>

mateusztabaka requested a review from a team as a code owner February 6, 2024 09:23

mateusztabaka requested review from itikhono and praasz February 6, 2024 09:24

github-actions bot added the category: Core OpenVINO Core (aka ngraph) label Feb 6, 2024

mateusztabaka force-pushed the remove_f16_bf16_evaluates_v2 branch from cef8252 to 2c14955 Compare February 6, 2024 13:32

mateusztabaka added 3 commits February 13, 2024 11:40

fix cc

e34a9ed

fix test on spr

55ecc26

add has_evaluate to GatherBase

0653933

github-actions bot added the category: CPP API OpenVINO CPP API bindings label Feb 13, 2024

mateusztabaka added 2 commits February 13, 2024 18:27

Merge branch 'master' into remove_f16_bf16_evaluates_v2

59ae034

introduce custom IfTypeOf

a9a0849

mateusztabaka force-pushed the remove_f16_bf16_evaluates_v2 branch from 2d3c74c to a9a0849 Compare February 14, 2024 13:32

mateusztabaka added 2 commits February 14, 2024 17:01

remove Gather::has_evaluate

65d2519

remove custom IfTypeOf

aaf3503

praasz reviewed Feb 15, 2024

View reviewed changes

review comments

673f656

mateusztabaka requested a review from praasz February 15, 2024 21:51

Merge branch 'master' into remove_f16_bf16_evaluates_v2

6f09de4

praasz approved these changes Feb 16, 2024

View reviewed changes

dont clone in evaluate_node_with_unsupported_precision

1f9d4c8

praasz added this to the 2024.1 milestone Feb 19, 2024

praasz assigned itikhono and praasz Feb 19, 2024

iefode reviewed Feb 19, 2024

View reviewed changes

itikhono added this pull request to the merge queue Feb 21, 2024

github-merge-queue bot removed this pull request from the merge queue due to failed status checks Feb 21, 2024

itikhono approved these changes Feb 21, 2024

View reviewed changes

itikhono added this pull request to the merge queue Feb 21, 2024

github-merge-queue bot removed this pull request from the merge queue due to failed status checks Feb 21, 2024

itikhono requested review from rkazants and slyalin February 21, 2024 11:24

rkazants reviewed Feb 21, 2024

View reviewed changes

itikhono added this pull request to the merge queue Feb 22, 2024

Merged via the queue into openvinotoolkit:master with commit a4dcf65 Feb 22, 2024
103 checks passed

mateusztabaka deleted the remove_f16_bf16_evaluates_v2 branch February 26, 2024 08:26

allnes mentioned this pull request Mar 6, 2024

[ARM CPU] Fix AUGRU layer in dien.xml model #21925

Merged

alvoron mentioned this pull request Sep 16, 2024

[CORE] Skip unnecessary convert_to_supported_precision if ConstantFolding is omitted #26611

Closed

itikhono mentioned this pull request Sep 24, 2024

[CORE] Skip unnecessary convert_to_supported_precision if ConstantFolding is omitted #26756

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove f16, bf16 from node's evaluate methods v2 #22674

Remove f16, bf16 from node's evaluate methods v2 #22674

mateusztabaka commented Feb 6, 2024

iefode Feb 19, 2024

praasz Feb 21, 2024

rkazants left a comment •

edited

Loading

Remove f16, bf16 from node's evaluate methods v2 #22674

Remove f16, bf16 from node's evaluate methods v2 #22674

Conversation

mateusztabaka commented Feb 6, 2024

iefode Feb 19, 2024

Choose a reason for hiding this comment

praasz Feb 21, 2024

Choose a reason for hiding this comment

rkazants left a comment • edited Loading

Choose a reason for hiding this comment

rkazants left a comment •

edited

Loading