[Release_v2160] Update Release notes #3380

nikita-malininn · 2025-03-26T10:11:48Z

Changes

Added v2.16.0 template

Reason for changes

Upcoming release

Related tickets

164968

For the contributors:

Please add your changes (as a commit to the branch) to the list according to the template and previous notes
Do not add tests-related notes
Provide the list of the PRs (for all your notes) in the comment for the discussion

nikita-malininn · 2025-03-26T10:12:02Z

@alexsu52, @ljaljushkin, @l-bat, @nikita-savelyevv, @andreyanufr, @andrey-churkin, @daniil-lyakhov, @kshpv, @AlexanderDokuchaev, @anzr299 fill the document with your changes for the upcoming release, please.

openvinotoolkit#3292 openvinotoolkit#3366

nikita-savelyevv · 2025-03-27T17:28:26Z

ReleaseNotes.md

+- Features:
+  - ...
+- Fixes:
+  - Fixed occasional failures of weight compression algorithm on ARM CPUs.


nikita-savelyevv · 2025-03-27T17:28:30Z

ReleaseNotes.md

+- Fixes:
+  - Fixed occasional failures of weight compression algorithm on ARM CPUs.
+- Improvements:
+  - Reduced the run time and peak memory of mixed precision assignment procedure during weight compression in the OpenVINO backend. Overall compression time reduction in mixed precision case is about 20-40%; peak memory reduction is about 20%.


openvinotoolkit#3351 openvinotoolkit#3348 openvinotoolkit#3341 openvinotoolkit#3337 openvinotoolkit#3322

ljaljushkin · 2025-03-27T22:09:44Z

ReleaseNotes.md

+- General:
+  - ...
+- Features:
+  - (Torch) Introduced a novel weight compression method for Large Language Models (LLMs) that significantly improves accuracy with int4 weights. Leveraging Quantization-Aware Training (QAT) and absorbable LoRA adapters, this approach can achieve a 2x reduction in accuracy loss during compression compared to the best post-training weight compression technique in NNCF (Scale Estimation + AWQ + GPTQ). The `nncf.compress_weight` API now includes a new `compression_format` option, `CompressionFormat.FQ_LORA`, for this QAT method, and a sample compression pipeline with preview support is available [here](examples/llm_compression/torch/qat_with_lora).


#3351
#3348
#3341
#3337
#3322

openvinotoolkit#3330

ljaljushkin · 2025-03-27T22:11:57Z

ReleaseNotes.md

+  - (Torch) Introduced a novel weight compression method for Large Language Models (LLMs) that significantly improves accuracy with int4 weights. Leveraging Quantization-Aware Training (QAT) and absorbable LoRA adapters, this approach can achieve a 2x reduction in accuracy loss during compression compared to the best post-training weight compression technique in NNCF (Scale Estimation + AWQ + GPTQ). The `nncf.compress_weight` API now includes a new `compression_format` option, `CompressionFormat.FQ_LORA`, for this QAT method, and a sample compression pipeline with preview support is available [here](examples/llm_compression/torch/qat_with_lora).
+- Fixes:
+  - Fixed occasional failures of weight compression algorithm on ARM CPUs.
+  - (Torch) Fixed weight compression for float16/bfloat16 models.


reworked FQ + Lora

anzr299 · 2025-03-28T08:33:12Z

ReleaseNotes.md

+
+Requirements:
+
+- Updated PyTorch (2.6.0) and Torchvision (0.21.0) versions.


anzr299 · 2025-03-28T08:33:40Z

ReleaseNotes.md

+  - (Torch) Fixed weight compression for float16/bfloat16 models.
+- Improvements:
+  - Reduced the run time and peak memory of mixed precision assignment procedure during weight compression in the OpenVINO backend. Overall compression time reduction in mixed precision case is about 20-40%; peak memory reduction is about 20%.
+  - (TorchFX, Experimental) Added quantization support for (TorchFX)[https://pytorch.org/docs/stable/fx.html] models exported with dynamic shapes.


kshpv · 2025-03-28T08:45:23Z

ReleaseNotes.md

+  - ...
+- Features:
+  - (Torch) Introduced a novel weight compression method to significantly improve the accuracy of Large Language Models (LLMs) with int4 weights. Leveraging Quantization-Aware Training (QAT) and absorbable LoRA adapters, this approach can achieve a 2x reduction in accuracy loss during compression compared to the best post-training weight compression technique in NNCF (Scale Estimation + AWQ + GPTQ). The `nncf.compress_weight` API now includes a new `compression_format` option, `CompressionFormat.FQ_LORA`, for this QAT method, and a sample compression pipeline with preview support is available [here](examples/llm_compression/torch/qat_with_lora).
+  - (Torch) Add support for 4-bit weight compression, along with AWQ and Scale Estimation data-aware methods to reduce quality loss after compression.


#3179, #3279

nikita-malininn requested a review from a team as a code owner March 26, 2025 10:11

github-actions bot added documentation Improvements or additions to documentation release target labels Mar 26, 2025

Release notes template

604c960

MaximProshin requested review from alexsu52, ljaljushkin, l-bat, nikita-savelyevv, kshpv, anzr299, AlexanderDokuchaev, andrey-churkin, andreyanufr, daniil-lyakhov and MaximProshin March 27, 2025 06:20

Update ReleaseNotes.md

689ae88

openvinotoolkit#3292 openvinotoolkit#3366

nikita-savelyevv approved these changes Mar 27, 2025

View reviewed changes

Update ReleaseNotes.md

47208f9

openvinotoolkit#3351 openvinotoolkit#3348 openvinotoolkit#3341 openvinotoolkit#3337 openvinotoolkit#3322

ljaljushkin reviewed Mar 27, 2025

View reviewed changes

Update ReleaseNotes.md

df5d3e5

openvinotoolkit#3330

ljaljushkin reviewed Mar 27, 2025

View reviewed changes

Update ReleaseNotes.md

28c6683

reworked FQ + Lora

ljaljushkin approved these changes Mar 27, 2025

View reviewed changes

anzr299 added 2 commits March 28, 2025 12:29

Update ReleaseNotes.md

93ce251

Update ReleaseNotes.md

c9426b6

anzr299 reviewed Mar 28, 2025

View reviewed changes

anzr299 approved these changes Mar 28, 2025

View reviewed changes

Update ReleaseNotes.md

2301298

kshpv reviewed Mar 28, 2025

View reviewed changes

kshpv approved these changes Mar 28, 2025

View reviewed changes

Add list of OV notebooks with NNCF to release notes

7d99dbe

l-bat approved these changes Mar 28, 2025

View reviewed changes

andreyanufr approved these changes Mar 28, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Release_v2160] Update Release notes #3380

[Release_v2160] Update Release notes #3380

nikita-malininn commented Mar 26, 2025

nikita-malininn commented Mar 26, 2025

nikita-savelyevv Mar 27, 2025

nikita-savelyevv Mar 27, 2025

ljaljushkin Mar 27, 2025

ljaljushkin Mar 27, 2025

anzr299 Mar 28, 2025

anzr299 Mar 28, 2025

kshpv Mar 28, 2025


		Requirements:

		- Updated PyTorch (2.6.0) and Torchvision (0.21.0) versions.

[Release_v2160] Update Release notes #3380

Are you sure you want to change the base?

[Release_v2160] Update Release notes #3380

Conversation

nikita-malininn commented Mar 26, 2025

Changes

Reason for changes

Related tickets

For the contributors:

nikita-malininn commented Mar 26, 2025

nikita-savelyevv Mar 27, 2025

Choose a reason for hiding this comment

nikita-savelyevv Mar 27, 2025

Choose a reason for hiding this comment

ljaljushkin Mar 27, 2025

Choose a reason for hiding this comment

ljaljushkin Mar 27, 2025

Choose a reason for hiding this comment

anzr299 Mar 28, 2025

Choose a reason for hiding this comment

anzr299 Mar 28, 2025

Choose a reason for hiding this comment

kshpv Mar 28, 2025

Choose a reason for hiding this comment