Skip to content

Commit

Permalink
Sync with r3.1 (#1921)
Browse files Browse the repository at this point in the history
* update dlrmv2 BKC (#1476)

* use merge-emb-cat for int8 since acc issue is fixed in IPEX (#1477)

* forcing numpy to use a specific version (#1474)

* Restrict vit training to single socket (#1484)

* update dlrm/bert distribute training BKC (#1486)

* change batch size for int8 resnet50 and ssd-resnet34 (#1487)

* Added command to remove existing logs in output_dir (#1475)

* Added command to remove existing logs in output_dir

* Fix condition to checkout OneAPI tools repository (#1490)

* Hz/dlrm ddp (#1496)

* fix dlrm ddp

* fix time computation

---------

Co-authored-by: Weizhuo Zhang <[email protected]>

* fix dlrm-v1 int8 thp (#1497)

* also use merged-emb-cat in dlrm-v2 int8 thp (#1498)

* updated tpp files for 2.12.1 release (#1479)

* updated tpp files

* added yolo5

* P0 models list (#1500)

* P0 models list

* replace master w/ tag

* correct framework name

---------

Co-authored-by: Jitendra Patil <[email protected]>

* another update to TPPs (#1503)

* Fixing SSD-Resnet34 training quickstart script to run right number of instances (#1493)

* Container GHA Pipeline Reformat (#1462)

* swap runner to mlops runner

* change ipex base image (#1440)

* rename tests yaml (#1450)

* New Test Runner (#1461)

* add execute perms to quickstart and add 140 tests to pytorch resnet with new runner

* add tests per new format

* add flex140 support

* Update Test Runner (#1467)

* Flex 140 tests for P0 (#1469)

* add previous m3 commits (#1478)

* GHA tests for flex 140 (#1499)

* add previous m3 commits (#1478)

* Added command to remove existing logs in output_dir (#1475)

* address PR review (#1501)

* remove makefile

* Remove caas reference (#1502)

* Add previous m3 commits in baremetal readme (#1480)

---------

Co-authored-by: Srikanth Ramakrishna <[email protected]>
Co-authored-by: mahathis <[email protected]>

* refine dlrm ddp dataloader (#1504)

Co-authored-by: Weizhuo Zhang <[email protected]>

* workaround oneccl bad termination issue for RN50 distributed training (#1508)

* Fix Test Pipeline (#1514)

* fix test pipeline

* Update container-pipeline-tester.yml

* Bump mlflow in /datasets/cloud_data_connector/samples/interoperability (#1492)

Bumps [mlflow](https://github.com/mlflow/mlflow) from 2.5.0 to 2.6.0.

* Bump mlflow in /datasets/cloud_data_connector/samples/azure (#1491)

Bumps [mlflow](https://github.com/mlflow/mlflow) from 2.5.0 to 2.6.0.

* Zufang/readme update for itex (#1485)

* add link to int8 PB for onednn graph

* refine readme for onednn graph option

* MaskRCNN GPU training (#1513)

* maskmrcnn model demo zero-bkc

* update readme

* added license header

* added HW requirement

* update docs

* support bf32 for SD finetune (#1521)

* fix dlrm-v1 ddp hang (#1512)

* fix dlrm-v1 ddp hang

* comment out two more barrier

* modify multi-node scripts for resnet50, maskrcnn and stable diffusion (#1537)

* modify multi-node scripts for RNNT and ssd-resnet34 (#1539)

* Modify Test Runner Run Dir from GHA (#1541)

* Adjust test runner path to be MLOps root

* move to test-runner dir

* merge dir and parent_dir

* remove parent dir

* use full paths

* get test name for artifact upload

* Stop workload tests and security scans on open PR event (#1543)

* PVC P0 RN50 PYT Inference (#1494)

* updated to latest BKC to add multi-card multi-tile support

* PVC P0 PYT BERT Large (#1495)

* Add support for multi-card multi-tile

* PVC P0 PYT DLRM training (#1518)

* add dlrm training pvc support

* Fix LKG pipeline for the new ipex and itex conda installers (#1525)

* update LKG pipeline for the new ipex and itex conda installers

* remove version subdir from bom file

* TF RN50V1_5 P0 Max Inference (#1529)

Co-authored-by: Mahathi Vatsal <[email protected]>

* modify SD inference scripts (#1553)

* Updates + New Transfer Learning Notebooks from TLT team (#1522)

* Update for all transfer learning notebooks
---------

Co-authored-by: Harsha Ramayanam <[email protected]>

* Rename the repo to Intel AI Reference Models for rebranding (#1473)

* Rename the repo for rebranding, remove k8s and tools directories

* Update README.md
---------

Co-authored-by: Clayne Robison <[email protected]>

* modify SD finetune scripts (#1555)

* fix CrossEntropyLoss target for dummy inputs (#1556)

* PVC P0 PYT DLRM inference (#1517)

* add dlrm pvc inference support

* Modified baremetal README(bert and rn50) for max series (#1520)

* fix itex (#1527)

* fix itex

* fix key error

* PVC BERT-Large P0 TF (#1534)

* adapt new BKC
* Modified Bert large TF baremetal readme for Mx series (#1557)
---------
Co-authored-by: Mahathi Vatsal <[email protected]>

* Add optimizations for BF16 transformer inference (#1489)

* Change the order of operations from dense->concat->split_heads to dense->split_heads->concat (attention_layer.py)

* Change the order of operations when calculating encoder-decoder k, v caches to avoid using matmul ops between large matrices(transformer.py)

*Reduce number of occurrences of split_heads with encoder-decoder k, v caches by performing split_heads in transformer.py

* Update README.md (#1516)

Change BFloat16 model description to point to new frozen graph.

* make llama training max-step flexible (#1563)

* Fix drops logic to be avoided if any workload test fails (#1564)

* Create selective PR validations tags-based (#1549)

* Create selective PR validations tags-based

* Add edit_pull_request as trigger for the PR validations

---------

Co-authored-by: Wafaa Taie <[email protected]>

* PYT CPU Automation (#1544)

* make initial changes

* add tests for new base container

* add more new tests

* remove env var

* add more tests

* more test added

* add dlrm inference build and test

* add more tests

* add more tests

* add another model test

* add final tests

* add devcatalog (#1566)

* Change success condition from previous jobs before doing drop (#1568)

* Added readme for DLRM pytorch MAX series (#1570)

* update docs for AI Tools (#1567)

* ViT Train : Enable multi instance training for Tensorflow Vision Transformer model (#1569)

* Revert "Restrict vit training to single socket (#1484)"

This reverts commit 8d4eb7a.

* Add multi instance support

* Update README.md

* Remove useless files and add license title for DLRM v2 (#1574)

* Gda/step url (#1560)

* Added step url to result table

* Remove continue on error from workload tests

* Run performance checks even when a workload test failed

* add driver setup doc (#1571)

Co-authored-by: Jitendra Patil <[email protected]>

* LLM models using ipex.optimize_transformers for bf16/int8 (#1562)

* init pr

* revise v1, local test passed

* TF ResNet50v1.5 Fix (#1575)

* tf cpu r50 inf fixed

* actions.json added

* newline added for actions

* venv instllation added

* pip install fixed

* venv pip install fixed

* test workflow reverted back

---------

Co-authored-by: Jitendra Patil <[email protected]>

* added int8 support for graphsage (#1536)

* MEMREC DLRM Inference (#1547)

* Added omp_num_threads and cores_per_instance as env variable (#1573)

* esnet50v1.5 and bert large inference models

* TF Stable Diffusion: Download model files in start.sh and log latency & throughput (#1581)

* adding model download to start.sh to avoid failure during multi-instance execution

* downloading the clip tokenizer in start.sh also

* script changes to report both latency and throughput

* Doc fixes (#1582)

* add torchrec dlrm to the models table

* fix paths to run quickstart scripts

* Add RN50 INT8 Calibration file (#1545)

* Jupyter notebook for AI Reference models (#1583)

* Added jupyter notebook for AI Reference models

* Added README for AI_Reference jupyter notebook

* Supports Resnet50 v1.5 and mobilenet v1 inference workloads

* Upgrade Pillow version to 10.0.1 to fix high severity CVEs (#1584)

* remove workflows

* correct_release_tag (#1587)

* correct_release_tag

* revert a change

* Unexpose old TF models (#1593)

* tf cpu distilbert inf (#1612)

* 3D Unet MLPerf Inference Workload (#1595)

* 3D Unet MLPerf added

* docker compose added

* batch size corrected

* numactl added

* ubuntu Dockerfile updated

* output dir changed in tests

* yes flag added in Dockerfile

* default OS added

* BERT Large Inference CPU Workload Added (#1594)

* BERT Large Inf added

* TCMALLOC added

* ubuntu Dockerfile updated

* TCMalloc location updated

* test file updated

* yes flag added in Dockerfile

* TF CPU Bert Large Training Workload (#1596)

* bert large pretraining added

* extra OS removed from r50 inf service

* ubuntu Dockerfile updated

* ssh helper script added

* yum non tineractive update

* output dir fixed

* TF CPU DIEN Inference Workload (#1598)

* TF CPU MobileNet V1 Inference Workload (#1600)

* TF CPU ResNet v1.5 Training Workload (#1604)

* TF CPU SSD ResNet-34 Inference Workload (#1606)

* TF CPU SSD ResNet-34 Training Workload (#1607)

* TF CPU Transformer MLPerf Inference Workload (#1597)

* TF CPU Transformer MLPerf Training (#1608)

* TF CPU DistilBERT fixed (#1629)

* TF CPU SSD MobileNet Inference Workload (#1605)

* Fixed typo in readme for framework (#1631)

* Checkpoints added for TF CPU Workloads (#1637)

* TF CPU Dev Catalog READMEs Updated. (#1652)

* EMR PYT RN50 Infer (#1624)

---------

Co-authored-by: Jitendra Patil <[email protected]>

* EMR PYT RN50 Train (#1625)

* build rn50 train centos

* comment conda lines

* comment conda lines

* remove fp16 test,add devcat and intel-openmp

* add changes to ubuntu

* remove commented lines

* add cpu to tag

* EMR PYT ResNext Infer (#1619)

* add initial commits for emr resnext

* add dockerfiles

* build resnext

* remove extra precisions

* add devcat,openmp and more tests

* add cpu to tag

* EMR PYT MaskRCNN Infer (#1621)

* add maskrcnn inference

* add cmake and bf32 tests

* add openmpi,more tests and devcat

* add cpu to tag

* rearrange pip installs

* EMR PYT MaskRCNN Train (#1622)

* build  maskrcnn training

* add cmake

* correct image names

* add ld_predload,devcat and tests changes

* EMR PYT SSD-ResNet34 Train (#1618)

* add compose changes

* ssd-resnet34 build

* correct tests file

* add more tests and devcat

* EMR PYT SSD-ResNet34 Infer (#1617)

* build ssd-resnet34 images

* add bf32 tests and update DEVCATALOG.md

* rename devcatalogs (#1671)

* EMR PYT BERT Large Infer (#1623)

* add bert-large build

* correct paths

* add devcatalog

* add more tests

* Rename EMR_DEVCATALOG.md to DEVCATALOG.md

* Update DEVCATALOG.md

* PYT EMR BERT-Large Train (#1639)

* build bert-large training

* add pretrained model env

* remove idsid

* add more tests and devcatalog

* correct env and rename

* Delete EMR_DEVCATALOG.md

* Update DEVCATALOG.md

* EMR PYT Distilbert Infer (#1620)

* build distilbert images

* validate distilbert

* add more tests and devcatalog

* remove MZ reference

* modify env params

* uncomment and remove idsid

* clarify core per instance

* clarify hf_datasets

* remove void env var

* Rename EMR_DEVCATALOG.md to DEVCATALOG.md

* Update DEVCATALOG.md

* EMR PYT RNNT Inference (#1616)

* add dockerfiles for rnnt

* fix pytorch binding error

* copy diff file to inference

* add librosa

* add more tests and devcatalog

* correct reatime cmd

* Rename EMR_DEVCATALOG.md to DEVCATALOG.md

* Update DEVCATALOG.md

* EMR PYT RNNT Train (#1615)

* Rename EMR_DEVCATALOG.md to DEVCATALOG.md

* Update DEVCATALOG.md

* EMR PYT DLRM Infer (#1626)

* Update DEVCATALOG.md

* EMR PYT DLRM Train (#1628)

* build dlrm training

* add num_batch

* add tcmalloc

* add tcmalloc

* add devcatalog

* re-locate the file

* Rename EMR_DEVCATALOG.md to DEVCATALOG.md

* Update DEVCATALOG.md

* make batch flexible (#1635)

* Change dataset for Transfer Learning LLM Notebook (#1576)

* update llm notebook with code alpaca

* push updates

* fixed broken link

* Refactor Transfer Learning Notebook folder to match TLT structure + small diff (#1670)

* refactor to match TL structure + small diff

* fixed table structure

* add landing page doc (#1653)

* add landing page doc

* simplify and add r3.1

* add precisions

* add tf landing page

* add precisions

---------

Co-authored-by: Jitendra Patil <[email protected]>

* r3.1 fixes (#1679)

* TF CPU ResNet 50 v1.5 Inference Model Checkpoints fixed (#1663)

* spr removed from workdir

* R50 Inf fixed

* fixed rn50 error

* fixed in docker compose yaml

---------

Co-authored-by: Sharvil Shah <[email protected]>

* make minor corrections in devcatalog README (#1680)

* Remove old TF models (#1673)

* remove ResNet50, FasterCNN, RFCN, NCF, Wide and deep Large dataset training, waveNet, Inception v4, mlperf GNMT models

* remove relevant unit tests and update coverage precentage

* refine changes based on feedback (#1684)

* fix typo (#1686)

* fixing docker iamages names

* fixing TF centos docker images link

* unset KMP AFFINITY for accuracy scripts (#1689)

* Bump mlflow in /datasets/cloud_data_connector/samples/azure (#1698)

Bumps [mlflow](https://github.com/mlflow/mlflow) from 2.6.0 to 2.8.1.
- [Release notes](https://github.com/mlflow/mlflow/releases)
- [Changelog](https://github.com/mlflow/mlflow/blob/master/CHANGELOG.md)
- [Commits](mlflow/mlflow@v2.6.0...v2.8.1)

---
updated-dependencies:
- dependency-name: mlflow
  dependency-type: direct:production

* Bump mlflow in /datasets/cloud_data_connector/samples/interoperability (#1697)

Bumps [mlflow](https://github.com/mlflow/mlflow) from 2.6.0 to 2.8.1.
- [Release notes](https://github.com/mlflow/mlflow/releases)
- [Changelog](https://github.com/mlflow/mlflow/blob/master/CHANGELOG.md)
- [Commits](mlflow/mlflow@v2.6.0...v2.8.1)

---
updated-dependencies:
- dependency-name: mlflow
  dependency-type: direct:production

* add optional args to devcatalog pages (#1750)

* add optional env for tf

* add optional args

* add v2 to table (#1762)

* remove space

* Corrected updating OMP NUM THREADS (#1759)

* validate omp_num_threads and cores_per_instance

* revert changes

* validate omp num threads and cores per instance (#1789)

* add omp_num_threads and cores_per_instance (#1809)

* wsl2 documentation (#1815)

* add wsl2 stable diffusion documentation

* add wsl2 stable diffusion documentation

* add wsl2 base doc

* make minor tabular changes

* make minor tabular changes

* add ssh instructions

* re-word example

* update torch version outputs

* add entry in main readme

---------

Co-authored-by: Jitendra Patil <[email protected]>

* fix dlrm normal training to support SNC mode (#1699)

Co-authored-by: Weizhuo Zhang <[email protected]>

* update torch-ccl branch (#1716)

* Pytorch ResNext32x16d baremetal EMR tests (#1640)

* Pytorch RN50 baremetal EMR inference and training tests (#1641)

* PyTorch SSD-Resnet34 EMR baremetal training and inference tests (#1647)

* PyTorch distilBERT baremetal tests (#1650)

* PyTorch BERT_LARGE_SQUAD inf baremetal tests (#1651)

* PyTorch BERT_LARGE Training baremetal tests (#1654)

* Pytorch MaskRCNN baremetal EMR tests (#1646)

* PyTorch DLRM baremetal tests (#1649)

* PyTorch RNN-T EMR baremetal tests (#1648)

* remove TF yolo v5, add cpu stable diffusion

* [Zero-BKC][ITEX][GPU]add itex stable diffusion,EfficientNet,wide and deep inference for ats-m (#1642)

Co-authored-by: XumingGai <[email protected]>

* remove pb files from the github repo (#1730)

* modify rn50 training script (#1732)

* Refactor new zero bkcs scripts for TF ResNet50 inf and Mask-RCNN to models_v2 (#1695)

* move new zero bkcs for TF resnet50 inf and maskrcnn to models_v2

* GHA tests for dGPU zero copy BKC format workloads (#1744)
---------

Co-authored-by: Mahathi Vatsal <[email protected]>

* modify Stable Diffusion finetune script (#1746)

* add utilities to parse result for pytorch (#1729)

* Wliao2/add rn50 (#1693)

* add dgpu resnet 50

* update intel tf version to be the latest (#1748)

* Enable Inductor path for Bert_large inference and training (#1733)

* Init Bert-large files from inductor path

* cherry pick Enable int8-mixed-bf16 for 5 transformer models (#1720)

* modify README

* Enable Inductor path for Distilbert-base inference (#1734)

* Init Bert-large files from inductor path

* cherry pick Enable int8-mixed-bf16 for 5 transformer models (#1720)

* modify README

* Init Distilbert base models

* Init DLRM Script (#1739)

* Enable Inductor path for RN50 inference and training (#1718)

* Enable Inductor path for RN50 inference and training

* add bf32

* add README for Torch inductor

---------

Co-authored-by: leslie-fang-intel <[email protected]>

* Move and add license headers P1 ATS-M ITEX models (#1754)

* move and add license headers to stable diffusion model

* move and add license headers to efficientnet

* update maskrcnn inference

* move and update license headers for wide and deep model

* Remove MLFlow dependency. Updates on functional tests (#1756)

* add v2 to table (#1762)

* Weizhuoz/fix bert accuracy (#1761)

* fix bert-large accuracy read issue

* fix bert_large accuracy issue

* inductor int8 could not use model.eval()

* remove ssdmobilenet and yolov4 (#1757)

* update maskrcnn, bert-large training (#1666)

* update maskrcnn training

* update maskrcnn and add bert-large

* move maskrcnn training to model_v2, update license

* code review changes for bert large training

---------

Co-authored-by: Wafaa Taie <[email protected]>

* update mlflow version to use the latest (#1768)

* [Zero-BKC][ITEX][GPU] Add resnet50 and 3dd-unet training (#1664)

* add 3d-unet gpu training

* add gpu resnet50 training

* update license headers and move scripts to models_v2

* code review changes for 3d-unet training, update license, move to models_v2

* changes in readme for code review

---------

Co-authored-by: Wafaa Taie <[email protected]>

* enable distributed training for DLRMv2 and some fix for inductor path (#1770)

* enable distributed training for DLRMv2 and some fix for inductor path

* add missing files

* remove license headers from .txt files in data-connector (#1774)

* fix data loader (#1780)

* Fix for TL Notebooks GHA (#1773)

* fixed file paths

* changed venv creation

* trying venv

* trying no venv

* reverted venv3

* testing apt get update

* added apt-get install

* venv3 --> venv

* uncommented apt

* without virtualenv

* added pip install virtualenv

* downgraded PyYaml to be conpatible with tf models official

* 2.12.0 --> 2.12.1

* removed package versions

* added back tf official version

* flipped

* addressed review comments

* Fix inductor path int8 bf16 realtime issue (#1767)

* Update inference_performance.sh (#1784)

* fix acc (#1798)

* add warm up iter for inference throughput (#1799)

* Predownload weights for SDv2.1 (#1797)

* predownload weights for SDv2.1

* update hash

* fix DLRM V1 syntax error (#1760)

* fix DLRM V1 syntax error

* fix dlrm inductor numpy.bool_ error (#1448)

* fix DLRM V1 train_ld issue

* Update dlrm_s_pytorch.py

---------

Co-authored-by: Chunyuan WU <[email protected]>
Co-authored-by: zhuhaozhe <[email protected]>

* Fix bert large inductor int8 accuray failure in last batch (#1803)

* fix bert large inductor int8 accuracy issue

* Format fixes

---------

Co-authored-by: jianan-gu <[email protected]>

* add bert (#1701)

* add bert IPEX

Co-authored-by: Wafaa Taie <[email protected]>
Co-authored-by: Mahathi Vatsal <[email protected]>

* Wliao2/add dlrm kaggle (#1711)

* add dlrm kaggle

* fix license issue

* Update README.md

* remove unused files

* update

* add licence header for modified files

---------

Co-authored-by: Mahathi Vatsal <[email protected]>

* update bert-large for ARC (#1816)

* update scripts and Readme for ARC

* Update README.md

* Update quickstart scripts with env variables for Stable Diffusion and ResNet50v1.5 (#1791)

* adding env vars for SD and RN50

* updating accuracy quickstart and other minor changes

* using default values only if the env var are not set from the cmd line

* fix for coverage tests

* Added GHA for ITEX wide deep large (#1820)

* Added GHA for ITEX wide deep large

* Added stable diffusion inference ITEX (#1819)

* Added stable diffusion inference

* Changed file permissions

* Added GHA for EfficientNet ITEX (#1821)

* Added GHA for EfficientNet ITEX

* Update run_test.sh

* Update setup.sh

* Update README.md

* Added GHA for ITEX bert large Training (#1822)

* ssdmobilenet int8 accuracy fix (#1811)

* ssdmobilenet int8 accuracy fix

* added change in quickstart accuracy script

* modified public bucket link

* modified new args in unit test

* fixed unit test

* add BKC for DLRM-V2 convergence test (#1824)

* wsl2 documentation (#1815)

* add wsl2 stable diffusion documentation

* add wsl2 stable diffusion documentation

* add wsl2 base doc

* make minor tabular changes

* make minor tabular changes

* add ssh instructions

* re-word example

* update torch version outputs

* add entry in main readme

---------

Co-authored-by: Jitendra Patil <[email protected]>

* bert-large inductor uses int8_bf16 mix (#1792)

* Bert-large inductor use int8-bf16 mix

* ipex uses int8_bf16 mix in if

* merge develop

* inductor uses int8-bf16 mix

* Distilbert int8 optimization (#1830)

* optimize distilbert int8

* re-calibrate distilbert

* fix for inductor (#1834)

* TF- DistilBERT - Update quickstart scripts with env vars (#1818)

* Add Env variables to quickstart scripts

* Update # of cores for throughput script

* add distilbert (#1702)

* add distilbert

* Corrected refactored path for distilbert

* Added intel license header

* Update README.md

---------

Co-authored-by: Mahathi Vatsal <[email protected]>

* Update dlrm_s_pytorch.py (#1843)

* Wliao2/add stable diffusion (#1705)

* add stable_diffusion

* update some typo

* fix license issue

* update stable diffusion

* update for acc

* verify the result

* Refactored to new folder

* Update README.md

* add support for ARC

---------

Co-authored-by: Mahathi Vatsal <[email protected]>

* Fix Bert Large Int8 Latency Issue (#1859)

Co-authored-by: jianan-gu <[email protected]>

* [DistilBert] modify for masked_fill default value (#1868)

* Nhatle/bert large training x3 vs x1 (#1776)

* set num-inter-threads=2

* bert-large squad: Binding process to cores on 1 socket

* Enable multi-instance training for bert-large squad

* Fix incase users only run 1 instance

* Fix benchmark_command

* Molly/ddp bkc update (#1873)

* make num_iter flexbile

* bugfix for bert-large ddp

* bkc for rn50 ddp training update

* bkc for rn50 ddp training update

* bkc for dlrm_v1 ddp training update

---------

Co-authored-by: WeizhuoZhang-intel <[email protected]>

* Corrected IPEX installer version (#1878)

* Changed IPEX installer version

* Update dlrm_s_pytorch.py (#1879)

* Update AI Bundle version in tests setup files for CI/CD pipeline (#1881)

* doc: document models_v2 contribution guideline (#1855)

Signed-off-by: Dmitry Rogozhkin <[email protected]>

* Molly/inductor fp16 (#1875)

* make num_iter flexbile

* bugfix for bert-large ddp

* bkc for rn50 ddp training update

* bkc for rn50 ddp training update

* bkc for dlrm_v1 ddp training update

* rn50 fp16 torch.compile enabled

* fp16 autocast fix

* fix RN50 for fp16 torch.compile (#1849)

* enable stable-diffusion fp16 inductor path

* vit, bert-large fp16 enable

* merge to latest transformers patch

* Update enable_ipex_for_transformers.diff

* Update enable_ipex_for_transformers.diff

---------

Co-authored-by: Cao E <[email protected]>
Co-authored-by: WeizhuoZhang-intel <[email protected]>

* Fix in case mpi_num_processes_per_socket=1 (#1885)

* Fix in case mpi_num_processes_per_socket=1

* small fix

* Update dlrm_s_pytorch.py (#1890)

* Modified dataset path (#1894)

* added GHA for ITEX bert large

* Stable Diffusion PYT Flex and Max (#1853)

* validate sd pyt
* add max tests and dockerfile

* Bump scipy in /models_v2/pytorch/stable_diffusion/inference/gpu (#1847)

Bumps [scipy](https://github.com/scipy/scipy) from 1.9.1 to 1.11.1.
- [Release notes](https://github.com/scipy/scipy/releases)
- [Commits](scipy/scipy@v1.9.1...v1.11.1)

---
updated-dependencies:
- dependency-name: scipy
  dependency-type: direct:production

* Bump gitpython in /models_v2/pytorch/distilbert/inference/gpu (#1846)

Bumps [gitpython](https://github.com/gitpython-developers/GitPython) from 3.1.30 to 3.1.41.
- [Release notes](https://github.com/gitpython-developers/GitPython/releases)
- [Changelog](https://github.com/gitpython-developers/GitPython/blob/main/CHANGES)
- [Commits](gitpython-developers/GitPython@3.1.30...3.1.41)

---
updated-dependencies:
- dependency-name: gitpython
  dependency-type: direct:production

* Bump transformers in /models_v2/pytorch/distilbert/inference/gpu (#1840)

Bumps [transformers](https://github.com/huggingface/transformers) from 4.25.1 to 4.36.0.
- [Release notes](https://github.com/huggingface/transformers/releases)
- [Commits](huggingface/transformers@v4.25.1...v4.36.0)

---
updated-dependencies:
- dependency-name: transformers
  dependency-type: direct:production

* Bump transformers in /models_v2/pytorch/bert_large/inference/gpu (#1810)

Bumps [transformers](https://github.com/huggingface/transformers) from 4.11.0 to 4.36.0.
- [Release notes](https://github.com/huggingface/transformers/releases)
- [Commits](huggingface/transformers@v4.11.0...v4.36.0)

---
updated-dependencies:
- dependency-name: transformers
  dependency-type: direct:production

* Bump transformers (#1786)

Bumps [transformers](https://github.com/huggingface/transformers) from 4.30.0 to 4.36.0.
- [Release notes](https://github.com/huggingface/transformers/releases)
- [Commits](huggingface/transformers@v4.30.0...v4.36.0)

---
updated-dependencies:
- dependency-name: transformers
  dependency-type: direct:production

* Bump transformers (#1785)

Bumps [transformers](https://github.com/huggingface/transformers) from 4.30.0 to 4.36.0.
- [Release notes](https://github.com/huggingface/transformers/releases)
- [Commits](huggingface/transformers@v4.30.0...v4.36.0)

---
updated-dependencies:
- dependency-name: transformers
  dependency-type: direct:production

* Added GHA tests for stable diffusion  (#1887)

* validate sd pyt

* update for ARC (#1781)

* update for ARC

* update log

* refactor to models_v2

* update path due to refactor

* sync with 2.1rc3

* Wliao2/add dlrm (#1704)

* add dlrm v2

* fix license issue

* Update dlrm_dataloader.py

* Update dist_models.py

* Update dlrm_dataloader.py

* Update dist_models.py

* refactored to a new folder

* update Readme

---------

Co-authored-by: Mahathi Vatsal <[email protected]>

* Max 3D-Unet container support (#1832)

* add masrkcnn container support

* add 3d-unet container support

* Added GHA for 3d Unet Training ITEX (#1823)

* Added Resnet50v1.5 and maskrcnn train GHA test (#1751)

* Refactored resnet50v1_5 for Zero copy BKC format (#1897)

* Added Necessary Metadata and Bug Fixes for Transfer Learning Notebooks (#1825)

* fixed file paths

* changed venv creation

* added pip install virtualenv

* downgraded PyYaml to be conpatible with tf models official

* 2.12.0 --> 2.12.1

* added back tf official version

* addressed review comments

* fixed version for fsspec, removed llm test

* changed accelerate version

* fix for tf-models-official

* added needed metadata

* made tests significantly less expensive

* fixed zip extract to tar extract

* fixed sms download

* fixed typo for csv path name

* addressed review comments

* simplified if statement

* Added oneapi path (#1902)

* validate sd pyt

* Update README.md with ipex version (#1903)

* Update README.md with ipex version

* Max MaskRCNN container support  (#1831)

* add masrkcnn container support

* Max RN50 container validation (#1829)

* validate container for zero-bkc for rn50 max container

* Max BERT-Large container support (#1833)

* add bert-large container support

* Flex Wide and deep container  (#1851)

* validate zero-copy bkc for itex stable diffusion

* validate zero-copy bkc for flex container (#1772)

* validate zero-copy bkc

* EfficientNet Container for flex (#1771)

* validate zero-copy bkc efficientnet

* TF MaskRCNN container for Flex GPU (#1755)

* adapt zero-copy bkc and validate maskrcnn

* validate bert-large inference PYT PVC (#1841)

* validate bert-large inference

* validate bert-large container PVC pytorch (#1838)

* validate bert-large container

* RN50 PYT Max container (#1904)

* validate refactor of zero-bkc training

* Latest updates to TF RN50 for Flex series (#1813)

* adapt zero-copy bkc for image build

* Update README.md for IPEX versions (#1907)

* Update README.md

* not cast and ramdomrized crossnet bias for inductor and make warmup iters as an arg (#1906)

* resolve merge conflicts (#1911)

* Wliao2/add ssdmbv1 (#1817)

* add ssd-mobilenetv1

Co-authored-by: Mahathi Vatsal <[email protected]>

* add IPEX Max 3dunet (#1706)

* add 3dunet for IPEX for 3DUnet for Max series

* Updated baremetal readme for distil bert IPEX (#1908)

* Updated baremetal readme distil bert for IPEX

* DistilBERT  inference container PYT Flex and Max (#1854)

* add functional support

* docker: fix broken docker-compose.yml (#1913)

Fixes: 473d3b3 ("DistilBERT inference container PYT Flex and Max (#1854)")

Signed-off-by: Dmitry Rogozhkin <[email protected]>

* docker/flex: fix build and run for tf maskrcnn (#1896)

Signed-off-by: Dmitry Rogozhkin <[email protected]>

* Flex PYT DLRM-v1 inference (#1895)

* build dlrmv1 container

* Updated baremetal readme for DLRM v1 (#1909)

* Updated baremetal readme for DLRM v1

* Update README.md (#1914)

* remove extra test (#1916)

* release docs for containers (#1915)

* Update release container table

* Updated main README table (#1919)

* Updated main README table

* clean up workflows

* restore git submodules

---------

Signed-off-by: Dmitry Rogozhkin <[email protected]>
Co-authored-by: jianan-gu <[email protected]>
Co-authored-by: zhuhaozhe <[email protected]>
Co-authored-by: Om Thakkar <[email protected]>
Co-authored-by: sachinmuradi <[email protected]>
Co-authored-by: Cao E <[email protected]>
Co-authored-by: mahathis <[email protected]>
Co-authored-by: lerealno <[email protected]>
Co-authored-by: DiweiSun <[email protected]>
Co-authored-by: zengxian <[email protected]>
Co-authored-by: Weizhuo Zhang <[email protected]>
Co-authored-by: Jitendra Patil <[email protected]>
Co-authored-by: Srikanth Ramakrishna <[email protected]>
Co-authored-by: Mahmoud Abuzaina <[email protected]>
Co-authored-by: Tyler Titsworth <[email protected]>
Co-authored-by: jiayisunx <[email protected]>
Co-authored-by: zofia <[email protected]>
Co-authored-by: Mahathi Vatsal <[email protected]>
Co-authored-by: okhleif-IL <[email protected]>
Co-authored-by: Harsha Ramayanam <[email protected]>
Co-authored-by: Clayne Robison <[email protected]>
Co-authored-by: jianyizh <[email protected]>
Co-authored-by: nhatle <[email protected]>
Co-authored-by: gera-aldama <[email protected]>
Co-authored-by: Real Novo, Luis <[email protected]>
Co-authored-by: Sharvil Shah <[email protected]>
Co-authored-by: Ashiq Imran <[email protected]>
Co-authored-by: Gopi Krishna Jha <[email protected]>
Co-authored-by: leslie-fang-intel <[email protected]>
Co-authored-by: Sharvil Shah <[email protected]>
Co-authored-by: Nick Camarena <[email protected]>
Co-authored-by: xiangdong <[email protected]>
Co-authored-by: wenjun liu <[email protected]>
Co-authored-by: XumingGai <[email protected]>
Co-authored-by: wincent8 <[email protected]>
Co-authored-by: Jesus Herrera Ledon <[email protected]>
Co-authored-by: XumingGai <[email protected]>
Co-authored-by: Chunyuan WU <[email protected]>
Co-authored-by: Syed Shahbaaz Ahmed <[email protected]>
Co-authored-by: Xuan Liao <[email protected]>
Co-authored-by: Dmitry Rogozhkin <[email protected]>
  • Loading branch information
Show file tree
Hide file tree
Showing 1,048 changed files with 68,017 additions and 25,603 deletions.
407 changes: 407 additions & 0 deletions CONTRIBUTING.md

Large diffs are not rendered by default.

191 changes: 0 additions & 191 deletions Contribute.md

This file was deleted.

14 changes: 14 additions & 0 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -53,6 +53,20 @@ unit_test:
@echo "Running unit tests..."
tox -e py3-py.test

test_tl_tf_notebook: venv
@. $(ACTIVATE) && pip install -r docs/notebooks/transfer_learning/requirements.txt && \
bash docs/notebooks/transfer_learning/run_tl_notebooks.sh $(CURDIR)/docs/notebooks/transfer_learning/text_classification/tfhub_bert_text_classification/BERT_Binary_Text_Classification.ipynb remove_for_custom_dataset && \
bash docs/notebooks/transfer_learning/run_tl_notebooks.sh $(CURDIR)/docs/notebooks/transfer_learning/image_classification/tf_image_classification/Image_Classification_Transfer_Learning.ipynb remove_for_custom_dataset && \
bash docs/notebooks/transfer_learning/run_tl_notebooks.sh $(CURDIR)/docs/notebooks/transfer_learning/image_classification/huggingface_image_classification/HuggingFace_Image_Classification_Transfer_Learning.ipynb && \
bash docs/notebooks/transfer_learning/run_tl_notebooks.sh $(CURDIR)/docs/notebooks/transfer_learning/question_answering/BERT_Question_Answering.ipynb && \
bash docs/notebooks/transfer_learning/run_tl_notebooks.sh $(CURDIR)/docs/notebooks/transfer_learning/text_classification/tfhub_bert_text_classification/BERT_Multi_Text_Classification.ipynb remove_for_custom_dataset

test_tl_pyt_notebook: venv
@. $(ACTIVATE) && pip install -r docs/notebooks/transfer_learning/requirements.txt && \
bash docs/notebooks/transfer_learning/run_tl_notebooks.sh $(CURDIR)/docs/notebooks/transfer_learning/image_classification/pytorch_image_classification/PyTorch_Image_Classification_Transfer_Learning.ipynb remove_for_custom_dataset && \
bash docs/notebooks/transfer_learning/run_tl_notebooks.sh $(CURDIR)/docs/notebooks/transfer_learning/object_detection/pytorch_object_detection/PyTorch_Object_Detection_Transfer_Learning.ipynb remove_for_custom_dataset && \
bash docs/notebooks/transfer_learning/run_tl_notebooks.sh $(CURDIR)/docs/notebooks/transfer_learning/text_classification/pytorch_text_classification/PyTorch_Text_Classifier_fine_tuning.ipynb remove_for_custom_dataset

test: lint unit_test

clean:
Expand Down
Loading

0 comments on commit 36d8b40

Please sign in to comment.