Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Develop upstream sync 240521 #2548

Merged
merged 482 commits into from
Jun 5, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
482 commits
Select commit Hold shift + click to select a range
09675d1
Update ops-related pbtxt files.
tensorflower-gardener May 16, 2024
7201a35
Update TFRT dependency to use revision
tensorflower-gardener May 16, 2024
b55d4f1
Reverts adf106883e61d1c0d5583b08491fec91cd74420e
anlunx May 16, 2024
4e800be
Check for `GetEnableMemories()` outside `PyDeviceList::PopulateMemory…
junwhanahn May 16, 2024
099c465
Update ops-related pbtxt files.
tensorflower-gardener May 16, 2024
c13cbeb
Propagate error to output if the input buffer has error.
May 16, 2024
04ab4ee
[IFRT] Wrap RemapPlan::mappings with shared_ptr
hyeontaek May 16, 2024
c0cbc2f
Update to Triton version that contains previously patched fixes.
chsigg May 16, 2024
b12b5e9
Update ops-related pbtxt files.
tensorflower-gardener May 16, 2024
ecaea0e
PR #12538: [GPU] Fix OSS compilation of previously disabled tests.
sergachev May 16, 2024
6ffda33
Update ops-related pbtxt files.
tensorflower-gardener May 16, 2024
6c538a6
[XLA] [NFC] Serialize all autotuning results
cheshire May 16, 2024
31c77e9
[XLA:GPU] Remove preprocessor `#if GOOGLE_CUDA`s and similar from Gem…
thomasjoerg May 16, 2024
1f94489
[XLA:GPU] Use IndexingMap::RemoveUnusedSymbols instead of WithoutRang…
tdanyluk May 16, 2024
6c2e476
Update GraphDef version to 1864.
tensorflower-gardener May 16, 2024
175131a
compat: Update forward compatibility horizon to 2024-05-16
tensorflower-gardener May 16, 2024
5720ac7
[XLA:GPU] Remove GpuStatus.
May 16, 2024
bf59de7
Automated Code Change
tensorflower-gardener May 16, 2024
b3388cd
Automated Code Change
tensorflower-gardener May 16, 2024
31f4770
PR #12501: [NFC] Fix mistypes in scatter expander.
sergachev May 16, 2024
9d9aa16
Update ops-related pbtxt files.
tensorflower-gardener May 16, 2024
e641445
XNNPack weight cache provider: refactor code.
qukhan May 16, 2024
62fe318
[XLA:GPU] Remove GOOGLE_CUDA and TENSORFLOW_USE_ROCM guards from hlo_…
thomasjoerg May 16, 2024
b43e061
Optionally dump the MLIR compilation pipeline steps.
jreiffers May 16, 2024
f9d867e
XNNPack weight cache provider: update the internal options object to …
qukhan May 16, 2024
5855701
Merge pull request #67683 from Intel-tensorflow:aimran/oneDNN_3.4.1
tensorflower-gardener May 16, 2024
feae9a4
Use apply_indexing for input index computations.
jreiffers May 16, 2024
8621ef5
Update ops-related pbtxt files.
tensorflower-gardener May 16, 2024
3fa9094
XNNPack weight cache: make logging message format consistent across t…
qukhan May 16, 2024
15da74a
[XLA:GPU] Remove GOOGLE_CUDA and TENSORFLOW_USE_ROCM defines from gem…
thomasjoerg May 16, 2024
ed5ebc6
Add a function to get delegate options using the C API.
qukhan May 16, 2024
228e8a4
PR #11895: Offloading 3/3: enable rematerialization using xla flags
zhenying-liu May 16, 2024
c9e8a7f
[XLA:GPU] Use IndexingMap's simplifier in tiling
tdanyluk May 16, 2024
dbf4e38
PR #12550: [ROCM] Avoid hard coded specifications for mi300 transpose…
Ruturaj4 May 16, 2024
f6a7049
PR #12494: [ROCM] GPU performance cost model specs for MI300
Ruturaj4 May 16, 2024
18d913b
Merge pull request #66087 from Intel-tensorflow:amin/int8-fp32-matmul…
tensorflower-gardener May 16, 2024
7ba79f0
Update ops-related pbtxt files.
tensorflower-gardener May 16, 2024
3c92d1e
Add a new test case in pjrt_util_test
deqiangc May 16, 2024
92b7d2c
[XLA:GPU][NFC] Print non-trivial reduction for all-reduce key.
golechwierowicz May 16, 2024
73cf4c1
[XLA:GPU] Introduce TiledHloComputation
olegshyshkov May 16, 2024
23efbfe
[XLA:GPU][NFC] Remove unused includes in `symbolic_tile.h`.
bchetioui May 16, 2024
6bf19f6
Use StreamExecutorInterface::CreateEvent to create stream_executor::E…
klucke May 16, 2024
e8dcadf
Integrate LLVM at llvm/llvm-project@e27f9bb31984
gribozavr May 16, 2024
41b7a8b
Update ops-related pbtxt files.
tensorflower-gardener May 16, 2024
0cdc6a5
[XLA:GPU] [NFC] Better error message for slow autotuned kernels
cheshire May 16, 2024
bdec6a7
Removed all the visibilities to `//learning/infra/mira/distributed` i…
May 16, 2024
0a7be3e
[XLA:GPU] #includes cleanup for xla/service/gpu/tests/gemm_rewrite_te…
kuym May 16, 2024
7f174f1
[XLA:GPU] Clang-tidy cleanup for xla/service/gpu/model/indexing_test_…
kuym May 16, 2024
8fb6297
[XLA:GPU] Clang-tidy cleanup for xla/service/gpu/model/indexing_analy…
kuym May 16, 2024
a6f1278
Remove duplicate dependency from PJRT test
beckerhe May 16, 2024
5b91995
[XLA:GPU] Clang-tidy cleanup for xla/service/gpu/model/gpu_performanc…
kuym May 16, 2024
be9bf7d
[XLA:GPU] Clang-tidy cleanup for xla/service/gpu/llvm_gpu_backend/gpu…
kuym May 16, 2024
5aade4c
Modify docker pull backoff in CI
MichaelHudgins May 16, 2024
c05f5ad
Change DebugString call to proto2::TextFormat::PrintToString.
matthiaskramm May 16, 2024
4c5b519
Make buffer_comparator_kernel target a gpu_kernel_library
beckerhe May 16, 2024
e1cd322
[XLA:GPU] Remove filter checking if an indexing map is tileable in `S…
bchetioui May 16, 2024
da32b63
Use absl::Status instead of xla::Status now that they're identical.
klucke May 16, 2024
8d97018
Enables additional solver output if we're removing user shardings, wh…
tensorflower-gardener May 16, 2024
62abb36
Add a command line parameter in benchmark_tool to set the XNNPack cac…
qukhan May 16, 2024
74fd07c
Use absl::Status instead of xla::Status now that they're identical.
klucke May 16, 2024
3321aa4
Update ops-related pbtxt files.
tensorflower-gardener May 16, 2024
2c05f78
Add missing rocm_config.h include to amdgpu_compiler.cc
beckerhe May 16, 2024
4ce1a06
Reverts c031655ca9514e7946bb87eda17cf5932c381a8b
tensorflower-gardener May 16, 2024
ebff15d
Allow overriding some methods in all_gather_decomposer.
Tongfei-Guo May 16, 2024
4308554
[xla_client] Port all custom calls used in test to FFI
ezhulenev May 16, 2024
3b1d506
Fix typo in comment.
anshumang May 16, 2024
17ea55b
Move schema_fbs_with_mutable to mlir/lite/schema.
tensorflower-gardener May 16, 2024
163edc2
Use absl::Status instead of xla::Status now that they're identical.
klucke May 16, 2024
fbe2a68
Use absl::Status instead of xla::Status now that they're identical.
klucke May 16, 2024
caf11ae
[xla] Update XLA custom call documentation to use XLA FFI
ezhulenev May 16, 2024
3eef59e
Introduce ExecutionStreamAssignment.
tensorflower-gardener May 16, 2024
d1459f2
Remove broken TSL tests from `TARGET_FILTERS`, tag them as `no_oss` i…
ddunl May 16, 2024
d835f88
Use absl::Status instead of xla::Status now that they're identical.
klucke May 16, 2024
01e821d
Integrate LLVM at llvm/llvm-project@a383b3cca338
MaskRay May 16, 2024
632a143
Remove the test only tag from odml converter target.
LukeBoyer May 16, 2024
1ab908f
Express tests tolerances for exhaustive tests relative to the accurac…
tensorflower-gardener May 16, 2024
1c969dd
Use absl::Status instead of xla::Status now that they're identical.
klucke May 16, 2024
1392171
Use absl::Status instead of xla::Status now that they're identical.
klucke May 16, 2024
6ec9024
Fix typo `s/Recieved/Received/`.
kenjitoyama May 16, 2024
9c5e56c
Use absl::Status instead of xla::Status now that they're identical.
klucke May 16, 2024
db6be90
Remove unnecessary flags from XLA ARM build now that hermetic Python …
ddunl May 16, 2024
ba66234
Integrate LLVM at llvm/llvm-project@c86a53d75995
MaskRay May 16, 2024
b1c954c
Fix the unused return value error in XNNPACK weightcache test.
jaeyoo May 17, 2024
9fedb4a
Update TFRT dependency to use revision
tensorflower-gardener May 17, 2024
7414b5a
Clean up the build tags for the gpu runner test target. Removes this …
tensorflower-gardener May 17, 2024
68a8100
Minor change to add logic for finding all lines with same id.
tensorflower-gardener May 17, 2024
ce5b30c
Enable weight-only quantization with StableHLO opset in TF Quantizer
doyeonkim0 May 17, 2024
5005bae
Open source pybind lib for populating sparse core layouts.
tensorflower-gardener May 17, 2024
d021d96
Merge pull request #67038 from tensorflow:dependabot/pip/werkzeug-3.0.3
tensorflower-gardener May 17, 2024
4136130
[IFRT] Use llvm::DenseSet instead of llvm::SmallSet when verifying de…
ICGog May 17, 2024
d8237f6
Move variable declaration into preprocessor branch
beckerhe May 17, 2024
55d75e9
Normalize layouts for Scatter.
akuegel May 17, 2024
88f115a
Fix string-replace mistake in ROCm code
beckerhe May 17, 2024
8a74312
Add missing ROCm dependencies in xla/tests
beckerhe May 17, 2024
24a30f6
Revive the PLATFORM_GOOGLE code in the ROCm wrappers
beckerhe May 17, 2024
07e48f1
Fix scatter bounds checks for unsigned indices.
jreiffers May 17, 2024
60c746c
Fix TF Grappler ROCm build issue
beckerhe May 17, 2024
98e84d2
Disable NCCL persistent plan allocator on ROCm
beckerhe May 17, 2024
f699a42
Add missing ROCm dependency in gpu/runtime
beckerhe May 17, 2024
7b3ae21
Automated Code Change
tensorflower-gardener May 17, 2024
3fd41a5
ROCm dependency fixes in service/gpu
beckerhe May 17, 2024
2a66297
Disable MultiOutputLoopFeedingMap with MLIR emitters.
jreiffers May 17, 2024
e22b292
Exclude //xla/service/gpu/... targets for CPU CI runs.
thomasjoerg May 17, 2024
04386ec
Move unique_indices from Gather to Scatter where it belongs.
akuegel May 17, 2024
2fdd8e9
Integrate LLVM at llvm/llvm-project@5b7088c3619e
gribozavr May 17, 2024
245ed12
Fix ROCm XLA profiling code
beckerhe May 17, 2024
69c461a
PR #11767: Add SYCL build script
ShengYang1 May 17, 2024
d3ead79
compat: Update forward compatibility horizon to 2024-05-17
tensorflower-gardener May 17, 2024
4cdcd36
Update GraphDef version to 1865.
tensorflower-gardener May 17, 2024
fed68cd
Automated Code Change
tensorflower-gardener May 17, 2024
d065e49
Fix various code style and build issues in stream_executor/rocm
beckerhe May 17, 2024
eb2fec4
Fix compiler errors in TFLite XNNPack delegate weight_cache_test.
qukhan May 17, 2024
8874bae
[XLA:GPU] Support fusion of dynamic-slice into triton gemm.
tdanyluk May 17, 2024
6c1a8e4
Followup to Scatter Layout normalization
akuegel May 17, 2024
abb166c
Automated Code Change
tensorflower-gardener May 17, 2024
cd5779d
PR #12515: [ROCm] Fix build break introduced in ffa7bb5 and df736d7
hsharsha May 17, 2024
ce8a3bd
Fix dependencies in ROCM Triton emitter
beckerhe May 17, 2024
a316efc
Automated Code Change
tensorflower-gardener May 17, 2024
a81a393
Integrate LLVM at llvm/llvm-project@5a20a07fce88
gribozavr May 17, 2024
c287cd4
IsSimplifiedScatter should also return false if update_window_dims is…
akuegel May 17, 2024
ce41ee4
PR #12520: [GPU] Add new flag xla_gpu_exclude_nondeterministic_ops.
sergachev May 17, 2024
2a6c0d6
[XLA:GPU] Add support for destructuring collapsing reshapes in symbol…
bchetioui May 17, 2024
d12d0e0
Integrate LLVM at llvm/llvm-project@a68d20e98605
gribozavr May 17, 2024
42ac0dd
Clean-up TFLite benchmark tools TFLite resolver.
qukhan May 17, 2024
4d73219
Integrate Triton up to 25b4212a9
gflegar May 17, 2024
ee44de4
[XLA:GPU][NFC] Make SymbolicTile's `DestructureSummation` util more i…
bchetioui May 17, 2024
45fa10c
Make returned future immdiately ready if it is TPU used only.
deqiangc May 17, 2024
d1d37b5
[stream_executor:host] Rework HostExecutor to avoid depending on LLVM…
ezhulenev May 17, 2024
317bee1
Fix up some unreachable code warnings on mobile platforms
tensorflower-gardener May 17, 2024
37dba8b
disables `detect_odr_violation` for `validator_runner_test`
tensorflower-gardener May 17, 2024
db3425c
Migrate away from llvm::StringRef::equals
tensorflower-gardener May 17, 2024
e116891
[IFRT] Allow kReuseInput in RemapArrays if the number of input arrays…
hyeontaek May 17, 2024
b960aaf
PR #12637: [ROCm] Fix build break due to 63c33b and f5ab2a4
hsharsha May 17, 2024
d4dc6cd
[XLA:GPU][NFC] Print reachability between two instructions on VLOG.
golechwierowicz May 17, 2024
dcb77c5
[Function inlining] Add flag to enable pruning before function calls …
mrry May 17, 2024
a8b7969
Remove duplicate test macros from TSL
beckerhe May 17, 2024
5e1bc5b
Integrate LLVM at llvm/llvm-project@371eccd5dfed
gribozavr May 17, 2024
bc8d197
[xla:python] NFC: Fix all warnings in xla_compiler.cc
ezhulenev May 17, 2024
59683af
Use default relative error tolerance for Cbrt.
tensorflower-gardener May 17, 2024
43e54f6
Improve errors upon service shutdown.
tensorflower-gardener May 17, 2024
d928140
Remove reference to `mlir::Operations` in `Thunks`.
tensorflower-gardener May 17, 2024
da07e34
[XLA:CPU] Remove unnecessary include of absl/log/check.h.
sgerrard May 17, 2024
df8d64e
Fix Flatbuffer's upstream error on `GetTemporaryPointer()`
jaeyoo May 17, 2024
672697d
Integrate LLVM at llvm/llvm-project@1e5f29af81a5
MaskRay May 17, 2024
7325fef
Use StreamExecutorInterface::CreateEvent to create events in send & r…
klucke May 17, 2024
7423675
Tag `//xla/tools/hlo_opt:hlo_opt_gpu` with the `gpu` tag to avoid bui…
ddunl May 17, 2024
a1e0516
Migrate coord service and tests to use absl libraries (mutex, thread …
tensorflower-gardener May 18, 2024
e8372d5
Adopts the syntax 'xla_tpu_auto_spmd_partitioning_memory_budget_ratio…
tensorflower-gardener May 18, 2024
17c7b86
Fix tf.lite's lite_v2_test by using the new Keras 3 API.
jaeyoo May 18, 2024
b409682
[XLA:SPMD] Support partitioning kCall.
Tongfei-Guo May 18, 2024
d3f2cdb
In PjRtStreamExecutorBuffer::Delete, fix a bug causing memory corrupt…
yifjiang May 18, 2024
4a6548c
[xla:ffi] Fix return types in `xla::ffi::Expected<T, E>`.
tensorflower-gardener May 18, 2024
3c5c0ed
[XLA] Support more cases in IotaTileAssignment::Transpose.
cezheng May 18, 2024
2d26ae9
Add ReadFileTraceMetadata function to support trace viewer processes …
zzzaries May 18, 2024
52da3cb
Automated Code Change
tensorflower-gardener May 18, 2024
12dfa7e
Fix keras model saving error into TF SavedModel format.
jaeyoo May 18, 2024
a254edd
Fix nightly build / cmake build error due to missing argument
jaeyoo May 18, 2024
62593a1
Update GraphDef version to 1866.
tensorflower-gardener May 18, 2024
80e0637
compat: Update forward compatibility horizon to 2024-05-18
tensorflower-gardener May 18, 2024
03e7962
Add unit test coverage for IFRT call op kernel impl
tensorflower-gardener May 18, 2024
4c655fd
Automated Code Change
tensorflower-gardener May 18, 2024
530da27
Move `ForAllThunks` to its own file.
tensorflower-gardener May 18, 2024
7020566
Extract `ExecutionStreamIds` from `Thunks` instead of the now-depreca…
tensorflower-gardener May 18, 2024
4f584ae
Automated Code Change
tensorflower-gardener May 18, 2024
113315c
Automated Code Change
tensorflower-gardener May 18, 2024
81e1e20
Automated Code Change
tensorflower-gardener May 18, 2024
ed39df3
disable flaky mwms_pjrt_gpu_test_xla_2gpu test
SeeForTwo May 18, 2024
c57ab0a
[XLA] Error out when encountering errors during flag parsing
cheshire May 18, 2024
626f974
compat: Update forward compatibility horizon to 2024-05-19
tensorflower-gardener May 19, 2024
e667913
Update GraphDef version to 1867.
tensorflower-gardener May 19, 2024
08d2478
Cleanup dependency on tracing.h
cliveverghese May 19, 2024
8776f75
Automated Code Change
tensorflower-gardener May 19, 2024
e33a5ea
Migrate coord grpc client and service to use absl libraries (mutex, t…
tensorflower-gardener May 19, 2024
ac1b250
Refactor common test code to a util
tensorflower-gardener May 20, 2024
ff7a81e
Automated Code Change
tensorflower-gardener May 20, 2024
f7d9db7
Automated Code Change
tensorflower-gardener May 20, 2024
887a686
Automated Code Change
tensorflower-gardener May 20, 2024
4cf0de1
Update GraphDef version to 1868.
tensorflower-gardener May 20, 2024
52f378f
compat: Update forward compatibility horizon to 2024-05-20
tensorflower-gardener May 20, 2024
718f820
[xla:ffi] Fix msan error for //xla/tests:custom_call_test_cpu.
penpornk May 20, 2024
312661b
Print mismatches for UnorderedElements() of different sizes.
tensorflower-gardener May 20, 2024
d0e5774
Add float/double template specializations for `TensorEq` matcher.
tensorflower-gardener May 20, 2024
62c13e8
Compute stats should clone the feature configs before updating them.
tensorflower-gardener May 20, 2024
943e193
Clean up include and build file
tensorflower-gardener May 20, 2024
c0da871
Migrate away from llvm::StringRef::equals
tensorflower-gardener May 20, 2024
0f3594e
Update np.float_ and np.complex_ to match Numpy 2.0.
kanglant May 20, 2024
9631d5f
Reenable/tag previously broken tests on ARM builds of XLA
ddunl May 20, 2024
ac0724c
Simplify JAX lowering rules for cumulative sum
cheshire May 20, 2024
647277e
Remove unnecessary uses of `gpu_any` backend
ddunl May 20, 2024
138d910
Use AllocateDestinationBuffer instead of CreateUninitializedBuffer for
yifjiang May 20, 2024
e817f66
Remove the use of xla::OkStatus now that it's just an alias to absl::…
klucke May 20, 2024
777c11a
Use absl::Status instead of xla::Status now that they're identical.
klucke May 20, 2024
dbb2747
Use absl::Status instead of xla::Status now that they're identical.
klucke May 20, 2024
190d56b
Temporarily disable xnnpack cache test on Android.
sirakiin May 20, 2024
8f3b195
[XLA] Support transposing iota tile assignment with tile dimensions c…
cezheng May 20, 2024
6fda9f2
Add missing parentheses in XLA custom calls doc.
dfm May 20, 2024
c67475d
Replace the use of xla::OkStatus with absl::OkStatus now that they're…
klucke May 20, 2024
f2fd583
Replace the use of xla::OkStatus with absl::OkStatus now that they're…
klucke May 20, 2024
11fd24a
Move Event creation into each StreamExecutorInterface derived class.
klucke May 20, 2024
1dff6b3
Remove `PjRtStreamExecutorMemorySpace` specialization from `PjRtStrea…
junwhanahn May 20, 2024
c00fe0a
Automated Code Change
klucke May 20, 2024
42f89ec
Cleanup dependency on tracing.h
cliveverghese May 20, 2024
f4b0af1
Use == instead of equals method (deprecated)
May 20, 2024
ba8c22b
Replace the use of xla::OkStatus with absl::OkStatus now that they're…
klucke May 20, 2024
b8f00f0
Remove never called TpuExecutor_DeallocateEvent.
klucke May 20, 2024
1d1b6a7
When pruning function bodies, enable pruning side-effect-free "statef…
mrry May 20, 2024
98a7f0b
Automated Code Change
tensorflower-gardener May 20, 2024
651e9b8
Add SavedModel to StableHLO Converter to TensorFlow pip package
sdasgup3 May 20, 2024
43d498c
Automated Code Change
tensorflower-gardener May 20, 2024
4825606
Add support for building and pushing to Artifact Registry for Linux A…
quoctruong May 20, 2024
dec844f
Add restore_uid to TPUEmbeddingShardedVariable.
tensorflower-gardener May 20, 2024
e03a6f2
[XLA:GPU] Clang-tidy cleanup for xla/service/gpu/amdgpu_compiler.cc
kuym May 20, 2024
e922bfc
[XLA:GPU] Clang-tidy cleanup for xla/service/gpu/gpu_conv_rewriter_te…
kuym May 21, 2024
bdfcc50
PR #12848: [XLA:GPU] Add force_update function for conditional commands
shawnwang18 May 21, 2024
32e5b6f
[XLA:GPU] Clang-tidy cleanup for xla/service/gpu/cudnn_workspace_rewr…
kuym May 21, 2024
97a794b
Introduce MakeConstantWithShape in hlo_creation_util file.
fhoushmand May 21, 2024
0cefd08
PR #12845: [XLA:GPU] add workspace rewrite for FP8 gemm
shawnwang18 May 21, 2024
a51ee7d
Internal change only.
jreiffers May 21, 2024
fde16c7
Rename Buffer to GetBuffer.
jreiffers May 21, 2024
f53aae3
Automated Code Change
tensorflower-gardener May 21, 2024
ea5adb0
MLIR emitters: Vectorize transposes with small element types.
jreiffers May 21, 2024
aa8bb6a
Automated Code Change
tensorflower-gardener May 21, 2024
c0ac2d3
Update GraphDef version to 1869.
tensorflower-gardener May 21, 2024
1eb72a6
compat: Update forward compatibility horizon to 2024-05-21
tensorflower-gardener May 21, 2024
688998a
Add unbounded dynamism test for FftOp.
ghpvnist May 21, 2024
f30bdc8
Automated Code Change
tensorflower-gardener May 21, 2024
104615d
Re-land: [XLA:GPU] Store fusion_roots and fusion_heroes as HloInstruc…
olegshyshkov May 21, 2024
b204051
[Triton] Re-enable createRemoveLayoutConversionPass.
Moerafaat May 21, 2024
aef9a9e
[xla:gpu] Fail gracefully (i.e. don't segfault) if Triton MLIR isn't …
chr1sj0nes May 21, 2024
1f5ff66
Automated Code Change
akuegel May 21, 2024
39a2d69
Replace all remaining uses of affine.apply with apply_indexing.
jreiffers May 21, 2024
35bc8b9
Remove the use of xla::OkStatus now that it's just an alias to absl::…
klucke May 21, 2024
6a90cce
Reverts 42ac0dde8f7855c87dfea2217b03b8ee7e829ae9
qukhan May 21, 2024
a68d77d
PR #12897: [GPU] Fix cuDNN GEMM scalar constants test condition.
sergachev May 21, 2024
7990149
Rename `RewriteSumIf` to `RemoveSummands`.
jreiffers May 21, 2024
da437e3
[Triton] Fix a use-after-free bug in LinearLayout::compose.
olegshyshkov May 21, 2024
4001ea6
Tighten reduction loop bounds for MLIR emitter.
jreiffers May 21, 2024
0f7658d
Split sum-modification from sum-visitation.
jreiffers May 21, 2024
aa9f630
Move schema_fbs_with_reflection to mlir/lite/schema.
tensorflower-gardener May 21, 2024
d83a73e
Migrate away from llvm::StringRef::equals
tensorflower-gardener May 21, 2024
54cfe0f
Align DRQ TransposeConv XNNPack test with reference TFLite implementa…
ablavatski May 21, 2024
1ce2d0f
Merge remote-tracking branch 'upstream/master' into develop-upstream-…
draganmladjenovic May 21, 2024
84d5270
Merge fixes after 240521
draganmladjenovic May 21, 2024
6af30d6
Fix tensorflow/core/kernels/conv_ops_gpu.cc build failure
draganmladjenovic May 21, 2024
cf09f6a
[ROCM][DO NOT UPSTREAM] Pin libamdhip64.so version
draganmladjenovic May 25, 2024
4da4f80
Fix //tensorflow/core/common_runtime/gpu:gpu_device_test_gpu failure
draganmladjenovic May 28, 2024
4736848
[ROCm] Fix profiler deadlock due to 3d7a4720b02e643faccd13f9a450213b5…
draganmladjenovic May 29, 2024
cc4a257
Disable some failing XLA tests
draganmladjenovic Jun 3, 2024
b3fcb02
Fix build error in third_party/xla/xla/service/memory_space_assignmen…
draganmladjenovic Jun 4, 2024
a344018
Fix xla/service/gpu/gpu_compiler_test.cc build
draganmladjenovic Jun 4, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
The diff you're trying to view is too large. We only load the first 3000 changed files.
3 changes: 3 additions & 0 deletions RELEASE.md
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,7 @@
been added to TF binary distributions (Python wheels).
* Replace `DebuggerOptions` of TensorFlow Quantizer, and migrate to
`DebuggerConfig` of StableHLO Quantizer.
* Add TensorFlow to StableHLO converter to TensorFlow pip package.

## Keras

Expand Down Expand Up @@ -87,6 +88,8 @@
* The Python TF Lite Interpreter bindings now have an option
`experimental_default_delegate_latest_features` to enable all default
delegate features.
* Flatbuffer version update:
* `GetTemporaryPointer()` bug fixed.

* `tf.data`
* Add `wait` to `tf.data.Dataset.load`. If `True`, for snapshots written
Expand Down
8 changes: 7 additions & 1 deletion ci/official/containers/linux_arm64/build.sh
Original file line number Diff line number Diff line change
Expand Up @@ -40,11 +40,15 @@ else
fi
fi

# TODO(b/341050361): When these steps are verified, removed the GCR image code.
AR_IMAGE_PATH="us-central1-docker.pkg.dev/tensorflow-sigs/tensorflow/build-arm64"

# Build for both JAX and TF usage. We do these in one place because they share
# almost all of the same cache layers
export DOCKER_BUILDKIT=1
for target in jax tf; do
IMAGE="gcr.io/tensorflow-sigs/build-arm64:$target-$TAG"
AR_IMAGE="$AR_IMAGE_PATH:$target-$TAG"
docker pull "$IMAGE" || true
# Due to some flakiness of resources pulled in the build, allow the docker
# command to reattempt build a few times in the case of failure (b/302558736)
Expand All @@ -55,7 +59,7 @@ for target in jax tf; do
--build-arg REQUIREMENTS_FILE=jax.requirements.txt \
--target=$target \
--cache-from "$IMAGE" \
-t "$IMAGE" . && break
-t "$IMAGE" -t "$AR_IMAGE" . && break
done
final=$?
if [ $final -ne 0 ]; then
Expand All @@ -66,5 +70,7 @@ for target in jax tf; do
if [[ -n "$KOKORO_BUILD_ID" ]]; then
gcloud auth configure-docker
docker push "$IMAGE"
gcloud auth configure-docker us-central1-docker.pkg.dev
docker push "$AR_IMAGE"
fi
done
5 changes: 3 additions & 2 deletions ci/official/utilities/setup_docker.sh
Original file line number Diff line number Diff line change
Expand Up @@ -14,11 +14,12 @@
# limitations under the License.
# ==============================================================================
if [[ "$TFCI_DOCKER_PULL_ENABLE" == 1 ]]; then
# Simple retry logic for docker-pull errors. Sleeps for 15s if a pull fails.
# Simple retry logic for docker-pull errors. Sleeps if a pull fails.
# Pulling an already-pulled container image will finish instantly, so
# repeating the command costs nothing.
docker pull "$TFCI_DOCKER_IMAGE" || sleep 15
docker pull "$TFCI_DOCKER_IMAGE" || sleep 15
docker pull "$TFCI_DOCKER_IMAGE" || sleep 30
docker pull "$TFCI_DOCKER_IMAGE" || sleep 60
docker pull "$TFCI_DOCKER_IMAGE"
fi

Expand Down
6 changes: 3 additions & 3 deletions requirements_lock_3_10.txt
Original file line number Diff line number Diff line change
Expand Up @@ -522,9 +522,9 @@ urllib3==2.2.0 \
--hash=sha256:051d961ad0c62a94e50ecf1af379c3aba230c66c710493493560c0c223c49f20 \
--hash=sha256:ce3711610ddce217e6d113a2732fafad960a03fd0318c91faa79481e35c11224
# via requests
werkzeug==3.0.1 \
--hash=sha256:507e811ecea72b18a404947aded4b3390e1db8f826b494d76550ef45bb3b1dcc \
--hash=sha256:90a285dc0e42ad56b34e696398b8122ee4c681833fb35b8334a095d82c56da10
werkzeug==3.0.3 \
--hash=sha256:097e5bfda9f0aba8da6b8545146def481d06aa7d3266e7448e2cccf67dd8bd18 \
--hash=sha256:fc9645dc43e03e4d630d23143a04a7f947a9a3b5727cd535fdfe155a17cc48c8
# via tb-nightly
wheel==0.41.3 \
--hash=sha256:488609bc63a29322326e05560731bf7bfea8e48ad646e1f5e40d366607de0942 \
Expand Down
6 changes: 3 additions & 3 deletions requirements_lock_3_11.txt
Original file line number Diff line number Diff line change
Expand Up @@ -522,9 +522,9 @@ urllib3==2.2.0 \
--hash=sha256:051d961ad0c62a94e50ecf1af379c3aba230c66c710493493560c0c223c49f20 \
--hash=sha256:ce3711610ddce217e6d113a2732fafad960a03fd0318c91faa79481e35c11224
# via requests
werkzeug==3.0.1 \
--hash=sha256:507e811ecea72b18a404947aded4b3390e1db8f826b494d76550ef45bb3b1dcc \
--hash=sha256:90a285dc0e42ad56b34e696398b8122ee4c681833fb35b8334a095d82c56da10
werkzeug==3.0.3 \
--hash=sha256:097e5bfda9f0aba8da6b8545146def481d06aa7d3266e7448e2cccf67dd8bd18 \
--hash=sha256:fc9645dc43e03e4d630d23143a04a7f947a9a3b5727cd535fdfe155a17cc48c8
# via tb-nightly
wheel==0.41.3 \
--hash=sha256:488609bc63a29322326e05560731bf7bfea8e48ad646e1f5e40d366607de0942 \
Expand Down
6 changes: 3 additions & 3 deletions requirements_lock_3_12.txt
Original file line number Diff line number Diff line change
Expand Up @@ -530,9 +530,9 @@ urllib3==2.2.0 \
--hash=sha256:051d961ad0c62a94e50ecf1af379c3aba230c66c710493493560c0c223c49f20 \
--hash=sha256:ce3711610ddce217e6d113a2732fafad960a03fd0318c91faa79481e35c11224
# via requests
werkzeug==3.0.1 \
--hash=sha256:507e811ecea72b18a404947aded4b3390e1db8f826b494d76550ef45bb3b1dcc \
--hash=sha256:90a285dc0e42ad56b34e696398b8122ee4c681833fb35b8334a095d82c56da10
werkzeug==3.0.3 \
--hash=sha256:097e5bfda9f0aba8da6b8545146def481d06aa7d3266e7448e2cccf67dd8bd18 \
--hash=sha256:fc9645dc43e03e4d630d23143a04a7f947a9a3b5727cd535fdfe155a17cc48c8
# via tb-nightly
wheel==0.41.3 \
--hash=sha256:488609bc63a29322326e05560731bf7bfea8e48ad646e1f5e40d366607de0942 \
Expand Down
6 changes: 3 additions & 3 deletions requirements_lock_3_9.txt
Original file line number Diff line number Diff line change
Expand Up @@ -526,9 +526,9 @@ urllib3==2.2.0 \
--hash=sha256:051d961ad0c62a94e50ecf1af379c3aba230c66c710493493560c0c223c49f20 \
--hash=sha256:ce3711610ddce217e6d113a2732fafad960a03fd0318c91faa79481e35c11224
# via requests
werkzeug==3.0.1 \
--hash=sha256:507e811ecea72b18a404947aded4b3390e1db8f826b494d76550ef45bb3b1dcc \
--hash=sha256:90a285dc0e42ad56b34e696398b8122ee4c681833fb35b8334a095d82c56da10
werkzeug==3.0.3 \
--hash=sha256:097e5bfda9f0aba8da6b8545146def481d06aa7d3266e7448e2cccf67dd8bd18 \
--hash=sha256:fc9645dc43e03e4d630d23143a04a7f947a9a3b5727cd535fdfe155a17cc48c8
# via tb-nightly
wheel==0.41.3 \
--hash=sha256:488609bc63a29322326e05560731bf7bfea8e48ad646e1f5e40d366607de0942 \
Expand Down
2 changes: 2 additions & 0 deletions tensorflow/BUILD
Original file line number Diff line number Diff line change
Expand Up @@ -1382,6 +1382,7 @@ tf_cc_shared_library(
"//tensorflow/compiler/mlir/quantization/common/quantization_lib:quantization_config",
"//tensorflow/compiler/mlir/lite/sparsity:sparsify_model",
"//tensorflow/compiler/mlir/quantization/stablehlo/python:pywrap_quantization_lib_impl",
"//tensorflow/compiler/mlir/quantization/tensorflow_to_stablehlo/python:pywrap_tensorflow_to_stablehlo_lib_impl",
"//tensorflow/compiler/mlir/quantization/tensorflow/calibrator:custom_aggregator_op",
"//tensorflow/compiler/mlir/quantization/tensorflow/python:quantize_model_cc_impl",
"//tensorflow/compiler/mlir/quantization/tensorflow:passes",
Expand Down Expand Up @@ -1416,6 +1417,7 @@ tf_cc_shared_library(
"//tensorflow/core/grappler:grappler_item_builder",
"//tensorflow/core/kernels:data_service_ops",
"//tensorflow/core/kernels:dataset_ops",
"//tensorflow/core/tpu/kernels:sparse_core_layout",
"//tensorflow/core/platform:logging",
"//tensorflow/core/platform:path",
"//tensorflow/core/platform:stacktrace_handler",
Expand Down
3 changes: 3 additions & 0 deletions tensorflow/c/experimental/ops/gen/common/case_format.cc
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,9 @@ limitations under the License.
==============================================================================*/
#include "tensorflow/c/experimental/ops/gen/common/case_format.h"

#include "tensorflow/core/platform/str_util.h"
#include "tensorflow/core/platform/types.h"

namespace tensorflow {
namespace generator {

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@ limitations under the License.
#include "tensorflow/c/experimental/ops/gen/common/case_format.h"

#include "tensorflow/core/platform/test.h"
#include "tensorflow/core/platform/types.h"

namespace tensorflow {
namespace generator {
Expand Down
10 changes: 8 additions & 2 deletions tensorflow/c/experimental/ops/gen/common/controller.cc
Original file line number Diff line number Diff line change
Expand Up @@ -15,11 +15,17 @@ limitations under the License.
#include "tensorflow/c/experimental/ops/gen/common/controller.h"

#include "absl/strings/substitute.h"
#include "tensorflow/c/experimental/ops/gen/common/path_config.h"
#include "tensorflow/c/experimental/ops/gen/common/source_code.h"
#include "tensorflow/c/experimental/ops/gen/model/op_spec.h"
#include "tensorflow/core/framework/api_def.pb.h"
#include "tensorflow/core/framework/op.h"
#include "tensorflow/core/lib/io/path.h"
#include "tensorflow/core/lib/strings/str_util.h"
#include "tensorflow/core/framework/op_def.pb.h"
#include "tensorflow/core/framework/op_gen_lib.h"
#include "tensorflow/core/platform/env.h"
#include "tensorflow/core/platform/logging.h"
#include "tensorflow/core/platform/path.h"
#include "tsl/platform/status.h"

namespace tensorflow {
namespace generator {
Expand Down
2 changes: 2 additions & 0 deletions tensorflow/c/experimental/ops/gen/common/path_config.cc
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,9 @@ limitations under the License.

#include <iostream>

#include "absl/strings/str_join.h"
#include "tensorflow/core/lib/strings/str_util.h"
#include "tensorflow/core/platform/types.h"

namespace tensorflow {
namespace generator {
Expand Down
3 changes: 3 additions & 0 deletions tensorflow/c/experimental/ops/gen/common/source_code.cc
Original file line number Diff line number Diff line change
Expand Up @@ -14,9 +14,12 @@ limitations under the License.
==============================================================================*/
#include "tensorflow/c/experimental/ops/gen/common/source_code.h"

#include "absl/strings/ascii.h"
#include "absl/strings/match.h"
#include "absl/strings/str_cat.h"
#include "tensorflow/core/lib/strings/str_util.h"
#include "tensorflow/core/platform/logging.h"
#include "tensorflow/core/platform/stringpiece.h"

namespace tensorflow {
namespace generator {
Expand Down
2 changes: 2 additions & 0 deletions tensorflow/c/experimental/ops/gen/common/view_util.cc
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,9 @@ limitations under the License.
==============================================================================*/
#include "tensorflow/c/experimental/ops/gen/common/view_util.h"

#include "absl/strings/str_join.h"
#include "absl/strings/substitute.h"
#include "tensorflow/core/platform/types.h"

namespace tensorflow {
namespace generator {
Expand Down
28 changes: 8 additions & 20 deletions tensorflow/c/experimental/stream_executor/stream_executor.cc
Original file line number Diff line number Diff line change
Expand Up @@ -407,10 +407,6 @@ class CStreamExecutor : public StreamExecutor {
return stream_executor_->host_callback(&device_, stream_handle,
&HostCallbackTrampoline, ctx);
}
absl::Status AllocateEvent(Event* event) override {
DCHECK(event != nullptr);
return static_cast<CEvent*>(event->implementation())->Create();
}
absl::Status DeallocateEvent(Event* event) override {
static_cast<CEvent*>(event->implementation())->Destroy();
return absl::OkStatus();
Expand Down Expand Up @@ -438,14 +434,6 @@ class CStreamExecutor : public StreamExecutor {
stream_executor_->get_event_status(&device_, event_handle);
return SEEventStatusToEventStatus(event_status);
}
bool AllocateStream(Stream* stream) override {
DCHECK(stream != nullptr);
absl::Status status =
static_cast<CStream*>(stream->implementation())->Create();
// TODO(annarev): update AllocateStream to return status instead
// (similar to AllocateEvent).
return status.ok();
}
void DeallocateStream(Stream* stream) override {
static_cast<CStream*>(stream->implementation())->Destroy();
}
Expand Down Expand Up @@ -559,18 +547,18 @@ class CStreamExecutor : public StreamExecutor {
return builder.Build();
}

// Each call creates a new instance of the platform-specific implementation of
// the corresponding interface type.
std::unique_ptr<EventInterface> CreateEventImplementation() override {
return std::unique_ptr<EventInterface>(
new CEvent(&device_, stream_executor_));
absl::StatusOr<std::unique_ptr<Event>> CreateEvent() override {
auto c_event = std::make_unique<CEvent>(&device_, stream_executor_);
TF_RETURN_IF_ERROR(c_event->Create());
return std::make_unique<Event>(this, std::move(c_event));
}

absl::StatusOr<std::unique_ptr<Stream>> CreateStream(
std::optional<std::variant<StreamPriority, int>> priority =
std::nullopt) override {
auto stream = std::make_unique<Stream>(
this, std::make_unique<CStream>(&device_, stream_executor_));
TF_RETURN_IF_ERROR(stream->Initialize(priority));
auto c_stream = std::make_unique<CStream>(&device_, stream_executor_);
TF_RETURN_IF_ERROR(c_stream->Create());
auto stream = std::make_unique<Stream>(this, std::move(c_stream));
return std::move(stream);
}

Expand Down
19 changes: 8 additions & 11 deletions tensorflow/c/experimental/stream_executor/stream_executor_test.cc
Original file line number Diff line number Diff line change
Expand Up @@ -342,11 +342,10 @@ TEST_F(StreamExecutorTest, CreateEvent) {

StreamExecutor* executor = GetExecutor(0);
ASSERT_FALSE(event_created);
Event* event = new Event(executor);
event->Init();
TF_ASSERT_OK_AND_ASSIGN(auto event, executor->CreateEvent());
ASSERT_TRUE(event_created);
ASSERT_FALSE(event_deleted);
delete event;
event.reset();
ASSERT_TRUE(event_deleted);
}

Expand All @@ -365,11 +364,10 @@ TEST_F(StreamExecutorTest, PollForEventStatus) {
};

StreamExecutor* executor = GetExecutor(0);
Event event(executor);
event.Init();
ASSERT_EQ(event.PollForStatus(), Event::Status::kComplete);
TF_ASSERT_OK_AND_ASSIGN(auto event, executor->CreateEvent());
ASSERT_EQ(event->PollForStatus(), Event::Status::kComplete);
event_status = SE_EVENT_ERROR;
ASSERT_EQ(event.PollForStatus(), Event::Status::kError);
ASSERT_EQ(event->PollForStatus(), Event::Status::kError);
}

TEST_F(StreamExecutorTest, RecordAndWaitForEvent) {
Expand Down Expand Up @@ -403,14 +401,13 @@ TEST_F(StreamExecutorTest, RecordAndWaitForEvent) {
};

StreamExecutor* executor = GetExecutor(0);
Event event(executor);
event.Init();
TF_ASSERT_OK_AND_ASSIGN(auto event, executor->CreateEvent());
TF_ASSERT_OK_AND_ASSIGN(auto stream, executor->CreateStream());
ASSERT_FALSE(record_called);
TF_ASSERT_OK(stream->RecordEvent(&event));
TF_ASSERT_OK(stream->RecordEvent(event.get()));
ASSERT_TRUE(record_called);
ASSERT_FALSE(wait_called);
TF_ASSERT_OK(stream->WaitFor(&event));
TF_ASSERT_OK(stream->WaitFor(event.get()));
ASSERT_TRUE(wait_called);
}

Expand Down
3 changes: 3 additions & 0 deletions tensorflow/compiler/jit/BUILD
Original file line number Diff line number Diff line change
Expand Up @@ -199,6 +199,7 @@ cc_library(
"//tensorflow/core/tpu:tpu_node_device_util",
"//tensorflow/core/tpu:virtual_device",
"@com_google_absl//absl/types:optional",
"@local_tsl//tsl/platform:statusor",
"@local_xla//xla/stream_executor/tpu:c_api_conversions",
"@local_xla//xla/stream_executor/tpu:status_helper",
"@local_xla//xla/stream_executor/tpu:tpu_api",
Expand Down Expand Up @@ -314,6 +315,7 @@ cc_library(
"//tensorflow/core/common_runtime:dma_helper",
"//tensorflow/core/framework:allocator",
"@com_google_absl//absl/synchronization",
"@local_tsl//tsl/platform:statusor",
"@local_xla//xla:util",
"@local_xla//xla/client:global_data",
"@local_xla//xla/client:local_client",
Expand Down Expand Up @@ -1149,6 +1151,7 @@ cc_library(
"@com_google_absl//absl/algorithm:container",
"@com_google_absl//absl/container:flat_hash_map",
"@com_google_absl//absl/container:flat_hash_set",
"@com_google_absl//absl/numeric:bits",
"@com_google_absl//absl/strings",
"@com_google_absl//absl/types:span",
"@local_xla//xla:status_macros",
Expand Down
17 changes: 2 additions & 15 deletions tensorflow/compiler/jit/device_util.h
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@ limitations under the License.
#include <memory>

#include "absl/container/flat_hash_map.h"
#include "absl/numeric/bits.h"
#include "absl/strings/string_view.h"
#include "absl/types/span.h"
#include "tensorflow/compiler/tf2xla/xla_op_registry.h"
Expand Down Expand Up @@ -79,7 +80,7 @@ class DeviceSet {
uint64 only_lowest_bit_set = word & -word;
// The number of trailing zeros in a non-zero word is the index of the
// least significant 1.
int bit_index = ctz_uint64(word);
int bit_index = absl::countr_zero(word);
if (!func(DeviceId(word_index * kWordSize + bit_index))) {
return;
}
Expand All @@ -89,20 +90,6 @@ class DeviceSet {
}

private:
static int ctz_uint64(uint64 x) {
DCHECK_NE(x, 0);
#ifdef __GNUC__
return __builtin_ctzl(x);
#else
int result = 0u;
while ((x & 1u) == 0u) {
x >>= 1;
++result;
}
return result;
#endif
}

absl::InlinedVector<uint64, 1> storage_;

const int kWordSize = 64;
Expand Down
2 changes: 2 additions & 0 deletions tensorflow/compiler/jit/kernels/BUILD
Original file line number Diff line number Diff line change
Expand Up @@ -59,8 +59,10 @@ cc_library(
"//tensorflow/compiler/jit:xla_compile_util",
"//tensorflow/core/platform:refcount",
"@com_google_absl//absl/status",
"@com_google_absl//absl/status:statusor",
"@com_google_absl//absl/strings",
"@local_xla//xla/pjrt:pjrt_client",
"@local_xla//xla/tsl/concurrency:async_value",
],
alwayslink = 1,
)
Expand Down
Loading
Loading