module 'tensorflow_addons' has no attribute 'optimizers' (tfa-nightly) #2578

asapsmc · 2021-09-27T21:58:14Z

System information

OS Platform and Distribution (e.g., Linux Ubuntu 16.04): MacOs M1 (12.0 Beta Monterey)
TensorFlow version and how it was installed (source or binary): 2.5.0 (pip)
TensorFlow-Addons version and how it was installed (source or binary): tfa-nightly 0.15.0 (pip)
Python version: 3.8
Is GPU used? (yes/no): NA

Describe the bug

After installing from nightly version, I got an error module 'tensorflow_addons' has no attribute 'optimizers'

Code to reproduce the issue

import tensorflow_addons as tfa
...
radam = tfa.optimizers.RectifiedAdam(lr=cf["lr"], clipnorm=clipnorm)

Other info / logs

Include any logs or source code that would be helpful to diagnose the problem. If including tracebacks, please include the full traceback. Large logs and files should be attached.

The text was updated successfully, but these errors were encountered:

bhack · 2021-09-27T22:33:51Z

/cc @szutenberg

szutenberg · 2021-09-28T19:13:37Z

Hi @MR-T77

I tried to reproduce it:

tensorflow-cpu 2.5.0
tfa-nightly 0.15.0.dev20210922190150
Ubuntu 20.04.1 LTS

Everything seems to be ok.

Please check if you can run the following code:

import tensorflow_addons as tfa
print(tfa)
print(tfa.optimizers)
print(tfa.optimizers.RectifiedAdam)

What does it return?

If you're still getting an error then please attach the output from pip freeze.

asapsmc · 2021-09-29T11:50:08Z

Hi @szutenberg!
It seems something was broken with my conda environment (I tried so many things...). I uninstalled tensorflow-addons (0.14) and reinstalled tfa-nightly, and now I can import tfa without error. Just to be sure: whenever I update something on a conda environment, I immediately run code on top of it, I don't restart terminal or vscode (I'm not sure this is the best process or if I should restart something).
Nevertheless, although I can import addons, I cannot use them, I always get the following error (tried with other optimizers too but I get the same error):

Traceback (most recent call last):
  File "/Users/machine/miniforge3/envs/tf/lib/python3.9/runpy.py", line 197, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/Users/machine/miniforge3/envs/tf/lib/python3.9/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/Users/machine/.vscode/extensions/ms-python.python-2021.9.1246542782/pythonFiles/lib/python/debugpy/__main__.py", line 45, in <module>
    cli.main()
  File "/Users/machine/.vscode/extensions/ms-python.python-2021.9.1246542782/pythonFiles/lib/python/debugpy/../debugpy/server/cli.py", line 444, in main
    run()
  File "/Users/machine/.vscode/extensions/ms-python.python-2021.9.1246542782/pythonFiles/lib/python/debugpy/../debugpy/server/cli.py", line 285, in run_file
    runpy.run_path(target_as_str, run_name=compat.force_str("__main__"))
  File "/Users/machine/miniforge3/envs/tf/lib/python3.9/runpy.py", line 268, in run_path
    return _run_module_code(code, init_globals, run_name,
  File "/Users/machine/miniforge3/envs/tf/lib/python3.9/runpy.py", line 97, in _run_module_code
    _run_code(code, mod_globals, init_globals,
  File "/Users/machine/miniforge3/envs/tf/lib/python3.9/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/Users/machine/Projects/finetune-asp/src/ISMIR2020_v2.py", line 471, in <module>
    main()
  File "/Users/machine/Projects/finetune-asp/src/ISMIR2020_v2.py", line 466, in main
    new_train(dataset, 'TCNv2', cpu=False, addons=True)
  File "/Users/machine/Projects/finetune-asp/src/ISMIR2020_v2.py", line 329, in new_train
    history = model.fit(train, steps_per_epoch=len(train), epochs=cf["num_epochs"], shuffle=True,
  File "/Users/machine/miniforge3/envs/tf/lib/python3.9/site-packages/tensorflow/python/keras/engine/training.py", line 1183, in fit
    tmp_logs = self.train_function(iterator)
  File "/Users/machine/miniforge3/envs/tf/lib/python3.9/site-packages/tensorflow/python/eager/def_function.py", line 889, in __call__
    result = self._call(*args, **kwds)
  File "/Users/machine/miniforge3/envs/tf/lib/python3.9/site-packages/tensorflow/python/eager/def_function.py", line 950, in _call
    return self._stateless_fn(*args, **kwds)
  File "/Users/machine/miniforge3/envs/tf/lib/python3.9/site-packages/tensorflow/python/eager/function.py", line 3023, in __call__
    return graph_function._call_flat(
  File "/Users/machine/miniforge3/envs/tf/lib/python3.9/site-packages/tensorflow/python/eager/function.py", line 1960, in _call_flat
    return self._build_call_outputs(self._inference_function.call(
  File "/Users/machine/miniforge3/envs/tf/lib/python3.9/site-packages/tensorflow/python/eager/function.py", line 591, in call
    outputs = execute.execute(
  File "/Users/machine/miniforge3/envs/tf/lib/python3.9/site-packages/tensorflow/python/eager/execute.py", line 59, in quick_execute
    tensors = pywrap_tfe.TFE_Py_Execute(ctx._handle, device_name, op_name,
tensorflow.python.framework.errors_impl.InvalidArgumentError: Cannot assign a device for operation model/conv_1_3x3_conv/Conv2D/ReadVariableOp: Could not satisfy explicit device specification '' because the node {{colocation_node model/conv_1_3x3_conv/Conv2D/ReadVariableOp}} was colocated with a group of nodes that required incompatible device '/job:localhost/replica:0/task:0/device:GPU:0'. All available devices [/job:localhost/replica:0/task:0/device:CPU:0, /job:localhost/replica:0/task:0/device:GPU:0]. 
Colocation Debug Info:
Colocation group had the following types and supported devices: 
Root Member(assigned_device_name_index_=2 requested_device_name_='/job:localhost/replica:0/task:0/device:GPU:0' assigned_device_name_='/job:localhost/replica:0/task:0/device:GPU:0' resource_device_name_='/job:localhost/replica:0/task:0/device:GPU:0' supported_device_types_=[CPU] possible_devices_=[]
Equal: CPU 
AssignSubVariableOp: GPU CPU 
AssignVariableOp: GPU CPU 
GreaterEqual: GPU CPU 
FloorDiv: CPU 
Sqrt: GPU CPU 
NoOp: GPU CPU 
Pow: GPU CPU 
Mul: CPU 
Cast: GPU CPU 
Identity: GPU CPU 
SelectV2: GPU CPU 
ReadVariableOp: GPU CPU 
RealDiv: GPU CPU 
Sub: GPU CPU 
AddV2: GPU CPU 
Const: GPU CPU 
Square: GPU CPU 
_Arg: GPU CPU 

Colocation members, user-requested devices, and framework assigned devices, if any:
  model_conv_1_3x3_conv_conv2d_readvariableop_resource (_Arg)  framework assigned device=/job:localhost/replica:0/task:0/device:GPU:0
  lookahead_lookahead_update_mul_5_readvariableop_resource (_Arg)  framework assigned device=/job:localhost/replica:0/task:0/device:GPU:0
  lookahead_lookahead_update_mul_8_readvariableop_resource (_Arg)  framework assigned device=/job:localhost/replica:0/task:0/device:GPU:0
  lookahead_lookahead_update_sub_10_readvariableop_resource (_Arg)  framework assigned device=/job:localhost/replica:0/task:0/device:GPU:0
  model/conv_1_3x3_conv/Conv2D/ReadVariableOp (ReadVariableOp) 
  Lookahead/Lookahead/update/Identity (Identity) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/add/y (Const) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/add (AddV2) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/Cast_5 (Cast) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/Pow (Pow) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/Pow_1 (Pow) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/sub/x (Const) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/sub (Sub) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/sub_1/x (Const) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/sub_1 (Sub) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/truediv/x (Const) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/truediv (RealDiv) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/sub_2/x (Const) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/sub_2 (Sub) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/sub_3/x (Const) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/sub_3 (Sub) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/truediv_1/x (Const) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/truediv_1 (RealDiv) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/truediv_2/x (Const) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/truediv_2 (RealDiv) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/sub_4/y (Const) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/sub_4 (Sub) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/mul/x (Const) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/mul (Mul) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/mul_1 (Mul) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/mul_2 (Mul) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/sub_5 (Sub) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/sub_6/y (Const) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/sub_6 (Sub) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/sub_7/y (Const) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/sub_7 (Sub) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/truediv_3 (RealDiv) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/sub_8/y (Const) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/sub_8 (Sub) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/mul_3 (Mul) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/sub_9/y (Const) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/sub_9 (Sub) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/truediv_4 (RealDiv) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/mul_4 (Mul) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/truediv_5 (RealDiv) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/Sqrt (Sqrt) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/GreaterEqual (GreaterEqual) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/Const (Const) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/mul_5/ReadVariableOp (ReadVariableOp) 
  Lookahead/Lookahead/update/mul_5 (Mul) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/mul_6 (Mul) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/add_1 (AddV2) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/AssignVariableOp (AssignVariableOp) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/ReadVariableOp_1 (ReadVariableOp) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/mul_7 (Mul) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/mul_8/ReadVariableOp (ReadVariableOp) 
  Lookahead/Lookahead/update/mul_8 (Mul) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/Square (Square) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/mul_9 (Mul) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/add_2 (AddV2) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/AssignVariableOp_1 (AssignVariableOp) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/ReadVariableOp_2 (ReadVariableOp) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/mul_10 (Mul) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/Sqrt_1 (Sqrt) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/mul_11 (Mul) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/add_3 (AddV2) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/truediv_6 (RealDiv) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/SelectV2 (SelectV2) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/mul_12 (Mul) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/AssignSubVariableOp (AssignSubVariableOp) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/group_deps (NoOp) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/add_4/y (Const) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/add_4 (AddV2) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/Cast_7/x (Const) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/Cast_7 (Cast) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/Cast_8/x (Const) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/ReadVariableOp_4 (ReadVariableOp) 
  Lookahead/Lookahead/update/sub_10/ReadVariableOp (ReadVariableOp) 
  Lookahead/Lookahead/update/sub_10 (Sub) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/mul_13 (Mul) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/ReadVariableOp_5 (ReadVariableOp) 
  Lookahead/Lookahead/update/add_5 (AddV2) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/floordiv (FloorDiv) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/mul_14 (Mul) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/Equal (Equal) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/SelectV2_1/ReadVariableOp (ReadVariableOp) 
  Lookahead/Lookahead/update/SelectV2_1 (SelectV2) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/AssignVariableOp_2 (AssignVariableOp) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/SelectV2_2/ReadVariableOp (ReadVariableOp) 
  Lookahead/Lookahead/update/SelectV2_2 (SelectV2) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/AssignVariableOp_3 (AssignVariableOp) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/group_deps_1 (NoOp) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/group_deps_2 (NoOp) /job:localhost/replica:0/task:0/device:GPU:0

         [[{{node model/conv_1_3x3_conv/Conv2D/ReadVariableOp}}]] [Op:__inference_train_function_30042]

bhack · 2021-09-29T12:06:35Z

/cc @lgeiger Can you replicate this on your M1?

szutenberg · 2021-09-30T05:48:32Z

@MR-T77 It looks that there are no issues with TFA but there are missing GPU kernels which breaks collocation.

You need to check what types do you use with Equal, Mul and FloorDiv. You can dump graphs (TF_DUMP_GRAPH_PREFIX + turn on vlog) and check it in pbtxt file.

asapsmc · 2021-09-30T10:05:28Z

@szutenberg I'm a newbie on this type of things, sorry. Could you please point me to more detailed description of what I need to do? Thanks in advance

asapsmc · 2021-09-30T10:34:58Z

Also, I'm not using any custom code here. It's a simple BLSTM (3-layer keras Bidirectional(simpleRNN) with a dense output)

szutenberg · 2021-09-30T19:29:22Z

@MR-T77 maybe the easiest would be to provide the code so that we can reproduce it locally.

İf you want to debug it on your own then set TF_CPP_MAX_VLOG_LEVEL to 10, and TF_DUMP_GRAPH_PREFİX=tmp. You should see tmp dir after running the script. Reading placer_input.pbtxt will answer to my question (simply grep -A 20 -rn FloorDiv). You'll see dtypes.

Anyway I'm traveling now so I'm not able to help you further until ~12th October.

asapsmc · 2021-09-30T20:15:09Z

@szutenberg thank you so much for your availability to help! I'll try to understand what's wrong until 12th October. If I'm unsuccessful, I'll contact again.

asapsmc · 2021-09-30T20:17:19Z

@szutenberg anyways, I leave here the code I'm using, just in case it's a easy thing you can spot on:

import os
import pickle
import numpy as np
from tensorflow.keras.utils import Sequence
from tensorflow.keras.layers import (Dense, Input)
from tensorflow.keras.layers import SimpleRNN, Bidirectional, Masking, LSTM  # For BLSTM
from tensorflow.keras.models import Sequential, Model
import madmom

import tensorflow.keras.backend as K
import tensorflow as tf

import tensorflow_addons as tfa

from modules.utils import PKL_PATH

tf.config.set_soft_device_placement(True)
# GENERAL CONSTANTS
FPS = 100  # set the frame rate as FPS frames per second
MASK_VALUE = -1

lr = 0.05
num_epochs = 50


class Fold(object):

    def __init__(self, folds, fold):
        self.folds = folds
        self.fold = fold

    @property
    def test(self):
        # fold N for testing
        return np.unique(self.folds[self.fold])

    @property
    def val(self):
        # fold N+1 for validation
        return np.unique(self.folds[(self.fold + 1) % len(self.folds)])

    @property
    def train(self):
        # all remaining folds for training
        train = np.hstack(self.folds)
        train = np.setdiff1d(train, self.val)
        train = np.setdiff1d(train, self.test)
        return train


class DataSequence_BLSTM(Sequence):

    mask_value = -999  # only needed for batch sizes > 1

    def __init__(self, x, y, batch_size=1, max_seq_length=None, fps=FPS):
        self.x = x
        self.y = [madmom.utils.quantize_events(o, fps=fps, length=len(d))
                  for o, d in zip(y, self.x)]
        self.batch_size = batch_size
        self.max_seq_length = max_seq_length

    def __len__(self):
        return int(np.ceil(len(self.x) / float(self.batch_size)))

    def __getitem__(self, idx):
        # determine which sequence(s) to use
        x = self.x[idx * self.batch_size:(idx + 1) * self.batch_size]
        y = self.y[idx * self.batch_size:(idx + 1) * self.batch_size]
        # pad them if needed
        if self.batch_size > 1:
            x = tf.keras.preprocessing.sequence.pad_sequences(
                x, maxlen=self.max_seq_length, dtype=np.float32, truncating='post', value=self.mask_value)
            y = tf.keras.preprocessing.sequence.pad_sequences(
                y, maxlen=self.max_seq_length, dtype=np.int32, truncating='post', value=self.mask_value)
        return np.array(x), np.array(y)[..., np.newaxis]


def simple_BLSTM(dataset, cpu=False):
    train_db = pickle.load(open('%s/%s.pkl' % (PKL_PATH, dataset), 'rb'))
    num_fold = 0
    fold = Fold(train_db.folds, num_fold)
    train = DataSequence_BLSTM([train_db.x[i] for i in fold.train],
                               [train_db.annotations[i] for i in fold.train],
                               batch_size=1, max_seq_length=60 * FPS)
    val = DataSequence_BLSTM([train_db.x[i] for i in fold.val],
                             [train_db.annotations[i] for i in fold.val],
                             batch_size=1, max_seq_length=60 * FPS)
    input_layer = Input((None, train[0][0].shape[-1]))
    masked = Masking(mask_value=-999)(input_layer)
    blstm_1 = Bidirectional(SimpleRNN(units=25, return_sequences=True))(masked)
    blstm_2 = Bidirectional(SimpleRNN(units=25, return_sequences=True))(blstm_1)
    blstm_3 = Bidirectional(SimpleRNN(units=25, return_sequences=True))(blstm_2)
    output_layer = Dense(1, name='output', activation='sigmoid')(blstm_3)
    model = Model(input_layer, output_layer)
    radam = tfa.optimizers.RectifiedAdam(lr=lr, clipnorm=0.5)
    ranger = tfa.optimizers.Lookahead(radam, sync_period=6, slow_step_size=0.5)
    model.compile(optimizer=ranger, loss=K.binary_crossentropy, metrics=['binary_accuracy'])
    history = model.fit(train, steps_per_epoch=len(train), epochs=num_epochs, shuffle=True,
                        validation_data=val, validation_steps=len(val),
                        verbose=True)
    return True


def main():
    tf.config.set_soft_device_placement(True)
    dataset = "traintest_smallsmc"
    simple_BLSTM(dataset)


if __name__ == "__main__":

    main()

bhack · 2021-09-30T20:40:15Z

Have you already tried with https://www.tensorflow.org/api_docs/python/tf/config/set_soft_device_placement ?

bhack · 2021-09-30T20:44:37Z

@szutenberg Could it be a side effect of your introduced var.device?

asapsmc · 2021-09-30T20:53:23Z

Have you already tried with https://www.tensorflow.org/api_docs/python/tf/config/set_soft_device_placement ?

Yes, it's there in the code: "tf.config.set_soft_device_placement(True)"

bhack · 2021-09-30T20:55:40Z

@MR-T77 Yes sorry I missed, is that your code is not well formatted.

szutenberg · 2021-10-13T19:11:50Z

Hi @MR-T77

I'm back. Have you solved the issue?

Today I tried to reproduce your problem and unfortunately, the code requires pickle which is not attached. I created dummy training data and everything works fine - FloorDiv (T=INT64) is placed on GPU.

Graphs don't contain name "conv_1_3x3_conv" so probably the code I got did not produce the error message you attached.

Please provide full code and all required files together with frozen pip list (pip freeze).

asapsmc · 2021-10-14T21:24:24Z

Hi @szutenberg!
Unfortunately no, I have not solved the issue, although I have tried everything I could.
But right now, and with this short snippet (from 30-Set) the behaviour is different than the initial:
Now, it seems that the train starts, but just stalls after appearing "Epoch 1/50".
I send you the missing pickle here attached, as well as the frozen pip list and a export of "conda list --explicit".
traintest_smallsmc.pkl.zip
condaenv.txt
pipfreeze.txt

asapsmc · 2021-10-14T21:43:50Z

And to be complete, in my original code (the one with conv layers) I am getting the same error as initially exposed ("Cannot assign a divide for operation..."). Nevertheless, if I replace the optimizer by the simple keras.optimizers.Adam, I can train the model.
Here I attach the code
problem_TCN.py.zip
Thanks in advance.

szutenberg · 2021-10-17T11:37:35Z

Hi @MR-T77 ,

I'm sorry but it seems that one more file is missing: definitions.py
Traceback (most recent call last): File "problem_TCN.py", line 183, in <module> simple_TCN(dataset) File "problem_TCN.py", line 135, in simple_TCN train_db = pickle.load(open('%s.pkl' % (dataset), 'rb')) ModuleNotFoundError: No module named 'definitions'

Could you make sure that it reproduces on google colab?

asapsmc · 2021-10-18T10:13:33Z

Hi @szutenberg,
I'm so sorry for that, the pickle file was saved in a module definitions.py, that's why it is requesting that file, although it does not need it.
I re-saved the pickle in "main", and I attach it as well as a more complete problem_TCN.py.
[traintest_smallsmc.pkl.zip](https://github.com/tensorflow/addons/files/7363832/traintest_smallsmc.pkl.zip
problem_TCN.py.zip
)

szutenberg · 2021-10-18T18:43:35Z

Hi @MR-T77 ,

Unfortunately now I have another error: AttributeError: Can't get attribute 'Dataset' on <module '__main__' from 'problem_TCN.py'>

Could you please prepare google colab which demonstrates your problem? Note that you don't need to provide real data - dummy training data is enough: you just need to make sure that shapes match.

Thanks!

bhack · 2021-10-18T18:58:17Z

Yes a Colab with dummy input data is the best thing to share so we could verify if it is something only related to your MacOs M1 platform or a more general issue.

asapsmc · 2021-10-19T16:21:15Z

Hi @szutenberg and @bhack : sorry for my late reply but I was trying to generate dummy data, but I couldn't do it without further errors (I'm a newbie).
I really hope with this Google Colab you can test everything (otherwise, please instruct me). You'll just have to upload the pkl file into your runtime.
In Colab I don't have errors, I can run this exact code.
But I'm using a whole set of different libraries (e.g. no tf-metal) and versions (tf, tf-addons).

szutenberg · 2021-10-19T17:41:26Z

Hi @MR-T77

I took the code from colab and was able to run it in my virtual env (Ubuntu 20.04):
tensorflow-gpu 2.5.0
tfa-nightly 0.15.0.dev20210922190150

All ops Equal are placed on GPU and everything works fine.

asapsmc · 2021-10-19T18:10:38Z

Hi @szutenberg,

So, do you think it is some clash between libraries in my environment or other reason?

bhack · 2021-10-19T18:16:49Z

@lgeiger Can you try to see if you can reproduce this on your M1 ?

asapsmc · 2021-10-20T18:54:50Z

I just want to clarify that I can run this exact code with no problems if I use tf.keras.Adam. If I start using tf-addons optimizers (e.g. Radam) I get the above errors.
Following @szutenberg request, I already sent the pip freeze list and paste it here again. Do you think any of these libraries/versions could be causing this clash with tf-addons?
pipfreeze.txt

bhack · 2021-10-20T18:58:05Z

Can you check your device placement:

https://www.tensorflow.org/api_docs/python/tf/debugging/set_log_device_placement

asapsmc · 2021-10-21T11:22:16Z

Hi @bhack , I did as you said and got this:

warnings.warn( args_0: (_Arg): /job:localhost/replica:0/task:0/device:CPU:0 GeneratorDataset: (GeneratorDataset): /job:localhost/replica:0/task:0/device:CPU:0 NoOp: (NoOp): /job:localhost/replica:0/task:0/device:CPU:0 Identity: (Identity): /job:localhost/replica:0/task:0/device:CPU:0 FakeSink0: (Identity): /job:localhost/replica:0/task:0/device:CPU:0 identity_RetVal: (_Retval): /job:localhost/replica:0/task:0/device:CPU:0 args_0: (_Arg): /job:localhost/replica:0/task:0/device:CPU:0 GeneratorDataset: (GeneratorDataset): /job:localhost/replica:0/task:0/device:CPU:0 NoOp: (NoOp): /job:localhost/replica:0/task:0/device:CPU:0 Identity: (Identity): /job:localhost/replica:0/task:0/device:CPU:0 FakeSink0: (Identity): /job:localhost/replica:0/task:0/device:CPU:0 identity_RetVal: (_Retval): /job:localhost/replica:0/task:0/device:CPU:0 Epoch 1/50 assignvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 AssignVariableOp: (AssignVariableOp): /job:localhost/replica:0/task:0/device:GPU:0 iter/Initializer/zeros: (Const): /job:localhost/replica:0/task:0/device:GPU:0 iterator: (_Arg): /job:localhost/replica:0/task:0/device:CPU:0 iterator_1: (_Arg): /job:localhost/replica:0/task:0/device:CPU:0 model_conv_1_convolution_conv2d_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 model_conv_1_convolution_biasadd_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 model_conv_2_convolution_conv2d_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 model_conv_2_convolution_biasadd_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 model_conv_3_convolution_conv2d_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 model_conv_3_convolution_biasadd_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 model_tcn_1_dilated_conv_conv1d_expanddims_1_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 model_tcn_1_dilated_conv_biasadd_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 model_tcn_1_conv_1x1_conv1d_expanddims_1_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 model_tcn_1_conv_1x1_biasadd_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 model_tcn_2_dilated_conv_conv1d_expanddims_1_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 model_tcn_2_dilated_conv_biasadd_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 model_tcn_2_conv_1x1_conv1d_expanddims_1_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 model_tcn_2_conv_1x1_biasadd_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 model_tcn_4_dilated_conv_conv1d_expanddims_1_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 model_tcn_4_dilated_conv_biasadd_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 model_tcn_4_conv_1x1_conv1d_expanddims_1_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 model_tcn_4_conv_1x1_biasadd_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 model_tcn_8_dilated_conv_conv1d_expanddims_1_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 model_tcn_8_dilated_conv_biasadd_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 model_tcn_8_conv_1x1_conv1d_expanddims_1_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 model_tcn_8_conv_1x1_biasadd_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 model_tcn_16_dilated_conv_conv1d_expanddims_1_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 model_tcn_16_dilated_conv_biasadd_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 model_tcn_16_conv_1x1_conv1d_expanddims_1_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 model_tcn_16_conv_1x1_biasadd_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 model_tcn_32_dilated_conv_conv1d_expanddims_1_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 model_tcn_32_dilated_conv_biasadd_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 model_tcn_32_conv_1x1_conv1d_expanddims_1_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 model_tcn_32_conv_1x1_biasadd_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 model_tcn_64_dilated_conv_conv1d_expanddims_1_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 model_tcn_64_dilated_conv_biasadd_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 model_tcn_64_conv_1x1_conv1d_expanddims_1_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 model_tcn_64_conv_1x1_biasadd_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 model_tcn_128_dilated_conv_conv1d_expanddims_1_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 model_tcn_128_dilated_conv_biasadd_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 model_tcn_128_conv_1x1_conv1d_expanddims_1_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 model_tcn_128_conv_1x1_biasadd_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 model_tcn_256_dilated_conv_conv1d_expanddims_1_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 model_tcn_256_dilated_conv_biasadd_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 model_tcn_256_conv_1x1_conv1d_expanddims_1_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 model_tcn_256_conv_1x1_biasadd_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 model_tcn_512_dilated_conv_conv1d_expanddims_1_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 model_tcn_512_dilated_conv_biasadd_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 model_tcn_512_conv_1x1_conv1d_expanddims_1_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 model_tcn_512_conv_1x1_biasadd_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 model_tcn_1024_dilated_conv_conv1d_expanddims_1_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 model_tcn_1024_dilated_conv_biasadd_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 model_tcn_1024_conv_1x1_conv1d_expanddims_1_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 model_tcn_1024_conv_1x1_biasadd_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 model_output_tensordot_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 model_output_biasadd_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 assignaddvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 assignaddvariableop_1_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_cast_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_cast_2_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_cast_3_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_cast_4_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_cast_6_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_lookahead_update_mul_5_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_lookahead_update_mul_8_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_lookahead_update_sub_10_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_lookahead_update_1_mul_5_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_lookahead_update_1_mul_8_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_lookahead_update_1_sub_10_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_lookahead_update_2_mul_5_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_lookahead_update_2_mul_8_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_lookahead_update_2_sub_10_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_lookahead_update_3_mul_5_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_lookahead_update_3_mul_8_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_lookahead_update_3_sub_10_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_lookahead_update_4_mul_5_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_lookahead_update_4_mul_8_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_lookahead_update_4_sub_10_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_lookahead_update_5_mul_5_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_lookahead_update_5_mul_8_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_lookahead_update_5_sub_10_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_lookahead_update_6_mul_5_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_lookahead_update_6_mul_8_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_lookahead_update_6_sub_10_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_lookahead_update_7_mul_5_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_lookahead_update_7_mul_8_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_lookahead_update_7_sub_10_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_lookahead_update_8_mul_5_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_lookahead_update_8_mul_8_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_lookahead_update_8_sub_10_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_lookahead_update_9_mul_5_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_lookahead_update_9_mul_8_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_lookahead_update_9_sub_10_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_lookahead_update_10_mul_5_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_lookahead_update_10_mul_8_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_lookahead_update_10_sub_10_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_lookahead_update_11_mul_5_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_lookahead_update_11_mul_8_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_lookahead_update_11_sub_10_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_lookahead_update_12_mul_5_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_lookahead_update_12_mul_8_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_lookahead_update_12_sub_10_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_lookahead_update_13_mul_5_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_lookahead_update_13_mul_8_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_lookahead_update_13_sub_10_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_lookahead_update_14_mul_5_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_lookahead_update_14_mul_8_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_lookahead_update_14_sub_10_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_lookahead_update_15_mul_5_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_lookahead_update_15_mul_8_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_lookahead_update_15_sub_10_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_lookahead_update_16_mul_5_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_lookahead_update_16_mul_8_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_lookahead_update_16_sub_10_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_lookahead_update_17_mul_5_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_lookahead_update_17_mul_8_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_lookahead_update_17_sub_10_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_lookahead_update_18_mul_5_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_lookahead_update_18_mul_8_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_lookahead_update_18_sub_10_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_lookahead_update_19_mul_5_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_lookahead_update_19_mul_8_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_lookahead_update_19_sub_10_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_lookahead_update_20_mul_5_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_lookahead_update_20_mul_8_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_lookahead_update_20_sub_10_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_lookahead_update_21_mul_5_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_lookahead_update_21_mul_8_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_lookahead_update_21_sub_10_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_lookahead_update_22_mul_5_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_lookahead_update_22_mul_8_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_lookahead_update_22_sub_10_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_lookahead_update_23_mul_5_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_lookahead_update_23_mul_8_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_lookahead_update_23_sub_10_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_lookahead_update_24_mul_5_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_lookahead_update_24_mul_8_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_lookahead_update_24_sub_10_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_lookahead_update_25_mul_5_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_lookahead_update_25_mul_8_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_lookahead_update_25_sub_10_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_lookahead_update_26_mul_5_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_lookahead_update_26_mul_8_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_lookahead_update_26_sub_10_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_lookahead_update_27_mul_5_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_lookahead_update_27_mul_8_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_lookahead_update_27_sub_10_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_lookahead_update_28_mul_5_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_lookahead_update_28_mul_8_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_lookahead_update_28_sub_10_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_lookahead_update_29_mul_5_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_lookahead_update_29_mul_8_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_lookahead_update_29_sub_10_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_lookahead_update_30_mul_5_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_lookahead_update_30_mul_8_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_lookahead_update_30_sub_10_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_lookahead_update_31_mul_5_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_lookahead_update_31_mul_8_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_lookahead_update_31_sub_10_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_lookahead_update_32_mul_5_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_lookahead_update_32_mul_8_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_lookahead_update_32_sub_10_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_lookahead_update_33_mul_5_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_lookahead_update_33_mul_8_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_lookahead_update_33_sub_10_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_lookahead_update_34_mul_5_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_lookahead_update_34_mul_8_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_lookahead_update_34_sub_10_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_lookahead_update_35_mul_5_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_lookahead_update_35_mul_8_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_lookahead_update_35_sub_10_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_lookahead_update_36_mul_5_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_lookahead_update_36_mul_8_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_lookahead_update_36_sub_10_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_lookahead_update_37_mul_5_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_lookahead_update_37_mul_8_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_lookahead_update_37_sub_10_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_lookahead_update_38_mul_5_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_lookahead_update_38_mul_8_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_lookahead_update_38_sub_10_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_lookahead_update_39_mul_5_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_lookahead_update_39_mul_8_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_lookahead_update_39_sub_10_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_lookahead_update_40_mul_5_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_lookahead_update_40_mul_8_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_lookahead_update_40_sub_10_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_lookahead_update_41_mul_5_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_lookahead_update_41_mul_8_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_lookahead_update_41_sub_10_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_lookahead_update_42_mul_5_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_lookahead_update_42_mul_8_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_lookahead_update_42_sub_10_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_lookahead_update_43_mul_5_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_lookahead_update_43_mul_8_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_lookahead_update_43_sub_10_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_lookahead_update_44_mul_5_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_lookahead_update_44_mul_8_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_lookahead_update_44_sub_10_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_lookahead_update_45_mul_5_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_lookahead_update_45_mul_8_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_lookahead_update_45_sub_10_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_lookahead_update_46_mul_5_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_lookahead_update_46_mul_8_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_lookahead_update_46_sub_10_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_lookahead_update_47_mul_5_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_lookahead_update_47_mul_8_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_lookahead_update_47_sub_10_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_lookahead_update_48_mul_5_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_lookahead_update_48_mul_8_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_lookahead_update_48_sub_10_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_lookahead_update_49_mul_5_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_lookahead_update_49_mul_8_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_lookahead_update_49_sub_10_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_lookahead_update_50_mul_5_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_lookahead_update_50_mul_8_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_lookahead_update_50_sub_10_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_lookahead_update_51_mul_5_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_lookahead_update_51_mul_8_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 lookahead_lookahead_update_51_sub_10_readvariableop_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 assignaddvariableop_2_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 assignaddvariableop_3_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 assignaddvariableop_4_resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0 IteratorGetNext: (IteratorGetNext): /job:localhost/replica:0/task:0/device:CPU:0

After this I get the error message:

Exception has occurred: InvalidArgumentError       (note: full exception trace is shown but execution is paused at: <module>)
Cannot assign a device for operation model/conv_1_convolution/Conv2D/ReadVariableOp: Could not satisfy explicit device specification '' because the node {{colocation_node model/conv_1_convolution/Conv2D/ReadVariableOp}} was colocated with a group of nodes that required incompatible device '/job:localhost/replica:0/task:0/device:GPU:0'. All available devices [/job:localhost/replica:0/task:0/device:CPU:0, /job:localhost/replica:0/task:0/device:GPU:0]. 
Colocation Debug Info:
Colocation group had the following types and supported devices: 
Root Member(assigned_device_name_index_=2 requested_device_name_='/job:localhost/replica:0/task:0/device:GPU:0' assigned_device_name_='/job:localhost/replica:0/task:0/device:GPU:0' resource_device_name_='/job:localhost/replica:0/task:0/device:GPU:0' supported_device_types_=[CPU] possible_devices_=[]
Equal: CPU 
AssignSubVariableOp: GPU CPU 
AssignVariableOp: GPU CPU 
GreaterEqual: GPU CPU 
FloorDiv: CPU 
Sqrt: GPU CPU 
NoOp: GPU CPU 
Pow: GPU CPU 
Mul: CPU 
Cast: GPU CPU 
Identity: GPU CPU 
SelectV2: GPU CPU 
ReadVariableOp: GPU CPU 
RealDiv: GPU CPU 
Sub: GPU CPU 
AddV2: GPU CPU 
Const: GPU CPU 
Square: GPU CPU 
_Arg: GPU CPU 

Colocation members, user-requested devices, and framework assigned devices, if any:
  model_conv_1_convolution_conv2d_readvariableop_resource (_Arg)  framework assigned device=/job:localhost/replica:0/task:0/device:GPU:0
  lookahead_lookahead_update_mul_5_readvariableop_resource (_Arg)  framework assigned device=/job:localhost/replica:0/task:0/device:GPU:0
  lookahead_lookahead_update_mul_8_readvariableop_resource (_Arg)  framework assigned device=/job:localhost/replica:0/task:0/device:GPU:0
  lookahead_lookahead_update_sub_10_readvariableop_resource (_Arg)  framework assigned device=/job:localhost/replica:0/task:0/device:GPU:0
  model/conv_1_convolution/Conv2D/ReadVariableOp (ReadVariableOp) 
  Lookahead/Lookahead/update/Identity (Identity) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/add/y (Const) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/add (AddV2) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/Cast_5 (Cast) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/Pow (Pow) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/Pow_1 (Pow) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/sub/x (Const) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/sub (Sub) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/sub_1/x (Const) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/sub_1 (Sub) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/truediv/x (Const) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/truediv (RealDiv) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/sub_2/x (Const) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/sub_2 (Sub) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/sub_3/x (Const) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/sub_3 (Sub) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/truediv_1/x (Const) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/truediv_1 (RealDiv) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/truediv_2/x (Const) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/truediv_2 (RealDiv) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/sub_4/y (Const) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/sub_4 (Sub) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/mul/x (Const) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/mul (Mul) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/mul_1 (Mul) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/mul_2 (Mul) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/sub_5 (Sub) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/sub_6/y (Const) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/sub_6 (Sub) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/sub_7/y (Const) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/sub_7 (Sub) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/truediv_3 (RealDiv) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/sub_8/y (Const) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/sub_8 (Sub) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/mul_3 (Mul) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/sub_9/y (Const) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/sub_9 (Sub) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/truediv_4 (RealDiv) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/mul_4 (Mul) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/truediv_5 (RealDiv) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/Sqrt (Sqrt) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/GreaterEqual (GreaterEqual) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/Const (Const) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/mul_5/ReadVariableOp (ReadVariableOp) 
  Lookahead/Lookahead/update/mul_5 (Mul) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/mul_6 (Mul) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/add_1 (AddV2) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/AssignVariableOp (AssignVariableOp) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/ReadVariableOp_1 (ReadVariableOp) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/mul_7 (Mul) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/mul_8/ReadVariableOp (ReadVariableOp) 
  Lookahead/Lookahead/update/mul_8 (Mul) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/Square (Square) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/mul_9 (Mul) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/add_2 (AddV2) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/AssignVariableOp_1 (AssignVariableOp) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/ReadVariableOp_2 (ReadVariableOp) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/mul_10 (Mul) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/Sqrt_1 (Sqrt) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/mul_11 (Mul) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/add_3 (AddV2) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/truediv_6 (RealDiv) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/SelectV2 (SelectV2) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/mul_12 (Mul) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/AssignSubVariableOp (AssignSubVariableOp)/job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/group_deps (NoOp) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/add_4/y (Const) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/add_4 (AddV2) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/Cast_7/x (Const) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/Cast_7 (Cast) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/Cast_8/x (Const) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/ReadVariableOp_4 (ReadVariableOp) 
  Lookahead/Lookahead/update/sub_10/ReadVariableOp (ReadVariableOp) 
  Lookahead/Lookahead/update/sub_10 (Sub) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/mul_13 (Mul) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/ReadVariableOp_5 (ReadVariableOp) 
  Lookahead/Lookahead/update/add_5 (AddV2) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/floordiv (FloorDiv) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/mul_14 (Mul) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/Equal (Equal) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/SelectV2_1/ReadVariableOp (ReadVariableOp) 
  Lookahead/Lookahead/update/SelectV2_1 (SelectV2) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/AssignVariableOp_2 (AssignVariableOp) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/SelectV2_2/ReadVariableOp (ReadVariableOp) 
  Lookahead/Lookahead/update/SelectV2_2 (SelectV2) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/AssignVariableOp_3 (AssignVariableOp) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/group_deps_1 (NoOp) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/group_deps_2 (NoOp) /job:localhost/replica:0/task:0/device:GPU:0

	 [[{{node model/conv_1_convolution/Conv2D/ReadVariableOp}}]] [Op:__inference_train_function_15035]

lgeiger · 2021-10-25T15:45:36Z

Do you have tensorflow-metal installed? If so, could you try uninstalling it and if not could you try installing it? Just to make sure that this has nothing to do with some ops not being supported by metal.

asapsmc · 2021-10-25T16:26:10Z

@lgeiger: you are right. When I uninstalled tensorflow-metal, I stopped getting the error. But of course, everything starts running only in CPU.
Do you think this is my only option, ie, running everything on CPU?

lgeiger · 2021-10-25T16:33:12Z

Do you think this is my only option, ie, running everything on CPU?

For now, unfortunately yes. It seems like some operation is not yet supported by the metal device, but I am not sure if the TFA optimizer could be rewritten to either not use this op or to not require it to be placed on the same device as the other devices in the group.

asapsmc · 2021-10-25T16:56:00Z

Thank you for your answer. Nevertheless, given the inability of Apple to provide support for developers, I hope you find some ingenious solution on your side.

bhack · 2021-10-25T16:58:39Z

The official Apple support is at https://developer.apple.com/forums/tags/tensorflow-metal

asapsmc · 2021-10-25T17:01:47Z

@bhack I know, I've been trying but they just don't give support.

bhack · 2021-10-25T17:03:06Z

What is your thread there?

bhack · 2021-10-25T17:05:11Z

If is this one https://developer.apple.com/forums/thread/692818 I suppose that just 4 days old with Saturday and Sunday it isn't too much long as waiting time.

asapsmc · 2021-10-25T17:23:25Z

That's not mine. But you can check this one (very similar to my problem), which was posted 3 months ago. I've got other 2 threads (my user is the same, so you can search by that) posted almost 1 month ago, also not solved by Apple. Besides this, I also submitted the issue through Feedback Assistant, but I have not any feedback.
So, 1 month or even 3 months seem more than sufficient time for a company like Apple to solve these issues, or at least to provide some feedback.
Wouldn't you agree?

bhack · 2021-10-25T17:42:21Z

tensorflow-metal is a closed source plugin package so we don't have too much alternative solutions other that asking to Apple.

asapsmc · 2021-10-25T17:51:24Z

@bhack I understand that and I sincerely thank you for your help (which I requested because I thought it was something related to tensorflow-addons). If I could change my computer, I'd also achieve a solution, but I can't.
Your support has been awesome (as opposed to Apple support) and it allowed me to identify the source of the problem.

bhack · 2021-10-25T19:17:45Z

Have you tried with tensorflow-macos 2.6?

Edit:
We are going to have a release with #2583

asapsmc · 2021-10-25T20:15:59Z

Yes, I've been trying with 2.6.

bhack · 2021-10-25T20:23:24Z

Ok I'am going to close this.. Please add a comment later if you have any news..

asapsmc · 2022-05-27T17:59:36Z

Ok, I'm back to this after a while.
So, I've been getting spurious errors while doing model.fit with the Lookahead optimizer (I'm doing fine-tuning with big datasets, and my code just breaks while fitting to different files, and in a not-reproducible way, i.e. each time I run it it breaks on a different file, and on different operations).
I can see that these errors are undoubtedly related to the Lookahead optimizer.
Let me try to explain this new info in a clear manner.
I've tried with 2 different versions of tf+tfaddons (conda environments), but I got the same type of errors, probably more frequent with the pylast conda environment:

pylast:tensorflow-macos 2.9.0, tensorflow-metal 0.5.0, tensorflow-addons 0.17.0
py39deps26-source: tensorflow-macos 2.6.0, tensorflow-metal 0.2.0, tensorflow-addons 0.15.0.dev0

The base code is always the same, I use tf.config.set_soft_device_placement(True) and also with tf.device('/cpu:0'): in every call to tensorflow, otherwise I get errors. As explained before, in my code, I just load a model, and fine-tune it to each file of a dataset.

Here are a pair of example error outputs (obtained with the pylast conda environment):

File "/Users/machine/Projects/finetune-asp/src/finetune_IMR2020.py", line 138, in finetune_dataset_db
    history = model.fit(ft, steps_per_epoch=len(ft), epochs=ft_cfg["num_epochs"], shuffle=True,
  File "/Users/machine/miniforge3/envs/pylast/lib/python3.9/site-packages/keras/utils/traceback_utils.py", line 67, in error_handler
    raise e.with_traceback(filtered_tb) from None
  File "/Users/machine/miniforge3/envs/pylast/lib/python3.9/site-packages/tensorflow/python/eager/execute.py", line 54, in quick_execute
    tensors = pywrap_tfe.TFE_Py_Execute(ctx._handle, device_name, op_name,
tensorflow.python.framework.errors_impl.InvalidArgumentError: Graph execution error:

Detected at node 'Lookahead/Lookahead/update_64/mul_11' defined at (most recent call last):
    
    File "/Users/machine/Projects/finetune-asp/src/finetune_IMR2020.py", line 138, in finetune_dataset_db
      history = model.fit(ft, steps_per_epoch=len(ft), epochs=ft_cfg["num_epochs"], shuffle=True,
    File "/Users/machine/miniforge3/envs/pylast/lib/python3.9/site-packages/keras/utils/traceback_utils.py", line 64, in error_handler
      return fn(*args, **kwargs)
    File "/Users/machine/miniforge3/envs/pylast/lib/python3.9/site-packages/keras/engine/training.py", line 1409, in fit
      tmp_logs = self.train_function(iterator)
    File "/Users/machine/miniforge3/envs/pylast/lib/python3.9/site-packages/keras/engine/training.py", line 1051, in train_function
      return step_function(self, iterator)
    File "/Users/machine/miniforge3/envs/pylast/lib/python3.9/site-packages/keras/engine/training.py", line 1040, in step_function
      outputs = model.distribute_strategy.run(run_step, args=(data,))
    File "/Users/machine/miniforge3/envs/pylast/lib/python3.9/site-packages/keras/engine/training.py", line 1030, in run_step
      outputs = model.train_step(data)
    File "/Users/machine/miniforge3/envs/pylast/lib/python3.9/site-packages/keras/engine/training.py", line 893, in train_step
      self.optimizer.minimize(loss, self.trainable_variables, tape=tape)
    File "/Users/machine/miniforge3/envs/pylast/lib/python3.9/site-packages/keras/optimizers/optimizer_v2/optimizer_v2.py", line 539, in minimize
      return self.apply_gradients(grads_and_vars, name=name)
    File "/Users/machine/miniforge3/envs/pylast/lib/python3.9/site-packages/tensorflow_addons/optimizers/lookahead.py", line 104, in apply_gradients
      return super().apply_gradients(grads_and_vars, name, **kwargs)
    File "/Users/machine/miniforge3/envs/pylast/lib/python3.9/site-packages/keras/optimizers/optimizer_v2/optimizer_v2.py", line 678, in apply_gradients
      return tf.__internal__.distribute.interim.maybe_merge_call(
    File "/Users/machine/miniforge3/envs/pylast/lib/python3.9/site-packages/keras/optimizers/optimizer_v2/optimizer_v2.py", line 723, in _distributed_apply
      update_op = distribution.extended.update(
    File "/Users/machine/miniforge3/envs/pylast/lib/python3.9/site-packages/keras/optimizers/optimizer_v2/optimizer_v2.py", line 706, in apply_grad_to_update_var
      update_op = self._resource_apply_dense(grad, var, **apply_kwargs)
    File "/Users/machine/miniforge3/envs/pylast/lib/python3.9/site-packages/tensorflow_addons/optimizers/lookahead.py", line 130, in _resource_apply_dense
      train_op = self._optimizer._resource_apply_dense(
    File "/Users/machine/miniforge3/envs/pylast/lib/python3.9/site-packages/tensorflow_addons/optimizers/rectified_adam.py", line 249, in _resource_apply_dense
      coef["r_t"] * m_corr_t / (v_corr_t + coef["epsilon_t"]),
Node: 'Lookahead/Lookahead/update_64/mul_11'
Incompatible shapes: [0] vs. [5,40,20]
	 [[{{node Lookahead/Lookahead/update_64/mul_11}}]] [Op:__inference_train_function_30821]

and

File "/Users/machine/Projects/finetune-asp/src/finetune_IMR2020.py", line 138, in finetune_dataset_db
    history = model.fit(ft, steps_per_epoch=len(ft), epochs=ft_cfg["num_epochs"], shuffle=True,
  File "/Users/machine/miniforge3/envs/pylast/lib/python3.9/site-packages/keras/utils/traceback_utils.py", line 67, in error_handler
    raise e.with_traceback(filtered_tb) from None
  File "/Users/machine/miniforge3/envs/pylast/lib/python3.9/site-packages/tensorflow/python/eager/execute.py", line 54, in quick_execute
    tensors = pywrap_tfe.TFE_Py_Execute(ctx._handle, device_name, op_name,
tensorflow.python.framework.errors_impl.InvalidArgumentError: Graph execution error:

Detected at node 'Lookahead/Lookahead/update_26/mul_11' defined at (most recent call last):

    File "/Users/machine/Projects/finetune-asp/src/finetune_IMR2020.py", line 138, in finetune_dataset_db
      history = model.fit(ft, steps_per_epoch=len(ft), epochs=ft_cfg["num_epochs"], shuffle=True,
    File "/Users/machine/miniforge3/envs/pylast/lib/python3.9/site-packages/keras/utils/traceback_utils.py", line 64, in error_handler
      return fn(*args, **kwargs)
    File "/Users/machine/miniforge3/envs/pylast/lib/python3.9/site-packages/keras/engine/training.py", line 1409, in fit
      tmp_logs = self.train_function(iterator)
    File "/Users/machine/miniforge3/envs/pylast/lib/python3.9/site-packages/keras/engine/training.py", line 1051, in train_function
      return step_function(self, iterator)
    File "/Users/machine/miniforge3/envs/pylast/lib/python3.9/site-packages/keras/engine/training.py", line 1040, in step_function
      outputs = model.distribute_strategy.run(run_step, args=(data,))
    File "/Users/machine/miniforge3/envs/pylast/lib/python3.9/site-packages/keras/engine/training.py", line 1030, in run_step
      outputs = model.train_step(data)
    File "/Users/machine/miniforge3/envs/pylast/lib/python3.9/site-packages/keras/engine/training.py", line 893, in train_step
      self.optimizer.minimize(loss, self.trainable_variables, tape=tape)
    File "/Users/machine/miniforge3/envs/pylast/lib/python3.9/site-packages/keras/optimizers/optimizer_v2/optimizer_v2.py", line 539, in minimize
      return self.apply_gradients(grads_and_vars, name=name)
    File "/Users/machine/miniforge3/envs/pylast/lib/python3.9/site-packages/tensorflow_addons/optimizers/lookahead.py", line 104, in apply_gradients
      return super().apply_gradients(grads_and_vars, name, **kwargs)
    File "/Users/machine/miniforge3/envs/pylast/lib/python3.9/site-packages/keras/optimizers/optimizer_v2/optimizer_v2.py", line 678, in apply_gradients
      return tf.__internal__.distribute.interim.maybe_merge_call(
    File "/Users/machine/miniforge3/envs/pylast/lib/python3.9/site-packages/keras/optimizers/optimizer_v2/optimizer_v2.py", line 723, in _distributed_apply
      update_op = distribution.extended.update(
    File "/Users/machine/miniforge3/envs/pylast/lib/python3.9/site-packages/keras/optimizers/optimizer_v2/optimizer_v2.py", line 706, in apply_grad_to_update_var
      update_op = self._resource_apply_dense(grad, var, **apply_kwargs)
    File "/Users/machine/miniforge3/envs/pylast/lib/python3.9/site-packages/tensorflow_addons/optimizers/lookahead.py", line 130, in _resource_apply_dense
      train_op = self._optimizer._resource_apply_dense(
    File "/Users/machine/miniforge3/envs/pylast/lib/python3.9/site-packages/tensorflow_addons/optimizers/rectified_adam.py", line 249, in _resource_apply_dense
      coef["r_t"] * m_corr_t / (v_corr_t + coef["epsilon_t"]),
Node: 'Lookahead/Lookahead/update_26/mul_11'
Incompatible shapes: [0] vs. [1,40,20]
	 [[{{node Lookahead/Lookahead/update_26/mul_11}}]] [Op:__inference_train_function_1406468]

bhack · 2022-05-27T18:12:27Z

@MR-T77 Just to be sure that it is reproducible on an environment under control with linux can you test the same with Docker + TFA pip:

https://www.tensorflow.org/install/docker

So that we could exclude that is related only to tensorflow-macos.

asapsmc · 2022-05-27T18:28:01Z

@MR-T77 Just to be sure that it is reproducible on an environment under control with linux can you test the same with Docker + TFA pip:

https://www.tensorflow.org/install/docker

So that we could exclude that is related only to tensorflow-macos.

I'm afraid I can't really help. Installed Docker and tried to pull and run latest tensorflow and got this error
WARNING: The requested image's platform (linux/amd64) does not match the detected host platform (linux/arm64/v8) and no specific platform was requested The TensorFlow library was compiled to use AVX instructions, but these aren't available on your machine. qemu: uncaught target signal 6 (Aborted) - core dumped

bhack · 2022-05-27T19:14:11Z

Yes probably they still not have added avx emulation support in QEMU for m1 or it is a Docker specific issue on m1 like docker/for-mac#6111.

If you can isolate the case and reproduce your error with a minimal code (e.g. like a test) we could try to run it on linux.

As tensorflow-macos and tensorflow-metal are closed source packages we cannot do anything here in the case we cannot reproduce the issue on another platform.

asapsmc · 2022-05-27T20:20:55Z

Well, I was just trying to provide further details, to see if it would help.
The code I sent in October generates this bug, but it seems it only happens when using Lookahead in M1.

bhack · 2022-05-27T20:29:02Z

Ok try to post also on https://developer.apple.com/forums/tags/tensorflow-metal

asapsmc · 2022-05-31T11:11:49Z

I submitted it to https://developer.apple.com/forums/thread/706952

asapsmc · 2022-06-23T15:55:48Z

Just to keep this info updated: I have a github repo with a stripped down version of my code (with the needed audio data) that does reproduce the issue (in a Mac M1). I also shared this with Apple, but unfortunately, they're not very responsive.

asapsmc · 2022-06-23T23:32:02Z

@MR-T77 maybe the easiest would be to provide the code so that we can reproduce it locally.

İf you want to debug it on your own then set TF_CPP_MAX_VLOG_LEVEL to 10, and TF_DUMP_GRAPH_PREFİX=tmp. You should see tmp dir after running the script. Reading placer_input.pbtxt will answer to my question (simply grep -A 20 -rn FloorDiv). You'll see dtypes.

Anyway I'm traveling now so I'm not able to help you further until ~12th October.

Hi again, I'm back at this as the problem remains (and it's even more frequent after I updated the conda environment), and Apple Developer Forums are not responding.
I was trying to generate some debug info (to see if I could figure out some way out of this), but I can't generate the pbtxt file.

I'm using this code:

from network_definitions import *
import tensorflow as tf


os.environ["TF_CPP_MIN_LOG_LEVEL"] = "10"
os.environ["TF_DUMP_GRAPH_PREFIX"] = 'tbdump'
#os.environ["XLA_FLAGS"] = "--xla_dump_to=/tbdump/generated"
tf.debugging.set_log_device_placement(False)
tf.config.set_soft_device_placement(True)
tf.debugging.experimental.enable_dump_debug_info(
    './tbdump',
    tensor_debug_mode="FULL_HEALTH",
    circular_buffer_size=-1)


if __name__ == "__main__":

    big_dataset = 'gtzan'
    small_dataset = 'traintest_smallsmc'
    dataset = big_dataset
    # data_aug='NODAUG'# to run without data augmentation
    finetune_db(dataset, data_aug='DAUG', load_pkl=True)

but after starting tensorboard --logdir /tbdump and accessing tensorboard on localhost, I always get the message "Debugger V2 is inactive because no data is available."

I can see that in tbdump folder there are being created the following type of files, but no pbtxt file:

tfdbg_events.xxx...xxx.graphs
tfdbg_events.xxx...xxx.source_files
tfdbg_events.xxx...xxx.execution
tfdbg_events.xxx...xxx.stack_frames
tfdbg_events.xxx...xxx.graph_execution_traces
tfdbg_events.xxx...xxx.metadata

Any idea on how to make this work?
This code and the needed data (as well as the pip freeze info is available at this github repo

seanpmorgan · 2023-03-01T04:19:50Z

TensorFlow Addons is transitioning to a minimal maintenance and release mode. New features will not be added to this repository. For more information, please see our public messaging on this decision:
TensorFlow Addons Wind Down

Please consider sending feature requests / contributions to other repositories in the TF community with a similar charters to TFA:
Keras
Keras-CV
Keras-NLP

bhack added the optimizers label Sep 27, 2021

bhack assigned szutenberg Oct 10, 2021

bhack added the macos label Oct 25, 2021

bhack closed this as completed Oct 25, 2021

bhack reopened this May 27, 2022

asapsmc mentioned this issue Oct 27, 2022

Strange behaviour: - fitting model with or without with tf.device('/cpu:0'): gives completely different losses in training #2777

Closed

seanpmorgan closed this as completed Mar 1, 2023

jqmcginnis mentioned this issue Dec 6, 2023

LST_AI installation on MAC OS CompImg/LST-AI#3

Closed

module 'tensorflow_addons' has no attribute 'optimizers' (tfa-nightly) #2578

module 'tensorflow_addons' has no attribute 'optimizers' (tfa-nightly) #2578

Comments

asapsmc commented Sep 27, 2021

bhack commented Sep 27, 2021

szutenberg commented Sep 28, 2021

asapsmc commented Sep 29, 2021

bhack commented Sep 29, 2021

szutenberg commented Sep 30, 2021

asapsmc commented Sep 30, 2021

asapsmc commented Sep 30, 2021

szutenberg commented Sep 30, 2021

asapsmc commented Sep 30, 2021

asapsmc commented Sep 30, 2021 • edited Loading

bhack commented Sep 30, 2021

bhack commented Sep 30, 2021

asapsmc commented Sep 30, 2021

bhack commented Sep 30, 2021

szutenberg commented Oct 13, 2021

asapsmc commented Oct 14, 2021

asapsmc commented Oct 14, 2021 • edited Loading

szutenberg commented Oct 17, 2021

asapsmc commented Oct 18, 2021

szutenberg commented Oct 18, 2021

bhack commented Oct 18, 2021

asapsmc commented Oct 19, 2021 • edited Loading

szutenberg commented Oct 19, 2021

asapsmc commented Oct 19, 2021

bhack commented Oct 19, 2021

asapsmc commented Oct 20, 2021

bhack commented Oct 20, 2021

asapsmc commented Oct 21, 2021

lgeiger commented Oct 25, 2021

asapsmc commented Oct 25, 2021

lgeiger commented Oct 25, 2021

asapsmc commented Oct 25, 2021

bhack commented Oct 25, 2021

asapsmc commented Oct 25, 2021

bhack commented Oct 25, 2021

bhack commented Oct 25, 2021

asapsmc commented Oct 25, 2021 • edited Loading

bhack commented Oct 25, 2021 • edited Loading

asapsmc commented Oct 25, 2021

bhack commented Oct 25, 2021 • edited Loading

asapsmc commented Oct 25, 2021

bhack commented Oct 25, 2021

asapsmc commented May 27, 2022

bhack commented May 27, 2022

asapsmc commented May 27, 2022 • edited Loading

bhack commented May 27, 2022 • edited Loading

asapsmc commented May 27, 2022

bhack commented May 27, 2022

asapsmc commented May 31, 2022

asapsmc commented Jun 23, 2022

asapsmc commented Jun 23, 2022

seanpmorgan commented Mar 1, 2023

asapsmc commented Sep 30, 2021 •

edited

Loading

asapsmc commented Oct 14, 2021 •

edited

Loading

asapsmc commented Oct 19, 2021 •

edited

Loading

asapsmc commented Oct 25, 2021 •

edited

Loading

bhack commented Oct 25, 2021 •

edited

Loading

bhack commented Oct 25, 2021 •

edited

Loading

asapsmc commented May 27, 2022 •

edited

Loading

bhack commented May 27, 2022 •

edited

Loading