Add Perplexity Metric #68

abheesht17 · 2022-03-26T14:54:13Z

Resolves #63.

Notebook: https://colab.research.google.com/drive/1XV1h5aeiy5IlHoQFjDTJ45hRC8wMSf16?usp=sharing
HF Reference: https://github.com/huggingface/transformers/blob/main/examples/research_projects/codeparrot/scripts/validation_loss.py#L56-L69

chenmoneygithub · 2022-03-29T22:37:24Z

Please ping us when this is ready for review, thanks!

abheesht17 · 2022-03-30T01:01:11Z

Please ping us when this is ready for review, thanks!

Hey, @chenmoneygithub! This PR is ready for review. I have not added test cases yet because if there are any major changes to be made, I'll have to make the same changes in test cases; I was waiting for an initial review. If you think the overall structure looks good, I can add unit tests now. Thanks!

chenmoneygithub · 2022-03-30T02:13:27Z

Please ping us when this is ready for review, thanks!

Hey, @chenmoneygithub! This PR is ready for review. I have not added test cases yet because if there are any major changes to be made, I'll have to make the same changes in test cases; I was waiting for an initial review. If you think the overall structure looks good, I can add unit tests now. Thanks!

Since your are overriding Metric class' method instead of proposing new methods, the API interface is stable, so your revision won't cause back and forth edits on test file. Please add unit tests so that we can better evaluate the functionality, thanks!

abheesht17 · 2022-03-30T02:15:14Z

Please ping us when this is ready for review, thanks!

Hey, @chenmoneygithub! This PR is ready for review. I have not added test cases yet because if there are any major changes to be made, I'll have to make the same changes in test cases; I was waiting for an initial review. If you think the overall structure looks good, I can add unit tests now. Thanks!

Since your are overriding Metric class' method instead of proposing new methods, the API interface is stable, so your revision won't cause back and forth edits on test file. Please add unit tests so that we can better evaluate the functionality, thanks!

Awesome! I'll add unit tests 👍🏼

abheesht17 · 2022-03-30T18:30:09Z

Please ping us when this is ready for review, thanks!

@mattdangerw, @chenmoneygithub, I've added UTs. This PR is now ready for review :)

chenmoneygithub

Thanks for the PR! Dropped some initial comments.

chenmoneygithub · 2022-03-31T00:24:42Z

keras_nlp/metrics/perplexity.py

+
+        if self.pad_token_id is not None:
+            sample_weight = tf.cast(
+                tf.math.logical_not(tf.equal(y_true, 0)), self._dtype


Should this be tf.equal(y_true, self.pad_token_id)?

Ah, yes. Fixed 👍🏼

chenmoneygithub · 2022-03-31T00:25:07Z

keras_nlp/metrics/perplexity.py

+        )
+
+    def update_state(self, y_true, y_pred, sample_weight=None):
+        # y_true shape: (bsz, seq_len), y_pred shape: (bsz, seq_len, vocab_size)


bsz is vague, please just use full name batch_size

chenmoneygithub · 2022-03-31T00:25:07Z

keras_nlp/metrics/perplexity.py

+        )
+
+    def update_state(self, y_true, y_pred, sample_weight=None):
+        # y_true shape: (bsz, seq_len), y_pred shape: (bsz, seq_len, vocab_size)


bsz is vague, please just use full name batch_size

chenmoneygithub · 2022-03-31T00:36:13Z

keras_nlp/metrics/perplexity.py

+            )
+
+        # Reshape y_true and y_pred.
+        y_true = tf.reshape(y_true, [-1])  # (bsz * seq_len,)


Curious - why are you doing these reshaping? I think SparseCategoricalCrossentropy can handle the original shape?

y_true = tf.constant([[1, 1, 0], [0, 2, 1]]) y_pred = tf.random.uniform(shape=[2, 3, 3]) entropy = tf.keras.losses.SparseCategoricalCrossentropy( from_logits=True, reduction="sum" ) entropy(y_true, y_pred)

the above code can run, am I missing something?

Yeah, you are right. Actually, the reason why I do reshaping is here: https://github.com/huggingface/transformers/blob/main/examples/research_projects/codeparrot/scripts/validation_loss.py#L56-L69 and https://github.com/huggingface/transformers/blob/v4.17.0/src/transformers/models/bert/modeling_bert.py#L1363.

Essentially, in PyTorch, when I compute the Cross Entropy loss and set the reduction method to mean, it computes the mean over the non-masked tokens. So, it sums the values of the non-masked tokens and divides by the number of non-masked tokens.

However, when I tried out some experiments with tf.keras.losses.SparseCategoricalCrossentropy and set the reduction method to mean, it sums over the non-masked tokens but divides by number of ALL tokens (basically, the first dimension). So, initially, I'd set the reduction method to mean, but changed it to sum, and handled the denominator in subsequent lines.

Now that we have set the reduction to sum, I can remove the lines where I do reshaping. Thanks for pointing this out!

chenmoneygithub · 2022-03-31T00:36:13Z

keras_nlp/metrics/perplexity.py

+            )
+
+        # Reshape y_true and y_pred.
+        y_true = tf.reshape(y_true, [-1])  # (bsz * seq_len,)


Curious - why are you doing these reshaping? I think SparseCategoricalCrossentropy can handle the original shape?

y_true = tf.constant([[1, 1, 0], [0, 2, 1]]) y_pred = tf.random.uniform(shape=[2, 3, 3]) entropy = tf.keras.losses.SparseCategoricalCrossentropy( from_logits=True, reduction="sum" ) entropy(y_true, y_pred)

the above code can run, am I missing something?

abheesht17 · 2022-03-31T12:47:47Z

@chenmoneygithub, addressed your comments! Thanks for reviewing :)

chenmoneygithub

Thanks! The functionality looks good to me! Dropped some comments on the coding style.

chenmoneygithub · 2022-03-31T00:24:42Z

keras_nlp/metrics/perplexity.py

+
+        if self.pad_token_id is not None:
+            sample_weight = tf.cast(
+                tf.math.logical_not(tf.equal(y_true, 0)), self._dtype


Should this be tf.equal(y_true, self.pad_token_id)?

keras_nlp/metrics/perplexity.py

chenmoneygithub · 2022-03-31T19:05:49Z

keras_nlp/metrics/perplexity.py

+    ```python
+    # 1. update_state() and result()
+    perplexity = keras_nlp.metrics.Perplexity(name="perplexity")
+    target = tf.experimental.numpy.random.randint(low=0, high=10, size=(2, 5))


Prefer using this:

target = tf.random.uniform(shape=[2, 5], maxval=10, dtype=tf.int32)

Since after the numpy stuff graduates from experimental namespace, this code would break.

Additionally, we may want to format this example section like this: https://github.com/keras-team/keras-nlp/blob/master/keras_nlp/tokenizers/word_piece_tokenizer.py#L123, which is more clear on the output, and more testable.

Sure, will change 👍🏼

chenmoneygithub · 2022-03-31T19:06:26Z

keras_nlp/metrics/perplexity.py

+        pad_token_id=None,
+        **kwargs,
+    ):
+        super(Perplexity, self).__init__(name=name, dtype=dtype, **kwargs)


Please use python3 style:

super().__init__(name=name, dtype=dtype, **kwargs)

chenmoneygithub · 2022-03-31T19:07:10Z

keras_nlp/metrics/perplexity.py

+        **kwargs,
+    ):
+        super(Perplexity, self).__init__(name=name, dtype=dtype, **kwargs)
+        if dtype is None:


Shall we move this default to the argument?

Do you mean moving it to kwargs?

I'm planning to handle it this way: https://github.com/keras-team/keras-nlp/blob/master/keras_nlp/tokenizers/word_piece_tokenizer.py#L177-L186.

oh this comment goes into the awkward position, I mean:

if dtype is None: self._dtype = tf.float32

I think the default is handled on the base class already. Can we remove this check entirely? And keep the error if not self.dtype.is_floating?

Yeah, sure. I think you are talking about this: https://github.com/keras-team/keras/blob/master/keras/metrics/base_metric.py#L122-L123.

I have used this condition:

if not tf.as_dtype(self.dtype.is_floating)

since the parent class sets it to a string ("float32") if dtype == None.

chenmoneygithub · 2022-03-31T19:07:49Z

keras_nlp/metrics/perplexity.py

+                    f"Received: dtype={dtype}"
+                )
+
+        self.cross_entropy_loss = keras.losses.SparseCategoricalCrossentropy(


This should be a private member, since it is not directly passed from constructor argument.

Also, I think it would be good to use the the metric version here if possible https://www.tensorflow.org/api_docs/python/tf/keras/metrics/SparseCategoricalCrossentropy, and remove _loss from your variable names. We are dealing with a metric here, not a loss.

@mattdangerw, I gave this a look and I don't think we can use tf.keras.metrics.SparseCategoricalCrossentropy(), since it doesn't have the reduction arg; it just takes the mean of all the values. We don't want the mean because we want to handle masked tokens later.

A sample of what I mean (pun on mean unintended xD):

loss = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True, reduction=tf.keras.losses.Reduction.NONE) metric = tf.keras.metrics.SparseCategoricalCrossentropy(from_logits=True) target = tf.random.uniform(shape=[2, 5], maxval=10, dtype=tf.int32, seed=42) logits = tf.random.uniform(shape=(2, 5, 10), seed=42) print("Print Element-wise Loss: ", loss(target, logits)) print("Metric Value: ", metric(target, logits))

Output:

Print Element-wise Loss: tf.Tensor( [[2.6863036 2.5335345 2.2001379 2.7056365 1.8480766] [2.319472 2.1234791 2.0424395 2.079166 2.5738573]], shape=(2, 5), dtype=float32) Metric Value: tf.Tensor(2.3112102, shape=(), dtype=float32)

chenmoneygithub · 2022-03-31T19:08:07Z

keras_nlp/metrics/perplexity.py

+
+        self.pad_token_id = pad_token_id
+
+        self.aggregate_cross_entropy_loss = self.add_weight(


should be private.

chenmoneygithub · 2022-03-31T19:08:29Z

keras_nlp/metrics/perplexity.py

+            initializer="zeros",
+            dtype=self._dtype,
+        )
+        self.number_of_samples = self.add_weight(


Same here, should be private.

chenmoneygithub · 2022-03-31T19:13:20Z

keras_nlp/metrics/perplexity_test.py

+
+
+class PerplexityTest(tf.test.TestCase):
+    @classmethod


Curious: why are you using this classmethod setUpClass instead of setUp(self)?

Also I would prefer setting the value of x, y in separate test cases - the disadvantage is we are copying code around, but the advantage is that the test case is more clear on its purposes and data.

setUp() runs before every unit test, whereas, setUpClass() runs only once before all the UTs.

For some reason, even though I've set the random seed, setUp() gives different outputs for every test. That's why I used setUpClass().

Sure, I'll add x and y to every UT.

chenmoneygithub · 2022-03-31T19:18:20Z

keras_nlp/metrics/perplexity_test.py

+        perplexity = Perplexity(from_logits=True)
+
+        val1 = perplexity(self.y_true_1, self.y_pred_1, self.sample_wt_1)
+        self.assertAlmostEqual(val1, 9.682761)


Instead of using random y_pred and put a magic number here, I would prefer to arbitrarily set y_pred to some fixed number, and compute the expected value manually.

Ah, so do you mean that we should do something like this?

y_pred = tf.constant([[...], [...]], dtype=...)

As in, for every UT, we should set y_pred explicitly?

Yes, otherwise the test case is a bit opaque because we do not know what the input is like.

mattdangerw

Thanks @abheesht17! Some style comments from me as well.

mattdangerw · 2022-03-31T17:11:40Z

keras_nlp/metrics/__init__.py

+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+from keras_nlp.metrics.perplexity import Perplexity


You need to add an import of metrics from the init file one directory up, otherwise the imports will not work on the exported package.

mattdangerw · 2022-03-31T17:35:58Z

keras_nlp/metrics/perplexity.py

+
+class Perplexity(keras.metrics.Metric):
+    """Perplexity metric.
+    This class implements the perplexity metric. In short, this class calculates


We should add a lot of returns here. Blank line after the one liner, blank line after paragraph, blank line before Args: and Examples:

mattdangerw · 2022-03-31T17:39:59Z

keras_nlp/metrics/perplexity.py

+    def update_state(self, y_true, y_pred, sample_weight=None):
+        # y_true shape: (batch_size, seq_len)
+        # y_pred shape: (batch_size, seq_len, vocab_size)
+        y_true = tf.cast(y_true, self._dtype)


there is a dtype property on metric, so this can be accessed as self.dtype

(many other instances)

mattdangerw · 2022-03-31T20:15:15Z

keras_nlp/metrics/perplexity.py

+        **kwargs: Other keyword arguments.
+    Examples:
+    ```python
+    # 1. update_state() and result()


You could do the core keras doc style here to show output...
https://github.com/keras-team/keras/blob/master/keras/metrics/metrics.py#L74

We don't have checks for this now, but we hope to add those soon. Output would make these more readable.

Also consider breaking these up with headers outside the code block, e.g.

Call the metric directly:
some code

Set padding id:
some code

mattdangerw · 2022-03-31T20:52:39Z

keras_nlp/metrics/perplexity.py

+                    f"Received: dtype={dtype}"
+                )
+
+        self.cross_entropy_loss = keras.losses.SparseCategoricalCrossentropy(


Also, I think it would be good to use the the metric version here if possible https://www.tensorflow.org/api_docs/python/tf/keras/metrics/SparseCategoricalCrossentropy, and remove _loss from your variable names. We are dealing with a metric here, not a loss.

mattdangerw · 2022-03-31T20:55:50Z

keras_nlp/metrics/perplexity.py

+        **kwargs,
+    ):
+        super(Perplexity, self).__init__(name=name, dtype=dtype, **kwargs)
+        if dtype is None:


I think the default is handled on the base class already. Can we remove this check entirely? And keep the error if not self.dtype.is_floating?

abheesht17

@mattdangerw, @chenmoneygithub, addressed all comments. Thank you for the review!

abheesht17 · 2022-04-02T01:58:33Z

keras_nlp/metrics/perplexity.py

+        **kwargs,
+    ):
+        super(Perplexity, self).__init__(name=name, dtype=dtype, **kwargs)
+        if dtype is None:


Yeah, sure. I think you are talking about this: https://github.com/keras-team/keras/blob/master/keras/metrics/base_metric.py#L122-L123.

abheesht17 · 2022-04-02T02:31:43Z

keras_nlp/metrics/perplexity.py

+                    f"Received: dtype={dtype}"
+                )
+
+        self.cross_entropy_loss = keras.losses.SparseCategoricalCrossentropy(


@mattdangerw, I gave this a look and I don't think we can use tf.keras.metrics.SparseCategoricalCrossentropy(), since it doesn't have the reduction arg; it just takes the mean of all the values. We don't want the mean because we want to handle masked tokens later.

A sample of what I mean (pun on mean unintended xD):

loss = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True, reduction=tf.keras.losses.Reduction.NONE) metric = tf.keras.metrics.SparseCategoricalCrossentropy(from_logits=True) target = tf.random.uniform(shape=[2, 5], maxval=10, dtype=tf.int32, seed=42) logits = tf.random.uniform(shape=(2, 5, 10), seed=42) print("Print Element-wise Loss: ", loss(target, logits)) print("Metric Value: ", metric(target, logits))

Output:

Print Element-wise Loss: tf.Tensor( [[2.6863036 2.5335345 2.2001379 2.7056365 1.8480766] [2.319472 2.1234791 2.0424395 2.079166 2.5738573]], shape=(2, 5), dtype=float32) Metric Value: tf.Tensor(2.3112102, shape=(), dtype=float32)

abheesht17 · 2022-04-02T02:50:25Z

keras_nlp/metrics/perplexity.py

+        **kwargs,
+    ):
+        super(Perplexity, self).__init__(name=name, dtype=dtype, **kwargs)
+        if dtype is None:


I have used this condition:

if not tf.as_dtype(self.dtype.is_floating)

since the parent class sets it to a string ("float32") if dtype == None.

abheesht17 · 2022-04-02T04:50:11Z

Not sure why the unit tests are failing here. They run as expected locally.

keras_nlp/metrics/perplexity_test.py::PerplexityTest::test_from_logits_with_pad_token_id PASSED                                               [ 52%]
keras_nlp/metrics/perplexity_test.py::PerplexityTest::test_from_logits_with_sample_weight PASSED                                              [ 53%] 
keras_nlp/metrics/perplexity_test.py::PerplexityTest::test_from_logits_without_masking PASSED                                                 [ 55%] 
keras_nlp/metrics/perplexity_test.py::PerplexityTest::test_from_probs_with_pad_token PASSED                                                   [ 56%]
keras_nlp/metrics/perplexity_test.py::PerplexityTest::test_from_probs_with_sample_weight PASSED                                               [ 57%] 
keras_nlp/metrics/perplexity_test.py::PerplexityTest::test_merge_state PASSED                                                                 [ 59%]
keras_nlp/metrics/perplexity_test.py::PerplexityTest::test_reset_state PASSED                                                                 [ 60%] 
keras_nlp/metrics/perplexity_test.py::PerplexityTest::test_two_inputs_from_logits PASSED                                                      [ 63%]
keras_nlp/metrics/perplexity_test.py::PerplexityTest::test_update_state PASSED                                                                [ 65%]
keras_nlp/metrics/perplexity_test.py::PerplexityTest::test_vars_after_initializing_class PASSED                                               [ 66%]

========================================================== 61 passed, 8 skipped in 27.33s ==========================================================

Any idea, @mattdangerw, @chenmoneygithub?

Error given here: https://github.com/keras-team/keras-nlp/runs/5797546465?check_suite_focus=true

/opt/hostedtoolcache/Python/3.7.12/x64/lib/python3.7/site-packages/tensorflow/python/ops/numpy_ops/np_array_ops.py:734: in around
    return a.astype(dtype)

Not sure how to fix this.

chenmoneygithub · 2022-04-04T19:00:51Z

keras_nlp/metrics/perplexity.py

+            sample_weight = tf.cast(sample_weight, self.dtype)
+
+        # Calculate the Cross Entropy Loss.
+        cross_entropy_value = tf.cast(


I checked out the source code of tf.keras.metrics.SparseCategoricalCrossentropy, and it is dong WEIGHTED_MEAN reduction (https://github.com/keras-team/keras/blob/d8fcb9d4d4dad45080ecfdd575483653028f8eda/keras/metrics.py#L583), which should automatically set the divisor as the sum over masks, could you help verify it with your unit test? Thanks!

Sure. Will try this out! Thanks!

@chenmoneygithub, this particular UT failed:

____________________________________________________ PerplexityTest.test_two_inputs_from_logits ____________________________________________________ self = <keras_nlp.metrics.perplexity_test.PerplexityTest testMethod=test_two_inputs_from_logits> def test_two_inputs_from_logits(self): perplexity = Perplexity(from_logits=True, pad_token_id=0) y_true_1 = tf.constant([[1, 3, 0], [2, 1, 3]]) y_pred_1 = tf.constant( [ [ [1.034, 4.797, 2.82, 1.154], [2.258, 1.591, 1.811, 1.852], [3.216, 1.037, 0.3662, 2.7], ], [ [1.363, 1.726, 1.898, 2.582], [1.163, 1.943, 1.761, 1.497], [2.766, 1.453, 2.61, 2.805], ], ] ) perplexity_val = perplexity(y_true_1, y_pred_1) self.assertAlmostEqual(perplexity_val, 2.8788896) y_true_2 = tf.constant([[2, 0, 0], [1, 2, 3]]) y_pred_2 = tf.constant( [ [ [2.887, 0.885, 2.973, 2.582], [0.3838, 2.629, 1.91, 1.802], [0.2578, 1.081, 1.125, 2.773], ], [ [1.623, 2.784, 0.2109, 2.66], [2.395, 2.01, 0.252, 1.828], [0.4482, 2.629, 0.9697, 0.998], ], ] ) perplexity_val = perplexity(y_true_2, y_pred_2) > self.assertEqual(perplexity_val, 3.9998498) E AssertionError: <tf.Tensor: shape=(), dtype=float32, numpy=3.3319612> != 3.9998498 keras_nlp\metrics\perplexity_test.py:132: AssertionError

I did a quick analysis on Colab. Apparently, when sample_weight is provided, the outputs of keras.losses.SparseCategoricalCrossentropy and keras.metrics.SparseCategoricalCrossentropy don't match. Have a look at this: https://colab.research.google.com/drive/1Jh44hylKVmmdqR1Z3B_redGzr1ZJdI3i?usp=sharing.
Is it some precision thingy which is messing up the values?

Let me know what the correct course of action is. Thanks!

This is pretty odd, I guess we can stick to Loss function and open an issue for future investigation.

Coolio :). Thanks! 👍🏼

chenmoneygithub · 2022-04-04T19:04:42Z

Your test failure seems to be caused by some env configuration:

E       AttributeError: 
E               'EagerTensor' object has no attribute 'astype'.
E               If you are looking for numpy-related methods, please run the following:
E               from tensorflow.python.ops.numpy_ops import np_config
E               np_config.enable_numpy_behavior()

I am not sure how that happens, will forward this to the team.

abheesht17 · 2022-04-06T18:36:18Z

Your test failure seems to be caused by some env configuration:

E       AttributeError: 
E               'EagerTensor' object has no attribute 'astype'.
E               If you are looking for numpy-related methods, please run the following:
E               from tensorflow.python.ops.numpy_ops import np_config
E               np_config.enable_numpy_behavior()

I am not sure how that happens, will forward this to the team.

Hey, @chenmoneygithub. Any possible solution for this?

abheesht17 · 2022-04-07T18:54:58Z

Hey, @chenmoneygithub! I've resolved the error we were facing with unit tests. Let me know if further changes are to be made. Thank you!

chenmoneygithub

Looks good!

chenmoneygithub · 2022-04-07T20:32:21Z

keras_nlp/metrics/perplexity_test.py

+
+        perplexity_1.update_state(y_true_1, y_pred_1)
+        perplexity_1.update_state(y_true_2, y_pred_2)
+        self.assertAlmostEqual(


Please avoid directly checking the private members, if we think it is something necessary to check, then we should consider exposing a public interface for it.

for this case, can we just check the perplexity value?

Ah, yes. I just wanted to show that cross_entropy_for_state_1 + cross_entropy_for_state_2 = cross_entropy_for_state_3. But yeah, checking for perplexity will suffice. Changes made! I've kept the check for number of samples though

Thanks for fixing! Please also apply similar changes on other test cases, basically we never want to explicitly check private fields, because private fields are subject to change without notice.

@chenmoneygithub, done! Thanks!

chenmoneygithub · 2022-04-08T16:31:58Z

keras_nlp/metrics/perplexity_test.py

+
+        perplexity_1.update_state(y_true_1, y_pred_1)
+        perplexity_1.update_state(y_true_2, y_pred_2)
+        self.assertAlmostEqual(


Thanks for fixing! Please also apply similar changes on other test cases, basically we never want to explicitly check private fields, because private fields are subject to change without notice.

chenmoneygithub

Thanks for fixing!

fchollet

Thanks for the PR! Some drive-by comments.

fchollet · 2022-04-08T23:22:12Z

keras_nlp/metrics/perplexity.py

+
+    def __init__(
+        self,
+        name="perplexity",


name argument should come last. It's a base class argument.

fchollet · 2022-04-08T23:22:30Z

keras_nlp/metrics/perplexity.py

+    def __init__(
+        self,
+        name="perplexity",
+        dtype=None,


Same for dtype (which comes before name).

fchollet · 2022-04-08T23:23:20Z

keras_nlp/metrics/perplexity.py

+        self.pad_token_id = pad_token_id
+
+        self._aggregate_cross_entropy = self.add_weight(
+            name="aggregate_cross_entropy",


Nit: Spell "crossentropy" in a single word, for consistency. This applies to the weight name and also to Python variable names.

fchollet · 2022-04-08T23:24:07Z

keras_nlp/metrics/perplexity.py

+from tensorflow import keras
+
+
+class Perplexity(keras.metrics.Metric):


Please make the metric serializable by adding a get_config() method.

fchollet · 2022-04-08T23:36:11Z

keras_nlp/metrics/perplexity.py

+        from_logits: bool. If True, `y_pred` (input to `update_state()`) should
+            be the logits as returned by the model. Otherwise, `y_pred` is a
+            tensor of probabilities.
+        pad_token_id: int. Token ID of the padding token. If provided, the mask


Also prefer "mask_token_id" over "pad_token_id"

fchollet · 2022-04-08T23:36:18Z

keras_nlp/metrics/perplexity.py

+            be the logits as returned by the model. Otherwise, `y_pred` is a
+            tensor of probabilities.
+        pad_token_id: int. Token ID of the padding token. If provided, the mask
+            is computed by this class (all padding tokens are masked while


by -> for ?

fchollet · 2022-04-08T23:36:45Z

keras_nlp/metrics/perplexity.py

+        pad_token_id: int. Token ID of the padding token. If provided, the mask
+            is computed by this class (all padding tokens are masked while
+            computing the cross entropy loss). Note that if this field is
+            provided, the `sample_weight` field in `update_state()` is ignored.


This behavior is problematic; we should combine the masks, not drop one of them.

What do you think is the best way to combine the masks? Element-wise maximum or element-wise addition (if both are not None)? Or do you have something else in mind?

I think we can just multiply the masks together. If padding token is set, that will give a mask of 1s and 0s, which could be multiplied with sample_weight.

Put one way... If a padding token has sample weight 0.5 it should be ignore. If a non padding token has sample weight 0.5 we should still weight it by its sample weight before summing it in.

Great! Done 👍🏼

abheesht17

Thank you, @fchollet, for the review comments! I've addressed all of them, save one. Want some clarification over there.

abheesht17 · 2022-04-09T01:03:00Z

keras_nlp/metrics/perplexity.py

+        pad_token_id: int. Token ID of the padding token. If provided, the mask
+            is computed by this class (all padding tokens are masked while
+            computing the cross entropy loss). Note that if this field is
+            provided, the `sample_weight` field in `update_state()` is ignored.


What do you think is the best way to combine the masks? Element-wise maximum or element-wise addition (if both are not None)? Or do you have something else in mind?

abheesht17 · 2022-04-12T04:06:52Z

@mattdangerw, I've addressed your comment. Any further changes required?

mattdangerw

LGTM!

* Add class for perplexity * Fix doc-string example * Add shape checks * Small typo * Small typo - 2 * Remove shape and rank checks - they don't work * Add UTs * Remove reshape ops * Address review comments - I * Add UT for no masking case * Minor change * Add a check for dtype * Format code * Address review comments - II, III * Fix formatting issue * Fix UTs (Attempt I) * Fix UTs (Attempt II) * Fix UTs (Attempt III) * Remove UTs for private members * Remove checks for private members in UTs * Address review comments - IV * Change sample_weight, mask functionality * Format + Lint

abheesht17 added 6 commits March 26, 2022 20:20

Add class for perplexity

937e68f

Fix doc-string example

de41f62

Add shape checks

52257dd

Small typo

6c9438b

Small typo - 2

53f2e9c

Remove shape and rank checks - they don't work

4d68c5b

Add UTs

ff8e341

chenmoneygithub reviewed Mar 31, 2022

View reviewed changes

abheesht17 added 4 commits March 31, 2022 17:55

Remove reshape ops

a5fb375

Address review comments - I

fb4f6d3

Add UT for no masking case

d50842f

Minor change

be547be

abheesht17 added 2 commits March 31, 2022 19:07

Add a check for dtype

ff1cfaa

Format code

5b19a73

chenmoneygithub reviewed Mar 31, 2022

View reviewed changes

Merge branch 'keras-team:master' into perplexity

ae976d1

mattdangerw requested changes Mar 31, 2022

View reviewed changes

abheesht17 commented Apr 2, 2022

View reviewed changes

abheesht17 added 2 commits April 2, 2022 10:05

Address review comments - II, III

d260090

Fix formatting issue

ad9a2a5

chenmoneygithub reviewed Apr 4, 2022

View reviewed changes

abheesht17 added 4 commits April 7, 2022 23:49

Merge branch 'keras-team:master' into perplexity

362db98

Fix UTs (Attempt I)

af39af8

Fix UTs (Attempt II)

829b236

Fix UTs (Attempt III)

18372f8

chenmoneygithub approved these changes Apr 7, 2022

View reviewed changes

chenmoneygithub suggested changes Apr 7, 2022

View reviewed changes

Remove UTs for private members

9cde5bf

chenmoneygithub suggested changes Apr 8, 2022

View reviewed changes

Remove checks for private members in UTs

72421ac

chenmoneygithub approved these changes Apr 8, 2022

View reviewed changes

fchollet reviewed Apr 8, 2022

View reviewed changes

abheesht17 commented Apr 9, 2022

View reviewed changes

abheesht17 added 2 commits April 9, 2022 06:36

Address review comments - IV

7213bbb

Change sample_weight, mask functionality

3c9e462

Format + Lint

ee37937

mattdangerw approved these changes Apr 13, 2022

View reviewed changes

mattdangerw merged commit feebbd0 into keras-team:master Apr 13, 2022

mattdangerw added a commit that referenced this pull request May 21, 2024

More consistency improvements for PaliGemma (#68)

7dff1d0

mattdangerw added a commit that referenced this pull request May 21, 2024

More consistency improvements for PaliGemma (#68)

f133d57


		self.pad_token_id = pad_token_id

		self.aggregate_cross_entropy_loss = self.add_weight(

		from tensorflow import keras


		class Perplexity(keras.metrics.Metric):

Add Perplexity Metric #68

Add Perplexity Metric #68

Conversation

abheesht17 commented Mar 26, 2022 • edited Loading

chenmoneygithub commented Mar 29, 2022

abheesht17 commented Mar 30, 2022

chenmoneygithub commented Mar 30, 2022

abheesht17 commented Mar 30, 2022

abheesht17 commented Mar 30, 2022

chenmoneygithub left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

abheesht17 commented Mar 31, 2022

chenmoneygithub left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

abheesht17 Apr 2, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

abheesht17 Apr 2, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mattdangerw left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mattdangerw Mar 31, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

abheesht17 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

abheesht17 Apr 2, 2022 • edited Loading

Choose a reason for hiding this comment

abheesht17 Apr 2, 2022 • edited Loading

Choose a reason for hiding this comment

abheesht17 commented Apr 2, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

abheesht17 Apr 6, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

chenmoneygithub commented Apr 4, 2022

abheesht17 commented Apr 6, 2022

abheesht17 commented Apr 7, 2022

chenmoneygithub left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

abheesht17 Apr 8, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

chenmoneygithub left a comment

abheesht17 commented Mar 26, 2022 •

edited

Loading

abheesht17 Apr 2, 2022 •

edited

Loading

abheesht17 Apr 2, 2022 •

edited

Loading

mattdangerw Mar 31, 2022 •

edited

Loading

abheesht17 Apr 2, 2022 •

edited

Loading

abheesht17 Apr 2, 2022 •

edited

Loading

abheesht17 commented Apr 2, 2022 •

edited

Loading

abheesht17 Apr 6, 2022 •

edited

Loading

abheesht17 Apr 8, 2022 •

edited

Loading

abheesht17 Apr 9, 2022 •

edited

Loading