Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Perplexity Metric #68

Merged
merged 25 commits into from
Apr 13, 2022
Merged
Show file tree
Hide file tree
Changes from 22 commits
Commits
Show all changes
25 commits
Select commit Hold shift + click to select a range
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions keras_nlp/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@
# limitations under the License.

from keras_nlp import layers
from keras_nlp import metrics
from keras_nlp import tokenizers

__version__ = "0.1.1"
15 changes: 15 additions & 0 deletions keras_nlp/metrics/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
# Copyright 2022 The KerasNLP Authors
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

from keras_nlp.metrics.perplexity import Perplexity
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need to add an import of metrics from the init file one directory up, otherwise the imports will not work on the exported package.

166 changes: 166 additions & 0 deletions keras_nlp/metrics/perplexity.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,166 @@
# Copyright 2022 The KerasNLP Authors
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

"""Perplexity metric implementation based on `keras.metrics.Metric`."""

import tensorflow as tf
from tensorflow import keras


class Perplexity(keras.metrics.Metric):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please make the metric serializable by adding a get_config() method.

"""Perplexity metric.

This class implements the perplexity metric. In short, this class calculates
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should add a lot of returns here. Blank line after the one liner, blank line after paragraph, blank line before Args: and Examples:

the cross entropy loss and takes its exponent.
Note: This implementation is not suitable for fixed-size windows.
chenmoneygithub marked this conversation as resolved.
Show resolved Hide resolved

Args:
name: string. Name of the metric instance.
dtype: string or tf.dtypes.Dtype. Precision of metric computation. If
not specified, it defaults to tf.float32.
from_logits: bool. If True, `y_pred` (input to `update_state()`) should
be the logits as returned by the model. Otherwise, `y_pred` is a
tensor of probabilities.
pad_token_id: int. Token ID of the padding token. If provided, the mask
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also prefer "mask_token_id" over "pad_token_id"

is computed by this class (all padding tokens are masked while
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

by -> for ?

computing the cross entropy loss). Note that if this field is
provided, the `sample_weight` field in `update_state()` is ignored.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This behavior is problematic; we should combine the masks, not drop one of them.

Copy link
Collaborator Author

@abheesht17 abheesht17 Apr 9, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think is the best way to combine the masks? Element-wise maximum or element-wise addition (if both are not None)? Or do you have something else in mind?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can just multiply the masks together. If padding token is set, that will give a mask of 1s and 0s, which could be multiplied with sample_weight.

Put one way... If a padding token has sample weight 0.5 it should be ignore. If a non padding token has sample weight 0.5 we should still weight it by its sample weight before summing it in.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great! Done 👍🏼

**kwargs: Other keyword arguments.

Examples:

1. Calculate perplexity by calling update_state() and result().
1.1. `sample_weight`, and `pad_token_id` are not provided.
>>> tf.random.set_seed(42)
>>> perplexity = keras_nlp.metrics.Perplexity(name="perplexity")
>>> target = tf.random.uniform(
... shape=[2, 5], maxval=10, dtype=tf.int32, seed=42)
>>> logits = tf.random.uniform(shape=(2, 5, 10), seed=42)
>>> perplexity.update_state(target, logits)
>>> perplexity.result()
<tf.Tensor: shape=(), dtype=float32, numpy=11.8781595>

1.2. `sample_weight` specified (masking token with ID 0).
>>> tf.random.set_seed(42)
>>> perplexity = keras_nlp.metrics.Perplexity(name="perplexity")
>>> target = tf.random.uniform(
... shape=[2, 5], maxval=10, dtype=tf.int32, seed=42)
>>> logits = tf.random.uniform(shape=(2, 5, 10), seed=42)
>>> sample_weight = tf.cast(
... tf.math.logical_not(tf.equal(target, 0)), tf.float32)
>>> perplexity.update_state(target, logits, sample_weight)
>>> perplexity.result()
<tf.Tensor: shape=(), dtype=float32, numpy=13.1128>

2. Call perplexity directly.
>>> tf.random.set_seed(42)
>>> perplexity = keras_nlp.metrics.Perplexity(name="perplexity")
>>> target = tf.random.uniform(
... shape=[2, 5], maxval=10, dtype=tf.int32, seed=42)
>>> logits = tf.random.uniform(shape=(2, 5, 10), seed=42)
>>> perplexity(target, logits)
<tf.Tensor: shape=(), dtype=float32, numpy=11.8781595>

3. Provide the padding token ID and let the class compute the mask on its
own.
>>> tf.random.set_seed(42)
>>> perplexity = keras_nlp.metrics.Perplexity(
... name="perplexity", pad_token_id=0)
>>> target = tf.random.uniform(
... shape=[2, 5], maxval=10, dtype=tf.int32, seed=42)
>>> logits = tf.random.uniform(shape=(2, 5, 10), seed=42)
>>> perplexity(target, logits)
<tf.Tensor: shape=(), dtype=float32, numpy=13.1128>
"""

def __init__(
self,
name="perplexity",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

name argument should come last. It's a base class argument.

dtype=None,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same for dtype (which comes before name).

from_logits=False,
pad_token_id=None,
**kwargs,
):
super().__init__(name=name, dtype=dtype, **kwargs)

if not tf.as_dtype(self.dtype).is_floating:
raise ValueError(
"`dtype` must be a floating point type. "
f"Received: dtype={dtype}"
)

self._cross_entropy = keras.losses.SparseCategoricalCrossentropy(
from_logits=from_logits, reduction="sum"
)

self.pad_token_id = pad_token_id

self._aggregate_cross_entropy = self.add_weight(
name="aggregate_cross_entropy",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Spell "crossentropy" in a single word, for consistency. This applies to the weight name and also to Python variable names.

initializer="zeros",
dtype=self.dtype,
)
self._number_of_samples = self.add_weight(
name="number_of_samples", initializer="zeros", dtype=self.dtype
)

def update_state(self, y_true, y_pred, sample_weight=None):
# y_true shape: (batch_size, seq_len)
# y_pred shape: (batch_size, seq_len, vocab_size)
y_true = tf.cast(y_true, self.dtype)
y_pred = tf.cast(y_pred, self.dtype)
batch_size = tf.cast(tf.shape(y_true)[0], self.dtype)

if self.pad_token_id is not None:
sample_weight = tf.cast(
tf.math.logical_not(tf.equal(y_true, self.pad_token_id)),
self.dtype,
)

if sample_weight is not None:
sample_weight = tf.cast(sample_weight, self.dtype)

# Calculate the Cross Entropy Loss.
cross_entropy_value = tf.cast(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I checked out the source code of tf.keras.metrics.SparseCategoricalCrossentropy, and it is dong WEIGHTED_MEAN reduction (https://github.com/keras-team/keras/blob/d8fcb9d4d4dad45080ecfdd575483653028f8eda/keras/metrics.py#L583), which should automatically set the divisor as the sum over masks, could you help verify it with your unit test? Thanks!

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. Will try this out! Thanks!

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@chenmoneygithub, this particular UT failed:

____________________________________________________ PerplexityTest.test_two_inputs_from_logits ____________________________________________________ 

self = <keras_nlp.metrics.perplexity_test.PerplexityTest testMethod=test_two_inputs_from_logits>

    def test_two_inputs_from_logits(self):
        perplexity = Perplexity(from_logits=True, pad_token_id=0)

        y_true_1 = tf.constant([[1, 3, 0], [2, 1, 3]])
        y_pred_1 = tf.constant(
            [
                [
                    [1.034, 4.797, 2.82, 1.154],
                    [2.258, 1.591, 1.811, 1.852],
                    [3.216, 1.037, 0.3662, 2.7],
                ],
                [
                    [1.363, 1.726, 1.898, 2.582],
                    [1.163, 1.943, 1.761, 1.497],
                    [2.766, 1.453, 2.61, 2.805],
                ],
            ]
        )

        perplexity_val = perplexity(y_true_1, y_pred_1)
        self.assertAlmostEqual(perplexity_val, 2.8788896)

        y_true_2 = tf.constant([[2, 0, 0], [1, 2, 3]])
        y_pred_2 = tf.constant(
            [
                [
                    [2.887, 0.885, 2.973, 2.582],
                    [0.3838, 2.629, 1.91, 1.802],
                    [0.2578, 1.081, 1.125, 2.773],
                ],
                [
                    [1.623, 2.784, 0.2109, 2.66],
                    [2.395, 2.01, 0.252, 1.828],
                    [0.4482, 2.629, 0.9697, 0.998],
                ],
            ]
        )
        perplexity_val = perplexity(y_true_2, y_pred_2)
>       self.assertEqual(perplexity_val, 3.9998498)
E       AssertionError: <tf.Tensor: shape=(), dtype=float32, numpy=3.3319612> != 3.9998498

keras_nlp\metrics\perplexity_test.py:132: AssertionError

Copy link
Collaborator Author

@abheesht17 abheesht17 Apr 6, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did a quick analysis on Colab. Apparently, when sample_weight is provided, the outputs of keras.losses.SparseCategoricalCrossentropy and keras.metrics.SparseCategoricalCrossentropy don't match. Have a look at this: https://colab.research.google.com/drive/1Jh44hylKVmmdqR1Z3B_redGzr1ZJdI3i?usp=sharing.
Is it some precision thingy which is messing up the values?

Let me know what the correct course of action is. Thanks!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is pretty odd, I guess we can stick to Loss function and open an issue for future investigation.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Coolio :). Thanks! 👍🏼

self._cross_entropy(y_true, y_pred, sample_weight=sample_weight),
self.dtype,
) # scalar

# Divide the loss by the number of non-masked tokens
if sample_weight is not None:
cross_entropy_value = cross_entropy_value / tf.reduce_sum(
sample_weight
) # scalar
else:
cross_entropy_value = cross_entropy_value / (
tf.cast(tf.shape(y_true)[0], self.dtype)
* tf.cast(tf.shape(y_true)[1], self.dtype)
) # scalar

self._aggregate_cross_entropy.assign_add(
batch_size * cross_entropy_value
)
self._number_of_samples.assign_add(batch_size)

def result(self):
if self._number_of_samples == 0:
return 0.0
perplexity_score = tf.exp(
self._aggregate_cross_entropy / self._number_of_samples
)
return perplexity_score

def reset_state(self):
self._aggregate_cross_entropy.assign(0.0)
self._number_of_samples.assign(0.0)
Loading