Avoid nan loss when there are labels with no samples in the training data. #12

chbeltz · 2024-11-22T17:45:53Z

Hello there.

I ran into problems today when trying to do a test run with training data that lacked samples for one of the labels. This causes the class-balanced focal loss to come out as nan.

import torch
from balanced_loss import Loss

samples_per_class = list(torch.tensor([ 310., 2489.,  114.,   17.,    0.,  725.]))
pred = torch.tensor([[4.1951e-04, 1.6066e-02, 3.2661e-03, 5.0763e-01, 1.0739e-03, 4.7154e-01],
        [7.6719e-03, 1.1280e-01, 5.8755e-02, 5.5621e-02, 6.6679e-01, 9.8361e-02],
        [3.0145e-03, 9.3653e-01, 1.7860e-02, 2.4776e-03, 3.6712e-03, 3.6448e-02],
        [1.0764e-03, 3.8136e-03, 4.5988e-03, 8.3224e-04, 9.8502e-01, 4.6638e-03],
        [9.5827e-03, 2.3838e-02, 5.1518e-02, 1.0943e-02, 2.9569e-02, 8.7455e-01]])
yb = torch.tensor([[0., 0., 0., 0., 0., 1.],
        [1., 0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 1., 0.],
        [1., 0., 0., 0., 0., 0.],
        [1., 0., 0., 0., 0., 0.]])
focal_loss = Loss(
    loss_type="focal_loss",
    samples_per_class=samples_per_class,
    beta=0.999, # class-balanced loss beta
    fl_gamma=2, # focal loss gamma
    class_balanced=True
)
print(focal_loss(pred, torch.argmax(yb, dim=-1).to(torch.int64)))

currently yields

tensor(nan)

/home/user/florist-environment/lib/python3.10/site-packages/balanced_loss/losses.py:111: RuntimeWarning: divide by zero encountered in divide
  weights = (1.0 - self.beta) / np.array(effective_num)
/home/user/florist-environment/lib/python3.10/site-packages/balanced_loss/losses.py:112: RuntimeWarning: invalid value encountered in divide
  weights = weights / np.sum(weights) * effective_num_classes

Adding a safe switch to the Loss class fixes this issue without any changes in weight for the non-zero-sample labels relative to leaving out the zero-sample labels. The loss, however, will come out larger than it would with alternative solution of removing the offending label.

I can see that this is an edge case. But, it will be helpful for me and I imagine it might also be for others. One could also consider raising a ValueError when no-sample labels are supplied hinting at making use of the safe switch.

…mple labels

fcakyon · 2024-12-16T18:18:38Z

Hey @chbeltz thanks for your contribution!

Please reformat your code and we are good to merge 💯

…ature in Loss class for improved readability

…upgrade GitHub Actions to latest versions for improved performance and compatibility

…ce caching logic. Added installation steps for PyTorch versions 1.13.1 and 2.5.1, and included a step to display installed packages. This improves clarity and consistency across CI configurations.

…installation details for PyTorch versions 1.13.1 and 2.5.1. This enhances documentation accuracy and provides users with essential version information.

…es and improvements.

fcakyon · 2024-12-16T19:00:04Z

@chbeltz its live on balanced-loss==0.1.1 🚀

chbeltz added 2 commits November 22, 2024 17:34

Fix division by zero error for labels with no samples

98b2612

safe switch for equivalent weight calculation in the presence of 0-sa…

f137f8a

…mple labels

fcakyon assigned chbeltz Dec 16, 2024

fcakyon self-requested a review December 16, 2024 18:21

fcakyon added the enhancement New feature or request label Dec 16, 2024

fcakyon added 5 commits December 16, 2024 21:24

Update author information in setup.py and enhance forward method sign…

a2503a4

…ature in Loss class for improved readability

Update CI workflows to use newer versions of Python and PyTorch, and …

6190eab

…upgrade GitHub Actions to latest versions for improved performance and compatibility

Refactor CI workflows to standardize matrix variable naming and enhan…

f37232c

…ce caching logic. Added installation steps for PyTorch versions 1.13.1 and 2.5.1, and included a step to display installed packages. This improves clarity and consistency across CI configurations.

Update README.md to clarify differences with vandit15's repo and add …

b532d67

…installation details for PyTorch versions 1.13.1 and 2.5.1. This enhances documentation accuracy and provides users with essential version information.

Update version number to 0.1.1 in __init__.py to reflect recent chang…

9b02f32

…es and improvements.

fcakyon merged commit 20f3779 into fcakyon:main Dec 16, 2024

fcakyon removed their request for review December 16, 2024 18:47

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Avoid nan loss when there are labels with no samples in the training data. #12

Avoid nan loss when there are labels with no samples in the training data. #12

chbeltz commented Nov 22, 2024 •

edited

Loading

fcakyon commented Dec 16, 2024

fcakyon commented Dec 16, 2024

Avoid nan loss when there are labels with no samples in the training data. #12

Avoid nan loss when there are labels with no samples in the training data. #12

Conversation

chbeltz commented Nov 22, 2024 • edited Loading

fcakyon commented Dec 16, 2024

fcakyon commented Dec 16, 2024

chbeltz commented Nov 22, 2024 •

edited

Loading