-
Notifications
You must be signed in to change notification settings - Fork 251
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add a perplexity metric #63
Comments
huggingface resource: https://huggingface.co/docs/transformers/perplexity @abheesht17's colab: http://go/colabx-drive/1BH1lTw_qLK6671oWaoU15IrUKSQfd6oE |
Looking through the colab, I think method 1 would be the correct approach? Over method 2. Perplexity seems to me to be defined on a single input sequence, so averaging over all sequences in the batch (and then all batches), seems reasonable. I think @chenmoneygithub was going to take a look here too, to tagging for thoughts. |
Hey, I can't access the notebook link (http://go/colabx-drive/1BH1lTw_qLK6671oWaoU15IrUKSQfd6oE). So, putting the link here again: https://colab.research.google.com/drive/1BH1lTw_qLK6671oWaoU15IrUKSQfd6oE?usp=sharing |
Agree method1 looks correct. One notes about masking: checking |
Ah, nice 👍🏼. I'll make the necessary changes and open a PR. |
Hello, @mattdangerw, @chenmoneygithub! I've made some changes to the class. Please see this notebook: https://colab.research.google.com/drive/1XV1h5aeiy5IlHoQFjDTJ45hRC8wMSf16?usp=sharing. I've followed this script: https://github.com/huggingface/transformers/blob/main/examples/research_projects/codeparrot/scripts/validation_loss.py#L56-L69. In the notebook, I've compared our results with HF's results. The perplexity scores returned by both are very close to each other! I'll open a PR for this now :D |
Splitting this issue out from #38.
We should add a perplexity metrics as
keras_nlp.metrics.Perplexity
.The text was updated successfully, but these errors were encountered: