Refactor `RotaryEmbedding` and `GPTNeoXAttention` #1101

shivance · 2023-06-28T19:34:21Z

cos and sin embeddings should be calculated only from value.
refactor

https://github.com/huggingface/transformers/blob/fd6735102abcc560cb2b68523b3f5012da54a956/src/transformers/models/gpt_neox/modeling_gpt_neox.py#L154

keras_nlp/models/gpt_neo_x/gpt_neo_x_attention.py

mattdangerw

Nice!

mattdangerw · 2023-06-30T23:08:53Z

/gcbrun

shivance · 2023-07-01T03:22:30Z

Good ! All checks pass.

mattdangerw

Thanks!

* fix rotary emb * refactor + remove unnecessary typecast * fix formatting * refactor * formatting fix * refactoring rotary emb * added a kwarg in super().__init__()

shivance added 4 commits June 29, 2023 00:59

fix rotary emb

a061bb9

refactor + remove unnecessary typecast

33dc486

fix formatting

dac732a

refactor

156f102

shivance changed the title ~~Fix rotary embedding layer~~ Fix + Refactor RotaryEmbedding and GPTNeoXAttention Jun 28, 2023

formatting fix

4978951

mattdangerw reviewed Jun 29, 2023

View reviewed changes

keras_nlp/models/gpt_neo_x/gpt_neo_x_attention.py Outdated Show resolved Hide resolved

refactoring rotary emb

78d4baa

shivance requested a review from mattdangerw June 30, 2023 17:30

shivance changed the title ~~Fix + Refactor RotaryEmbedding and GPTNeoXAttention~~ Refactor RotaryEmbedding and GPTNeoXAttention Jun 30, 2023

shivance and others added 2 commits June 30, 2023 23:01

Merge branch 'keras-team:master' into fix-attention

bdcb3c5

added a kwarg in super().__init__()

c6aebe7

mattdangerw approved these changes Jun 30, 2023

View reviewed changes

mattdangerw approved these changes Jul 6, 2023

View reviewed changes

mattdangerw merged commit f68c256 into keras-team:master Jul 6, 2023

shivance deleted the fix-attention branch July 13, 2023 02:33

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor `RotaryEmbedding` and `GPTNeoXAttention` #1101

Refactor `RotaryEmbedding` and `GPTNeoXAttention` #1101

shivance commented Jun 28, 2023 •

edited

Loading

mattdangerw left a comment

mattdangerw commented Jun 30, 2023

shivance commented Jul 1, 2023

mattdangerw left a comment

Refactor RotaryEmbedding and GPTNeoXAttention #1101

Refactor RotaryEmbedding and GPTNeoXAttention #1101

Conversation

shivance commented Jun 28, 2023 • edited Loading

mattdangerw left a comment

Choose a reason for hiding this comment

mattdangerw commented Jun 30, 2023

shivance commented Jul 1, 2023

mattdangerw left a comment

Choose a reason for hiding this comment

Refactor `RotaryEmbedding` and `GPTNeoXAttention` #1101

Refactor `RotaryEmbedding` and `GPTNeoXAttention` #1101

shivance commented Jun 28, 2023 •

edited

Loading