Replies: 2 comments 3 replies
-
It looks like the same comments apply to |
Beta Was this translation helpful? Give feedback.
-
Thanks for the feedback! One of the reasons why it's probably not there is that I build the whole LLM first before taking it apart and discussing the individual components. And I think the embedding layers will complain before the MHA will complain if inputs ( I can see your point though when looking at Chapter 3 in isolation. I would maybe say adding the truncation as a commented line would be a good compromise. This way it doesn't deviate from the book contents but provides a helpful tip to readers. What do you think? |
Beta Was this translation helpful? Give feedback.
-
I looks to me like the
CausalAttention
implementation in https://github.com/rasbt/LLMs-from-scratch/blob/main/ch03/01_main-chapter-code/ch03.ipynb does not handlecontext_length
correctly: it limits the mask size tocontext_length
xcontext_length
but does not truncatex
accordingly.As a result,
context_length
values greater than 1 and less than the input text lengthRuntimeError
. https://colab.research.google.com/drive/1aAkYHATiSq5jWR6RxdMmVhF9Rb24R89D?usp=sharing demonstrates this problem and a possible solution.Let me know if I can help, e.g. by contributing a PR with this change.
Beta Was this translation helpful? Give feedback.
All reactions