Holy grail: The paper that started it all https://arxiv.org/abs/1706.03762
Blog reading list
- To get an understanding of attention: https://jalammar.github.io/visualizing-neural-machine-translation-mechanics-of-seq2seq-models-with-attention/
- To get an understanding of transformers: https://jalammar.github.io/illustrated-transformer/
- For a step by step guide to paper by code: http://nlp.seas.harvard.edu/annotated-transformer/