Implement alternative decoder invert operation #52

levmckinney · 2023-03-29T02:11:21Z

According to Analyzing Transformers in Embedding Space using the moore penrose pseudo inverse lead to problems. This is what is used to initialize the decoder invert method. It might be worth it to implement a simpler version that just multiplies by the transpose of the embedding matrix as advocated in that paper.

norabelrose · 2023-04-05T21:44:18Z

Are you saying that we initialize with the transpose of the embedding matrix, or are you saying we should "implement" an entirely separate inverse that consists of nothing more than multiplying by the transpose? I'd be fine with testing the former, but the latter seems sort of pointless since it's so simple.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement alternative decoder invert operation #52

Implement alternative decoder invert operation #52

levmckinney commented Mar 29, 2023

norabelrose commented Apr 5, 2023

Implement alternative decoder invert operation #52

Implement alternative decoder invert operation #52

Comments

levmckinney commented Mar 29, 2023

norabelrose commented Apr 5, 2023