Transformer-decoder language models
Published:
Some notes on how transformer-decoder language models work, taking GPT-2 as an example, and with lots references in order to dig deeper.
Published:
Some notes on how transformer-decoder language models work, taking GPT-2 as an example, and with lots references in order to dig deeper.
Page 1 of 1 | Previous page | Next page | JSON Feed | Atom Feed