I just watched "Deep Dive into LLMs like ChatGPT" by Andrej Karpathy and things make much more sense! is this correct about RL? (I asked Chatgpt)
https://chatgpt.com/share/67d995f4-a818-800a-aac1-4a243e1cd676
submitted by /u/BukHunt
[comments]
Source link