I built a GPT model in 10 hours

From Looking Glass Universe.

Kapathy’s tutorial: https://www.youtube.com/watch?v=kCc8FmEb1nY
Link to the Colab (make a copy of it to fill it in): https://colab.research.google.com/drive/10TjJAOO-oCGIWEScYF7HTbGInVqJm3yN?usp=sharing

GPT theory videos:
3blue1brown’s super helpful videos on this: https://youtu.be/eMlx5fFNoYc?si=_7XgY5nOpVaRdThU and https://youtu.be/wjZofJX0v4M?si=Cc4QkmzN6rsJZuOc
Algorithmic Simplicity’s video give great intuition: https://youtu.be/kWLed8o5M2Y?si=2WqR04FblltNxbF-

PyTorch nn.Module tutorial: https://pytorch.org/tutorials/beginner/basics/buildmodel_tutorial.html

When I didn’t understand things, I mostly just asked ChatGPT to explain it to me. The 4o model has just become freely available, so give it a go!