I built GPT-2 for $31.99

From Looking Glass Universe.

The difference between this video and the last GPT I made is that that one was a toy model. It was trained on a very small set of data (the works of Shakespeare) and it doesn’t get very good. This one is much bigger, and reaches the level of GPT-2. It required quite a lot more optimisation to train it.

This is Karpathy’s tutorial: https://youtu.be/l8pRSuU81PU?si=IZKoAl-YpqEn9Tyj

The code: https://github.com/karpathy/build-nanogpt/blob/master/train_gpt2.py

My write up as I went: https://colab.research.google.com/drive/1awhFM8oIGMVTQII-S7rsQGFeJDDxFoEV?usp=sharing

S	M	T	W	T	F	S
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28
29	30

mBlip

The YouTube aggregator for thoughtful, intelligent people.

I built GPT-2 for $31.99

Related posts