Comment by ronsor
12 days ago
If you used all of Wikipedia and HN, you could easily train a model for ~$200 worth of GPU time. The model really shouldn't be bigger than a few hundred million parameters for that quantity of data.
12 days ago
If you used all of Wikipedia and HN, you could easily train a model for ~$200 worth of GPU time. The model really shouldn't be bigger than a few hundred million parameters for that quantity of data.
No comments yet
Contribute on Hacker News ↗