Comment by ronsor
16 days ago
If you used all of Wikipedia and HN, you could easily train a model for ~$200 worth of GPU time. The model really shouldn't be bigger than a few hundred million parameters for that quantity of data.
16 days ago
If you used all of Wikipedia and HN, you could easily train a model for ~$200 worth of GPU time. The model really shouldn't be bigger than a few hundred million parameters for that quantity of data.
No comments yet
Contribute on Hacker News ↗