Comment by Mathnerd314
14 days ago
That's what the "base" models are, pure token prediction on huge corpuses. I use them a fair amount, it does require some experimentation to find input formats that work but the base models are way smarter and don't have any refusals. Honestly it is a bit weird, everyone complains about rhlf etc. but the non-instruct models are right there if you look for them. I've been in a few Discord chats and it seems people are just spoiled, they use bad formats for the prompts and give up when it doesn’t work the first time like with instruct.
No comments yet
Contribute on Hacker News ↗