← Back to context

Comment by nutrientharvest

16 days ago

Funnily enough, of all that I've tried, the model by the best at writing porn has been not one of ones uncensored and tuned exactly for that purpose, but stock Command R - whose landing page lists such exciting uses as "suggest example press releases" and "assign a category to a document".

> uncensored and tuned exactly for that purpose

Are they tuning too, or just removing all restrictions they can get at?

Because my worry isn't that I can't generate porn, but that censorship will mess up all the answers. This study seems to say the latter.

  • Usually "uncensored" models have been made by instruction tuning a model from scratch (i.e. starting from a pretrained-only model) on a dataset which doesn't contain refusals, so it's hard to compare directly to a "censored" model - it's a whole different thing, not an "uncensored" version of one.

    More recently a technique called "orthogonal activation steering" aka "abliteration" has emerged which claims to edit refusals out of a model without affecting it otherwise. But I don't know how well that works, it's only been around for a few weeks.

    • I've seen some of the "abliterated" models flat-out refuse to write novels, other times they just choose to skip certain plot elements. Non-commercial LLMs seem to be hit or miss... (Is that a good thing? I don't know, I just screw around with them in my spare time)

      I'll try command-r though, it wasn't on my list to try because it didn't suggest what it was good at.