Comment by impossiblefork

3 months ago

It's important to understand that if we 'align' an LLM, then we are aligning it in a very total way.

When we do similar things to humans, the humans still have internal thoughts which we cannot control. But if we add internal thoughts to an LLM, then we will be able to align even them.