← Back to context

Comment by lsy

12 days ago

I think it's difficult to conceptualize a program's behavior as "buggy" when the program has no specified behavior. The vexing thing about LLMs and image generators is that they ultimately have no intended purpose or principled construction as artifacts, being mostly discovered through trial-and-error and pressed into service due to various misconceptions about what they are — human imitators, fact retrievers, search engines, etc. But they are really just statistical regurgitators over whatever the training dataset is, and while that type of an artifact seems to sometimes prove useful, it's not something we have yet developed any kind of principled theory around.

One example is in DALL-E initially going viral due to its generated image of an astronaut riding a unicorn. Is this a "bug" because unicorns don't exist and astronauts don't ride them? One user wants facts and another wants fancy. The decision about what results are useful for which cases is still highly social and situational, so AIs should never be put in a fire-and-forget scenario or we will see the biases Dan discusses. AIs are more properly used as sources of statistical potential that can be reviewed and discarded if they aren't useful. This isn't to say that the training sets are not biased, or that work shouldn't be done to rectify that distribution in the interest of a better society. But a lot of the problem is the idea that we can or should trust the result of an LLM or image generator as some source of truth.