Comment by Lockal

5 hours ago

Classic implementations of LLMs (like llama.cpp) and diffusion image models allow to specify seed, and as long as it runs the same code on the same hardware with the same parallelism level the result will be the same. This is even checked in autotests[1]. The thing that produces randomized results in floating point operations (excluding bugs) is known as "stochastic rounding": it is pretty novel (from implementations standpoint) and it also can be controlled by seed. Other than that I've never seen hardware that has non-deterministic (maybe stochastic) output, but maybe we will see it in the next few years.

[1] https://github.com/ggerganov/llama.cpp/blob/master/examples/...

Do you know why OpenAI are unable to provide a "seed" parameter that's fully deterministic? I had assumed it was for the reason I described, but I'm not confident in my assertion there.

  • The original question was about LLMs, and what OpenAI provides is out of LLM definition (they have caches, "memory", undocumented o1 Chain-of-Thought and so on). Also I am pretty sure that they collected a zoo of hardware with autoscaling and per-client parallelism. And if you change number of threads even basic bricks like matrix multiplication produce different result.