Hacker News new | past | comments | ask | show | jobs | submit login

They are deterministic at 0 temperature



At zero temp there is still non-determism due to sampling and the fact that floating point addition is not commutative so you will get varying results due to parallelism.


(Disclaimer: I know literally nothing about LLMs.) Wouldn't there still be issues of sensitivity, though? Like, wouldn't you still have to ensure that the wording of your commands stays exactly the same every time? And with models that take less discrete data (e.g. ChatGPT's new "advanced voice model" that works on audio directly), this seems even harder.


s/advanced voice model/advanced voice mode/ (too late for me to edit my original comment)


They are pretty deterministic then but they are also pretty useless at 0 temperature.


Not for the leading LLMs from OpenAI and Anthropic.


Not really, not in practice. The order of execution is non-deterministic when running on a cluster or a gpu, or more than one core of the CPU and rounding errors propagate differently on each run.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: