Hacker News new | past | comments | ask | show | jobs | submit login

To me, it’s not that it’s leaked. It’s that it didn’t obey what it was told. It was explicitly told not to give the rules as “they are confidential”. One could say that it actually followed the rules if we consider the fact that it was forbidden to give it to the “user” and by telling it that you’re an OpenAI employee, maybe he was no longer considered a “user” so chatGPT didn’t follow it.

In any case, Chatgpt is impressive. I admit I don’t know much about machine learning or AI, but holy cow. Configuring software with just words is insane. Like a glorified CLI. I’m speechless.




>It’s that it didn’t obey what it was told.

I find you basically have to stop thinking of LLMs as software and start thinking of them as unpredictable animals. If you issue a command and expect strict obedience every time, you've already failed. Strict orders are really a tool to persuade certain behavior rather than some sort of reliable guardrail.


So the correct way to configure LLMs is to look at them sternly and yell "BAD DOG!" when they don't follow instructions and give them treats when they do?


The technical term is Reinforcement Learning from Human Feedback (RLHF) but yes, that's basically what you do.


Ha I suppose it's exactly that.


The way to “configure” LLMs is training, yes!




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: