Hacker News new | past | comments | ask | show | jobs | submit login

> The ability to trivially trick the model into thinking it said something it didn’t is a feature and intentional.

It is definitely not an intended feature for the end user to be able to trick the model into believing it said something it didn't say. It also doesn't work with ChatGPT or Bing Chat, as far as I can tell. I was talking about the user, not about the developer.

> It’s how you do multi-turn conversations with context.

That can be done with special tokens also. The difference is that the user can't enter those tokens themselves.




> It is definitely not an intended feature for the end user to be able to trick the model into believing it said something it didn't say. It also doesn't work with ChatGPT or Bing Chat, as far as I can tell. I was talking about the user, not about the developer.

Those aren't models, they are applications built on top of models.

> That can be done with special tokens also. The difference is that the user can't enter those tokens themselves.

Sure. But there are no open models that do that, and no indication of whether the various closed models do it either.


> Those aren't models, they are applications built on top of models.

The point holds about the underlying models.

> Sure. But there are no open models that do that, and no indication of whether the various closed models do it either.

An indication that they don't do it would be if they could be easily tricked by the user into assuming they said something which they didn't say. I know no such examples.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: