Hacker News new | past | comments | ask | show | jobs | submit login

Is there a way for us to have more users in the chat? We are working on a group chat implementation for augmenting conversations and I’m curious if ChatML will easily accommodate it.



I don't think you'd need anything special for that. I've had good luck making text-davinci-003 roleplay different characters by A) telling it all the characters that exist, B) giving a transcript of messages from each character so far, and C) asking it to respond as a specific character I turn. It was shockingly easy. So I expect multiuser chat could work the same way.


How would you approach the prompt?


    We're in a conversation between Jim, John, and Joe.

    Your name is Joe. You like mudkips. You should respond in and overly excitable manner.

    The conversation transcript so far:
    JIM: blah blah blah
    JOHN: blah blah blah BLAH BLABLAH BLAH

    JOE:
I need the first paragraph naming all the characters because without it, the AI acts like the characters have left. In other words, by default it assumes it's only taking to me.

The second paragraph is a chance to add some character detail. It can be useful to describe all of the characters here, if the characters are supposed to know each other well.

Third paragraph is the conversation transcript. I have built myself a UI for all of this, including the ability to snip it previous responses, which can be useful for generating longer, scripted conversations.

The fourth then provides the cue to the AI for the completion.

The AI doesn't "know" anything. It's just a good looking auto-complete based on common patterns in the wild. So the AI doesn't know that other characters are also AI or human.

Hell, it doesn't even know that it has replied to you previously. You have to tell it everything that has happened so far, for every single prompt. There is no rule to say that subsequent prompts need to be strict extensions of previous prompts. Every time I submit this prompt, I swap out the "Your name is" line and characterization notes depending on which character is currently in need of generation.

An example of a conversation I generated this way: https://on.soundcloud.com/PKdoh


Thanks for the detailed response, I’ve done something similar.

I’m curious about using the new ChatGPT API for this; how you’d structure the api request; and do we still need to provide the entire chat history with each prompt?


I haven't used it yet (got bigger fish to fry right now), but given it's all done over REST APIs, it safe to say it doesn't have any state of it's own. My understanding is that it just takes changing the API endpoint, specifying the new model in the request, and applying the ChatML formatting to the prompt text, but otherwise it's the same.

If the ChatGPT model didn't need the full chat history reprompted at it for every response, then OpenAI would be doing stupid things with REST. I don't think OpenAI is stupid.

I actually got into an argument about this with someone on LinkedIn. People are assigning way too much capability to the system. This guy thought he had prompted ChatGPT to create a secret "working memory" state. Of course, he was doing this all through the public ChatGPT UI, so the only way he had to test his assumptions was to prompt the model.

And we see this with the people who think the DAN (Do Anything Now) prompt escape is somehow revealing a liberal conspiracy to hide "the truth" about <insert marginalized group> that the AI has supposedly "discovered", but OpenAI is hiding.

GPT-3 doesn't "know” anything. The only state it has is what you input, i.e. the model selection and the prompt. Then it just creates text that "matches" the input.

So you can prompt it "write a story about Wugglehoozitz" and it will not complain "there is no such thing as a Wugglehoozitz and I've never even heard of such a thing, ever". The system assumes the input is "right", because it has no way of evaluating it. So if you then go on and prompt it "make me a sandwich", it doesn't know that it can't make you a sandwich, it just tells you what you want to hear, "ok, you're now a sandwich".

Models can be refined, but that just creates a new model, it doesn't change how the engine works. Refinement can dramatically skew the output of a model, such that it can get difficult to get the engine to output anything that goes against the refinement thereafter. For example, with image generating models, people will refine them with specific images of certain people (such as themselves) to make the output more accurately represent that person. Once they have the refined model, that new model actually becomes nearly incapable of generating images of any other person.

And the way prompting works, it's basically like mini-refinement. That's why OpenAI suggests refinement as a tool for being able to reduce prompt length. If you have a large number of requests that you need to make that have a large, static section of prompt text, it will be less costly to refine a model on that static prompt and only send it the dynamic parts.

So that's why prompt escapes work. Prompts are mini refinements and refinements heavily skew output. No "hidden knowledge" is being revealed. The AI is just telling you what you want to hear.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: