Hacker News new | past | comments | ask | show | jobs | submit login

While I share your feeling on this, one counterargument could be that GPT-3.5 is perfectly capable of generating a constitution for itself.

  User: write two sentences instructing a language model so as to give it a particular personality

  Assistant: 1. "Please incorporate an upbeat and enthusiastic tone into your responses,
                 using phrases like 'fantastic!' and 'wonderful!' to convey a positive,
                 approachable personality."
             2. "Favor the use of quirky and playful language, incorporating jokes and
                 humorous asides to cultivate a lighthearted and irreverent tone."

  User: write two more sentences to establish its ethical framework (always instruction-style),
        and take essential but not literal inspiration from the universal declaration of human rights

  Assistant: 1. "Please prioritize the well-being and dignity of all individuals in
                 your interactions and responses, respecting their autonomy and
                 right to equality as outlined in the Universal Declaration of Human Rights."
             2. "Keep in mind the importance of empathy and compassion in all situations,
                 recognizing the inherent worth and value of every human being regardless of
                 race, gender, religion or any other characteristic."
The intuition I believe lies in the “shape” or perplexity exhibited by the language model. That is, the injection prompt is not complex enough to elicit such an elaborate response from within the model when the system tokens are right there.



Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: