A large data set will be biased if the sum of data is leaning towards some direc...

A large data set will be biased if the sum of data is leaning towards some direction.

I'm not sure you can produce a truly unbiased model without actively interfering with it.

Just consider the fact that you'll find less republicans among scientists. (source: https://www.pewresearch.org/politics/2009/07/09/section-4-sc...)

Now the research-based data on ChatGPT will be biased. It takes no active "inserting" by OpenAI. It may manage creating the bias all by itself.