These models have gone beyond the level of "token predictors". On the level of chatGPT, the model has itself, internally, acquired "concepts" that it refers to in the conversation. It "understands" concepts like "you", "me", "them" etc, and can apply it correctly (to a large part) to the entities in the conversation.
I believe that answers your question. I could be wrong: errare humanum est.
It can be quite possible that we, humans, just cannot find uses of "you," "me" and "them" that require understanding of the concepts instead of statistical correlation. I think so because "you," "me" and "them" are very frequent words and most of their uses are very well covered by thousands of examples.
No, we're past that point. it's no longer the most useful way to describe these things, we need to understand that they already have some sort of "understanding" which is very similar if not equal to what we understand by understanding.
I believe that answers your question. I could be wrong: errare humanum est.