This isn’t about saying “return the original text” this is about assuming LLMs understand language, and they don’t. Telling an LLM “don’t do these things” will be as effective as telling it “don’t hallucinate” or asking it "how many 'r’s in ‘strawberry’.
In order to make such affirmation or infirmation we’ll need to define understanding.
The example you gave can be explained by other way than “it doesn’t understand”.
For example, the “how many ‘r’ in strawberry”, LLMs see tokens, and the dataset they use, doesn’t contain a lot of data about the letters that are present in a token.
This isn’t about saying “return the original text” this is about assuming LLMs understand language, and they don’t. Telling an LLM “don’t do these things” will be as effective as telling it “don’t hallucinate” or asking it "how many 'r’s in ‘strawberry’.
In order to make such affirmation or infirmation we’ll need to define understanding.
The example you gave can be explained by other way than “it doesn’t understand”.
For example, the “how many ‘r’ in strawberry”, LLMs see tokens, and the dataset they use, doesn’t contain a lot of data about the letters that are present in a token.