RL training amounts to pattern matching.
How does an LLM decode Base64? Decode algorithm? No - predictive pattern matching.
An LLM isn't predicting what a person thinks - it's predicting what a person does.
RL training amounts to pattern matching.
How does an LLM decode Base64? Decode algorithm? No - predictive pattern matching.
An LLM isn't predicting what a person thinks - it's predicting what a person does.