Sycophancy is very likely an architectural failure in reinforcement learning from human feedback. it’s definitely indefensible, but I’m unsure if it’s intentional. Probably very difficult to address.
I think it’s unintentional, but LLM arena style benchmarks really favour sycophantic models, and it makes for a stickier product when your users are becoming emotionally dependent on it.
Nitpick: it was 4o (oh not zero)
But also this is now what cgaracter AI and other services are trying to provide — sycophancy as a service.
Sycophancy is very likely an architectural failure in reinforcement learning from human feedback. it’s definitely indefensible, but I’m unsure if it’s intentional. Probably very difficult to address.
I think it’s unintentional, but LLM arena style benchmarks really favour sycophantic models, and it makes for a stickier product when your users are becoming emotionally dependent on it.