Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I find it better overall, it's my default.

It's also what people think in blind tests: https://arena.lmsys.org



Take those results with a large grain of salt. It's dead simple to figure out which response is the new model purely by the speed at which the result is returned, and as such, this is incredibly easy to bot in an effort to manipulate the rankings.

They should really be streaming the content at the same time, based on the slowest responder.


Who would manipulate the rankings to make OpenAI look better? Are you sure you can tell the difference between a small llama3 8b or a fast gemini?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: