Maybe it's a placebo but I switched back to GPT-4. Something about GPT-4o's responses, it's too verbose and rambles on about generalities instead of capturing the nuance of the topic of question, which is what I really want to know. Almost like 3.5 in that regard.
4o is completely trash, it literally won't listen, cant reason, talks non stop like an idiot gushing you with usless info you didnt ask for. Its like that one kid that over explains and thinks hes smart. Its not even that much faster, quality sacrifice is not usable.
It's nearly useless for me. GPT-4 used to be so good, 3.5 even, but I believe they have nerfed processing power per request, and the OpenAI stack is virtually useless to me, I tend to rely on Claude.
I was using Claude Opus to code before and now I'm back to ChatGPT. GPT-4o is faster, doesn't generate placeholders and works way better for me because of the larger context.
Take those results with a large grain of salt. It's dead simple to figure out which response is the new model purely by the speed at which the result is returned, and as such, this is incredibly easy to bot in an effort to manipulate the rankings.
They should really be streaming the content at the same time, based on the slowest responder.
I find 4o better than 4 in my experiences. Mostly doing code generation/correction in Python/JS, and asking science, business, finance, management, and other non-creative questions.
Yeah, it's much worse for me, worse than 3.5 even. Almost at the level of GPT-3 curie at worst.
I suspect it could be related to whatever it's using as language detection, because many others don't experience this. It glitches hard on language, often responding in the wrong one.
Sorry for a tangent, but also is gpt-4 better for you than 8x7b?
When I return to 8x7b from gpt-4 it feels like I just shook off an unbearably boring guy and met a normal one, both very similar in knowledge (and unable to perform complex tasks).
Their claim hasn't been that 4o is better than 4. Just that it's faster and cheaper. So it's better than 3.5-turbo but not as good as 4, atleast from the examples I've tried out for summarization, code gen etc.
No, their site literally says "our most powerful model" as the description for GPT4o, and it scores slightly higher than GPT-4 in their benchmarks: https://openai.com/index/hello-gpt-4o/