Hacker News
new
|
past
|
comments
|
ask
|
show
|
jobs
|
submit
login
datpuz
5 months ago
|
parent
|
context
|
favorite
| on:
AI agent benchmarks are broken
Benchmarks in software have always been bullshit. AI benchmarks are just even more bullshit since they're trying to measure something significantly more subjective and nuanced than most.
Guidelines
|
FAQ
|
Lists
|
API
|
Security
|
Legal
|
Apply to YC
|
Contact
Search: