Hacker Newsnew | past | comments | ask | show | jobs | submit | KerrickStaley's commentslogin

"Please don't complain about tangential annoyances—e.g. article or website formats, name collisions, or back-button breakage. They're too common to be interesting."

- Hacker News Guidelines https://news.ycombinator.com/newsguidelines.html


It's at least Meta-relevant. Compression Represents Intelligence Linearly (Y Huang, 2024)

Such complaints are valid for AI model releases, that tells us that they are not using their own models to test their own release pages.

Maybe they did get their models to test their pages, but they didn't tell their models to pretend that they're browsing on mobile using a 3G connection.

I think this speaks to the product release iself

I think most people can speak faster than 120 WPM. For example this site says I speak at 343 WPM https://www.typingmaster.com/speech-speed-test/, and I self-measure 222 WPM on dense technical text.

Micro machines guy could be vibe coding at an absurd rate.

My LLM types at 2k WPM. So I ise that to talk to my LLMs

I think (without having done extensive research) that some sort of Apple hardware is your best bet right now. Apple hasn’t raised RAM upgrade prices [1] (although to be fair their RAM upgrades were hugely inflated before the crunch) and their high memory bandwidth means they do inference faster than most consumer GPUs.

I have an M4 MacBook Air with 24 GB RAM and it doesn’t feel sufficient to run a substantial coding model (in addition to all my desktop apps). I’m thinking about upgrading to an M5 MacBook Pro with much more RAM, but I think the capabilities of cloud-hosted models will always run ahead of local models and it might never be that useful to do local inference. In the cloud you can run multiple models in parallel (e.g. to work on different problems in parallel) but locally you only have a fixed amount of memory bandwidth so running multiple model instances in parallel is slower.

[1] https://9to5mac.com/2026/03/03/apple-macbook-price-increase-...



Tried this out today and it feels half-baked unfortunately. I can't get auth working (https://github.com/googleworkspace/cli/issues/198).

The decision to pass all params as a JSON string to --params makes it unfriendly for humans to experiment with, although Claude Code managed to one-shot the right command for me, so I guess this is fine. This is an intentional design per https://justin.poehnelt.com/posts/rewrite-your-cli-for-ai-ag...


Side note, a lot happens at C3 other than the talks! Art, electronic gizmos and demos of all kinds, people hacking in realtime on projects, impromptu meetups, and bumping techno music :) I'd encourage people to attend in person if they get a chance; just watching the talks online is only a fraction of the experience.


I recently designed an eval to see if LLMs can produce usable CAD models: https://kerrickstaley.com/2026/02/22/can-frontier-llms-solve...

Claude 4.6 Opus and Gemini 3.1 Pro can to some degree, although the 3D models they produce are often deficient in some way that my eval didn't capture.

My eval used OpenSCAD simply due to familiarity and not having time to experiment with build123d/CadQuery. There is an academic paper where they were successful at fine-tuning a small VLM to do CadQuery: https://arxiv.org/pdf/2505.14646


Great work - looks like building block towards 3d-model composition integration testing. I have been looking for a solution that would allow testing component fit into surrounding components. My use-case would be to create parametric boat hull and then add components to that could be tested for fitness in the arrangement.


Cool project, thanks for sharing!

The simulator lets the LLM request renders from different angles/times, so the LLM can get visual feedback. For failures, the simulator also returns status codes like `object_fell` or `mount_initially_collided_with_object` depending on what happened. You can see what the tool call looks like by looking at the Transcript tab, e.g. here https://kerrickstaley.com/ai-cad-design-mount-viz/gso__mug__...

I agree it's not clear how much benefit models get from iteration. Many of the successful runs are one-shots. You can see some examples of basic spatial reasoning e.g. here https://kerrickstaley.com/ai-cad-design-mount-viz/gso__mug__... :

> The initial collision is because the mount was positioned at the same height as the mug's body center (z=-22), causing overlap. I need to lower the mount significantly so the mug starts above it and drops into the cradle.


> I'll also remove the end cap to avoid it blocking the mug's descent.

Ah yes, that matches my observations. It kinda sees that the stuff it is looking for is there, but does not see enough detail to actually notice that not only there is an endcap in the way, but the mug is also rotated the wrong way to sit in the holder.

It feels like the "r's in strawberry" effect where the models do not have enough introspection into the raw input data.


  a = b = []
has the same semantics here as

  b = []
  a = b
which I don't find surprising.


A fun way to play this game with less downside is to run `set -euo pipefail` in an interactive session. Then, whenever you execute a command that returns a non-zero exit code, your shell will exit immediately.

Unfortunately certain commands like `rg` will return non-zero by design when there are no matches, which could be an intentional outcome.


Just remember what those commands are, and to type a || or && next to them each time you run one.

'<cmd> || echo$?' is a good option, if you care about the return value.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: