Hacker Newsnew | past | comments | ask | show | jobs | submit | gregpr07's commentslogin

Hey HN,

we heard a lot of complaints about Browser Use being slow. Last few weeks we focused a lot on improving the speed, while keeping the same accuracy.

Go try it out. It's really fun to see it glide the web.


Congratulations! I was one of the people who complained. Not only was it slow it couldn't complete basic tasks like login into a site.

Has it been tested in the wild on speed and accuracy?


Browser Use creator here; we are working on prototypes like this but always find ourselves stuck with the safety vs freedom questions. We are very well aware how easy it is to inject stuff into the browser and do something malicious hence sandboxed browser still seem to like a very good idea. I guess in the long run we will not even need browsers, just a background agent that does stuff in the background. Is there any good research for guardrails of how to prevent “go to my bank and send the money to nigerian prince” style prompts?


AGI was just 1 bash for loop away all this time I guess. Insane project.


Less flippantly that was sort of my thought. I’m probably a paranoid idiot and I’m not really sure I can articulate this idea properly but I can imagine a less concise but broader prompt and an agent configured in a way it has privileges you dont want it to have or a path to escalate them and its not quite AGI but its a virus on steroids - like a company or resource (think utilities) killer. I hope Im just missing something but these models seem pretty capable of wreaking all kinds of havoc if they just keep looping and have access nobody in their right mind wants.


Just need to add ID.md, EGO.md and SUPEREGO.md and we're done.


was deeply unsettling among other things


It is, isn't it mate? Shit, I stumbled upon Ralph back in February and it shook me to the core.


Not that I want to be shaken but what is Ralph? A quick search showed me some marketing tools but that cant be what you are referring to is it?


Ralph is a technique. The stupidest technique possible. Running an agent in a while true loop. https://ghuntley.com/ralph


re side-note: if you know anyone who would be willing to interact connect me :)


i was going to ask the same to you. :-)

i'm just stubborn enough to find out, though. and i still have a few contacts at the googleplex...


I really like both nodriver and pydoll. I am definitely keeping the option of switching to them open, but we just wanted to have full control for now and see how painful CDP-use is to maintain first and then reconsider.


I mean... Playwright was built and is maintained by Microsoft, so I don't think VC money argument really makes sense here.

By the very nature of how Playwright is built we can't contribute to it - it runs inside a JS subprocess and does not expose a bunch of CDP apis that we NEED (for example to make cross origin iframes work).


Hi, the first version of Browser Use was actually built on Selenium but we quite quickly switched to Playwright


yeah, i noticed that. apologies if i missed a post about it... what do you wish didn't suck about selenium?


Scrolling to an element doesn’t always work because somehow the element might not be ready. You need to add ids to the element and select by that to ensure it works properly.


thanks! yeah, playwright was a huge improvement there -- waiting until an element was actually ready. the official posture from the selenium project ("figure it out, be explicit") wasn't always the most user friendly messaging.

having to add ids to elements is one of those classic tradeoffs -- the alternative was to use css or xpath selectors, which can be even worse, maintenance-wise. i'm secretly hoping ai code-gen apps pumped out by things like Lovable or Claude Code automagically generate element test-ids and the tests for you and we never have to worry about it again.


It’s literally the only issue I’ve ever encountered with selenium, and having ids would be good for more reasons than just testing, so for me it wasn’t even an actual issue.


whats the downside of using frameId/targetId+backendNodeId as the stable element ids?


i'm at the edge of my chrome internals knowledge here, but i'd answer the question with a question: isn't backendnodeid only stable within a single session?

that might not matter if the agent is re-finding the element between sessions anyway, but then you're paying a lookup cost (time + tokens) each time. compared to just using document.getelementbyid() on an explicit id.


iirc it's stable across sessions until the tab closes, even though their docs dont guarantee it.

we cant modify the dom to add IDs because we'd get detected by block-blockers very quickly. we're gradually trying to get rid of all DOM tampering entirely for that reason.


From my experience with Playwright RR-Web recordings are MUCH better than Playwright’s replay traces, so we usually just use those.


What's RR web?



That can be integrated with Playwright, or did you mean to say it is already used under the hood for their reports?


Gregor was saying it works without needing playwright, and provides more detailed trace recordings than playwright does.

we plan to use rr-web and maybe browsertrix for our website archival / replay system for deterministic evals.


I think it's awesome (we are close friends with Erik from Pig so slightly biased) - one extreme is Browser Use, which is just an agent that does everything for the first time, the other extreme is Workflow Use, which is almost deterministic. I think the winner product lies somewhere in the middle - Browser Use + Cache is easier to do for browser trajectories than for pure images! We will definitely try this direction!


1) we made this sick function in browser use library which analyses when there are no more requests going through - so we just reuse that!

2) yeah good question. The end goal is to completely regenerate the flow if it breaks (let browser use explore the “new” website and update the original flow). But let’s see, soo much could be done here!

What did you work on btw?


1. Oh yes right. I remember trying it out thinking it was going to be brittle because of analytics etc but it filters for those surprisingly well.

2. We are working on https://www.launchskylight.com/ , agentic QA. For the self onboarding version we are using pure CUA without caching. (We wanted to avoid playwright to make it more flexible for canvas+iframe based apps,where we found HTML based approaches like browser-use limited, and to support desktop apps in the future).

We are betaing caching internally for customers, and releasing it for the self-onboarding soon. We use CUA actions for caching instead of playwright. Caching with pixel native models is def a bit more brittle for clicks and we focus on purely vision based analysis to decide to proceed or not. I think for scaling though you are 100% right, screenshots every step for validating are okay/worth it, but running an agent non-deterministically for actions is def an overkill for enterprise, that was what we found as well.

Geminis video understanding is also an interesting way to analyze what went wrong in more interactive apps. Apart from that i think we share quite a bit of the core thinking, would be interested to chat, will DM!


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: