More

MagMueller · 2025-12-08T02:48:22 1765162102

Interesting read. Agree that GUI is super hard for agents. Did you see "skills" from browser-use? We directly interact with network requests now.

MagMueller · 2025-09-06T16:48:42 1757177322

I worked for 2 years in a co-working space full of founders next to ETH Zurich. The most consistent worker? The cleaning lady. Every morning at 6 am, she did not miss a single day.

I grew up in a small village in Germany. 500 people, 5000 cows. Only farmers and a cheese factory. In the factory, we worked on Christmas, Easter, and New Year's Eve every morning at 5 am. Farmers don't take days off because cows don't take days off.

Maybe it's not the most healthy way of life. I don't think it physically requires us to take time.

MagMueller · 2025-08-25T16:37:19 1756139839

We could do a hackathon where its only allowed to change 1 line.

MagMueller · 2025-08-24T16:46:28 1756053988

I would love to fix my docs with this. I have them in the main browser-use repo. What do you recommend that the agent does never push to main browser-use, but only to its own branch?

dhorthy · 2025-08-24T16:53:39 1756054419

Yeah you can easily tweak this to push to a branch or a fork or something in the generated prompt.md

MagMueller · 2025-05-17T17:08:56 1747501736

Yes so you can run the same form over and over again with different input variables, very reliable, fast and cheap

MagMueller · 2025-05-17T17:07:51 1747501671

In the main library this feature could help you with that: https://github.com/browser-use/browser-use/pull/1437

MagMueller · 2025-02-26T18:19:11 1740593951

One option could be for the main apps like WhatsApp to have defined custom actions, which are almost like an API to the service. I think the interplay between LLM and automation scripts will succeed here:

Agent call 1: Send WhatsApp message (to=Magnus, text=hi) Inside, you open WhatsApp and search for Magnus (without LLM)

Agent call 2: Select contact from all possible Magnus contacts Script 3: Type the message and click send

So in total, 2 calls - with Gemini, you could already achieve this in 10-15 seconds.

MagMueller · 2025-02-26T18:08:25 1740593305

We see people replacing UIs and using browser-use to fill out the real UI. So there could be a world where everyone has their own UI, and you could have that filter option.

Furthermore, valid point: if Pepsi spends $1M on ads, why don't you get a piece of it if they pitch to you?

MagMueller · 2025-02-26T18:01:49 1740592909

I use browser-use. I use use-browser. I use mac-use. I use use.

MagMueller · 2025-02-26T17:57:10 1740592630

It could be useful to run a prompt/test once, get the xPaths, and rerun it deterministically. When it breaks, you know something is wrong, and the LLM could be used as a fallback to fix the script.