I worked for 2 years in a co-working space full of founders next to ETH Zurich. The most consistent worker? The cleaning lady. Every morning at 6 am, she did not miss a single day.
I grew up in a small village in Germany. 500 people, 5000 cows. Only farmers and a cheese factory. In the factory, we worked on Christmas, Easter, and New Year's Eve every morning at 5 am. Farmers don't take days off because cows don't take days off.
Maybe it's not the most healthy way of life. I don't think it physically requires us to take time.
I would love to fix my docs with this.
I have them in the main browser-use repo.
What do you recommend that the agent does never push to main browser-use, but only to its own branch?
One option could be for the main apps like WhatsApp to have defined custom actions, which are almost like an API to the service. I think the interplay between LLM and automation scripts will succeed here:
Agent call 1:
Send WhatsApp message (to=Magnus, text=hi)
Inside, you open WhatsApp and search for Magnus (without LLM)
Agent call 2:
Select contact from all possible Magnus contacts
Script 3: Type the message and click send
So in total, 2 calls - with Gemini, you could already achieve this in 10-15 seconds.
We see people replacing UIs and using browser-use to fill out the real UI. So there could be a world where everyone has their own UI, and you could have that filter option.
Furthermore, valid point: if Pepsi spends $1M on ads, why don't you get a piece of it if they pitch to you?
It could be useful to run a prompt/test once, get the xPaths, and rerun it deterministically.
When it breaks, you know something is wrong, and the LLM could be used as a fallback to fix the script.
reply