Still some tweaks to the final result, but I am guessing with the ARC-AGI benchmark jumping so much, the model's visual abilities are allowing it to do this well.
Animated SVGs are one of the example in the press release. Which is fine, I just think the weird SVG benchmark is now dead. Gemini has beat the benchmark and now differences are just coming down to taste.
I don't know if it got these abilities through generalization or if google gave it a dedicated animated SVG RL suite that got it to improve so much between models.
Regardless we need a new vibe check benchmark ala bicycle pelican.
What benchmark, though? There is very clearly a lot of room for improvement in its SVG making capabilities. The fact that it can now, finally, make a pelican on a bike that isn’t completely wrong is not an indicator that SVG generation is now a solved problem.
Unfortunately it still fails my personal SVG benchmark (educational 2d cross section of the human heart), even after multiple iterations and screenshots feedback. Oh well, back to the (human) drawing board.
I'm thinking now that as models get better and better at generating SVGs, there could be a point where we can use them to just make arbitrary UIs and interactive media with raw SVGs in realtime (like flash games).
Thats one dimension before another long term milestone: Realtime generation of 3D mesh content during gameplay.
Which is the "left brain" approach vs the "right brain" approach of coming at dynamic videogames from the diffusion model direction which the Gemini Genie thing seems to be about.
On the other hand, creation of other vector image formats (eg. "create a postscript file showing a walrus brushing its teeth") hasn't improved nearly so much.
Perhaps they're deliberately optimising for SVG generation.
To show newbies how to use vim. Currently its not complete and has major issues. So if you want to try give it a go, but please hold your judgement as not all shortcuts have been added.
I have found GPT 5.3-Codex to do exceedingly well when working with graphics rendering pipelines. They must have better training data or RL approaches than Antropic as I have given the same prompt and config to Opus 4.6 and it seems to have added unwanted rendering artifacts. This may be just an issue specific to my use case, but wonder since OpenAI is partners with MSFT, which makes lots of games, that this may be an area they heavily invested in
While I think the use of the term “terrorist” is unwarranted, I do think deflock is seeking political change. The decision to use flock is a government policy choice, right?
>> “I was stunned to learn late yesterday that after convening a task force of local and national experts, Mayor Johnston has been negotiating secretly with the discredited CEO of Flock Safety and signing another unilateral extension of this mass surveillance contract with no public process and no vote from the City Council or input from his own task force,” Councilmember Sarah Parady told The Denver Gazette.
What is the point of this comment? Are you saying that deflock are not terrorists but are terrorist adjacent? Why respond to someone defining terrorism by pointing out that 2 words at the end of the definition also apply to deflock? Do those not apply to basically everyone who participates in their country's society, including literally everyone who votes and all politicians?
I am very curious if this app is making money or are users just using the two generators and then leaving? If so I am very impressed with your wrapper around the image gen models.
This could be the future of film. Instead of prompting where you don't know what the model will produce, you could use fine-grained motion controls to get the shot you are looking for. If you want to adjust the shot after, you could just checkpoint the model there, by taking a screenshot, and rerun. Crazy.
"create a svg of a unicorn playing xbox"
https://www.svgviewer.dev/s/NeKACuHj
Still some tweaks to the final result, but I am guessing with the ARC-AGI benchmark jumping so much, the model's visual abilities are allowing it to do this well.
reply