I like the idea a lot more than other implementations (although I still think the original google glass was great), but I do feel the market for these, outside of the bankers and financial industry that loves to show off with tech, is primarily dorks like myself, and dorks like myself often like to be able to fiddle. The fact I can't be trusted to install applications that I can make as I go with an easy to use api seems a mistake. I see the even hub but that seems far off considering there is no details about it.
I bought this but ultimately returned it as it didn't really solve any problems due to being a complete walled garden with very sparse functionality right now
It's a cool form factor but the built-in transcription, ai etc are not very well implemented and I cannot imagine a user viewing this as essential rather than a novelty gadget
If I instead don't, and let you know that the key is there in the source code, hopefully at least one deserving person might learn how to look through source code better, and none of the lazy people get it :)
I love how every player in this space is just building exactly the same product, and no of them seems to have a compelling pitch why someone would need their product
I'm not sure how many actually use teleprompters, because it regularly bothers me how many public figures are staring at their notes on the podium throughout their speeches.
Mind you, I grew up in the handful-of-index-cards-and-memorise-the-damn-speech era.
The only thing that matters is how easily I can customize what is shown on the screen. Everything else is probably just annoying, like the translation or map feature, which I assume will be finicky and useless. If the ring had four-way arrows and ok/back buttons, and the glasses had a proper SDK for creating basic navigation and retrieval, such as the ability to communicate with HTTP APIs, there would be no limit to the useful things I could create. However, if I'm forced to use built-in applications only, I have very little faith that it would actually be useful, considering how bad the last generation of applications for these devices was.
I assumed that the display is actually watchable, as it seems to be. I have multiple AR and VR glasses, and I have never gotten fed up with them due to an inability to view the content properly. But having to struggle with getting it to show the content I want...
For instance, the teleprompter is terrible and buggy when it tries to follow along based on voice. A simple clicker for moving forward in a text file would be better than how it currently works.
How many people say they lost interest due to ocular issues versus complaints that it’s just not useful?
Seriously. A simple file browser with support for text files only would be more useful than the finicky G1 apps.
Of course visual issues could occur for someone, but it’s so aggravating that the they can’t just put in some sort of customization for content properly
There's a Bluetooth API for the G1 that's been pretty fully reverse engineered. It wouldn't be at all difficult to wrap an http wrapper around it if somebody hasn't already. I suspect you could even vibe code it in a weekend.
I really like the aesthetic of these, both the glasses themselves and the UI. However, I have the same problem with these as with smartwatches: the apps don't solve any of my problems.
I've been in many situations where I wanted translations, and I can't think of one where I'd actually want to rely on either glasses or the airpods working like they do in the demos.
The crux of it for me:
- if it's not a person it will be out of sync, you'll be stopping it every 10 sec to get the translation. One could as well use their phone, it would be the same, and there's a strong chance the media is already playing from there so having the translation embedded would be an option.
- with a person, the other person needs to understand when your translation in going on, and when it's over, so they know when to get an answer or know they can go on. Having a phone in plain sight is actually great for that.
- the other person has no way to check if your translation is completely out of whack. Most of the time they have some vague understanding, even if they can't really speak. Having the translation in the glasses removes any possible control.
There are a ton of smaller points, but all in all the barrier for a translation device to become magic and just work plugged in your ear or glasses is so high I don't expect anything beating a smartphone within my lifetime.
Some of your points are already considered with current implementations. Airpods live translate uses your phone to display what you say to the target person, and the target person's speech is played to your airpods. I think the main issue is that there is a massive delay and apple's translation models are inferior to ChatGPT. The other thing is the airpods don't really add much. It works the same as if you had the translation app open and both people are talking to it.
Aircaps demos show it to be pretty fast and almost real time. Meta's live captioning works really fast and is supposed to be able to pick out who is talking in a noisy environment by having you look at the person.
I think most of your issues are just a matter of the models improving themselves and running faster. I've found translations tend to not be out of whack, but this is something that can't really be solved except by having better translation models. In the case of Airpods live translate the app will show both people's text.
It's understating the lag. Faster will always be better, but even "real time" still requires the other person to complete their sentence before getting a translation (there is the edge case of the other language having similar grammatical structure and word order, but IMHO that's rare), and you catch up from there. That's enough lag to warrant putting the whole translation process literally on the table.
I see the real improvements in the models, for IRL translation I just think phones are very good at this and improving from there will be exponentially difficult.
IMHO it's the same for "bots" intervening (commenting/reacring on exchanges etc.) in meetings. Interfacing multiple humans in the same scene is always a delicate problem.
I have the G1 glasses and unfortunately the microphones are terrible, so the live translation feature barely works. Even if you sit in a quiet room and try to make conditions perfect, the accuracy of transcription is very low. If you try to use it out on the street it rarely gets even a single word correct.
This is the sad reality of most if these AI products and it’s that they are just taking poor feature implementations on the hardware. It seems like if they just picked one or these features and doing it well will make the glasses useful.
Meta has a model just for isolating speech in noisy environments (the “live captioning feature”) and it seems that’s also the main feature of the Aircaps glasses. Translation is a relatively solved problem. The issue is isolating the conversation.
I’ve found meta is pretty good about not overdelivering on promised features, and as a result even though they probably have the best hardware and software stack of any glasses, the stuff you can do with the Rayban displays are extremely limited.
Is it even possible to translate in real time? In many languages and sentences the meaning and translation needs to completely change all thanks to one additional word at the very end. Any accurate translation would need to either wait for the end of a sentence or correct itself after the fact.
Live translation is a well solved problem by this point — the translation will update as it goes, so while you may have a mistranslation visible during the sentence, it will correct when the last word is spoken. The user does need to have awareness of this but in my experience it works well.
Bear in mind that simultaneous interpretation by humans (eg with a headset at a meeting of an international organisation) has been a thing for decades.
And I hate the aesthetics of them (for me,) which is going to be a huge problem for the smart glasses world. Glasses dramatically change how you look, so few frames look good on more than a handful of face types, and that’s not even considering differences in personal style. Unless you come up with a core that can be used in a bunch of different frame shapes I can’t see any of these being long term products.
The hardware can look amazing, but if the software doesn't offer something meaningfully better than pulling out your phone, it ends up as an expensive novelty
I don't have time to fiddle around with some locked-in ecosystem in exchange a little more productivity or the ability to pretend not to be using my computer. And I don't even have a day job.
If it was just a heads-up display for android like xreal, but low power and wireless that might be cool for when I'm driving. But everyone wants to make AI glasses locked into their own ecosystem. Everyone wants to displace the smartphone, from the Rabbit R1 to the new ray-bans. It's impossible.
In the end this tech will all get democratized and open sourced anyways, so I have to hand it to Meta and others for throwing money around and doing all this free R&D for the greater good.
Nice idea, but no world lock rendering (Thats hard so we'll let them off)
However you are limited in what you can do.
there are no speakers, which they pitch as a "simpler quieter interface" which is great but it means that _all_ your interactions are visual, even if they don't need to be.
I'm also not sure about the microphone setup, if you're doing voice assistant, you need beamforming/steering.
However, the online context in "conversate" mode is quite nice. I wonder how useful it is. they hint at proper context control "we can remember your previous conversations" but thats a largely unsolved problem on large machines, let alone on device.
I've never used any smart glasses, but I do wear prescription glasses ("dumb" glasses?); don't these smart glasses products all clash with the field of prescription lenses? I mean, either they each have to provide the entire possible range of correction profiles, for use instead of what people wear now, or they need to be attachable/overlays for regular prescription glasses - which is complicated and doesn't look like what the providers are doing ATM. Or - am I getting it wrong?
For me it's like the pebble in smart glasses land, simple and elegant.
Less is more, just calendar, tasks, notes and AI. The rest I can do on my laptop or phone (with or without other display glasses).
I do wish there's a way to use the LLM on my android phone with it and if possible write my own app for it. So I am not dependent on the internet and have my HUD/G2 as a lightweight custom made AI assistent.
It's similar to how people felt when Google Glass first showed up. Until there's some universally understood signal like a visible recording light (that can't be turned off), I think that unease is going to stick around