1. Proprietary Data (Youtube, docs, gmail, cloud logs, waymo, website analytics, ads, search, the list is huge)
2. Commercial Datacenters (theyre ahead at least)
3. Chip production (Google is manufactoring proprietary chips)
4. Consumer OS (Chrome, Andriod)
5. Consumer Hardware (Pixel)
Basically google has access to data that OpenAI will never have access to, can lower costs below what OpenAI can, and is already a leader in all the places OpenAI will need massive capex to catch up.
You can't train LLMs on proprietary data, at least not if you want to make that LLM as accessible as Gemini. Otherwise random people can ask it your home address.
So it matters less than one would think. Also, ChatGPT can do 'internet search' as a tool already, so it already has access to say Google maps POI database of SMBs.
And ChatGPT also gets a lot of proprietary data of its own as well. People use it as a Google replacement.
>You can't train LLMs on proprietary data, at least not if you want to make that LLM as accessible as Gemini. Otherwise random people can ask it your home address.
If this is your only criteria I think you have a misunderstanding of what proprietary data is and ways companies can mitigate the situation in the inference stage.
2. Commercial Datacenters (theyre ahead at least)
3. Chip production (Google is manufactoring proprietary chips)
4. Consumer OS (Chrome, Andriod)
5. Consumer Hardware (Pixel)
Basically google has access to data that OpenAI will never have access to, can lower costs below what OpenAI can, and is already a leader in all the places OpenAI will need massive capex to catch up.