It works fine, although with a bit more latency than non-local models. However, swap usage goes way beyond what I’m comfortable with, so I’ll continue to use smaller models for the foreseeable future.
Hopefully other quantizations of these OpenAI models will be available soon.
Hopefully other quantizations of these OpenAI models will be available soon.