Ollama works with Windows and Linux as well too, but doesn't (yet) have GPU support for those platforms. You have to compile it yourself (it's a simple `go build .`), but should work fine (albeit slow). The benefit is you can still pull the llama2 model really easily (with `ollama pull llama2`) and even use it with other runners.
DISCLAIMER: I'm one of the developers behind Ollama.
> DISCLAIMER: I'm one of the developers behind Ollama.
I got a feature suggestion - would it be possible to have the ollama CLI automatically start up the GUI/daemon if it's not running? There's only so much stuff one can keep in a Macbook Air's auto start.
Good suggestion! This is definitely on the radar, so that running `ollama` will start the server when it's needed (instead of erroring!): https://github.com/jmorganca/ollama/issues/47
I think you'd need to offload the model into CoreML to do so, right? My understanding is that none of the current popular inference frameworks do this (not yet, at least).
DISCLAIMER: I'm one of the developers behind Ollama.