I thought it was going to be using local hardware but lost interest when I saw it requires an API key. How’s the state of the art currently for that use case?
If you have a multimodal model running locally, say for example with llamafile, ollama or lmstudio, you can easily configure llm to connect to them instead. The llm package is more like a Swiss Army knife to connect which whatever model (local or remote) you like and interact with it
I thought it was going to be using local hardware but lost interest when I saw it requires an API key. How’s the state of the art currently for that use case?
If you have a multimodal model running locally, say for example with llamafile, ollama or lmstudio, you can easily configure llm to connect to them instead. The llm package is more like a Swiss Army knife to connect which whatever model (local or remote) you like and interact with it
I’m working on plugins for that at the moment. The local vision models have got pretty good - I’m impressed by Llama 3.2 and Phi 3.5 Vision.