1. 3
    1. 1

      I’ve been trying LM Studio, which provides a GUI on top of LLaMa.cpp and can download models from HuggingFace and run them. It was a lot easier than building these things myself, though I’d love to see a F/OSS GUI for LLaMa.cpp.

      1. 1

        I guess the immediate purpose is to keep queries private. This might vary person by person or org by org. If this isn’t a concern, then I don’t know what other use there is. Self-hosting will probably be routine one day. Then, it might not be special or viable. You could run your own email / mastodon / irc nodes. No one does because compute is not the interesting part.

        The other purpose would be to train it on your own data. The “fine tuning” trap. The immediate problem is hardware requirements (to do one run). The future problem is keeping that model up to date and/or trying to fine tune it so that it is better than the original or better than your previous run. At that point, you are OpenAI (or similar). You will do many runs in many ways and want to track them. ML evaluation.

        This is the gap and I feel like it’s We have McDonald’s at home energy or that moment when people thought they could buy the yellow Google search appliance to get Google search internally on their own data. But it never worked.