I’m still messing around with self hosting llm, rn ive settled on using lumo from proton if I use an llm.
When I have run llm, I used koboldcpp. Works pretty well, depends on what you are doing and what models you use. Forget which models ive been using off the top of my head
I’m still messing around with self hosting llm, rn ive settled on using lumo from proton if I use an llm.
When I have run llm, I used koboldcpp. Works pretty well, depends on what you are doing and what models you use. Forget which models ive been using off the top of my head