If you’ve ever looked into running AI models on your own hardware, you’ve almost certainly come across Ollama. It’s the default recommendation in many different places, it’s what most YouTube tutorials point you toward, and it’s the tool that a lot of self-hosting guides treat as the starting point for local inference. For good reason, too, because getting a model running with Ollama is as easy as typing “ollama run gpt-oss-20b” and waiting. It’s the Docker of local LLMs, and that comparison isn’t a coincidence, given that some of the people behind Ollama came from the Docker world.