How to run Ollama on Docker

Ғылым және технология

Ollama runs great on Docker, but there are just a couple things to keep in mind. This covers them all.
Visit hub.docker.com/r/ollama/ollama for more details.
Be sure to sign up to my monthly newsletter at technovangelist.com/newsletter
And if interested in supporting me, sign up for my patreon at / technovangelist

Пікірлер: 116

@technovangelist3 ай бұрын
someone just commented about finding another way to upgrade the container. I can't find the comment now, so if this was you, post again. But no, do not upgrade the install inside a container, that’s a whole lot of work for no benefit. The models are stored in the volume you mounted as part of the install, so deleting the image will not affect the models. If you have gone against the recommendations and stored models inside the container, then best approach is to move them to the correct spot and update the container.
@Kimomaru29 күн бұрын
I really wish more videos were made like this. No nonsense, gets straight to the point, clear, concise. Thank you.
@technovangelist
29 күн бұрын
And yet some complain that I take too long and waste time. But thankyou so much for the comment. I do appreciate it.
@ToddWBucy-lf8yz4 күн бұрын
thank you Sir! You just took the mystery of how to set this up right. I love me some docker. It really helps to keep the work stuff separated from the personal project stuff.
@ErnestOak3 ай бұрын
Does it make sense to use ollama in production as a server?
@mercadolibreventas3 ай бұрын
Matt your a great teacher, no one explains things like you do. They just read the command in one sentence and do not explain the actual function of that command in parts. Lots of videos showing how to do something and 75% never work. So thanks so much!
@EcomGraduates3 ай бұрын
How you speak in your videos is refreshing thank you 🙏🏻
@MakumazaanАй бұрын
much respect for the way you deliver information
@ashwah15 күн бұрын
Thanks Matt this helped me understand the Docker side of things. Namely keeping the models in a volume. I will restructure my project based on this. Keep it up ❤
@xXWillyxWonkaXx3 ай бұрын
Straight to the point, no fluff, very informative. Very updated. You just earned a fan/subscriber. Howdy Matt 🎩
@technovangelist
3 ай бұрын
there are some who say I am all fluff, but I try to always be closer to your observation.
@MohammadhosseinMalekpour13 күн бұрын
Thanks, Matt! It was a straightforward tutorial.
@sampellinoАй бұрын
A fantastic, clear instructional. Thank you so much! This helped me a ton.
@TimothyGraupmann3 ай бұрын
Learned that containers can be remote and the alias. Yet another great video! I need to take advantage of that. I have a bunch of RPI security cameras and remote containers might make administration even easier!
@tristanbob3 ай бұрын
This is my new favorite channel! I learned like 10 things just in this video. I love learning about AI, modern tools such as docker and tailscale, and modern hosting platforms and services. Thank you!
@technovangelist
3 ай бұрын
you left off the most important part.... NERF can be expensed!!
@tristanbob
3 ай бұрын
Good point! So I learned 11 things :)@vangelist
@robertdolovcak98603 ай бұрын
Nice and clear tutorial. Thanks! 😀
@Slimpickens453 ай бұрын
🔥good stuff as always Matt!
@fuba443 ай бұрын
This is my new favorite content, the way you explain it just beams directly into my brain and i get it right away. Thank you. Is there a way to show support, donations or similar?
@technovangelist
3 ай бұрын
Folks have asked me about that. I’ll be looking into something like Patreon soon.
@technovangelist
3 ай бұрын
The big thing for now is to just share the video with everyone you know.
@technovangelist
Ай бұрын
Well I do have that patreon now. Just set it up: patreon.com/technovangelist
@sushicommander3 ай бұрын
Great video. Now i'm curious about how you setup ollama on brev ... What is your recommended setup & host service for using Ollama as an endpoint?
@tiredofeverythingnew3 ай бұрын
In the realm of ones and zeros and LLM models, Matt is the undisputed sovereign.
@technovangelist
3 ай бұрын
wow, you are too kind
@bjaburg3 ай бұрын
There are not many people that can explain these steps in such an easy and entertaining way as you do Matt. I often pride myself in being able to do so, but you can be my teacher. I often find myself watching the progress-bar beause I don't want it to end (seriously :-))! A request: could you do an explainer video on how to train a model (say Microsoft/Phi-2) on your own dataset and deploy the trained model? OpenAI makes it super easy by deploying a JSONL file and after a while it 'returns' the trained model. But I want to train my own models. I have been looking around YT but get lost in parameters, incorrect JSONL-files (or csv)., etc. Surely, this must be easier. (hopefull your answer is "it is easier, and don't call me Shirley") Thanks so much again. You have a happy subscriber (and many more to come)> Kind regards, Bas
@mohammedibrahim-hd2rs6 күн бұрын
you're amazing bro
@JM-sn5eb12 күн бұрын
This is exactly what I've been looking for! Could you please tell (or maybe create a video) how to use ollama completeley offline? I have a PC that I can not connect to the internet.
@AnkitK-wi3wk21 күн бұрын
Hi Matt, your videos are super useful and right on point. Thank you putting this together. I have a quick ques on this topic. I have created a RAG streamlit app in python using Ollama llama3 and ChormaDB. The app runs fine on my Mac localhost but I wanted to create a docker image of this app. I am unable to figure out how to include Ollama llama3 in my docker image. Can you help point to any resources which can guide me on this or cover this in one of the topics? Again,thanks a mil for the content. Great stuff!!! Cheers
@chandup2 ай бұрын
Nice video. Could you also please make a demo video on how to use ollama via nix (nix shell or on nixos)?
@Lemure_Noah3 ай бұрын
Excellent, Matt! For some reason, I had to run docker commands with "sudo" , to use my GPUs.
@gokudomatic
3 ай бұрын
That sounds like your user is not in the right group. I had once issues like that, and it was a matter of not being in docker group. Now, I can use my gpu in my docker container.
@technovangelist
3 ай бұрын
good answer. I knew it, but couldn't remember. and this is what I remember.
@devgoneweird2 ай бұрын
Does it make possible to limit resource consumption of ollama? I'm looking for some way to run a background computation and I don't really care about how much time it takes (if it is able to process a stream's avg load), but it would be annoying if it would be hanging the main activity on the machine.
@CaptZenPetabyteКүн бұрын
Ive been running a lot via Docker but when I found out about the difficulty of GPU pass-through (on any machine) I have been swapping things over to proxmox which does have a GPU pass-through *and* can also use CPU to emulate GPU as it is needed ... what do you think about running on Proxmox?
@95jack443 ай бұрын
Searching for a full airgap install on docker to use on Kubernetes. This is a start ^^. Thx
@ricardofernandez2286Ай бұрын
Hi Matt, thank you for such a clear an concise explanation!! I have a question that may or may not apply in this context, and I'll let you be the jugde of it. I'm running on CPU on an 8 virtual core server with 30Gb RAM and NVme disk on ubuntu 22.04, and the performance is kind of poor (and I clearly understand that GPU will be the straightforward way to solve this). But I've noticed that when I run the models, for example Mistral 7b, ollama only uses about half the CPUs available and less than 1 Gb of RAM. I'm not sure why it is not using all the resources available, or if using them will improve the performace. Anyway it would be great to have your advice on this, and if it is something that can be improved/configured how would you suggest to do it? Thank you very much!!!
@technovangelist
Ай бұрын
You will need a GPU. Maybe a faster CPU would help, but the GPU is going to be the easier approach. You will see 1 or 2 orders of magnitude improvement adding even a pretty cheap GPU from nvidia or amd.
@ricardofernandez2286
Ай бұрын
@@technovangelist Thank you! I know the GPU is the natural way to go. I was just wondering why it is using less that half the resources available, when it has plenty of extra CPU and RAM; and if using these idle resources could improve at least in a x% the performance. And unfortunately I can't add GPU to this current configuration I have. My CPUs are AMD EPYC 7282 16-Core Processor which I think are quite nice CPUs. Thank you!!
@xDiKsDe2 ай бұрын
hey matt, appreciate your content - has been very helpful to get everything running so far! I am on a windows 11 pc and managed to get ollama + anythingllm running on docker and communicate w/ each other. Now I want to try to get llms from hugging face to run in the dockerized ollama. I saw how it works, if you have ollama installed directly on the system. But how do I approach this with using docker? Thanks in advance and keep it up 👏
@technovangelist
2 ай бұрын
Is the model not already in the library. You can import but I can be a bit of extra work. Check out the import doc in the docs folder of the ollama repository
@xDiKsDe
2 ай бұрын
ah yes they are, but I meant custom trained llms - I stumbled across the open_llm_leaderboard and wanted to give those a try - will check out the import doc, thanks! @@technovangelist
@s.b.605Ай бұрын
how do you swap models in the same container? I think I'm doing it wrong and it's affecting my container memory
@Tarun_Mamidi3 ай бұрын
Cool tutorial. Can you also show how we can integrate ollama docker with other programs, say, langchain script inside docker. How to connect both of them together or separately? Thanks!
@technovangelist
3 ай бұрын
would love to see a good example of using langchain. often folks use it for rag where it only adds complexity. Do you have a good usecase?
@brentfergsАй бұрын
Great video as always Matt, I love them. I would like to know how to load a custom model in docker with a model file. Thank you so much.
@technovangelist
Ай бұрын
Same way as without docker. you create the model using the modelfile, then run it. or am i missing something
@mrRobot-qm3pq3 ай бұрын
Does it consume less resources and run better with OrbStack instead of with Docker Desktop?
@nagasaivishnu96803 ай бұрын
Running the docker container as ROOT user is not secure,is there anyway to run it as non root user
@vishalnagda72 ай бұрын
Could you kindly assist me in clarifying how to specify the model name when running the ollama Docker command? For instance, I aim to utilize the mistral and llama2:13b models in my project. Thus, I request our dev-ops team to launch an ollama container configured with these specific models.
@michaelberg72013 ай бұрын
I recently had the opportunity to try Ollama in docker and it worked pretty much as shown in this video. I do think it would be nice if it was somehow possible to start a container and have it ready to serve a model immediately but i couldn't find an easy way to do this. You basically have to run one docker command to start Ollama, then wait a bit, then run another docker "it" command to tell Ollama to load whatever model you happen to need. How do i achieve the same thing using just one single docker command?
@carlosmendez3363
Күн бұрын
docker-compose
@alibahrami68103 ай бұрын
is it possible to manage multiple instances of ollama on docker for scaling the ollamas for production? how ?
@technovangelist
3 ай бұрын
You could but it will result in lower performance for everyone.
@RupertoCamarena2 ай бұрын
did you hear about jan ai? Would be got a Tutorial for docker. Thanks
@Lemure_Noah3 ай бұрын
I would like to suggest the ollamas support to embeddings, when it becomes available through REST API. If they really choosed the nomic-ai/nomic-embed-text-v1.5-GGUF, it would be perfect as this model is multi-language
@technovangelist
3 ай бұрын
It does support embeddings. Using Nomic-embed-text. Check out the previous video. It covers that topic.
@csepartha3 ай бұрын
Kindly make a tutorial to fine tune an open source LLM model on many pdfs data. The fine tuned LLM must be able to answer the questions from the pdfs accurately.
@akshaypachbudhe33192 ай бұрын
how to connect this ollama server with a streamlit app and run both on docker
@kevyyar3 ай бұрын
Juat dound this channel. Coould you make a video tutorial on how to use inside vscode for code competions?
@michaeldoyle4222Ай бұрын
Any idea where I can see docker logs for local install (i.e. not docker) on mac....
@technovangelist
Ай бұрын
If it’s a local install that isn’t docker there is no docker log
@MarvinBo3 ай бұрын
Make you Ollama even better by installing Open WebUI in a second container. This even runs on my Raspi5!
@technovangelist
3 ай бұрын
Some like the webui. But that’s a personal thing. Its an alternative.
@AdrienSales3 ай бұрын
Hi, would you also share podman commands ? Did you give it a try ?
@technovangelist
3 ай бұрын
I tried it a bit when Bret fisher and I had them on the show we hosted together but it didn’t have much reason to stop using docker. I didn’t see any benefit.
@AdrienSales
3 ай бұрын
@@technovangelistThanks for the feedback. It was not about dropping docker, but rather be sure both work as in some cases, podman is used (beacause of the rootless mode eg.) and not docker. So it may help some of us spreading ollama even in theses cases in enterprise context ;-p
@technovangelist
3 ай бұрын
I thought they were supposed to be command line compatible. should be the same, right? Try it and let us know.
@AlokSaboo3 ай бұрын
Loved the video…can you do something similar for LocalAI. Thanks!
@technovangelist
3 ай бұрын
Hmm. Never heard of it before now. I’ll take a look
@AlokSaboo
3 ай бұрын
@@technovangelist github.com/mudler/LocalAI - Similar to Ollama in many respects. One more tool for you to learn :)
@kiranpallayil86506 күн бұрын
would ollama still work on a machine with no graphics card?
@technovangelist
6 күн бұрын
Absolutely. It will just be 1-2 orders of magnitude slower. The work models do requires a lot of math that gpu really help accelerate
@SharunKumar3 ай бұрын
For Windows, the recommended way would be to use WSL(2), since that's a container in itself
@technovangelist
3 ай бұрын
Well recommended way on windows is the native install. But after that is docker. And wsl is a vm, not a container. Ubuntu on wsl is a container that runs inside the wsl vm.
@kaaveh3 ай бұрын
I wish there was a clean way to launch an Ollama docker container with a preconfigured set of models so it would serve and then immediately pull the models. We are overriding the image’s entry point right now to run a script shell that does this…
@ravitejarao62013 ай бұрын
Hi bro. When try to deploy ollama on awa lambda with ecr docker image I am getting error can you please help me Error:http:connecterror:[errno 111] connection refused Thank you
@technovangelist
3 ай бұрын
need a lot more info. where do you see that. when in the process? Is running a container like that even going to be possible? Do you have access to a gpu with lambda? if not, its going to be an expensive way to go.
@bodyguardik3 ай бұрын
In wsl2 docker version DONT PUT MODELS OUTSIDE WSL2 on mounted windows drive - I/O performance will be x15 times slower.
@technovangelist
3 ай бұрын
Yup. Pretty standard stuff for docker and virtualization. Docker on wsl with Ubuntu means the ollama container is running in the Ubuntu container on the wsl virtual machine. Each level of abstraction slows things down. And translation between levels is going to be slow.
@Vinn.VКүн бұрын
It's better to write a docker file and package it as docker image
@lancemarchetti86733 ай бұрын
Hey Guys.... Mistral just launched their new model named Large!
@arberstudio2 ай бұрын
some of the model links r broken so I had to add it to requirements and edit the Dockerfile
@technovangelist
2 ай бұрын
What do you mean by that? Is this something I’m a file you made?
@arberstudio
2 ай бұрын
@@technovangelist I was referring to the Ollama Webui, perhaps this isn't the same repo?
@technovangelist
2 ай бұрын
different product made by unrelated folks
@bobuputheeckal269317 күн бұрын
How to run as a dockerfile
@technovangelist
17 күн бұрын
Yes that’s what this video shows
@bobuputheeckal2693
17 күн бұрын
@@technovangelist I mean, how to run as a dockerfile, not as a set of docker commands.
@technovangelist
17 күн бұрын
Docker commands that runs an image using the dockerfile
@basilbrush78782 ай бұрын
Mac not allowing GPU pass through is a huge limitation
@technovangelist
2 ай бұрын
Docker has known the issue for a long time. But mostly its because there aren't linux drivers for the apple silicon gpu
@florentflote3 ай бұрын
@95jack443 ай бұрын
If anyone has insights on a particular LLM model that has low halucination rate on Kubernetes native resource generation, please leave me a comment ;-). Thx
@technovangelist
3 ай бұрын
Usually when someone has a hard time with the output of a model it points to a bad prompt rather than a bad model.
@jbo8540
3 ай бұрын
Mistral:Instruct is a solid choice for a range of tasks
@themax2go21 күн бұрын
waaiiiit wait wait a sec... i specifically remember in a vid (don't remember which, it's been months) that on Mac in order for ollama to utilize "metal" 3d acceleration that it needs to run in docker... strange 🫤
@technovangelist
21 күн бұрын
Sorry. You must have remembered that wrong. Docker on Mac with apple silicon has no access to the gpu. And ollama doesn’t work with the gpu on Intel Macs either