ChatGPT - but Open Sourced | Running HuggingChat locally (VM) | Chat-UI + Inference Server + LLM

Ғылым және технология

In this video you learn how you can run HuggingChat, an Open Sourced ChatGPT alternative, locally (on a VM) and interact with the Open Assistant model, respectively with any LLM, in two variants.
Variant 1: Run just the Chat-UI locally and utilize a remote inference endpoint from Hugging Face
Variant 2: Run the whole stack, the Chat-UI, the Text Generation Inference Server and the (Open Assistant) LLM on your Virtual Machine
Chapters in this video:
0:00 - Intro and Explanation
0:48 - Live Demo: Running HuggingChat locally
4:53 - Installing the Chat-UI
10:19 - Run the Chat-UI
11:26 - Text Generation Inference Server
12:57 - Set up the Inference Server
15:37 - Run the Inference Server
16:22 - Testing the Open Assistant Model
17:08 - Connect Chat-UI to Inference Server
18:52 - Outro
Video related links:
- SSH into Remote VM with VS Code: • SSH into Remote VM wit...
- Detailed instructions and Scripts (Free): www.blueantoinette.com/2023/0...
- Downloadable Installation Scripts (Payable): www.blueantoinette.com/produc...
About us:
- Homepage: www.blueantoinette.com/
- Contact us: www.blueantoinette.com/contac...
- Twitter: / blueantoinette_
- Consulting Hour: www.blueantoinette.com/produc...
Hashtags:
#chatgpt #opensource #huggingchat #openassistant

Пікірлер: 44

  • @BlueAntoinette
    @BlueAntoinette Жыл бұрын

    In the meanwhile we created a HuggingChat Plugin for aitom8 which is our professional AI Automation software. It allows you to install HuggingChat with just one command. While everything explained in this video is still valid and fully functional, please consider further improving your efficiency with this video: kzread.info/dash/bejne/eoNluJmkfLTbZtY.html

  • @kaitglynn2472
    @kaitglynn247211 ай бұрын

    Thank you so much for this wealth of knowledge!! Spectacular job!

  • @BlueAntoinette

    @BlueAntoinette

    11 ай бұрын

    Kait, I thank you so much!

  • @itsmith32
    @itsmith32 Жыл бұрын

    Thank you so much! Great job

  • @BlueAntoinette

    @BlueAntoinette

    Жыл бұрын

    Thx! 😀

  • @BlueAntoinette
    @BlueAntoinette11 ай бұрын

    Update: New video about running Code Llama locally available: kzread.info/dash/bejne/n5ylmKSKiJPFgJM.html

  • @chuizitech
    @chuizitech11 ай бұрын

    兄弟,感谢!正好准备进行私有化部署。

  • @BlueAntoinette

    @BlueAntoinette

    11 ай бұрын

    不客气

  • @user-fu7er9gl1g
    @user-fu7er9gl1g Жыл бұрын

    Thanks for this detailed tutorial. Would you mind sharing the scripts that you created?

  • @BlueAntoinette

    @BlueAntoinette

    Жыл бұрын

    Hi, I now have added a link to my instructions and scripts in the video description. You can access it directly on our site at this link www.blueantoinette.com/2023/05/09/chatgpt-but-open-sourced-running-huggingchat-locally-vm/

  • @MultiTheflyer
    @MultiTheflyer Жыл бұрын

    Thank you!!! this has been super useful. I'm trying to use this front end, however I'd like to use OPENAI APIs as a backend, because it currently supports function calling (I don't know of any other model that does). I'm quite new to programming in general and don't have any experience with docker, in my understanding though, the huggingface chatui front end cannot be "edited" and can only be deployed as is because it's already in a container, is that correct? I'd like to change it slightly so that it shows when a function is being called etc but it seems that's not possible right? thanks again for the useful tutorial, it really did open up a new world of possibilities to me

  • @BlueAntoinette

    @BlueAntoinette

    Жыл бұрын

    Not quite right. I do not run the Chat-UI in a container, instead I run its source code directly with npm run, please check this out again in the video. If you want to make changes to the source code simply clone or fork it from repo and adapt it to your needs. The Chat-UI is written in Typescript and it utilizes Svelte and Tailwind, so you want to make yourself familiar with these technologies.

  • @MultiTheflyer

    @MultiTheflyer

    Жыл бұрын

    @@BlueAntoinette thank you!

  • @oryxchannel
    @oryxchannel Жыл бұрын

    I wonder if you can build a privacy filter built of diverse prompt clusters. Tell it that its all PII or something so that the VM isn't able to read your data. Tunneling and all that. This may be an added privacy solution, and maybe it won't work at all. But the fact that it's on a Google virtual machine does not mean that it is "local" or "private". Also, if you had the video support, an MMLU AI benchmark would be helpful.

  • @BlueAntoinette

    @BlueAntoinette

    11 ай бұрын

    Interesting…

  • @thevadimb
    @thevadimb Жыл бұрын

    First, thank you for your video and for sharing your experience! A question - why did you allocate two GPUs? Why do you need more than one for simple inference purposes?

  • @BlueAntoinette

    @BlueAntoinette

    Жыл бұрын

    Well, this was a little bit of trial and error. I first increased the number of GPUs and then, because it still did not work, the CPUs and RAM, which eventually turned out to be the deciding factor. So potentially you can get away with just one GPU, but I did not test that.

  • @thevadimb

    @thevadimb

    Жыл бұрын

    @@BlueAntoinette Thank you!

  • @BlueAntoinette

    @BlueAntoinette

    Жыл бұрын

    @@thevadimb FYI, I now tried it with just one GPU but this results in an error "AssertionError: Each process is one gpu". Then I tried to reduce the number of shards to 1 but this results in waiting endlessly with the message "Waiting for shard 0 to be ready...". Therefore the only reliable configuration so far is the one that I show in the video (with 2 GPUs).

  • @thevadimb

    @thevadimb

    Жыл бұрын

    @@BlueAntoinette Thank you for devoting your time to checking this point. It is a bit weird that it requires at least two GPUs. HF did tremendous work building this server, so it is a bit strange that after all this profound design they ended up with such a strange restriction. I would bet that there is some hidden configuration setting... Probably 🙂

  • @BlueAntoinette

    @BlueAntoinette

    Жыл бұрын

    @@thevadimb Well, apparently they optimized it for their needs. Maybe there are settings for this or it requries changes to the code and a rebuild of their docker image. However that's beyond the time I can spend on it for free.

  • @eduardmart1237
    @eduardmart1237 Жыл бұрын

    Is it possible to train it on a custom data? What are the ways to it? Does it support any languages except for English?

  • @BlueAntoinette

    @BlueAntoinette

    Жыл бұрын

    Theoretically you can run it with any model, however I just tested it with the Open Assistent Model so far.

  • @ShravaniSreeRavinuthala
    @ShravaniSreeRavinuthala2 ай бұрын

    Thank you for this video, I am trying to use the UI with my custom backend server which has RAG setup in it, but all it needs as parameters are the queries, as per what I explored, it looks like I have to make changes in the source code, is there any easier way to achieve this

  • @BlueAntoinette

    @BlueAntoinette

    2 ай бұрын

    I did the same once with my RAG backend and I had to make changes to the source code as well. Learn more about my solution here: aitomChat - Talk with documents | Retrieval Augmented Generation (RAG) | Huggingchat extension kzread.info/dash/bejne/oGpntaaegd3deMY.html

  • @frankwilder6860
    @frankwilder686011 ай бұрын

    Is there an easy way to run the HuggingChat UI on port 80 with SSL encryption?

  • @BlueAntoinette

    @BlueAntoinette

    11 ай бұрын

    Yes, you can setup NGINX Reverse Proxy with SSL encryption. I fully automated this process in this video: kzread.info/dash/bejne/qGR4lNSHeNC5dJc.htmlsi=xjU2QGt_vQHXaBgj With this approach it requires just one command!

  • @user-zz9pb9yg9q
    @user-zz9pb9yg9q7 ай бұрын

    damn, almost 600 usd monthly for the inference server alone.

  • @BlueAntoinette

    @BlueAntoinette

    7 ай бұрын

    Well, just if you choose variant 2. Actually, the high costs for variant 2 are caused by the required hardware, the inference server (software) comes at no cost (apart from integration, automation, maintenance, etc). LLMs are very resource intensive and the cloud providers charge a lot for the required GPUs. Alternatively you can stick with the remote endpoints (variant 1).

  • @deathybrs
    @deathybrs Жыл бұрын

    I am a little curious - why a VM?

  • @BlueAntoinette

    @BlueAntoinette

    Жыл бұрын

    You mean in contrast to running on your local machine? Well, there are several reasons. For example if you do not have sufficient hardware resources on your local machine, which is especially likely when you choose variant 2. Or if want to to make it publicly available with SSL encryption and reverse proxy.

  • @fredguth1315

    @fredguth1315

    Жыл бұрын

    Also, if you are developing as s team, a vem is handy for keeping environment in sync

  • @deathybrs

    @deathybrs

    Жыл бұрын

    At the end of the day, I think maybe I should have been more clear in my question. Why a VM *before* explaining how to set it up *without* a VM? I understand the value of a VM, but there aren't many videos explaining how to do this, so why *start* with the VM explanation rather than explaining how to get it set up in our native environment first?

  • @BlueAntoinette

    @BlueAntoinette

    Жыл бұрын

    @@deathybrs When it comes down to the Chat-UI, there is no difference in the installation on a VM or on a local linux compatible machine. If you don‘t want to use a VM then you don‘t have to for the UI part. If you run Windows locally you could utilize WSL as well. If you want to discuss your situation more specific feel free to share your native environment.

  • @deathybrs

    @deathybrs

    Жыл бұрын

    @@BlueAntoinette I really appreciate that, thanks! At this point, I am actually not ready right now to set it up as my use-case for AI is 90% Diffusion rather than LLM, and I suspect that unless I need it soon, the tech will have changed shape so much that by the time I do get there, a video made now would not be applicable enough for it to be worth your time. But as I said, your kindness is certainly appreciated!

  • @GAMINGDEADCRACKER
    @GAMINGDEADCRACKER Жыл бұрын

    May I get you mail address? I want to know more about it.

  • @BlueAntoinette

    @BlueAntoinette

    Жыл бұрын

    Yes, please find my contact details here: www.blueantoinette.com/contact-us/ Thx!

Келесі