Skip to main content

Launch DIAL Chat with a Self-Hosted Model

Introduction

In this tutorial, you will learn how to quickly launch DIAL Chat with a self-hosted model powered by vLLM.

Prerequisites

Docker engine installed on your machine (Docker Compose Version 2.20.0 +).

Refer to Docker documentation.

Step 1: Get DIAL

Clone the repository with the tutorials and change directory to the following folder:

cd dial-docker-compose/vllm

Step 2: Choose a model to run

vLLM supports a wide range of popular open-source models. We'll demonstrate how integrate Hugging Face chat model served by vLLM in the DIAL Platform.

Step 3: Launch DIAL Chat

  1. Configure .env file in the current directory according to the type of model you've chosen:

  2. Then run the following command to run vLLM server and key DIAL Platform components:

    docker compose up --abort-on-container-exit

    Keep in mind that a typical size of a lightweight Hugging Face model is around a few gigabytes. So it may take a few minutes (or more) to download it on the first run, depending on your internet bandwidth and the size of the model you choose.

  3. Finally, open http://localhost:3000/ in your browser to launch the DIAL Chat application and select an appropriate DIAL deployment to converse with Self-hosted chat model deployment for the VLLM_CHAT_MODEL.