Skip to main content

Models Servings

Introduction

The Model Servings page enables DIAL admins to deploy and manage containers for AI models listed at NVIDIA NIM and Hugging Face.

How to Use Models

To be able to use AI models in DIAL, you need adapters. Model adapters unify the APIs of respective AI models to align with the Unified Protocol of DIAL Core. DIAL includes adapters for Azure OpenAI models, GCP Vertex AI models, and AWS Bedrock models. You can also create custom adapters for other AI models with DIAL SDK.

You can use DIAL OpenAI adapter to work with compatible models listed on Hugging Face or NVIDIA NIM. For other models not compatible with OpenAI API, you need to create custom adapters.

To enable a model in DIAL:
  1. Add and run a model serving container with an OpenAI-compatible model from Hugging Face or NIM.
  2. Unless it is a part of your DIAL setup, create a new adapter based on DIAL Azure OpenAI Adapter and add it in Builders/Adapters.
  3. In Entities/Models, create a new model entity:
    • As a Source Type, select your OpenAI adapter.
    • As an Override Name, use the model name from the running model serving container. You can find it in the container logs.
    • Add Upstream Endpoint with the URL of your model serving running container. Follow this pattern: http://<container_url>/openai/v1/chat/completions.
  4. Now the AI model is available for users and apps based on your permissions model.

Main Screen

On the main screen, you can view existing and add new AI model servings.

Model servings grid
FieldDescription
Display NameName of the model serving rendered on UI.
DescriptionBrief description of the model serving.
Source TypeSource type of the model (NIM or Hugging Face).
StatusCurrent status of the model serving.
IDUnique identifier for the model serving.
Container URLURL of the container where the model is hosted.
Available for a running container.
MaintainerPerson or team responsible for maintaining the model serving.
Create timeDate and time when the model serving was created.
Update timeDate and time when the model serving was last updated.

Create

On the main screen, click the Create button to open the Create Model Serving form.

To create a new model serving:
  1. Click the Create button on the main screen to open the Create Model Serving form.
  2. Fill in the required fields in the form:
    • ID: Unique identifier for the model serving.
    • Display Name: Enter a name for the model serving.
    • Description: Provide a brief description of the model serving.
    • Source Type: Select the source type (NIM or Hugging Face).
    • Hugging Face Model Name: Applies to Hugging Face source type. Enter the name of the model from Hugging Face.
    • Docker Image URI: Applies to NIM source type. Enter the Docker image URI for the model.
  3. Click the Create button to submit the form and create the model serving.

Configuration Screen

Click any model serving from the main screen to open its configuration screen.

Actions

In the header of the Configuration screen, you can find the following action buttons:

ActionDescription
Create ModelAvailable for running model servings.
Click to create a new model deployment using this selected model serving.
Run/StopClick to start or stop the selected model serving.
DeleteClick to delete the selected model serving.

To Create Model

You can use a running model serving container to create a new model deployment in DIAL. Once created, the model deployment appears in Entities/Models. Refer to How to Use Models section for more details on how to enable models in DIAL.

  1. In the Configuration screen of the running model serving, click the Create Model button in the header.
  2. In the Create Model dialog, fill in the form fields:
    • ID: Unique identifier for the model deployment.
    • Display Name: Enter a name for the model deployment.
    • Display Version: Specify a version of the model deployment.
    • Description: Provide a brief description of the model deployment.
  3. Click the Create button to submit the form and create the model deployment. Repeat these steps to create more model deployments if needed.

Properties

In the Properties tab, you can view and edit the selected model serving container settings.

PropertyRequiredEditableDescription
ID-NoUnique identifier of the model serving container.
Type-NoContainer by default.
Creation Time-NoDate and time when the model serving container was created.
Updated Time-NoDate and time when the model serving container was last updated.
Status-NoCurrent status of the model serving container.
URL-NoURL of the container where the model is hosted.
Display NameYesYesName of the model serving container rendered in UI.
DescriptionNoYesBrief description of the model serving container.
MaintainerNoYesPerson or team responsible for maintaining the model serving container.
Source TypeYesYesSource type of the model (NIM or Hugging Face).
Hugging Face model nameConditionalYesApplies to Hugging Face source type.
The name of the model from Hugging Face.
Docker Image URIConditionalYesApplies to NIM source type.
The Docker image URI for the model.
Endpoint ConfigurationNoYesPort configuration for the model serving.
Environment VariablesNoYesList of environment variables for the model serving.
ResourcesNoYesResource allocation settings for the model serving (CPU, Memory, GPU).
ConfigurationNoAdditional configuration settings for the model serving container.

Advanced users with technical expertise can work with model serving properties in the table or a JSON editor view modes. It is useful for advanced scenarios of bulk updates, copy/paste between environments, or tweaking settings not exposed on UI.

Execution log

In the Execution Log tab, you can view the logs related to the operations and activities of the selected model serving.

Events

In the Events tab, you can view the event history related to the selected model serving.