Models Servings

Introduction

The Model Servings page enables DIAL admins to deploy and manage containers for AI models listed at NVIDIA NIM and Hugging Face.

How to Use Models

To be able to use AI models in DIAL, you need adapters. Model adapters unify the APIs of respective AI models to align with the Unified Protocol of DIAL Core. DIAL includes adapters for Azure OpenAI models, GCP Vertex AI models, and AWS Bedrock models. You can also create custom adapters for other AI models with DIAL SDK.

You can use DIAL OpenAI adapter to work with compatible models listed on Hugging Face or NVIDIA NIM. For other models not compatible with OpenAI API, you need to create custom adapters.

To enable a model in DIAL:

Add and run a model serving container with an OpenAI-compatible model from Hugging Face or NIM.
Unless it is a part of your DIAL setup, create a new adapter based on DIAL Azure OpenAI Adapter and add it in Builders/Adapters.
In Entities/Models, create a new model entity:
- As a Source Type, select your OpenAI adapter.
- As an Override Name, use the model name from the running model serving container. You can find it in the container logs.
- Add Upstream Endpoint with the URL of your model serving running container. Follow this pattern: http://<container_url>/openai/v1/chat/completions.
Now the AI model is available for users and apps based on your permissions model.

Main Screen

On the main screen, you can view existing and add new AI model servings.

Model servings grid

Field	Description
Display Name	Name of the model serving rendered on UI.
Description	Brief description of the model serving.
Source Type	Source type of the model (NIM or Hugging Face).
Status	Current status of the model serving.
ID	Unique identifier for the model serving.
Container URL	URL of the container where the model is hosted. Available for a running container.
Maintainer	Person or team responsible for maintaining the model serving.
Create time	Date and time when the model serving was created.
Update time	Date and time when the model serving was last updated.

Create

On the main screen, click the Create button to open the Create Model Serving form.

To create a new model serving:

Click the Create button on the main screen to open the Create Model Serving form.
Fill in the required fields in the form:
- ID: Unique identifier for the model serving.
- Display Name: Enter a name for the model serving.
- Description: Provide a brief description of the model serving.
- Source Type: Select the source type (NIM or Hugging Face).
- Hugging Face Model Name: Applies to Hugging Face source type. Enter the name of the model from Hugging Face.
- Docker Image URI: Applies to NIM source type. Enter the Docker image URI for the model.
Click the Create button to submit the form and create the model serving.

Configuration Screen

Click any model serving from the main screen to open its configuration screen.

Actions

In the header of the Configuration screen, you can find the following action buttons:

Action	Description
Create Model	Available for running model servings. Click to create a new model deployment using this selected model serving.
Run/Stop	Click to start or stop the selected model serving.
Delete	Click to delete the selected model serving.

To Create Model

You can use a running model serving container to create a new model deployment in DIAL. Once created, the model deployment appears in Entities/Models. Refer to How to Use Models section for more details on how to enable models in DIAL.

In the Configuration screen of the running model serving, click the Create Model button in the header.
In the Create Model dialog, fill in the form fields:
- ID: Unique identifier for the model deployment.
- Display Name: Enter a name for the model deployment.
- Display Version: Specify a version of the model deployment.
- Description: Provide a brief description of the model deployment.
Click the Create button to submit the form and create the model deployment. Repeat these steps to create more model deployments if needed.

Properties

In the Properties tab, you can view and edit the selected model serving container settings.

Property	Required	Editable	Description
ID	-	No	Unique identifier of the model serving container.
Type	-	No	Container by default.
Creation Time	-	No	Date and time when the model serving container was created.
Updated Time	-	No	Date and time when the model serving container was last updated.
Status	-	No	Current status of the model serving container.
URL	-	No	URL of the container where the model is hosted.
Display Name	Yes	Yes	Name of the model serving container rendered in UI.
Description	No	Yes	Brief description of the model serving container.
Maintainer	No	Yes	Person or team responsible for maintaining the model serving container.
Source Type	Yes	Yes	Source type of the model (NIM or Hugging Face).
Hugging Face model name	Conditional	Yes	Applies to Hugging Face source type. The name of the model from Hugging Face.
Docker Image URI	Conditional	Yes	Applies to NIM source type. The Docker image URI for the model.
Endpoint Configuration	No	Yes	Port configuration for the model serving.
Environment Variables	No	Yes	List of environment variables for the model serving.
Resources	No	Yes	Resource allocation settings for the model serving (CPU, Memory, GPU).
Configuration	No	Additional configuration settings for the model serving container.

Advanced users with technical expertise can work with model serving properties in the table or a JSON editor view modes. It is useful for advanced scenarios of bulk updates, copy/paste between environments, or tweaking settings not exposed on UI.

Execution log

In the Execution Log tab, you can view the logs related to the operations and activities of the selected model serving.

Events

In the Events tab, you can view the event history related to the selected model serving.

Introduction​

How to Use Models​

To enable a model in DIAL:​

Main Screen​

Model servings grid​

Create​

To create a new model serving:​

Configuration Screen​

Actions​

To Create Model​

Properties​

Execution log​

Events​