Skip to main content

DIAL as Application Server

Introduction

DIAL can act as an application server facilitating the development, hosting, deployment, and management of GenAI applications.

Why use DIAL?

DIAL provides developers with a comprehensive environment for creating GenAI applications, along with robust middleware and tools to manage applications at every stage of their lifecycle. By leveraging the DIAL technology, you can focus on innovation and core functionality without the need to reinvent the wheel.

Develop

The DIAL platform can be used as a development studio to create GenAI applications. Developers can choose between two primary approaches:

  • Developing from scratch: You can develop an application yourself from ground up. DIAL includes SDK and a unified API to streamline this process. When developing apps, you can use a technology of your preference, be it any LLM framework, LlamaIndex, LangChain, Semantic Kernel, vector DBs or any other. Apps created with DIAL API and SDK are fully compatible with the unified protocol of DIAL, which enables calling them by other applications as agents (building blocks) in a custom multi-agent workflow.
  • Using predefined application templates: DIAL includes predefined templates, known as Application Types, to simplify the creation of specific types of applications. These templates include Quick Apps, Code Apps, and Mind Maps. Application types are based on schemas that define the structure of an application, including application's UI and editor. Additionally, DIAL allows developers to create custom application types, enabling end-users to add specialized apps using no-code editors on UI. Custom application types can support a wide variety of logic, custom UI designs (including non-conversational interfaces), and tailored UI wizards.

The platform also includes a rich set of tools and features to support the development of powerful, multi-agent applications:

  • API: DIAL follows an API-First approach, providing access to all its features via the Unified API.
  • SDK: A framework for creating applications and model adapters for DIAL. Applications and model adapters implemented using this framework are fully compatible with DIAL API, which is based on the Azure OpenAI API.
  • Unified Protocol: The DIAL Core unified protocol is fully compatible with the OpenAI API, making it easier to develop and integrate new applications. This single, standardized protocol is used for all applications and models deployed within DIAL. It supports a wide range of features, including MCP server calling, tool calling, streaming, seeds, multi-modality, and more.
  • Agents: You can use DIAL Unified API to call any other deployed agent (yours or created by other users based on your permissions) directly from your application's code. Similarly, you can allow others to use your apps as agents, promoting reusability and enabling the creation of multi-agent workflows.
  • Application Runners: DIAL supports the concept of application runners which process parameters of specific application types. Refer to Schema-Rich Applications for more details.
  • Experimentation and Prototyping: Application Runners enable users to quickly prototype apps in no-code editors and test them before deployment. Code apps enable power users to write and execute custom Python code directly within the DIAL Chat application and deploy it on the platform's infrastructure in a few clicks. Once ready, applications can be published on the Marketplace or shared with specific users.
  • Middleware: DIAL provides a robust middleware available out of the box to create powerful apps, enabling you to focus on your core business needs without having to reinvent the wheel. The middleware includes:
    • Language models: DIAL supports connectivity with leading LLM vendors, allowing you to configure the system to match your preferences. You can also integrate language models from the open-source community, alternative vendors, fine-tuned micro models, self-hosted models, or those listed on platforms like HuggingFace or DeepSeek.
    • Interceptors: Interceptors let you add custom logic to incoming and outgoing requests for models and applications. This enables functionality such as PII obfuscation, guardrails, safety checks, and more.
    • RBAC: DIAL integrates with various identity providers (IDPs), allowing you to implement and support a custom RBAC system tailored to your organizational needs.
    • Rate Limit: DIAL allows you to define flexible rate limits for JWT and API keys, giving you control over the usage of your applications and models.
    • Load balancer: DIAL includes a powerful load balancer with PTU (Processing Time Units) support, enabling efficient distribution of requests to LLMs across various resources. This helps prevent bottlenecks, improves fault tolerance, and optimizes costs.
    • Observability: With tools powered by OpenTelemetry, DIAL provides insights into your system's performance and health, helping you monitor and optimize your applications.
    • Evaluation toolkit: A suite of tools designed to evaluate the retrieval and generation capabilities of RAG-like (Retrieval-Augmented Generation) applications.
    • Additional Tools: DIAL also includes tools for collecting and visualizing usage analytics, managing logs, debugging applications, and more.
  • Real-time Co-development: DIAL supports real-time collaboration during application development. By granting WRITE permissions, you can enable other users to work on the same application simultaneously, streamlining teamwork and accelerating development.

Host and Deploy

Quick apps, Code apps and Mind Maps

Applications of standard types (e.g. Quick Apps, Code Apps, and Mind Maps) are automatically hosted and deployed on the DIAL infrastructure. This eliminates the need for developers to manage tasks like hosting, scaling, file storage, and application management, as these are handled seamlessly by DIAL.

Apps hosted and deployed outside DIAL

DIAL also enables the usage of applications that are deployed and hosted outside its infrastructure. These external applications can be enabled through the DIAL API, the DIAL Chat UI wizards, direct modifications to the DIAL Core configuration, or via the DIAL Admin.

Refer to Tutorials for Developers to learn more.

Test and Evaluate

DIAL provides a range of tools to help developers test, experiment, and evaluate applications before they are released. These tools ensure that applications meet quality standards and perform as expected. Key features include:

  • Evaluation toolkit: DIAL offers a dedicated UI and library to calculate metrics and evaluate the quality of information retrieval and generation for RAG-like (Retrieval-Augmented Generation) applications. Watch a demo video.
  • Access to application logs: You can use the DIAL API and DIAL Chat UI to access and review logs for Code Apps, enabling efficient debugging and monitoring.
  • Observability and troubleshooting: DIAL leverages OpenTelemetry (OTEL) to enable system monitoring. A comprehensive metrics and tracing capabilities help identify bottlenecks, analyze health and performance, visualize metrics in tools like Grafana and PowerBI. Refer to Observability for more details.
  • Usage analytics: DIAL includes a specialized service called DIAL Analytics Realtime, which leverages techniques such as embedding algorithms, clustering, and lightweight self-hosted language models to analyze chat completion logs. The extracted insights can be visualized in tools like Grafana, providing actionable analytics for optimization. Refer to Analytics for more details.

Release

Before an application is released, it is accessible only to you as its creator. Once published, system administrators can access and manage the application through the API, DIAL Admin Panel or DIAL Chat Admin space.

Once your application is ready to be released, you can share it with specific users or publish it on the DIAL Marketplace to make it available to a broader audience.

Sharing

Sharing allows you to grant READ or WRITE access to specific users. You can share applications via the DIAL Chat UI and API, and you can revoke access at any time. Additionally, you can grant permissions to allow users to re-share the application with others.

Publishing

Publishing makes your application available on the DIAL Marketplace, where it can be accessed by users with specific roles or by all users, depending on the rules defined in your publication request. Published applications are accessible through the Marketplace, DIAL Chat, and the API. Publishing requires approval from a platform administrator, and you can revoke access to a published application at any time after it has been approved.

Operate

DIAL provides a comprehensive set of tools for application owners to access, manage, monitor, and optimize their applications throughout the entire lifecycle.

Access and Management

Users can access and manage their applications though the DIAL Chat Workspace or API. These interfaces allow users to perform a variety of actions, including editing applications, revoking usage, collaborating with others, and managing application settings.

Monitoring

DIAL offers robust monitoring capabilities to help developers and administrators analyze application performance and usage patterns:

The Analytics Realtime component enables the analysis of chat completion logs and extracts actionable insights. These insights may include any calculated statistics such as user activity, usage patterns, conversation topics, or sentiment analysis and can be visualized using tools like Grafana and PowerBI.

You can use DIAL API and DIAL Chat UI to view logs of Code apps, aiding in debugging and monitoring.

Administrators can view both real-time and historical metrics for applications. They can monitor usage patterns, enforce SLAs, optimize costs, and troubleshoot anomalies via the DIAL Admin Panel.

Evaluation

To optimize application performance, developers can use the Evaluation Toolkit to fine-tune retrieval and generation behaviors. This toolkit allows for the comparison of application outputs against ground truth data, as well as the calculation of evaluation metrics.

Watch a demo video.

Auto-Scaling

DIAL provides automatic scaling for Code Apps, dynamically adjusting resources based on real-time workload demands. This ensures optimal performance and cost efficiency, even during periods of high usage.