Interceptors

Introduction

Refer to DIAL Interceptors Python SDK for a comprehensive information, configuration and implementation examples.

Interceptors can be seen as a middleware that modifies incoming or outgoing requests according to a specific logic. In DIAL, we use interceptors to facilitate the implementation of a so-called Responsible AI approach and enforce compliance with internal and external privacy regulations and policies.

For example, interceptors can block requests that violate specific regulations, related to restricted domains, or potentially lead to data leaks or biased responses. Another use case is when interceptors allow applications or models to respond solely to specific subjects and anonymize Personally Identifiable Information (PII) from user requests, or cache LLM responses.

Watch a demo video to learn more about interceptors.

Technically speaking, interceptors in DIAL are components inserted into deployments (applications or model adapters) that can be called before or after chat completion requests.

Interceptors in DIAL could be classified into the following categories:

Pre-interceptors that only modify the incoming request from the client (e.g. rejecting requests following certain criteria)
Post-interceptors that only modify the response received from the upstream (e.g. censoring the response)
Generic interceptors that modify both the incoming request and the response from the upstream (e.g. caching the responses)

For example, to implement PII (Personally Identifiable Information) anonymization for all data sent to models through DIAL, you can use a generic interceptor which can employ specific locally deployed NLP models to obfuscate (replace with token) PII in requests (pre-interceptor) and decode it in responses (post-interceptor), effectively ensuring the anonymization of all personal data.

For illustration, the below diagram shows the flow of requests if two interceptors are configured. Every request/response goes through DIAL Core (this is hidden from the diagram for brevity):

Client -> (original request) ->
  Interceptor 1 -> (modified request #1) ->
    Interceptor 2 -> (modified request #2) ->
      Upstream -> (original response) ->
    Interceptor 2 -> (modified response #1) ->
  Interceptor 1 -> (modified response #2) ->
Client

DIAL Core manages chat completion requests from interceptors through the endpoint: /openai/deployments/interceptor/chat/completions. It uses the reserved deployment name interceptor to handle requests from all interceptors. Upon receiving a request, it identifies the next interceptor based on its per-request API key. The final interceptor in the sequence is always the target deployment (application, model).

Interceptors SDK

You can use DIAL Interceptors Python SDK to create your custom interceptors. Refer to Examples for your reference.

DIAL Core Configuration

Interceptors can be defined and assigned in DIAL Core dynamic settings. DIAL administrators can add and assign interceptors via the DIAL Admin Panel.

Refer to Interceptors SDK for detailed configuration guidelines and examples.

Step 1: Declaration

Interceptors that you want to use with your deployments (applications or models) can be defined in the interceptors section in the DIAL Core dynamic settings.

Refer to DIAL Core documentation for a detailed description of configuration parameters.

Example of the DIAL Core dynamic settings configuration

{
  "interceptors": {
    "gpt-cache": {
      "endpoint": "${INTERCEPTOR_SERVICE_URL}/openai/deployments/gpt-cache/chat/completions",
      "description": "description"
    },
    "pii-anonymizer": {
      "endpoint": "${INTERCEPTOR_SERVICE_URL}/openai/deployments/pii-anonymizer/chat/completions",
      "description": "description"
    }
  }
}

Step 2: Usage

Once you have declared your interceptors, you can use them as global, application type, or local interceptors.

When all categories of interceptors are configured, they are triggered in the following sequences:

Chat completion request: global interceptor -> application type interceptor -> local interceptor
Response for the chat completion request: local interceptor -> application type interceptor -> global interceptor

Global interceptors tend to have the most strict rules. They receive original input first and examine the response last.

Local Interceptors

Local interceptors are configured and applied to a specific instance of an application. They can be set by DIAL admin when an application is published or modified, by the application author when creating application via DIAL Core API or in DIAL Core dynamic settings in the applications section.

Configuration of a local interceptor for applications in DIAL Core dynamic settings

Refer to Interceptors SDK for a complete example.

{
  "applications": {
    "app": {
      "endpoint": "http://localhost:7001/openai/deployments/10k/chat/completions",
      "displayName": "App",
      "iconUrl": "https://host/app.svg",
      "interceptors": ["interceptor-id"]
    }
  }
}

Global Interceptors

Global interceptors apply to any deployment in DIAL and tend to have the most strict rules, because they receive original input first and examine the response last.

Global interceptors can be defined in the globalInterceptors section in the DIAL Core dynamic settings.

DIAL Core dynamic settings configuration example

{
  "globalInterceptors": ["interceptor-id", "interceptor-id2"]
}

Application Type Interceptors

Application Type interceptors apply to schema-rich applications.

To enable Application Type Interceptors

Provide a JSON schema for a specific application type that includes "dial:applicationTypeInterceptors" property with a list of interceptors in DIAL Core dynamic settings.

{
"applicationTypeSchemas": [
    {
        "$schema": "https://dial.epam.com/application_type_schemas/schema#",
        "$id": "https://mydial.somewhere.com/custom_application_schemas/specific_application_type",
        "dial:applicationTypeEditorUrl": "https://mydial.somewhere.com/custom_application_schemas/schema",
        "dial:applicationTypeViewerUrl": "https://mydial.somewhere.com/custom_application_schemas/viewer",
        "dial:applicationTypeDisplayName": "Specific Application Type",
        "dial:applicationTypeCompletionEndpoint": "http://specific_application_service/openai/v1/completion",
        "dial:applicationTypeInterceptors": [
            "interceptor-id",
            "interceptor-id2"
        ]
    }
]
}

Flow

To demonstrate the flow, lets take two local interceptors gpt-cache and pii-anonymizer configured for the GPT-4 model:

{
  "models": {
    "chat-gpt-4": {
      "interceptors": ["gpt-cache", "pii-anonymizer"]
    }
  }
}

DIAL Core receives a request from DIAL Chat to query the GPT-4 model.
The first interceptor, gpt-cache, checks the cache for the request. If found, the response is returned to DIAL Core; if not, the request is forwarded to pii-anonymizer.
The pii-anonymizer interceptor anonymizes any personally identifiable information (PII) in the request and forwards it to GPT-4.
After all interceptors have processed the request, DIAL Core sends it directly to the GPT-4 model.
DIAL Core retrieves the response from GPT-4 and forwards it to pii-anonymizer.
The pii-anonymizer interceptor restores the original PII in the response and passes it to gpt-cache.
The gpt-cache interceptor stores the response in the cache and returns it to DIAL Core.
DIAL Core sends the final response back to DIAL Chat.

Introduction​

Interceptors SDK​

DIAL Core Configuration​

Step 1: Declaration​

Example of the DIAL Core dynamic settings configuration​

Step 2: Usage​

Local Interceptors​

Configuration of a local interceptor for applications in DIAL Core dynamic settings​

Global Interceptors​

DIAL Core dynamic settings configuration example​

Application Type Interceptors​

To enable Application Type Interceptors​

Flow​