Skip to main content

Access Control for End Users

Introduction

End-users authenticate in DIAL using JWT access tokens. In this document, you can learn how to define access to DIAL Core resources and usage limits for JWT.

Step 1: Get JWT from IDP

When users log in, DIAL uses JWT to authenticate them. JWT tokens that represent users are issued by identity service providers (IDPs).

Refer to Configure IDPs to view supported IDPs and learn how to configure them to work with DIAL for users authentication and authorization.

Step 2: Enable Access to Resources

Access Control Overview

Access control in DIAL rests upon the concept of Objects of access (what we protect) and Subjects of access (who we give access to) and Actions (what kind of access is given). Objects refer to entities such as Models, Applications, Toolsets, Files, Prompts, and Conversations. Subjects are actors who access Objects, trying to create, update, delete, read, or use them.

End-users are Subjects that authorize in DIAL Core using JWT access tokens to access DIAL Core resources.

Enable Access

By default, all authenticated users get access to public resources access to which is not limited by roles.

Refer to Access Control to learn more about private and public spaces.

To enable user access to additional specific resources in DIAL (like applications, toolsets, routes or language models) you need to associate JWT with a specific resource. This can be done via claims provided in JWT by identity service providers.

To associate a JWT with a specific resource, use a specific claim value from JWT as a value of the userRoles parameter of a corresponding deployment in DIAL Core configuration.

In the following example, JWT with claim value myRole has access to chat-gpt-35-turbo language model. Using the same pattern, you can define user access to applications, toolset and routes.

"models": {
"chat-gpt-35-turbo": {
"userRoles": [
"myRole"
]
}
}

Step 3: Add Limits

You can define limits on number of requests to models, number of allowed tokens, cost and sharing limits for JWT. This is done in the roles section of DIAL Core configuration.

In the roles section, you can define roles matching the claim values provided in JWT and set limits for these roles.

How it works

If limits are not defined for a specific role, the limits of the default role apply. If limits are not defined for the default role, the value is unlimited.

All limits operate in parallel, with requests needing to satisfy all applicable limits. When a request is rejected, users receive notifications explaining which specific constraint was triggered.

Request Limit

A role can be configured to have a limit on a number of requests to a specific model in a given timeframe.

How it works

A request limit controls the number of requests that can be sent to a specific resource within a defined timeframe. The system tracks the number of requests made, enforcing limits across hourly or daily intervals. When a limit is reached, additional requests are rejected until the time window refreshes.

Lower level settings (requestHour) will be limited by higher level settings (requestDay). In other words, you can have an unlimited calls per hour (requestHour in NULL) until you hit the daily limits.

DIAL Core configuration

Refer to DIAL Core documentation to see configuration guidelines for token and requests limits.

Request limits can be defined in the roles.<role_name>.limits section of the DIAL Core dynamic settings. The following values are available:

  • requestHour: Total requests per hour that can be sent to a specific resource. Default: refer to the default logic description.
  • requestDay: Total requests per day that can be sent to a specific resource. Default: refer to the default logic description.

In this example, we define request limits for myRole role for the chat-gpt-35-turbo model.

"models": {
"chat-gpt-35-turbo": {
"userRoles": [
"myRole"
]
}
},
"roles": {
"myRole": {
"limits": {
"chat-gpt-35-turbo": {
"requestHour": "100000",
"requestDay": "10000000",
}
},
}
}

Token Limit

Token rate limiting controls the volume of tokens processed by AI models within specific timeframes.

How it works

A token limit controls the number of tokens that can be sent to a specific resource within a defined timeframe. The system tracks the number of tokens used, enforcing limits across minute, daily, weekly, or monthly intervals. When a limit is reached, additional requests are rejected until the time window refreshes.

Lower level settings (e.g., minute) will be limited by higher level settings (e.g., day, week, month). In other words, you can have an unlimited tokens per minute (minute in NULL) until you hit the daily limits.

DIAL Core configuration

Refer to DIAL Core documentation to see configuration guidelines for token and requests limits.

Token limits can be defined in the roles.<role_name>.limits section of the DIAL Core dynamic settings. The following values are available:

  • minute: Total tokens per minute that can be sent to a specific resource, managed via floating window approach for well-distributed rate limiting. Default: refer to the default logic description.
  • day: Total tokens per day that can be sent to a specific resource, managed via floating window approach for balanced rate limiting. Default: refer to the default logic description.
  • week: Total tokens per week that can be sent to a specific resource, managed via floating window approach for balanced rate limiting. Default: refer to the default logic description.
  • month: Total tokens per month that can be sent to a specific resource, managed via floating window approach for balanced rate limiting. Default: refer to the default logic description.

In this example, we define token limits for myRole role for the chat-gpt-35-turbo model.

"models": {
"chat-gpt-35-turbo": {
"userRoles": [
"myRole"
]
}
},
"roles": {
"myRole": {
"limits": {
"chat-gpt-35-turbo": {
"minute": "10000",
"day": "10000000",
"week": "50000000",
"month": "200000000",
}
},
}
}

Cost Limit

Cost-based rate limiting addresses financial governance by setting monetary consumption limits across all models available to a specific role. Unlike token limits, which vary in financial impact depending on the model used, cost limits directly control actual spending.

How it works

Contrary to token limits, which vary in financial impact depending on the model used, cost limits directly control actual spending across all models available for a specific role. The system calculates the financial impact of each model interaction in real-time, tracking currency-based consumption across specified time intervals. This approach automatically accounts for the varying pricing of different models, allowing organizations to implement consistent budget controls regardless of which models are being utilized.

DIAL Core configuration

Refer to DIAL Core documentation to see configuration guidelines for cost limits.

Cost limits can be defined in the roles.<role_name>.costLimit section of the DIAL Core dynamic settings. The following values are available:

  • minute: Total cost limits per minute in USD applied for a specific role across all models. Default: refer to the default logic description.
  • day: Total cost limits per day in USD applied for a specific role across all models. Default: refer to the default logic description.
  • week: Total cost limits per week in USD applied for a specific role across all models. Default: refer to the default logic description.
  • month: Total cost limits per month in USD applied for a specific role across all models. Default: refer to the default logic description.

In this example, we define cost limits for myRole role for the chat-gpt-35-turbo model.

"models": {
"chat-gpt-35-turbo": {
"userRoles": [
"myRole"
]
}
},
"roles": {
"myRole": {
"costLimit": {
"chat-gpt-35-turbo": {
"minute": 0.069,
"day": 100.00,
"week": 500.00,
"month": 2000.00,
}
},
}
}

Sharing Limits

In DIAL, you can share your private resources with other users or applications. In DIAL Core configuration, you can define sharing limits for roles to restrict the maximum number of users who can accept an invitation link for a resource (APPLICATION or FILE) being shared.

  • Refer to DIAL Core documentation to see configuration guidelines for sharing limits.
  • Refer to Sharing to learn more about sharing feature.
  • Refer to Access Control to learn more about access control, Public and Private resources in DIAL.
How it works

When a resource is shared, an invitation link is generated. The invitation_ttl parameter sets the duration (in hours) for which this link remains valid. After this period, the link expires and can no longer be used to access the shared resource.

The max_accepted_users parameter limits the number of unique users who can accept the invitation link and gain access to the shared resource. Once this limit is reached, no additional users can accept the invitation, even if the link is still valid.

DIAL Core configuration

Refer to DIAL Core documentation to see configuration guidelines for sharing limits.

Sharing limits can be defined in the roles.<role_name>.share section of the DIAL Core dynamic settings. The following values are available:

  • invitation_ttl: TTL of the invitation link. Default: 72 (hrs)
  • max_accepted_users: The maximum number of users who can accept an invitation link for a resource being shared. The limit is applied to the shared resource. Default: 10 for APPLICATION and UNLIMITED for other resource types.
"models": {
"chat-gpt-35-turbo": {
"userRoles": [
"myRole"
]
}
},
"roles": {
"myRole": { // the name of the role
"share": {
"APPLICATION": {
"invitation_ttl": "24",
"max_accepted_users": "10"
},
"FILE": {
"invitation_ttl": "24",
"max_accepted_users": "10"
}
}
}
}

Full Configuration Example

//Example extract from aidial.config.json
"models": {
"chat-gpt-35-turbo": {
"userRoles": [
"myRole"
]
}
},
"roles": {
"myRole": {
"limits": {
"chat-gpt-35-turbo": {
"requestHour": "100000",
"requestDay": "10000000",
"minute": "100000", //number of tokens per minute
"day": "10000000",
"week": "10000000",
"month": "10000000"
}
},
"costLimit": {
"minute": 10.00, //cost per minute in USD
"day": 100.00,
"week": 500.00,
"month": 2000.00
},
"share": {
"APPLICATION": {
"invitation_ttl": "24",
"max_accepted_users": "10"
},
"FILE": {
"invitation_ttl": "24",
"max_accepted_users": "10"
}
}
}
}