Access Control for End Users
Introduction
End-users authenticate in DIAL using JWT access tokens. In this document, you can learn how to define access to DIAL Core resources and usage limits for JWT.
Step 1: Get JWT from IDP
When users log in, DIAL uses JWT to authenticate them. JWT tokens that represent users are issued by identity service providers (IDPs).
Refer to Configure IDPs to view supported IDPs and learn how to configure them to work with DIAL for users authentication and authorization.
Step 2: Enable Access to Resources
Access Control Overview
Access control in DIAL rests upon the concept of Objects of access (what we protect) and Subjects of access (who we give access to) and Actions (what kind of access is given). Objects refer to entities such as Models, Applications, Toolsets, Files, Prompts, and Conversations. Subjects are actors who access Objects, trying to create, update, delete, read, or use them.
End-users are Subjects that authorize in DIAL Core using JWT access tokens to access DIAL Core resources.
- Refer to Authentication to learn more about authentication in DIAL.
- Refer to Access Control to learn more about access control in DIAL.
Enable Access
By default, all authenticated users get access to public resources access to which is not limited by roles.
Refer to Access Control to learn more about private and public spaces.
To enable user access to additional specific resources in DIAL (like applications, toolsets, routes or language models) you need to associate JWT with a specific resource. This can be done via claims provided in JWT by identity service providers.
To associate a JWT with a specific resource, use a specific claim
value from JWT as a value of the userRoles
parameter of a corresponding deployment in DIAL Core configuration.
In the following example, JWT with claim value myRole
has access to chat-gpt-35-turbo
language model. Using the same pattern, you can define user access to applications, toolset and routes.
"models": {
"chat-gpt-35-turbo": {
"userRoles": [
"myRole"
]
}
}
Step 3: Add Limits
You can define limits on number of requests to models, number of allowed tokens, cost and sharing limits for JWT. This is done in the roles
section of DIAL Core configuration.
In the roles
section, you can define roles matching the claim values provided in JWT and set limits for these roles.
- Refer to Roles to learn more about them.
- Refer to DIAL Core documentation to see configuration guidelines for roles.
How it works
If limits are not defined for a specific role, the limits of the default
role apply. If limits are not defined for the default
role, the value is unlimited.
All limits operate in parallel, with requests needing to satisfy all applicable limits. When a request is rejected, users receive notifications explaining which specific constraint was triggered.
Request Limit
A role can be configured to have a limit on a number of requests to a specific model in a given timeframe.
How it works
A request limit controls the number of requests that can be sent to a specific resource within a defined timeframe. The system tracks the number of requests made, enforcing limits across hourly or daily intervals. When a limit is reached, additional requests are rejected until the time window refreshes.
Lower level settings (requestHour
) will be limited by higher level settings (requestDay
). In other words, you can have an unlimited calls per hour (requestHour
in NULL) until you hit the daily limits.
DIAL Core configuration
Refer to DIAL Core documentation to see configuration guidelines for token and requests limits.
Request limits can be defined in the roles.<role_name>.limits
section of the DIAL Core dynamic settings. The following values are available:
requestHour
: Total requests per hour that can be sent to a specific resource. Default: refer to the default logic description.requestDay
: Total requests per day that can be sent to a specific resource. Default: refer to the default logic description.
In this example, we define request limits for myRole
role for the chat-gpt-35-turbo
model.
"models": {
"chat-gpt-35-turbo": {
"userRoles": [
"myRole"
]
}
},
"roles": {
"myRole": {
"limits": {
"chat-gpt-35-turbo": {
"requestHour": "100000",
"requestDay": "10000000",
}
},
}
}
Token Limit
Token rate limiting controls the volume of tokens processed by AI models within specific timeframes.
How it works
A token limit controls the number of tokens that can be sent to a specific resource within a defined timeframe. The system tracks the number of tokens used, enforcing limits across minute, daily, weekly, or monthly intervals. When a limit is reached, additional requests are rejected until the time window refreshes.
Lower level settings (e.g., minute
) will be limited by higher level settings (e.g., day
, week
, month
). In other words, you can have an unlimited tokens per minute (minute
in NULL) until you hit the daily limits.
DIAL Core configuration
Refer to DIAL Core documentation to see configuration guidelines for token and requests limits.
Token limits can be defined in the roles.<role_name>.limits
section of the DIAL Core dynamic settings. The following values are available:
minute
: Total tokens per minute that can be sent to a specific resource, managed via floating window approach for well-distributed rate limiting. Default: refer to the default logic description.day
: Total tokens per day that can be sent to a specific resource, managed via floating window approach for balanced rate limiting. Default: refer to the default logic description.week
: Total tokens per week that can be sent to a specific resource, managed via floating window approach for balanced rate limiting. Default: refer to the default logic description.month
: Total tokens per month that can be sent to a specific resource, managed via floating window approach for balanced rate limiting. Default: refer to the default logic description.
In this example, we define token limits for myRole
role for the chat-gpt-35-turbo
model.
"models": {
"chat-gpt-35-turbo": {
"userRoles": [
"myRole"
]
}
},
"roles": {
"myRole": {
"limits": {
"chat-gpt-35-turbo": {
"minute": "10000",
"day": "10000000",
"week": "50000000",
"month": "200000000",
}
},
}
}
Cost Limit
Cost-based rate limiting addresses financial governance by setting monetary consumption limits across all models available to a specific role. Unlike token limits, which vary in financial impact depending on the model used, cost limits directly control actual spending.
How it works
Contrary to token limits, which vary in financial impact depending on the model used, cost limits directly control actual spending across all models available for a specific role. The system calculates the financial impact of each model interaction in real-time, tracking currency-based consumption across specified time intervals. This approach automatically accounts for the varying pricing of different models, allowing organizations to implement consistent budget controls regardless of which models are being utilized.
DIAL Core configuration
Refer to DIAL Core documentation to see configuration guidelines for cost limits.
Cost limits can be defined in the roles.<role_name>.costLimit
section of the DIAL Core dynamic settings. The following values are available:
minute
: Total cost limits per minute in USD applied for a specific role across all models. Default: refer to the default logic description.day
: Total cost limits per day in USD applied for a specific role across all models. Default: refer to the default logic description.week
: Total cost limits per week in USD applied for a specific role across all models. Default: refer to the default logic description.month
: Total cost limits per month in USD applied for a specific role across all models. Default: refer to the default logic description.
In this example, we define cost limits for myRole
role for the chat-gpt-35-turbo
model.
"models": {
"chat-gpt-35-turbo": {
"userRoles": [
"myRole"
]
}
},
"roles": {
"myRole": {
"costLimit": {
"chat-gpt-35-turbo": {
"minute": 0.069,
"day": 100.00,
"week": 500.00,
"month": 2000.00,
}
},
}
}
Sharing Limits
In DIAL, you can share your private resources with other users or applications. In DIAL Core configuration, you can define sharing limits for roles to restrict the maximum number of users who can accept an invitation link for a resource (APPLICATION or FILE) being shared.
- Refer to DIAL Core documentation to see configuration guidelines for sharing limits.
- Refer to Sharing to learn more about sharing feature.
- Refer to Access Control to learn more about access control, Public and Private resources in DIAL.
How it works
When a resource is shared, an invitation link is generated. The invitation_ttl
parameter sets the duration (in hours) for which this link remains valid. After this period, the link expires and can no longer be used to access the shared resource.
The max_accepted_users
parameter limits the number of unique users who can accept the invitation link and gain access to the shared resource. Once this limit is reached, no additional users can accept the invitation, even if the link is still valid.
DIAL Core configuration
Refer to DIAL Core documentation to see configuration guidelines for sharing limits.
Sharing limits can be defined in the roles.<role_name>.share
section of the DIAL Core dynamic settings. The following values are available:
invitation_ttl
: TTL of the invitation link. Default: 72 (hrs)max_accepted_users
: The maximum number of users who can accept an invitation link for a resource being shared. The limit is applied to the shared resource. Default: 10 for APPLICATION and UNLIMITED for other resource types.
"models": {
"chat-gpt-35-turbo": {
"userRoles": [
"myRole"
]
}
},
"roles": {
"myRole": { // the name of the role
"share": {
"APPLICATION": {
"invitation_ttl": "24",
"max_accepted_users": "10"
},
"FILE": {
"invitation_ttl": "24",
"max_accepted_users": "10"
}
}
}
}
Full Configuration Example
//Example extract from aidial.config.json
"models": {
"chat-gpt-35-turbo": {
"userRoles": [
"myRole"
]
}
},
"roles": {
"myRole": {
"limits": {
"chat-gpt-35-turbo": {
"requestHour": "100000",
"requestDay": "10000000",
"minute": "100000", //number of tokens per minute
"day": "10000000",
"week": "10000000",
"month": "10000000"
}
},
"costLimit": {
"minute": 10.00, //cost per minute in USD
"day": 100.00,
"week": 500.00,
"month": 2000.00
},
"share": {
"APPLICATION": {
"invitation_ttl": "24",
"max_accepted_users": "10"
},
"FILE": {
"invitation_ttl": "24",
"max_accepted_users": "10"
}
}
}
}