Access Control for API Keys
Introduction
API Keys can be used by external applications to access DIAL Core resources such as models, applications, toolsets and routes. DIAL Core uses them for server-to-server authentication and access control. In this document, you can learn how to define API keys, give access to resources in DIAL and define access limits.
Step 1: Define API Keys
To use API keys, you need to define them. API keys can be defined in DIAL Core dynamic settings and by DIAL administrators in DIAL Admin.
DIAL Core configuration
API keys can be defined in the keys.<core_key> section in the DIAL Core configuration file.
Refer to DIAL Core documentation to get familiar with the description of the configuration parameters of API keys.
Requirements:
- An API Key should be a secure random key of at least 128 bit size.
- API keys must be associated with a project and role; otherwise a key is invalid. Refer to Roles to learn more about them.
In the following example, "myApiKey" API key is created for a project "MyProject" with the "myRole" role:
//Example extract from aidial.config.json
"keys": {
"myApiKey": {
"project": "MyProject",
"role": "myRole"
}
}
Step 2: Define Access (Optional)
Access Control Overview
Access control in DIAL rests upon the concept of Objects of access (what we protect) and Subjects of access (who we give access to) and Actions (what kind of access is given). Objects refer to entities such as Models, Applications, Toolsets, Files, Prompts, and Conversations. Subjects are actors who access Objects, trying to create, update, delete, read, or use them.
API Keys are Subjects used by external applications to access DIAL Core resources. DIAL Core uses them for server-to-server authentication and access control.
- Refer to Authentication to learn more about authentication in DIAL.
- Refer to Access Control to learn more about access control in DIAL.
The configuration in the previous step gives access to for a private space of API key and resources in the public space that are not limited by roles or available for a role "myRole".
Refer to Access Control to learn more about access control in DIAL.
DIAL Core configuration
To provide access to additional resources in DIAL Core, you need to associate the role assigned to the API key with specific resources. You can do this by adding the API key role to the userRoles parameter of a corresponding deployment in DIAL Core configuration.
In the following example, the "myRole" role is granted access to the chat-gpt-35-turbo language model. Using the same pattern, you can define user access to applications, toolset and routes.
In this example, we give access for myRole role for the chat-gpt-35-turbo model.
//Example extract from aidial.config.json
"keys": {
"myApiKey": {
"project": "MyProject",
"role": "myRole"
}
},
"models": {
"chat-gpt-35-turbo": {
"userRoles": [
"myRole"
]
}
}
Step 3: Define Limits (Optional)
You can define limits on number of requests to models, number of allowed tokens, cost and sharing limits for API keys. This is done in the roles section of DIAL Core configuration.
In the roles section, you can define limits for specific roles assigned to API keys.
- Refer to Roles to learn more about them.
- Refer to DIAL Core documentation to see configuration guidelines for roles.
How it works
If limits are not defined for a specific role, the limits of the default role apply. If limits are not defined for the default role, the value is unlimited.
All limits operate in parallel, with requests needing to satisfy all applicable limits. When a request is rejected, users receive notifications explaining which specific constraint was triggered.
Request Limit
A role can be configured to have a limit on a number of requests to a specific model in a given timeframe.
How it works
A request limit controls the number of requests that can be sent to a specific resource within a defined timeframe. The system tracks the number of requests made, enforcing limits across hourly or daily intervals. When a limit is reached, additional requests are rejected until the time window refreshes.
Lower level settings (requestHour) will be limited by higher level settings (requestDay). In other words, you can have an unlimited calls per hour (requestHour in NULL) until you hit the daily limits.
DIAL Core configuration
Refer to DIAL Core documentation to see configuration guidelines for token and requests limits.
Request limits can be defined in the roles.<role_name>.limits section of the DIAL Core dynamic settings. The following values are available:
requestHour: Total requests per hour that can be sent to a specific resource. Default: refer to the default logic description.requestDay: Total requests per day that can be sent to a specific resource. Default: refer to the default logic description.
In this example, we define request limits for myRole role for the chat-gpt-35-turbo model.
"keys": {
"myApiKey": {
"project": "MyProject",
"role": "myRole"
}
},
"models": {
"chat-gpt-35-turbo": {
"userRoles": [
"myRole"
]
}
},
"roles": {
"myRole": {
"limits": {
"chat-gpt-35-turbo": {
"requestHour": "100000",
"requestDay": "10000000",
}
},
}
}
Token Limit
Token rate limiting controls the volume of tokens processed by AI models within specific timeframes.
How it works
A token limit controls the number of tokens that can be sent to a specific resource within a defined timeframe. The system tracks the number of tokens used, enforcing limits across minute, daily, weekly, or monthly intervals. When a limit is reached, additional requests are rejected until the time window refreshes.
Lower level settings (e.g., minute) will be limited by higher level settings (e.g., day, week, month). In other words, you can have an unlimited tokens per minute (minute in NULL) until you hit the daily limits.
DIAL Core configuration
Refer to DIAL Core documentation to see configuration guidelines for token and requests limits.
Token limits can be defined in the roles.<role_name>.limits section of the DIAL Core dynamic settings. The following values are available:
minute: Total tokens per minute that can be sent to a specific resource, managed via floating window approach for well-distributed rate limiting. Default: refer to the default logic description.day: Total tokens per day that can be sent to a specific resource, managed via floating window approach for balanced rate limiting. Default: refer to the default logic description.week: Total tokens per week that can be sent to a specific resource, managed via floating window approach for balanced rate limiting. Default: refer to the default logic description.month: Total tokens per month that can be sent to a specific resource, managed via floating window approach for balanced rate limiting. Default: refer to the default logic description.
In this example, we define token limits for myRole role for the chat-gpt-35-turbo model.
"keys": {
"myApiKey": {
"project": "MyProject",
"role": "myRole"
}
},
"models": {
"chat-gpt-35-turbo": {
"userRoles": [
"myRole"
]
}
},
"roles": {
"myRole": {
"limits": {
"chat-gpt-35-turbo": {
"minute": "10000",
"day": "10000000",
"week": "50000000",
"month": "200000000",
}
},
}
}
Cost Limit
Cost-based rate limiting addresses financial governance by setting monetary consumption limits across all models available to a specific role. Unlike token limits, which vary in financial impact depending on the model used, cost limits directly control actual spending.
How it works
Contrary to token limits, which vary in financial impact depending on the model used, cost limits directly control actual spending across all models available for a specific role. The system calculates the financial impact of each model interaction in real-time, tracking currency-based consumption across specified time intervals. This approach automatically accounts for the varying pricing of different models, allowing organizations to implement consistent budget controls regardless of which models are being utilized.
DIAL Core configuration
Refer to DIAL Core documentation to see configuration guidelines for cost limits.
Cost limits can be defined in the roles.<role_name>.costLimit section of the DIAL Core dynamic settings. The following values are available:
minute: Total cost limits per minute in USD applied for a specific role across all models. Default: refer to the default logic description.day: Total cost limits per day in USD applied for a specific role across all models. Default: refer to the default logic description.week: Total cost limits per week in USD applied for a specific role across all models. Default: refer to the default logic description.month: Total cost limits per month in USD applied for a specific role across all models. Default: refer to the default logic description.
In this example, we define cost limits for myRole role for the chat-gpt-35-turbo model.
"keys": {
"myApiKey": {
"project": "MyProject",
"role": "myRole"
}
},
"models": {
"chat-gpt-35-turbo": {
"userRoles": [
"myRole"
]
}
},
"roles": {
"myRole": {
"costLimit": {
"chat-gpt-35-turbo": {
"minute": 0.069,
"day": 100.00,
"week": 500.00,
"month": 2000.00,
}
},
}
}
Sharing Limits
In DIAL, you can share your private resources with other users or applications. In DIAL Core configuration, you can define sharing limits for roles to restrict the maximum number of users who can accept a sharing link for a resource and the time-to-live (TTL) of the sharing invitation link.
- Refer to DIAL Core documentation to see configuration guidelines for sharing limits.
- Refer to Sharing to learn more about sharing feature.
- Refer to Access Control to learn more about access control, Public and Private resources in DIAL.
How it works
When a resource is shared, an invitation link is generated. The invitation_ttl parameter sets the duration (in hours) for which this link remains valid. After this period, the link expires and can no longer be used to access the shared resource.
The max_accepted_users parameter limits the number of unique users who can accept the invitation link and gain access to the shared resource. Once this limit is reached, no additional users can accept the invitation, even if the link is still valid.
DIAL Core configuration
Refer to DIAL Core documentation to see configuration guidelines for sharing limits.
Sharing limits can be defined in the roles.<role_name>.share section of the DIAL Core dynamic settings. The following values are available:
invitation_ttl: TTL of the invitation link. Default: 72 (hrs)max_accepted_users: The maximum number of users who can accept an invitation link for a resource being shared. The limit is applied to the shared resource. Default: 10 for APPLICATION and UNLIMITED for other resource types.
"keys": {
"myApiKey": {
"project": "MyProject",
"role": "myRole"
}
},
"models": {
"chat-gpt-35-turbo": {
"userRoles": [
"myRole"
]
}
},
"roles": {
"myRole": { // the name of the role
"share": {
"APPLICATION": {
"invitation_ttl": "24",
"max_accepted_users": "10"
},
"FILE": {
"invitation_ttl": "24",
"max_accepted_users": "10"
}
}
}
}
Full Configuration Example
//Example extract from aidial.config.json
"keys": {
"myApiKey": { //API key
"project": "MyProject",
"role": "myRole" // the name of the role
}
},
"models": {
"chat-gpt-35-turbo": {
"userRoles": [
"myRole"
]
}
},
"roles": {
"myRole": {
"limits": {
"chat-gpt-35-turbo": {
"requestHour": "100000",
"requestDay": "10000000",
"minute": "100000", //number of tokens per minute
"day": "10000000",
"week": "10000000",
"month": "10000000"
}
},
"costLimit": {
"minute": 10.00, //cost per minute in USD
"day": 100.00,
"week": 500.00,
"month": 2000.00
},
"share": {
"APPLICATION": {
"invitation_ttl": "24",
"max_accepted_users": "10"
},
"FILE": {
"invitation_ttl": "24",
"max_accepted_users": "10"
}
}
}
}