Per-Request Keys
Per-request keys are used to manage access to user files for applications, enable open telemetry for tracing and realize cost control in a lifespan of a particular request. They also play a key role for external applications accessing language models and applications deployed in AI DIAL.
Per-request keys are generated by AI DIAL Core, when it is making a request to the application and is valid only during the lifetime of this particular request from the Core to the application.
How It Works
- Let's consider that AI DIAL Core can receive an initial request from AI DIAL Chat or any of its clients. Such request we will call root requests. Example:
AI DIAL Chat --> AI DIAL Core
- AI DIAL Core can perform a series of requests to realize the initial request - let's call them non-root requests. Example:
AI DIAL Chat --> AI DIAL Core --> Application A --> AI DIAL Core --> Application B...
Root Request:
- AI DIAL Core checks that the caller (chat user or application) has enough permissions and limits based on the configuration to perform the request and generates a new per-request key specifically for this request.
- AI DIAL Core stores an additional data associated with the per-request key either in memory or in Redis. The data may include but is not limited to: JWT of the user or API key of the application,
trace-id
,core-parent-span-id
, list of the user files attached to the request. - AI DIAL Core passes the per-request key to the application or model adapter in the API-key header. The application or model adapter then send the per-request key back to AI DIAL Core using the same API-key header for model requests or file storage API requests.
- AI DIAL Core invalidates the per-request key when the request is completed.
Non-root Requests:
- For all non-root requests, AI DIAL Core checks, that the incoming per-request key is valid and known (the one from the root request).
- It checks the permissions of the caller and then generates and saves a new outgoing per-request key. This key is also associated with specific data: all data from the root request (its parent request) and the necessary data for the current request.
- AI DIAL Core passes this new outgoing per-request key in the request header. The application or model adapter then send the per-request key back to AI DIAL Core using the same API-key header for model requests or file storage API requests.
- AI DIAL Core invalidates the outgoing per-request key when the request is completed.
Calculating Costs & Limits
Per-request keys can be used to attribute costs and limits incurred by a request to a model to the user or application that initiated the request.
To attribute costs and limits to the request originator, traceparent
should be included in the request header. trace-id
is saved throughout the lifespan of the request in memory or in Redis.
Gathering Statistics
For calculating statistics, AI DIAL Core uses trace-id
and core-parent-span-id
that are stored in the additional data linked to the per-request key.
Files Sharing
Initially, AI DIAL Core verifies the permissions and limits of the request originator using the JWT or API Key associated with the initial request. It then generates a per-request key and links specific user files to it. Throughout the duration of the per-request key and across the entire call stack, AI DIAL Core uses this associated data to determine access rights for the application. If the application attempts to share a file which is not in the list of files associated with the per-request key or directly accessible to this application, the request results in 403 Forbidden error.
To share the output files with the user, AI DIAL Core grants a full access to the authorized application to a specific output folder in the user's bucket.
To provide the path of the folder for output files, we add appdata
field to the response for the GET /v1/bucket. If this request is made with a per-request key the response will contain appdata
with the path to the shared folder, including the user's bucket and the correct deployment-id
for the application as registered in the AI DIAL Core configs.
{
"bucket": "{application-bucket-id}",
"appdata": "{user-bucket-id}/appdata/{deployment-id}",
}
Telemetry Tracing
For tracing open telemetry, traceparent
should be included in the request header. The open telemetry tracing does not interfere with the limits, statistics or file sharing.
Access and Cost Control for External Applications
Applications in AI DIAL can use routes
for communication through registered in AI DIAL Core endpoints, which may not necessarily adhere to the AI DIAL API. Routes, therefore, act as a bridging mechanism between the AI DIAL Core and external applications, facilitating seamless interactions.
Once a route with a designated endpoint is set up in AI DIAL Core, it allows client applications, such as AI DIAL Chat for example, to interact with this endpoint. Essentially, AI DIAL Core functions as an intermediary, handling authentication and authorization between the client and the external application linked to the route.
External applications, do not have direct access to the resources within AI DIAL. Still, they might need to retrieve user data or interact with other conversational agents and language models available in AI DIAL to perform their functions.
Per-request keys are issued for routes to enable:
- Access to language models and applications
- A dedicated workspace within a BLOB store for routes, allowing them to read and write files under
/Keys/<route_name>/
. - The ability to fetch user information via the
/v1/user/info
endpoint.
To manage access and control costs for external applications behind routes, it is possible to assign specific roles for routes.
In the following example, a route myApp
has a user role app_user
assigned to it. This means, that a user with app_user
role can access myApp
route within the defined limits requestsPerMin
.
{
"routes": {
"myApp": {
"userRoles": ["app_user"] // user must have app_user role in order to access the route
}
},
"roles": {
"app_user": {
"limits": {
"myApp": {
"requestsPerMin": "1000", // user with the app_user role can call up to 1000 requests per min for the route myApp
}
}
}
}
}
Refer to AI DIAL Core config to see the full example.
Example
For instance, a user of AI DIAL Chat may request an external RAG (Retrieval-Augmented Generation) application to generate a response to a prompt based on an attached file. In this scenario, AI DIAL Chat uses a designated route to interact with the external RAG application. A per-request key is specifically generated for this interaction to ensure secure and authorized communication.
Note: Access to models deployed in AI DIAL and request limits are determined by the roles set up in AI DIAL Core, which are assigned to both the route and the user.
The external RAG application, upon receiving the request, uses the per-request key to call a language model within AI DIAL. This model processes the attached file and generates the necessary response. Subsequently, the response is relayed back to the user in AI DIAL Chat, completing the interaction loop.