Where to select the model
Open the Model tab
Click the Model tab in the agent settings. The configuration dialog displays three columns: providers, available models, and behavior parameters.
Select the provider
In the left column, choose the provider: OpenAI, Anthropic, Google, or OpenRouter. The center column updates with the available models for the selected provider.
How providers work
Timely.ai offers two modes of model access: using the platform’s credits (access via Timely) or connecting your own API key from an external provider.Option 1 — Using Timely credits
Each model consumes a fixed number of credits per execution. You do not need to create accounts with providers — Timely manages authentication and billing.Select any available model
All models listed in the interface are immediately available if you have credits in the workspace.
Check the credit cost
Each model displays the estimated cost per execution. Actual consumption varies based on prompt size, conversation history, and the number of tool calls in the same execution.
Monitor consumption
Track credit consumption in Settings > Billing to adjust the model as conversation volume grows.
Option 2 — Connecting your own provider
Workspaces on the Enterprise plan can configure custom providers, connecting models with their own API keys or open-source models hosted on internal infrastructure.Add the provider credentials
Enter the provider’s API key (OpenAI, Anthropic, or Google). The key is stored with encryption and used exclusively for your workspace’s executions.
Select the custom provider in the agent
When configuring the agent model, the custom provider appears as an additional option in the providers column.
Custom provider configuration requires an Enterprise plan. Contact support to enable this feature in your workspace.
Switching model per agent
Each agent can use a different model within the same workspace. This allows you to optimize cost and performance per use case:- High-volume FAQ agents use cheaper models (Kimi K2.5 or Gemini 2.5 Flash)
- Complex analysis agents use high-capacity models (Claude Opus 4.6 or Gemini 3.1 Pro)
- General support agents use balanced models (Claude Sonnet 4.6, marked as recommended)
- OpenAI GPT-5.x models disable temperature — the slider becomes inactive automatically
- Anthropic models with active reasoning force temperature to 1.0 — the temperature control has no effect
Best practices
- Always start with Claude Sonnet 4.6 — it is the platform’s recommended model for most use cases, with a good balance between quality (4 credits/execution), speed, and tool support
- Test any model change in the Internal chat with representative questions before applying to production
- Monitor latency after switching to larger models — models such as Claude Opus 4.6 and Gemini 3.1 Pro have higher latency that may affect the experience on messaging channels
- Do not switch models based on cost alone — compare response quality in the real scenarios of your use case before migrating