AI functions are experimental. Set
allow_experimental_ai_functions to enable them.AI functions can return unpredictable outputs. The result will highly depend on the quality of the prompt and the model used.
- Quota enforcement: Per-query limits on tokens (
ai_function_max_input_tokens_per_query,ai_function_max_output_tokens_per_query) and API calls (ai_function_max_api_calls_per_query). - Retry with backoff: Transient failures are retried (
ai_function_max_retries) with exponential backoff (ai_function_retry_initial_delay_ms).
Configuration
AI functions resolve provider credentials and configuration from a named collection. To set a named collection to use for credentials, use theai_function_credentials setting.
Example statement to create a named collection with provider credentials:
ai_function_credentials setting, for the session or for a single query:
ai_function_credentials is empty (the default), an exception is raised.
Named collection parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
provider | String | — | Model provider. Supported: 'openai', 'anthropic'. See note below. |
endpoint | String | — | API endpoint URL. |
model | String | — | Model name (e.g. 'gpt-4o-mini', 'text-embedding-3-small'). |
api_key | String | — | Authentication key for the provider. Optional: when omitted, the auth header is not sent, which allows targeting OpenAI-compatible servers that do not require authentication. |
max_tokens | UInt64 | 1024 | Maximum number of output tokens per API call. |
api_version | String | — | API version string. Used by Anthropic ('2023-06-01'). |
Any OpenAI-compatible API (e.g. vLLM, Ollama, LiteLLM) can be used by setting
provider = 'openai' and pointing the endpoint to your service.Query-level settings
Which named collection to use is controlled by theai_function_credentials setting. Other AI-related settings are listed in Settings under the ai_function_ prefix.
Use in DEFAULT and MATERIALIZED columns
The ai_function_credentials setting is read when the default expression is evaluated, NOT when the column is defined. The collection name is not stored in the column definition:
allow_experimental_ai_functions and ai_function_credentials must be set, and the evaluating user must hold GRANT NAMED COLLECTION on the collection (resolving the credentials runs a NAMED COLLECTION access check). Any of them missing raises an exception (SUPPORT_IS_DISABLED, an empty-credentials error, or ACCESS_DENIED).
A DEFAULT column is evaluated at INSERT, so both settings must be set in the inserting session or query:
MATERIALIZED column is computed at INSERT like a DEFAULT column, and is also recomputed by mutations such as ALTER TABLE ... MATERIALIZE COLUMN. Mutations run outside a user session and do not inherit a query’s SETTINGS clause, but they do inherit settings from a settings profile. Set both settings in a settings profile, and grant NAMED COLLECTION to the table owner, for mutation-driven recomputation to succeed.
Restricting endpoint hosts
Theendpoint URL in an AI named collection is an outbound destination the server connects to under its own identity, potentially carrying (if specified) the named collection’s api_key in the request headers. By default, ClickHouse permits any host. To restrict functions to a specific set of providers, configure remote_url_allow_hosts in the server config, e.g.:
Supported providers
| Provider | provider value | Chat functions | Notes |
|---|---|---|---|
| OpenAI | 'openai' | Yes | Default provider. |
| Anthropic | 'anthropic' | Yes | Uses /v1/messages endpoint. |
Observability
AI function activity is tracked through ClickHouse ProfileEvents:| ProfileEvent | Description |
|---|---|
AIAPICalls | Number of HTTP requests made to the AI provider. |
AIInputTokens | Total input tokens consumed. |
AIOutputTokens | Total output tokens consumed. |
AIRowsProcessed | Number of rows that received a result. |
AIRowsSkipped | Number of rows skipped (quota exceeded, or error with ai_function_throw_on_error = 0). |
aiClassify
Introduced in: v26.4.0 Classifies the given text into one of the provided categories using an LLM provider. The function sends the text together with a fixed classification prompt and a JSON-schema response format constraining the model to return exactly one of the supplied labels. When the response is returned as a JSON object of the form{"category": "..."}, the label is unwrapped and the label string is returned.
Provider credentials and configuration are taken from the named collection specified by the ai_function_credentials setting.
Syntax
AIClassify
Arguments
text— Text to classify.Stringcategories— Constant list of candidate category labels.Array(String)temperature— Sampling temperature controlling randomness. Default:0.0.Float64
ai_function_throw_on_error is disabled. String
Examples
Classify sentiment
Query
Response
Query
Response
aiEmbed
Introduced in: v26.6.0 Generates an embedding vector for the given text using the configured AI provider. The function sends the text to the configured embedding endpoint and returns the resulting vector asArray(Float32).
Within a single block of rows, inputs are grouped into batches of up to
ai_function_embedding_max_batch_size
entries per HTTP request to reduce per-call overhead.
Provider credentials and configuration are taken from the named collection specified by the ai_function_credentials setting.
The optional dimensions argument, when supported by the model (e.g. OpenAI’s text-embedding-3-*),
requests a vector of the given size; otherwise the model’s native size is returned.
Syntax
text— Text to embed.Stringdimensions— Optional target dimensionality for the output vector.0or omitted means the model’s native size.UInt64
ai_function_throw_on_error is disabled, or a quota was exceeded with ai_function_throw_on_quota_exceeded disabled. Array(Float32)
Examples
Embed a single string
Query
Response
Query
Response
Query
Response
aiExtract
Introduced in: v26.4.0 Extracts structured information from unstructured text using an LLM provider. The second argument may be either a free-form natural-language instruction (e.g.'the main complaint') or a
JSON-encoded schema of the form '{"field_a": "description of field a", "field_b": "description of field b"}'.
In instruction mode, the function returns the extracted value as a plain string, or an empty string if nothing was found.
In schema mode, the function returns a JSON object string whose keys match the requested schema; missing fields are null.
Provider credentials and configuration are taken from the named collection specified by the ai_function_credentials setting.
Syntax
AIExtract
Arguments
text— Text to extract information from.Stringinstruction_or_schema— Free-form extraction instruction, or a constant JSON object describing the fields to extract.const Stringtemperature— Sampling temperature controlling randomness. Default:0.0.const Float64
ai_function_throw_on_error is disabled. String
Examples
Free-form instruction
Query
Response
Query
Response
aiGenerate
Introduced in: v26.4.0 Generates free-form text content from a prompt using an LLM provider. The function sends the prompt to the configured AI provider and returns the generated text. An optional system prompt can be provided to guide the model’s behavior (e.g. tone, format, role). If no system prompt is given, the default system prompt is:You are a helpful assistant. Provide a clear and concise response.
Provider credentials and configuration are taken from the named collection specified by the ai_function_credentials setting.
Syntax
AIGenerate
Arguments
prompt— The user prompt or question to send to the model.Stringsystem_prompt— Optional constant system-level instruction that guides the model’s behavior (e.g. persona, output format), sent along with each prompt.Stringtemperature— Sampling temperature controlling randomness. Default:0.7.Float64
ai_function_throw_on_error is disabled. String
Examples
Simple question
Query
Response
Query
Response
Query
Response
aiTranslate
Introduced in: v26.4.0 Translates the given text into the specified target language using an LLM provider. Additional style or dialect instructions may be passed as a third argument (e.g.'keep technical terms untranslated').
Provider credentials and configuration are taken from the named collection specified by the ai_function_credentials setting.
Syntax
AITranslate
Arguments
text— Text to translate.Stringtarget_language— Target language name or BCP-47 code (e.g.'French','es-MX').Stringinstructions— Optional constant additional instructions for the translator.Stringtemperature— Sampling temperature controlling randomness. Default:0.3.Float64
ai_function_throw_on_error is disabled. String
Examples
Translate to French
Query
Response
Query
Response