> ## Documentation Index
> Fetch the complete documentation index at: https://private-7c7dfe99-fix-nav-issues.mintlify.site/llms.txt
> Use this file to discover all available pages before exploring further.

> Documentation for the Apache Arrow Flight interface in ClickHouse, allowing Flight SQL clients to connect to ClickHouse

# Arrow Flight interface

<h2 id="overview">
  Overview
</h2>

ClickHouse supports the [Apache Arrow Flight](https://arrow.apache.org/docs/format/Flight.html) protocol — a high-performance RPC framework for efficient columnar data transport using the [Arrow IPC](https://arrow.apache.org/docs/format/Columnar.html#serialization-and-interprocess-communication-ipc) format over [gRPC](https://grpc.io/).

The implementation includes support for [Arrow Flight SQL](https://arrow.apache.org/docs/format/FlightSql.html), enabling BI tools and applications that speak the Flight SQL protocol to query ClickHouse directly.

Key capabilities:

* Execute SQL queries and retrieve results in Apache Arrow format.
* Insert data into tables using the Arrow format.
* Query metadata (catalogs, schemas, tables, primary keys) via Flight SQL commands.
* Create, bind, execute, and close server-side prepared statements via Flight SQL.
* Manage sessions and settings via Flight SQL actions.
* TLS encryption and username/password authentication.
* Incremental result retrieval via `PollFlightInfo`.
* Query cancellation via `CancelFlightInfo`.

<h2 id="enabling-server">
  Enabling the Arrow Flight Server
</h2>

To enable the Arrow Flight server, add the `arrowflight_port` setting to the ClickHouse server configuration:

```xml theme={null}
<clickhouse>
    <arrowflight_port>9090</arrowflight_port>
</clickhouse>
```

Upon startup, a log message confirms the interface is active:

```text theme={null}
{} <Information> Application: Arrow Flight compatibility protocol: 0.0.0.0:9090
```

<h2 id="tls-configuration">
  TLS configuration
</h2>

To enable TLS for the Arrow Flight interface, configure the following settings:

```xml theme={null}
<clickhouse>
    <arrowflight_port>9090</arrowflight_port>
    <arrowflight>
        <enable_ssl>true</enable_ssl>
        <ssl_cert_file>/path/to/server-cert.pem</ssl_cert_file>
        <ssl_key_file>/path/to/server-key.pem</ssl_key_file>
    </arrowflight>
</clickhouse>
```

When TLS is enabled, clients must connect using the `grpc+tls://` scheme instead of `grpc://`.

<h2 id="authentication">
  Authentication
</h2>

The Arrow Flight interface supports two authentication methods:

<h3 id="basic-auth">
  Basic Authentication
</h3>

Clients authenticate with a username and password via the standard HTTP `Authorization: Basic` header. On successful authentication, the server returns a Bearer token in the response header.

<h3 id="bearer-auth">
  Bearer Token Authentication
</h3>

Subsequent requests can use the Bearer token returned from Basic authentication via the `Authorization: Bearer <token>` header. The token is automatically refreshed on each use and expires based on the `default_session_timeout` server setting (default: 60 seconds).

<h3 id="auth-python-example">
  Python Example
</h3>

```python theme={null}
import pyarrow.flight as flight

client = flight.FlightClient("grpc://localhost:9090")

# Basic auth returns a bearer token for subsequent calls
token_pair = client.authenticate_basic_token("default", "")
options = flight.FlightCallOptions(headers=[token_pair])
```

With TLS:

```python theme={null}
import pyarrow.flight as flight

with open("ca-cert.pem", "rb") as f:
    tls_root_certs = f.read()

client = flight.FlightClient(
    "grpc+tls://localhost:9090",
    tls_root_certs=tls_root_certs,
)

token_pair = client.authenticate_basic_token("default", "password")
options = flight.FlightCallOptions(headers=[token_pair])
```

<h2 id="session-management">
  Session Management
</h2>

The Arrow Flight interface supports ClickHouse sessions through custom gRPC metadata headers:

| Header                         | Description                                                                                                                           |
| ------------------------------ | ------------------------------------------------------------------------------------------------------------------------------------- |
| `x-clickhouse-session-id`      | Session identifier. If provided, multiple requests share the same session state (temporary tables, settings).                         |
| `x-clickhouse-session-timeout` | Session timeout in seconds. Must not exceed `max_session_timeout`.                                                                    |
| `x-clickhouse-session-check`   | Set to `1` to check if the session exists without creating one.                                                                       |
| `x-clickhouse-session-close`   | Set to `1` to close the session after the request completes. Requires `enable_arrow_close_session` to be `true` in the server config. |

<Note>
  Because Arrow Flight uses gRPC over HTTP/2, metadata header names are case-sensitive and must be specified in lowercase exactly as shown (e.g., `x-clickhouse-session-id`, not `X-ClickHouse-Session-Id`). This is required by [RFC 9113, Section 8.2](https://www.rfc-editor.org/rfc/rfc9113#section-8.2), which mandates that HTTP/2 field names contain only lowercase characters. This differs from HTTP/1.1 where header names are case-insensitive.
</Note>

Sessions allow setting persistent ClickHouse settings via the `SetSessionOptions` action (see [DoAction](#doaction)).

<h2 id="configuration-reference">
  Server Configuration Reference
</h2>

| Setting                                                       | Default | Description                                                                                                                                                                                                                                                                                                                                                |
| ------------------------------------------------------------- | ------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `arrowflight_port`                                            | —       | Port for the Arrow Flight server. The server starts only if this setting is specified.                                                                                                                                                                                                                                                                     |
| `arrowflight.enable_ssl`                                      | `false` | Enable TLS encryption.                                                                                                                                                                                                                                                                                                                                     |
| `arrowflight.ssl_cert_file`                                   | —       | Path to the TLS certificate file. Required when TLS is enabled.                                                                                                                                                                                                                                                                                            |
| `arrowflight.ssl_key_file`                                    | —       | Path to the TLS private key file. Required when TLS is enabled.                                                                                                                                                                                                                                                                                            |
| `arrowflight.tickets_lifetime_seconds`                        | `600`   | Time in seconds before flight tickets expire and are cleaned up. Set to `0` to disable automatic ticket expiration.                                                                                                                                                                                                                                        |
| `arrowflight.cancel_ticket_after_do_get`                      | `false` | If `true`, tickets are cancelled immediately after being consumed by `DoGet`, freeing memory.                                                                                                                                                                                                                                                              |
| `arrowflight.poll_descriptors_lifetime_seconds`               | `600`   | Time in seconds before poll descriptors expire. Set to `0` to disable automatic expiration.                                                                                                                                                                                                                                                                |
| `arrowflight.cancel_flight_descriptor_after_poll_flight_info` | `false` | If `true`, poll descriptors are cancelled after being consumed by `PollFlightInfo`.                                                                                                                                                                                                                                                                        |
| `arrowflight.max_prepared_statements_per_user`                | `100`   | Maximum number of open prepared statements per user. Set to `0` to disable the limit.                                                                                                                                                                                                                                                                      |
| `arrowflight.prepared_statements_lifetime_seconds`            | `-1`    | Prepared statement lifetime mode. `> 0`: use this value as lifetime and refresh expiration on each request for both session-bound and session-less statements. `0`: disable automatic expiration. `-1`: for session-bound statements, use session timeout as lifetime and refresh it on each request; session-less statements do not expire automatically. |
| `enable_arrow_close_session`                                  | `true`  | Allow clients to close sessions via the `x-clickhouse-session-close` header.                                                                                                                                                                                                                                                                               |
| `default_session_timeout`                                     | `60`    | Default session timeout in seconds. Also controls Bearer token expiration.                                                                                                                                                                                                                                                                                 |
| `max_session_timeout`                                         | `3600`  | Maximum allowed session timeout in seconds.                                                                                                                                                                                                                                                                                                                |

<h2 id="rpc-methods">
  Supported RPC Methods
</h2>

<h3 id="getflightinfo">
  GetFlightInfo
</h3>

Executes a query and returns a `FlightInfo` containing the result schema, endpoints with tickets for data retrieval, row count, and byte count.

Accepts a `FlightDescriptor` which can be:

* **PATH descriptor**: A single-component path interpreted as a table name. Generates `SELECT * FROM <table>`.
* **CMD descriptor**: Either a raw SQL query string, or a serialized Flight SQL protobuf command (see [Flight SQL Commands](#flight-sql-commands)).

The query is executed fully, and results are stored in server-side tickets. Each block of data produces a separate endpoint/ticket, allowing clients to retrieve data in parallel.

```python theme={null}
# Query by table name
descriptor = flight.FlightDescriptor.for_path("my_table")
info = client.get_flight_info(descriptor, options)

# Query by SQL
descriptor = flight.FlightDescriptor.for_command(
    "SELECT * FROM my_table WHERE id > 100"
)
info = client.get_flight_info(descriptor, options)

# Retrieve results
for endpoint in info.endpoints:
    reader = client.do_get(endpoint.ticket, options)
    table = reader.read_all()
    print(table.to_pandas())
```

<h3 id="pollflightinfo">
  PollFlightInfo
</h3>

Enables incremental result retrieval for long-running queries. Instead of waiting for the entire query to complete (as `GetFlightInfo` does), `PollFlightInfo` returns results block by block.

On the first call, the query starts executing. The response includes:

* A `FlightInfo` with endpoints for any data blocks available so far.
* A `FlightDescriptor` for the next poll (if more results are expected).

Subsequent calls with the returned descriptor retrieve additional blocks. When no more data is available, the response contains no next descriptor.

<Note>
  The current implementation blocks until a data block is available rather than returning immediately with no data.
</Note>

<h3 id="getschema">
  GetSchema
</h3>

Returns the Arrow schema for a query result without executing the full query. Accepts the same descriptor types as `GetFlightInfo`.

```python theme={null}
descriptor = flight.FlightDescriptor.for_command(
    "SELECT 1 AS x, 'hello' AS y"
)
schema_result = client.get_schema(descriptor, options)
schema = schema_result.schema
print(schema)  # x: int32, y: string
```

<h3 id="doget">
  DoGet
</h3>

Retrieves data for a given ticket. Accepts either:

* A ticket returned by `GetFlightInfo` or `PollFlightInfo`.
* A raw SQL query string as the ticket value.

```python theme={null}
# Using a ticket from GetFlightInfo
reader = client.do_get(endpoint.ticket, options)
table = reader.read_all()

# Using a raw SQL query as ticket
ticket = flight.Ticket("SELECT number FROM system.numbers LIMIT 10")
reader = client.do_get(ticket, options)
table = reader.read_all()
```

<h3 id="doput">
  DoPut
</h3>

Sends data to ClickHouse. Accepts a `FlightDescriptor` and a stream of Arrow record batches.

**Insert by table name** (PATH descriptor):

```python theme={null}
schema = pa.schema([("id", pa.int64()), ("name", pa.string())])
batch = pa.record_batch(
    [pa.array([1, 2, 3]), pa.array(["Alice", "Bob", "Charlie"])],
    schema=schema,
)

descriptor = flight.FlightDescriptor.for_path("my_table")
writer, _ = client.do_put(descriptor, schema, options)
writer.write_batch(batch)
writer.close()
```

**Insert by SQL** (CMD descriptor):

```python theme={null}
descriptor = flight.FlightDescriptor.for_command(
    "INSERT INTO my_table FORMAT Arrow"
)
writer, _ = client.do_put(descriptor, schema, options)
writer.write_batch(batch)
writer.close()
```

**Execute DDL/DML via Flight SQL `CommandStatementUpdate`:**

Flight SQL clients use `CommandStatementUpdate` to execute DDL/DML statements (CREATE, INSERT, ALTER, etc.). The response includes the affected row count.

**Bulk ingest via Flight SQL `CommandStatementIngest`:**

Only appending to existing tables is supported (`TABLE_NOT_EXIST_OPTION_FAIL` + `TABLE_EXISTS_OPTION_APPEND`). Catalogs and temporary tables are not supported for this command.

`transaction_id` is not supported for `CommandStatementUpdate` or `CommandStatementIngest`. If provided, ClickHouse returns a `NotImplemented` error.

<Note>
  Only the `Arrow` format is accepted for data transfer. Specifying other formats in SQL (e.g., `FORMAT JSON`) results in an error.
</Note>

<h3 id="doaction">
  DoAction
</h3>

Executes named actions. The following actions are supported:

<h4 id="cancelflightinfo">
  CancelFlightInfo
</h4>

Cancels a running query associated with a `FlightInfo`. The query ID is extracted from the `FlightInfo`'s `app_metadata` field. Also cancels any poll descriptors associated with the query.

```python theme={null}
# Start a long-running query via PollFlightInfo, then cancel it
cancel_request = flight.CancelFlightInfoRequest(info)
result = client.cancel_flight_info(cancel_request, options)
# result.status is CancelStatus.CANCELLED if successful
```

<h4 id="setsessionoptions">
  SetSessionOptions
</h4>

Sets ClickHouse server settings for the current session. Requires a session ID to be set via the `x-clickhouse-session-id` header.

Supported value types: string, boolean, integer, double, and string lists.

If a setting name is unknown, the error `INVALID_NAME` is returned. If a value cannot be parsed, the error `INVALID_VALUE` is returned.

<h4 id="getsessionoptions">
  GetSessionOptions
</h4>

Returns all current ClickHouse settings and their values for the session. Returns a map of setting names to string values (queries `system.settings` internally).

<h4 id="createpreparedstatement">
  CreatePreparedStatement
</h4>

Creates a server-side prepared statement and returns a statement handle. The request contains the SQL query text with `?` placeholders.

`transaction_id` is not supported for this action. If it is provided, ClickHouse returns a `NotImplemented` error.

For query statements, the response may include:

* `dataset_schema`: schema of the result set.
* `parameter_schema`: schema of statement parameters.

If schema inference fails for a valid query (for example, when replacing placeholders with `NULL` is not valid for that query), ClickHouse still creates the prepared statement and returns the handle without `dataset_schema`.

Prepared statements are owned by the authenticated user, not by a single session. If you open multiple sessions as the same user, you can execute, re-bind, and close the same statement handle from any of those sessions.

Other users cannot execute, bind, or close a statement handle they did not create.

`arrowflight.prepared_statements_lifetime_seconds` controls expiration behavior:

* `> 0`: use the configured value as statement lifetime. Expiration is refreshed on each request for both session-bound and session-less statements.
* `0`: prepared statements do not expire automatically.
* `-1` (default): if the statement is created in a session, its lifetime follows that session timeout and is refreshed on each request in that session. If the statement is created without a session, it does not expire automatically.

Expired statements are removed and no longer count toward `arrowflight.max_prepared_statements_per_user`.

<h4 id="closepreparedstatement">
  ClosePreparedStatement
</h4>

Closes a prepared statement and releases the associated server-side resources when the request contains a non-empty statement handle.

ClickHouse also supports bulk close with `ClosePreparedStatement` when the handle is empty:

* If `x-clickhouse-session-id` is present, it closes all prepared statements for the authenticated user in that session.
* If no session ID is present, it closes only session-less prepared statements for the authenticated user.

If a prepared statement is created in a session (via `x-clickhouse-session-id`), it is also closed automatically when that session is closed.

<h2 id="flight-sql-commands">
  Flight SQL Commands
</h2>

When a `CMD` descriptor contains a serialized [Flight SQL protobuf](https://arrow.apache.org/docs/format/FlightSql.html) message, ClickHouse handles the following commands:

<h3 id="flightsql-getflightinfo">
  Supported via GetFlightInfo / GetSchema
</h3>

| Command                         | Description                                                                                       |
| ------------------------------- | ------------------------------------------------------------------------------------------------- |
| `CommandStatementQuery`         | Execute an arbitrary SQL query. `transaction_id` is not supported.                                |
| `CommandGetSqlInfo`             | Retrieve server metadata (name, version, Arrow version, capabilities).                            |
| `CommandGetCatalogs`            | List catalogs. Returns an empty result (ClickHouse does not use catalogs).                        |
| `CommandGetDbSchemas`           | List databases. Supports optional `db_schema_filter_pattern` (SQL `LIKE` pattern).                |
| `CommandGetTables`              | List tables. Supports filters for schema, table name, table types, and optional schema inclusion. |
| `CommandGetTableTypes`          | List table engine types (from `system.table_engines`).                                            |
| `CommandGetPrimaryKeys`         | Retrieve primary key columns for a specified table.                                               |
| `CommandPreparedStatementQuery` | Execute a prepared `SELECT`-style statement by handle.                                            |

<h3 id="flightsql-doput">
  Supported via DoPut
</h3>

| Command                          | Description                                                                                                                                                                                                                                                                 |
| -------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `CommandStatementUpdate`         | Execute a DDL/DML statement (CREATE, INSERT, ALTER, etc.). Returns affected row count. `transaction_id` is not supported.                                                                                                                                                   |
| `CommandStatementIngest`         | Bulk insert Arrow data into an existing table. Only append mode is supported. `transaction_id` is not supported.                                                                                                                                                            |
| `CommandPreparedStatementQuery`  | Bind parameter values for a prepared statement when sent via `DoPut`, then return `DoPutPreparedStatementResult` with the statement handle. Only one parameter set (one row) is accepted, and the number of bound values must exactly match the number of `?` placeholders. |
| `CommandPreparedStatementUpdate` | Execute a prepared DDL/DML statement by handle and return affected row count.                                                                                                                                                                                               |

<h3 id="flightsql-not-implemented">
  Unsupported in ClickHouse
</h3>

These commands map to features that ClickHouse does not provide, so they are not supported by the Arrow Flight SQL interface.

| Command                         | Reason                                                                                                                                |
| ------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------- |
| `CommandGetCrossReference`      | ClickHouse is not a relational database and does not implement foreign key constraints, so cross-reference metadata is not available. |
| `CommandGetExportedKeys`        | ClickHouse is not a relational database and does not implement foreign key constraints, so exported-key metadata is not available.    |
| `CommandGetImportedKeys`        | ClickHouse is not a relational database and does not implement foreign key constraints, so imported-key metadata is not available.    |
| `CommandStatementSubstraitPlan` | ClickHouse does not support Substrait plans.                                                                                          |

<h2 id="complete-example">
  Complete Example
</h2>

```python title="Query" theme={null}
import pyarrow as pa
import pyarrow.flight as flight

# Connect and authenticate
client = flight.FlightClient("grpc://localhost:9090")
token = client.authenticate_basic_token("default", "")
options = flight.FlightCallOptions(headers=[token])

# Insert data using DoPut with a PATH descriptor
schema = pa.schema([("id", pa.uint32()), ("value", pa.string())])
batch = pa.record_batch(
    [pa.array([1, 2, 3], type=pa.uint32()), pa.array(["a", "b", "c"])],
    schema=schema,
)
descriptor = flight.FlightDescriptor.for_path("test")
writer, _ = client.do_put(descriptor, schema, options)
writer.write_batch(batch)
writer.close()

# Query data using GetFlightInfo + DoGet
descriptor = flight.FlightDescriptor.for_command(
    "SELECT * FROM test ORDER BY id"
)
info = client.get_flight_info(descriptor, options)
for endpoint in info.endpoints:
    reader = client.do_get(endpoint.ticket, options)
    table = reader.read_all()
    print(table.to_pandas())
```

```text title="Response" theme={null}
   id value
0   1     a
1   2     b
2   3     c
```

<h2 id="data-format">
  Data Format
</h2>

All data is transferred in Apache Arrow IPC format. Only the `Arrow` format is supported — specifying other ClickHouse formats (e.g., `FORMAT JSON`, `FORMAT CSV`) results in an error.

ClickHouse data types are mapped to Arrow types during serialization. The setting `output_format_arrow_unsupported_types_as_binary` controls whether unsupported ClickHouse types are serialized as binary blobs.

<h2 id="compatibility">
  Compatibility
</h2>

The Arrow Flight interface is compatible with any client or tool that supports the Arrow Flight or Arrow Flight SQL protocol, including:

* Python (`pyarrow`)
* Java (`org.apache.arrow.flight`)
* C++ (`arrow::flight`)
* Go (`apache/arrow/go`)
* ADBC (Arrow Database Connectivity) drivers
* DBeaver, and other tools with Flight SQL support

If a native ClickHouse connector is available for your tool (e.g., JDBC, ODBC, native protocol), prefer using it unless Arrow Flight is specifically required for performance or format compatibility.

<h2 id="client-side">
  Client-side ArrowFlight features
</h2>

ClickHouse can also act as a Flight client to read data from external Arrow Flight servers. See:

* [ArrowFlight table engine](/reference/engines/table-engines/integrations/arrowflight)
* [arrowFlight table function](/reference/functions/table-functions/arrowflight)

<h2 id="see-also">
  See also
</h2>

* [Apache Arrow Flight specification](https://arrow.apache.org/docs/format/Flight.html)
* [Apache Arrow Flight SQL specification](https://arrow.apache.org/docs/format/FlightSql.html)
* [Arrow format in ClickHouse](/reference/formats/Arrow/Arrow)
