> ## Documentation Index
> Fetch the complete documentation index at: https://private-7c7dfe99-fix-nav-issues.mintlify.site/llms.txt
> Use this file to discover all available pages before exploring further.

> Data ingestion with Vector for ClickStack - The ClickHouse Observability Stack

# Ingesting with Vector

export const Image = ({img, alt, size}) => {
  return <Frame>
      <img src={img} alt={alt} />
    </Frame>;
};

[Vector](https://vector.dev) is a high-performance, vendor-neutral observability data pipeline. It is commonly used to collect, transform, and route logs and metrics from a wide range of sources, and is especially popular for log ingestion due to its flexibility and low resource footprint.

When using Vector with ClickStack, users are responsible for defining their own schemas. These schemas may follow OpenTelemetry conventions, but they can also be entirely custom, representing user-defined event structures. In practice, Vector ingestion is most commonly used for **logs**, where users want full control over parsing and enrichment before data is written to ClickHouse.

This guide focuses on onboarding data into ClickStack using Vector for both ClickStack Open Source and Managed ClickStack. For simplicity, it doesn't cover Vector sources or pipeline configuration in depth. Instead, it focuses on configuring the **sink** that writes data into ClickHouse and ensuring the resulting schema is compatible with ClickStack.

The only strict requirement for ClickStack, whether using the open-source or managed deployment, is that the data includes a **timestamp column** (or equivalent time field), which can be declared when configuring the data source in the ClickStack UI.

<h2 id="sending-data-with-vector">
  Sending data with Vector
</h2>

<br />

<Tabs>
  <Tab title="Managed ClickStack">
    The following guide assumes you have already created a Managed ClickStack service and recorded your service credentials. If you haven't, follow the [Getting Started](/clickstack/getting-started/managed) guide for Managed ClickStack until promoted to configure Vector.

    <Steps>
      <Step>
        <h3 id="create-a-database-table">
          Create a database and table
        </h3>

        Vector requires a table and schema to be defined prior to data ingestion.

        First create a database. This can be done via the [ClickHouse Cloud console](/products/cloud/features/sql-console-features/sql-console).

        In the example below, we use `logs`:

        ```sql theme={null}
        CREATE DATABASE IF NOT EXISTS logs
        ```

        Create a table for your data. This should match the output schema of your data. The example below assumes a classic Nginx structure. Adjust accordingly to your data, adhering to [schema best practices](/concepts/best-practices/select-data-type). We **strongly recommend** familiarizing yourself with the [concept of Primary keys](/concepts/core-concepts/primary-indexes), selecting your primary key based on the guidelines outlined [here](/clickstack/managing/performance-tuning#choosing-a-primary-key).

        ```sql theme={null}
        CREATE TABLE logs.nginx_logs
        (
            `time_local` DateTime,
            `remote_addr` IPv4,
            `remote_user` LowCardinality(String),
            `request` String,
            `status` UInt16,
            `body_bytes_sent` UInt64,
            `http_referer` String,
            `http_user_agent` String,
            `http_x_forwarded_for` LowCardinality(String),
            `request_time` Float32,
            `upstream_response_time` Float32,
            `http_host` String
        )
        ENGINE = MergeTree
        ORDER BY (toStartOfMinute(time_local), status, remote_addr)
        ```

        <Info>
          **Nginx primary key**

          The primary key above assumes typical access patterns in the ClickStack UI for Nginx logs, but may need to be adjusted depending on your workload in production environments.
        </Info>
      </Step>

      <Step>
        <h3 id="add-clickhouse-sink-to-config">
          Add ClickHouse sink to vector configuration
        </h3>

        Modify your Vector configuration to include the ClickHouse sink, updating the `inputs` field to receive events from your existing pipelines.

        This configuration assumes that your upstream Vector pipeline has already **prepared the data to match the target ClickHouse schema**, meaning that fields are parsed, named correctly, and typed appropriately for insertion. See the [**Nginx example below**](#example-dataset-with-vector) for a complete illustration of parsing and normalizing raw log lines into a schema suitable for ClickStack.

        ```yaml theme={null}
        sinks:
          clickhouse:
            type: clickhouse
            inputs:
              - your_input
            endpoint: "<CLICKHOUSE_ENDPOINT>"
            database: logs
            format: json_each_row
            table: nginx_logs
            skip_unknown_fields: true
            auth:
              strategy: "basic"
              user: "default"
              password: "<CLICKHOUSE_PASSWORD>"
        ```

        By default, we recommend using the **`json_each_row`** format, which encodes each event as a single JSON object per row. This is the default and recommended format for ClickStack when ingesting JSON data, and should be preferred over alternative formats such as JSON objects encoded as strings.

        The ClickHouse sink also supports **Arrow stream encoding** (currently in beta). This can offer higher throughput but comes with important constraints: the database and table must be static, as the schema is fetched once at startup, and dynamic routing isn't supported. For this reason, Arrow encoding is best suited for fixed, well-defined ingestion pipelines.

        We recommend reviewing the available sink configuration options in the [Vector documentation](https://vector.dev/docs/reference/configuration/sinks/clickhouse):

        <Note>
          The example above uses the default user for Managed ClickStack. For production deployments, we recommend [creating a dedicated ingestion user](/clickstack/ingesting-data/collector#creating-an-ingestion-user) with appropriate permissions and limits.
        </Note>
      </Step>

      <Step>
        <h3 id="navigate-to-clickstack-ui">
          Navigate to the ClickStack UI
        </h3>

        Navigate to your Managed ClickStack service and select "ClickStack" from the left-hand menu. If you’ve already completed the onboarding, this will launch the ClickStack UI in a new tab, and you will be automatically authenticated. If not, you can proceed through the onboarding and select “Launch ClickStack” once you’ve selected Vector as your input source.

        <Image img="https://mintcdn.com/private-7c7dfe99-fix-nav-issues/zXCQbzXFHfeD9FBK/images/clickstack/launch-clickstack-vector.png?fit=max&auto=format&n=zXCQbzXFHfeD9FBK&q=85&s=de1740efed86706a6c58568e0360f587" alt="Launch ClickStack for vector" size="lg" width="1920" height="918" data-path="images/clickstack/launch-clickstack-vector.png" />
      </Step>

      <Step>
        <h3 id="create-a-datasource-managed">
          Create a datasource
        </h3>

        Create a logs data source. If no data sources exist, you will be prompted to create one on your first login. Otherwise, navigate to Team Settings and add a new data source.

        <Image img="https://mintcdn.com/private-7c7dfe99-fix-nav-issues/Y9kcWM6RbYppspJn/images/clickstack/create-vector-datasource.png?fit=max&auto=format&n=Y9kcWM6RbYppspJn&q=85&s=ae39a46fcf0945b3e1e30e6211e136b3" alt="Create datasource - vector" size="lg" width="3600" height="1938" data-path="images/clickstack/create-vector-datasource.png" />

        The configuration above assumes an Nginx-style schema with a `time_local` column used as the timestamp. This should be, where possible, the timestamp column declared in the primary key. This column is mandatory.

        We also recommend updating the `Default SELECT` to explicitly define which columns are returned in the logs view. If additional fields are available, such as service name, log level, or a body column, these can also be configured. The timestamp display column can also be overridden if it differs from the column used in the table's primary key and configured above.

        In the example above, a `Body` column doesn't exist in the data. Instead, it is defined using a SQL expression that reconstructs an Nginx log line from the available fields.

        For other possible options, see the [configuration reference](/clickstack/managing/config).
      </Step>

      <Step>
        <h3 id="explore-the-data-managed">
          Explore the data
        </h3>

        Navigate to the logs view to explore the data and begin using ClickStack.

        <Image img="https://mintcdn.com/private-7c7dfe99-fix-nav-issues/zXCQbzXFHfeD9FBK/images/clickstack/nginx-logs-vector-search.png?fit=max&auto=format&n=zXCQbzXFHfeD9FBK&q=85&s=05c303e7bacb5ec7765367bcaed8cd11" alt="Nginx logs in CLickStack" size="lg" width="3600" height="1906" data-path="images/clickstack/nginx-logs-vector-search.png" />
      </Step>
    </Steps>
  </Tab>

  <Tab title="OpenSource ClickStack">
    <Steps>
      <Step>
        <h3 id="create-a-database-table-oss">
          Create a database and table
        </h3>

        Vector requires a table and schema to be defined prior to data ingestion.

        First create a database. This can be done via the [ClickHouse Web user interface](/concepts/features/interfaces/http#web-ui) at [http://localhost:8123/play](http://localhost:8123/play). Use the default username and password `api:api`.

        <Image img="https://mintcdn.com/private-7c7dfe99-fix-nav-issues/zXCQbzXFHfeD9FBK/images/clickstack/play-ui-clickstack.png?fit=max&auto=format&n=zXCQbzXFHfeD9FBK&q=85&s=ada17bdc9d2bab9225d770395cb11a33" alt="Play UI ClickStack" size="lg" width="3600" height="1918" data-path="images/clickstack/play-ui-clickstack.png" />

        In the example below, we use `logs`:

        ```sql theme={null}
        CREATE DATABASE IF NOT EXISTS logs
        ```

        Create a table for your data. This should match the output schema of your data. The example below assumes a classic Nginx structure. Adjust accordingly to your data, adhering to [schema best practices](/concepts/best-practices/select-data-type). We **strongly recommend** familiarizing yourself with the [concept of Primary keys](/concepts/core-concepts/primary-indexes), selecting your primary key based on the guidelines outlined [here](/clickstack/managing/performance-tuning#choosing-a-primary-key).

        ```sql theme={null}
        CREATE TABLE logs.nginx_logs
        (
            `time_local` DateTime,
            `remote_addr` IPv4,
            `remote_user` LowCardinality(String),
            `request` String,
            `status` UInt16,
            `body_bytes_sent` UInt64,
            `http_referer` String,
            `http_user_agent` String,
            `http_x_forwarded_for` LowCardinality(String),
            `request_time` Float32,
            `upstream_response_time` Float32,
            `http_host` String
        )
        ENGINE = MergeTree
        ORDER BY (toStartOfMinute(time_local), status, remote_addr)
        ```

        <Info>
          **Nginx primary key**

          The primary key above assumes typical access patterns in the ClickStack UI for Nginx logs, but may need to be adjusted depending on your workload in production environments.
        </Info>
      </Step>

      <Step>
        <h3 id="add-clickhouse-config-to-sink-oss">
          Add ClickHouse sink to vector configuration
        </h3>

        Ingestion to ClickStack for Vector should occur directly to ClickHouse, bypassing the OTLP endpoint exposed by the collector.

        Modify your Vector configuration to include the ClickHouse sink, updating the `inputs` field to receive events from your existing pipelines.

        This configuration assumes that your upstream Vector pipeline has already **prepared the data to match the target ClickHouse schema**, meaning that fields are parsed, named correctly, and typed appropriately for insertion. See the [**Nginx example below**](#example-dataset-with-vector) for a complete illustration of parsing and normalizing raw log lines into a schema suitable for ClickStack.

        ```yaml theme={null}
        sinks:
          clickhouse:
            type: clickhouse
            inputs:
              - your_input
            endpoint: "http://localhost:8123"
            database: logs
            format: json_each_row
            table: nginx_logs
            skip_unknown_fields: true
            auth:
              strategy: "basic"
              user: "api"
              password: "api"
        ```

        By default, we recommend using the **`json_each_row`** format, which encodes each event as a single JSON object per row. This is the default and recommended format for ClickStack when ingesting JSON data, and should be preferred over alternative formats such as JSON objects encoded as strings.

        The ClickHouse sink also supports **Arrow stream encoding** (currently in beta). This can offer higher throughput but comes with important constraints: the database and table must be static, as the schema is fetched once at startup, and dynamic routing isn't supported. For this reason, Arrow encoding is best suited for fixed, well-defined ingestion pipelines.

        We recommend reviewing the available sink configuration options in the [Vector documentation](https://vector.dev/docs/reference/configuration/sinks/clickhouse):

        <Note>
          The example above uses the `api` user for ClickStack Open Source. For production deployments, we recommend [creating a dedicated ingestion user](/clickstack/ingesting-data/collector#creating-an-ingestion-user) with appropriate permissions and limits. The above configuration also assumes that Vector is running on the same host as ClickStack. In production deployments, this is likely to be different. We would recommend sending data over the secure HTTPS port 8443.
        </Note>
      </Step>

      <Step>
        <h3 id="navigate-to-clickstack-ui-oss">
          Navigate to the ClickStack UI
        </h3>

        Navigate to the ClickStack UI at [http://localhost:8080](http://localhost:8080). Create a user if you haven't completed the onboarding.

        <Image img="https://mintcdn.com/private-7c7dfe99-fix-nav-issues/Wpmp4N2VLv_V8ziJ/images/use-cases/observability/hyperdx-login.png?fit=max&auto=format&n=Wpmp4N2VLv_V8ziJ&q=85&s=a4a7f0f11f4ba3b35b9a6c6613b62f5e" alt="ClickStack login" size="lg" width="3600" height="1900" data-path="images/use-cases/observability/hyperdx-login.png" />
      </Step>

      <Step>
        <h3 id="create-a-datasource-oss">
          Create a datasource
        </h3>

        Navigate to Team Settings and add a new data source.

        <Image img="https://mintcdn.com/private-7c7dfe99-fix-nav-issues/Y9kcWM6RbYppspJn/images/clickstack/create-vector-datasource-oss.png?fit=max&auto=format&n=Y9kcWM6RbYppspJn&q=85&s=09df9e222c584d4201647b4f10e3f9dc" alt="Create datasource - vector" size="lg" width="3600" height="1940" data-path="images/clickstack/create-vector-datasource-oss.png" />

        The configuration above assumes an Nginx-style schema with a `time_local` column used as the timestamp. This should be, where possible, the timestamp column declared in the primary key. This column is mandatory.

        We also recommend updating the `Default SELECT` to explicitly define which columns are returned in the logs view. If additional fields are available, such as service name, log level, or a body column, these can also be configured. The timestamp display column can also be overridden if it differs from the column used in the table's primary key and configured above.

        In the example above, a `Body` column doesn't exist in the data. Instead, it is defined using a SQL expression that reconstructs an Nginx log line from the available fields.

        For other possible options, see the [configuration reference](/clickstack/managing/config).
      </Step>

      <Step>
        <h3 id="explore-the-data-oss">
          Explore the data
        </h3>

        Navigate to the logs view to explore the data and begin using ClickStack.

        <Image img="https://mintcdn.com/private-7c7dfe99-fix-nav-issues/zXCQbzXFHfeD9FBK/images/clickstack/nginx-logs-vector-search.png?fit=max&auto=format&n=zXCQbzXFHfeD9FBK&q=85&s=05c303e7bacb5ec7765367bcaed8cd11" alt="Nginx logs in CLickStack" size="lg" width="3600" height="1906" data-path="images/clickstack/nginx-logs-vector-search.png" />
      </Step>
    </Steps>
  </Tab>
</Tabs>

<h2 id="example-dataset-with-vector">
  Example dataset with Vector
</h2>

For a more complete example, we use an **Nginx log file** below.

<Tabs>
  <Tab title="Managed ClickStack">
    The following guide assumes you have already created a Managed ClickStack service and recorded your service credentials. If you haven't, follow the [Getting Started](/clickstack/getting-started/managed) guide for Managed ClickStack until promoted to configure Vector.

    <Steps>
      <Step>
        <h3 id="installing-vector">
          Installing Vector
        </h3>

        Before proceeding, ensure that **Vector is installed** on the system where you plan to run your ingestion pipeline. Follow the [official Vector installation guide](https://vector.dev/docs/setup/installation/) to install a prebuilt binary or package appropriate for your environment:

        Once installed, verify that the `vector` binary is available on your path before continuing with the configuration steps below.

        This can be installed on the same instance as your ClickStack OTel collector.

        Follow best practices for architecture and security when [moving Vector to production](https://vector.dev/docs/setup/going-to-prod/).
      </Step>

      <Step>
        <h3 id="download-the-sample-data">
          Download the sample data
        </h3>

        If you wish to experiment with a sample dataset, download the following example nginx sample.

        ```bash theme={null}
        curl -O https://datasets-documentation.s3.eu-west-3.amazonaws.com/clickstack-integrations/access.log
        ```

        <Note>
          This data has been collected from an Nginx instance configured to output logs in JSON format for easier parsing. For the Nginx configuration for these logs, see ["Monitoring Nginx Logs with ClickStack"](/clickstack/integration-examples/nginx-logs#configure-nginx).
        </Note>
      </Step>

      <Step>
        <h3 id="create-database-table-nginx-managed">
          Create a database and table
        </h3>

        Vector requires a table and schema to be defined prior to data ingestion.

        First create a database. This can be done via the [ClickHouse Cloud console](/products/cloud/features/sql-console-features/sql-console).

        Create a database `logs`:

        ```sql theme={null}
        CREATE DATABASE IF NOT EXISTS logs
        ```

        Create a table for your data.

        ```sql theme={null}
        CREATE TABLE logs.nginx_logs
        (
            `time_local` DateTime,
            `remote_addr` IPv4,
            `remote_user` LowCardinality(String),
            `request` String,
            `status` UInt16,
            `body_bytes_sent` UInt64,
            `http_referer` String,
            `http_user_agent` String,
            `http_x_forwarded_for` LowCardinality(String),
            `request_time` Float32,
            `upstream_response_time` Float32,
            `http_host` String
        )
        ENGINE = MergeTree
        ORDER BY (toStartOfMinute(time_local), status, remote_addr)
        ```

        <Info>
          **Nginx primary key**

          The primary key above assumes typical access patterns in the ClickStack UI for Nginx logs, but may need to be adjusted depending on your workload in production environments.
        </Info>
      </Step>

      <Step>
        <h3 id="copy-vector-configuration">
          Copy Vector configuration
        </h3>

        Copy the vector configuration and create a file `nginx.yaml`, setting the `CLICKHOUSE_ENDPOINT` and `CLICKHOUSE_PASSWORD`.

        ```yaml theme={null}
        data_dir: ./.vector-data
        sources:
          nginx_logs:
            type: file
            include:
              - access.log
            read_from: beginning

        transforms:
          decode_json:
            type: remap
            inputs:
              - nginx_logs
            source: |
              . = parse_json!(to_string!(.message))
              ts = parse_timestamp!(.time_local, format: "%d/%b/%Y:%H:%M:%S %z")
              # ClickHouse-friendly DateTime format
              .time_local = format_timestamp!(ts, format: "%F %T")

        sinks:
          clickhouse:
            type: clickhouse
            inputs:
              - decode_json
            endpoint: "<CLICKHOUSE_ENDPOINT>"
            database: logs
            format: json_each_row
            table: nginx_logs
            skip_unknown_fields: true
            auth:
              strategy: "basic"
              user: "default"
              password: "<CLICKHOUSE_PASSWORD>"
        ```

        <Note>
          The example above uses the default user for Managed ClickStack. For production deployments, we recommend [creating a dedicated ingestion user](/clickstack/ingesting-data/collector#creating-an-ingestion-user) with appropriate permissions and limits.
        </Note>
      </Step>

      <Step>
        <h3 id="start-vector">
          Start Vector
        </h3>

        Start Vector with the following command, creating the data directory first to record file offsets.

        ```bash theme={null}
        mkdir ./.vector-data
        vector --config nginx.yaml
        ```
      </Step>

      <Step>
        <h3 id="navigate-to-clickstack-ui-nginx-managed">
          Navigate to the ClickStack UI
        </h3>

        Navigate to your Managed ClickStack service and select "ClickStack" from the left-hand menu. If you’ve already completed the onboarding, this will launch the ClickStack UI in a new tab, and you will be automatically authenticated. If not, you can proceed through the onboarding and select “Launch ClickStack” once you’ve selected Vector as your input source.

        <Image img="https://mintcdn.com/private-7c7dfe99-fix-nav-issues/zXCQbzXFHfeD9FBK/images/clickstack/launch-clickstack-vector.png?fit=max&auto=format&n=zXCQbzXFHfeD9FBK&q=85&s=de1740efed86706a6c58568e0360f587" alt="Launch ClickStack for vector" size="lg" width="1920" height="918" data-path="images/clickstack/launch-clickstack-vector.png" />
      </Step>

      <Step>
        <h3 id="create-a-datasource-nginx-managed">
          Create a datasource
        </h3>

        Create a logs data source. If no data sources exist, you will be prompted to create one on first login. Otherwise, navigate to Team Settings and add a new data source.

        <Image img="https://mintcdn.com/private-7c7dfe99-fix-nav-issues/Y9kcWM6RbYppspJn/images/clickstack/create-vector-datasource.png?fit=max&auto=format&n=Y9kcWM6RbYppspJn&q=85&s=ae39a46fcf0945b3e1e30e6211e136b3" alt="Create datasource - vector" size="lg" width="3600" height="1938" data-path="images/clickstack/create-vector-datasource.png" />

        The configuration assumes the Nginx schema with a `time_local` column used as the timestamp. This is the timestamp column declared in the primary key. This column is mandatory.

        We have also specified the default select to be `time_local, remote_addr, status, request`, which defines which columns are returned in the logs view.

        In the example above, a `Body` column doesn't exist in the data. Instead, it is defined as the SQL expression:

        ```sql theme={null}
        concat(
          remote_addr, ' ',
          remote_user, ' ',
          '[', formatDateTime(time_local, '%d/%b/%Y:%H:%M:%S %z'), '] ',
          '"', request, '" ',
          toString(status), ' ',
          toString(body_bytes_sent), ' ',
          '"', http_referer, '" ',
          '"', http_user_agent, '" ',
          '"', http_x_forwarded_for, '" ',
          toString(request_time), ' ',
          toString(upstream_response_time), ' ',
          '"', http_host, '"'
        )
        ```

        This reconstructs the log line from the structured fields.

        For other possible options, see the [configuration reference](/clickstack/managing/config).
      </Step>

      <Step>
        <h3 id="explore-the-data-nginx-managed">
          Explore the data
        </h3>

        Navigate to the search view for `October 20th, 2025` to explore the data and begin using ClickStack.

        <Image img="https://mintcdn.com/private-7c7dfe99-fix-nav-issues/zXCQbzXFHfeD9FBK/images/clickstack/nginx-logs-vector-search.png?fit=max&auto=format&n=zXCQbzXFHfeD9FBK&q=85&s=05c303e7bacb5ec7765367bcaed8cd11" alt="HyperDX UI" size="lg" width="3600" height="1906" data-path="images/clickstack/nginx-logs-vector-search.png" />
      </Step>
    </Steps>
  </Tab>

  <Tab title="Open Source ClickStack">
    The following guide assumes you have set up ClickStack Open Source with the [Getting Started guide](/clickstack/getting-started/oss).

    <Steps>
      <Step>
        <h3 id="installing-vector-oss">
          Installing vector
        </h3>

        Before proceeding, ensure that **Vector is installed** on the system where you plan to run your ingestion pipeline. Follow the [official Vector installation guide](https://vector.dev/docs/setup/installation/) to install a prebuilt binary or package appropriate for your environment:

        Once installed, verify that the `vector` binary is available on your path before continuing with the configuration steps below.

        This can be installed on the same instance as your ClickStack OTel collector.

        Follow best practices for architecture and security when [moving Vector to production](https://vector.dev/docs/setup/going-to-prod/).
      </Step>

      <Step>
        <h3 id="download-the-sample-data-oss">
          Download the sample data
        </h3>

        If you wish to experiment with a sample dataset, download the following example nginx sample.

        ```bash theme={null}
        curl -O https://datasets-documentation.s3.eu-west-3.amazonaws.com/clickstack-integrations/access.log
        ```

        <Note>
          This data has been collected from an Nginx instance configured to output logs in JSON format for easier parsing. For the Nginx configuration for these logs, see ["Monitoring Nginx Logs with ClickStack"](/clickstack/integration-examples/nginx-logs#configure-nginx).
        </Note>
      </Step>

      <Step>
        <h3 id="create-a-database-table-nginx-oss">
          Create a database and table
        </h3>

        Vector requires a table and schema to be defined prior to data ingestion.

        First create a database. This can be done via the [ClickHouse Web user interface](/concepts/features/interfaces/http#web-ui) at [http://localhost:8123/play](http://localhost:8123/play). Use the default username and password `api:api`.

        <Image img="https://mintcdn.com/private-7c7dfe99-fix-nav-issues/zXCQbzXFHfeD9FBK/images/clickstack/play-ui-clickstack.png?fit=max&auto=format&n=zXCQbzXFHfeD9FBK&q=85&s=ada17bdc9d2bab9225d770395cb11a33" alt="Play UI ClickStack" size="lg" width="3600" height="1918" data-path="images/clickstack/play-ui-clickstack.png" />

        Create a database `logs`:

        ```sql theme={null}
        CREATE DATABASE IF NOT EXISTS logs
        ```

        Create a table for your data.

        ```sql theme={null}
        CREATE TABLE logs.nginx_logs
        (
            `time_local` DateTime,
            `remote_addr` IPv4,
            `remote_user` LowCardinality(String),
            `request` String,
            `status` UInt16,
            `body_bytes_sent` UInt64,
            `http_referer` String,
            `http_user_agent` String,
            `http_x_forwarded_for` LowCardinality(String),
            `request_time` Float32,
            `upstream_response_time` Float32,
            `http_host` String
        )
        ENGINE = MergeTree
        ORDER BY (toStartOfMinute(time_local), status, remote_addr)
        ```

        <Info>
          **Nginx primary key**

          The primary key above assumes typical access patterns in the ClickStack UI for Nginx logs, but may need to be adjusted depending on your workload in production environments.
        </Info>
      </Step>

      <Step>
        <h3 id="copy-vector-configuration-nginx-oss">
          Copy Vector configuration
        </h3>

        Ingestion to ClickStack for Vector should occur directly to ClickHouse, bypassing the OTLP endpoint exposed by the collector.

        Copy the vector configuration and create a file `nginx.yaml`.

        ```yaml theme={null}
        data_dir: ./.vector-data
        sources:
          nginx_logs:
            type: file
            include:
              - access.log
            read_from: beginning

        transforms:
          decode_json:
            type: remap
            inputs:
              - nginx_logs
            source: |
              . = parse_json!(to_string!(.message))
              ts = parse_timestamp!(.time_local, format: "%d/%b/%Y:%H:%M:%S %z")
              # ClickHouse-friendly DateTime format
              .time_local = format_timestamp!(ts, format: "%F %T")

        sinks:
          clickhouse:
            type: clickhouse
            inputs:
              - decode_json
            endpoint: "http://localhost:8123"
            database: logs
            format: json_each_row
            table: nginx_logs
            skip_unknown_fields: true
            auth:
              strategy: "basic"
              user: "api"
              password: "api"
        ```

        <Note>
          The example above uses the `api` user for ClickStack Open Source. For production deployments, we recommend [creating a dedicated ingestion user](/clickstack/ingesting-data/collector#creating-an-ingestion-user) with appropriate permissions and limits. The above configuration also assumes that Vector is running on the same host as ClickStack. In production deployments, this is likely to be different. We would recommend sending data over the secure HTTPS port 8443.
        </Note>
      </Step>

      <Step>
        <h3 id="start-vector-oss-nginx">
          Start Vector
        </h3>

        Start Vector with the following command.

        ```bash theme={null}
        mkdir ./.vector-data
        vector --config nginx-local.yaml
        ```
      </Step>

      <Step>
        <h3 id="create-a-datasource-nginx-oss">
          Create a datasource
        </h3>

        Create a logs data source via `Team -> Sources`

        <Image img="https://mintcdn.com/private-7c7dfe99-fix-nav-issues/Y9kcWM6RbYppspJn/images/clickstack/create-vector-datasource-oss.png?fit=max&auto=format&n=Y9kcWM6RbYppspJn&q=85&s=09df9e222c584d4201647b4f10e3f9dc" alt="Create datasource - vector" size="lg" width="3600" height="1940" data-path="images/clickstack/create-vector-datasource-oss.png" />

        The configuration assumes the Nginx schema with a `time_local` column used as the timestamp. This is the timestamp column declared in the primary key. This column is mandatory.

        We have also specified the default select to be `time_local, remote_addr, status, request`, which defines which columns are returned in the logs view.

        In the example above, a `Body` column doesn't exist in the data. Instead, it is defined as the SQL expression:

        ```sql theme={null}
        concat(
          remote_addr, ' ',
          remote_user, ' ',
          '[', formatDateTime(time_local, '%d/%b/%Y:%H:%M:%S %z'), '] ',
          '"', request, '" ',
          toString(status), ' ',
          toString(body_bytes_sent), ' ',
          '"', http_referer, '" ',
          '"', http_user_agent, '" ',
          '"', http_x_forwarded_for, '" ',
          toString(request_time), ' ',
          toString(upstream_response_time), ' ',
          '"', http_host, '"'
        )
        ```

        This reconstructs the log line from the structured fields.

        For other possible options, see the [configuration reference](/clickstack/managing/config).
      </Step>

      <Step>
        <h3 id="navigate-to-clickstack-ui-nginx-oss">
          Navigate to the ClickStack UI
        </h3>

        Navigate to the ClickStack UI at [http://localhost:8080](http://localhost:8080). Create a user if you haven't completed the onboarding.

        <Image img="https://mintcdn.com/private-7c7dfe99-fix-nav-issues/Wpmp4N2VLv_V8ziJ/images/use-cases/observability/hyperdx-login.png?fit=max&auto=format&n=Wpmp4N2VLv_V8ziJ&q=85&s=a4a7f0f11f4ba3b35b9a6c6613b62f5e" alt="ClickStack login" size="lg" width="3600" height="1900" data-path="images/use-cases/observability/hyperdx-login.png" />
      </Step>

      <Step>
        <h3 id="explore-the-data-nginx-oss">
          Explore the data
        </h3>

        Navigate to the search view for `October 20th, 2025` to explore the data and begin using ClickStack.

        <Image img="https://mintcdn.com/private-7c7dfe99-fix-nav-issues/zXCQbzXFHfeD9FBK/images/clickstack/nginx-logs-vector-search.png?fit=max&auto=format&n=zXCQbzXFHfeD9FBK&q=85&s=05c303e7bacb5ec7765367bcaed8cd11" alt="HyperDX UI" size="lg" width="3600" height="1906" data-path="images/clickstack/nginx-logs-vector-search.png" />
      </Step>
    </Steps>
  </Tab>
</Tabs>
