Configure dataflow endpoints for Azure Data Explorer

Important

Azure IoT Operations Preview – enabled by Azure Arc is currently in preview. You shouldn't use this preview software in production environments.

You'll need to deploy a new Azure IoT Operations installation when a generally available release becomes available. You won't be able to upgrade a preview installation.

For legal terms that apply to Azure features that are in beta, in preview, or otherwise not yet released into general availability, see the Supplemental Terms of Use for Microsoft Azure Previews.

To send data to Azure Data Explorer in Azure IoT Operations Preview, you can configure a dataflow endpoint. This configuration allows you to specify the destination endpoint, authentication method, table, and other settings.

Prerequisites

Create an Azure Data Explorer database

  1. In the Azure portal, create a database in your Azure Data Explorer full cluster.

  2. Create a table in your database for the data. You can use the Azure portal and create columns manually, or you can use KQL in the query tab. For example, to create a table for sample thermostat data, run the following command:

    .create table thermostat (
        externalAssetId: string,
        assetName: string,
        CurrentTemperature: real,
        Pressure: real,
        MqttTopic: string,
        Timestamp: datetime
    )
    
  3. Enable streaming ingestion on your table and database. In the query tab, run the following command, substituting <DATABASE_NAME> with your database name:

    .alter database ['<DATABASE_NAME>'] policy streamingingestion enable
    

    Alternatively, enable streaming ingestion on the entire cluster. See Enable streaming ingestion on an existing cluster.

  4. In Azure portal, go to the Arc-connected Kubernetes cluster and select Settings > Extensions. In the extension list, find the name of your Azure IoT Operations extension. Copy the name of the extension.

  5. In your Azure Data Explorer database (not cluster), under Overview select Permissions > Add > Ingestor. Search for the Azure IoT Operations extension name then add it.

Create an Azure Data Explorer dataflow endpoint

Create the dataflow endpoint resource with your cluster and database information. We suggest using the managed identity of the Azure Arc-enabled Kubernetes cluster. This approach is secure and eliminates the need for secret management. Replace the placeholder values like <ENDPOINT_NAME> with your own.

  1. In the operations experience, select the Dataflow endpoints tab.

  2. Under Create new dataflow endpoint, select Azure Data Explorer > New.

    Screenshot using operations experience to create an Azure Data Explorer dataflow endpoint.

  3. Enter the following settings for the endpoint:

    Setting Description
    Name The name of the dataflow endpoint.
    Host The hostname of the Azure Data Explorer endpoint in the format <cluster>.<region>.kusto.windows.net.
    Authentication method The method used for authentication. Choose System assigned managed identity or User assigned managed identity
    Client ID The client ID of the user-assigned managed identity. Required if using User assigned managed identity.
    Tenant ID The tenant ID of the user-assigned managed identity. Required if using User assigned managed identity.

Available authentication methods

The following authentication methods are available for Azure Data Explorer endpoints. For more information about enabling secure settings by configuring an Azure Key Vault and enabling workload identities, see Enable secure settings in Azure IoT Operations Preview deployment.

Permissions

To use these authentication methods, the Azure IoT Operations Arc extension must be given Ingestor permission on the Azure Data Explorer database. For more information, see Manage Azure Data Explorer database permissions.

System-assigned managed identity

Using the system-assigned managed identity is the recommended authentication method for Azure IoT Operations. Azure IoT Operations creates the managed identity automatically and assigns it to the Azure Arc-enabled Kubernetes cluster. It eliminates the need for secret management and allows for seamless authentication.

In the DataflowEndpoint resource, specify the managed identity authentication method. In most cases, you don't need to specify other settings. This configuration creates a managed identity with the default audience https://api.kusto.windows.net.

In the operations experience dataflow endpoint settings page, select the Basic tab then choose Authentication method > System assigned managed identity.

If you need to override the system-assigned managed identity audience, you can specify the audience setting.

In most cases, you don't need to specify a service audience. Not specifying an audience creates a managed identity with the default audience scoped to your storage account.

User-assigned managed identity

To use user-managed identity for authentication, you must first deploy Azure IoT Operations with secure settings enabled. To learn more, see Enable secure settings in Azure IoT Operations Preview deployment.

Then, specify the user-assigned managed identity authentication method along with the client ID, tenant ID, and scope of the managed identity.

In the operations experience dataflow endpoint settings page, select the Basic tab then choose Authentication method > User assigned managed identity.

Enter the user assigned managed identity client ID and tenant ID in the appropriate fields.

Here, the scope is optional and defaults to https://api.kusto.windows.net/.default. If you need to override the default scope, specify the scope setting via Bicep or Kubernetes.

Advanced settings

You can set advanced settings for the Azure Data Explorer endpoint, such as the batching latency and message count.

Use the batching settings to configure the maximum number of messages and the maximum latency before the messages are sent to the destination. This setting is useful when you want to optimize for network bandwidth and reduce the number of requests to the destination.

Field Description Required
latencySeconds The maximum number of seconds to wait before sending the messages to the destination. The default value is 60 seconds. No
maxMessages The maximum number of messages to send to the destination. The default value is 100000 messages. No

For example, to configure the maximum number of messages to 1000 and the maximum latency to 100 seconds, use the following settings:

In the operations experience, select the Advanced tab for the dataflow endpoint.

Screenshot using operations experience to set Azure Data Explorer advanced settings.

Next steps

To learn more about dataflows, see Create a dataflow.