共用方式為


WATERMARK 子句

適用於:核取記號為「是」 Databricks SQL 核取記號為「是」 Databricks Runtime 12.0 和更新版本

在 select 語句中新增 watermark 作為一個關聯。 子 WATERMARK 句僅適用於具狀態串流數據的查詢,其中包括數據流聯結和匯總。

語法

from_item
{ table_name [ TABLESAMPLE clause ] [ watermark_clause ] [ table_alias ] |
  JOIN clause |
  [ LATERAL ] table_valued_function [ table_alias ] |
  VALUE clause |
  [ LATERAL ] ( query ) [ TABLESAMPLE clause ] [ watermark_clause ] [ table_alias ] }

watermark_clause
  WATERMARK named_expression DELAY OF interval

Parameters

  • named_expression

    提供型 timestamp別 值的表達式。 表達式必須是對現有 column的參考,或者是對現有 column進行的決定性轉換。 表示式會新增時間戳類型的 column,用來追蹤 watermark。 新增的 column 可供查詢。

  • interval_clause

    定義 watermark延遲閾值的間隔常值。 必須是小於一個月的正值。

範例

-- Creating a streaming table performing time window row count, with defining watermark from existing column
> CREATE OR REFRESH STREAMING TABLE window_agg_1
  AS SELECT window(ts, '10 seconds') as w, count(*) as CNT
  FROM
  STREAM stream_source WATERMARK ts DELAY OF INTERVAL 10 SECONDS AS stream
  GROUP BY window(ts, '10 seconds');

-- Creating a streaming table performing time window row count, with deriving a new timestamp column to define watermark
> CREATE OR REFRESH STREAMING TABLE window_agg_2
  AS SELECT window(ts, '10 seconds') as w, count(*) as CNT
  FROM
  STREAM stream_source WATERMARK to_timestamp(ts_str) AS ts DELAY OF INTERVAL 10 SECONDS AS stream
  GROUP BY window(ts, '10 seconds');