使用 Azure Databricks 查詢 SQL Server

發行項
11/22/2024

本文說明如何將 Azure Databricks 連線到 Microsoft SQL Server 來讀取和寫入數據。

重要

本文所述的設定為實驗性質。實驗性功能是以現況提供，且無法透過客戶技術支援來支援 Databricks。 若要取得完整的查詢同盟支援，您應該改用 Lakehouse 同盟，這可讓您的 Azure Databricks 使用者利用 Unity 目錄語法和資料控管工具。

設定 SQL Server 的連線

在 Databricks Runtime 11.3 LTS 和更新版本中，您可以使用關鍵詞來使用 sqlserver 內含的驅動程式來連線到 SQL Server。使用 DataFrame 時，請使用下列語法：

Python

remote_table = (spark.read
  .format("sqlserver")
  .option("host", "hostName")
  .option("port", "port") # optional, can use default port 1433 if omitted
  .option("user", "username")
  .option("password", "password")
  .option("database", "databaseName")
  .option("dbtable", "schemaName.tableName") # (if schemaName not provided, default to "dbo")
  .load()
)

Scala

val remote_table = spark.read
  .format("sqlserver")
  .option("host", "hostName")
  .option("port", "port") // optional, can use default port 1433 if omitted
  .option("user", "username")
  .option("password", "password")
  .option("database", "databaseName")
  .option("dbtable", "schemaName.tableName") // (if schemaName not provided, default to "dbo")
  .load()

使用 SQL 時，請在子句中USING指定 sqlserver ，並在建立資料表時傳遞選項，如下列範例所示：

DROP TABLE IF EXISTS sqlserver_table;
CREATE TABLE sqlserver_table
USING sqlserver
OPTIONS (
  dbtable '<schema-name.table-name>',
  host '<host-name>',
  port '1433',
  database '<database-name>',
  user '<username>',
  password '<password>'
);

使用舊版 JDBC 驅動程式

在 Databricks Runtime 10.4 LTS 和下方，您必須使用 JDBC 設定來指定驅動程式和組態。下列範例會使用其 JDBC 驅動程式查詢 SQL Server。如需讀取、寫入、設定平行處理原則和查詢下推的詳細資訊，請參閱使用 JDBC 查詢資料庫。

Python

driver = "com.microsoft.sqlserver.jdbc.SQLServerDriver"

database_host = "<database-host-url>"
database_port = "1433" # update if you use a non-default port
database_name = "<database-name>"
table = "<table-name>"
user = "<username>"
password = "<password>"

url = f"jdbc:sqlserver://{database_host}:{database_port};database={database_name}"

remote_table = (spark.read
  .format("jdbc")
  .option("driver", driver)
  .option("url", url)
  .option("dbtable", table)
  .option("user", user)
  .option("password", password)
  .load()
)

Scala

val driver = "com.microsoft.sqlserver.jdbc.SQLServerDriver"

val database_host = "<database-host-url>"
val database_port = "1433" // update if you use a non-default port
val database_name = "<database-name>"
val table = "<table-name>"
val user = "<username>"
val password = "<password>"

val url = s"jdbc:sqlserver://{database_host}:{database_port};database={database_name}"

val remote_table = spark.read
  .format("jdbc")
  .option("driver", driver)
  .option("url", url)
  .option("dbtable", table)
  .option("user", user)
  .option("password", password)
  .load()

共用方式為

使用 Azure Databricks 查詢 SQL Server

設定 SQL Server 的連線

Python

Scala

使用舊版 JDBC 驅動程式

Python

Scala

意見反應

其他資源