New-AzureHDInsightSqoopJobDefinition
Defines a new Sqoop job.
Note
The cmdlets referenced in this documentation are for managing legacy Azure resources that use Azure Service Manager (ASM) APIs. This legacy PowerShell module isn't recommended when creating new resources since ASM is scheduled for retirement. For more information, see Azure Service Manager retirement.
The Az PowerShell module is the recommended PowerShell module for managing Azure Resource Manager (ARM) resources with PowerShell.
Syntax
New-AzureHDInsightSqoopJobDefinition
[-Command <String>]
[-File <String>]
[-Files <String[]>]
[-StatusFolder <String>]
[-Profile <AzureSMProfile>]
[<CommonParameters>]
Description
This version of Azure PowerShell HDInsight is deprecated. These cmdlets will be removed by January 1, 2017. Please use the newer version of Azure PowerShell HDInsight.
For information about how to use the new HDInsight to create a cluster, see Create Linux-based clusters in HDInsight using Azure PowerShell (https://azure.microsoft.com/en-us/documentation/articles/hdinsight-hadoop-create-linux-clusters-azure-powershell/). For information about how to submit jobs by using Azure PowerShell and other approaches, see Submit Hadoop jobs in HDInsight (https://azure.microsoft.com/en-us/documentation/articles/hdinsight-submit-hadoop-jobs-programmatically/). For reference information about Azure PowerShell HDInsight, see Azure HDInsight Cmdlets.
The New-AzureHDInsightSqoopJobDefinition cmdlet creates a Sqoop job to run on an Azure HDInsight cluster.
Sqoop is a tool to transfer data between Hadoop clusters and relational databases. You can use Sqoop to import data from a SQL Server database to a Hadoop Distributed File System (HDFS), transform the data with Hadoop MapReduce, and then export the data from the HDFS back to the SQL Server database.
Examples
Example 1: Import data
PS C:\>$SqoopJobDef = New-AzureHDInsightSqoopJobDefinition -Command "import --connect jdbc:sqlserver://<SQLDatabaseServerName>.database.windows.net:1433;username=<SQLDatabasUsername>@<SQLDatabaseServerName>; password=<SQLDatabasePassword>; database=<SQLDatabaseDatabaseName> --table <TableName> --target-dir wasb://<ContainerName>@<WindowsAzureStorageAccountName>.blob.core.windows.net/<Path>"
This command defines a Sqoop job that imports all of the rows in a table from an AzureSQL Server database to an HDInsight cluster, and then stores the job definition in the $SqoopJobDef variable.
Parameters
-Command
Specifies a Sqoop command and its arguments.
Type: | String |
Position: | Named |
Default value: | None |
Required: | False |
Accept pipeline input: | False |
Accept wildcard characters: | False |
-File
Specifies the path to a script file that contains the commands to run. The script file must be located on WASB.
Type: | String |
Aliases: | QueryFile |
Position: | Named |
Default value: | None |
Required: | False |
Accept pipeline input: | False |
Accept wildcard characters: | False |
-Files
Specifies the collection of WASB files that are required for a job.
Type: | String[] |
Position: | Named |
Default value: | None |
Required: | False |
Accept pipeline input: | False |
Accept wildcard characters: | False |
-Profile
Specifies the Azure profile from which this cmdlet reads. If you do not specify a profile, this cmdlet reads from the local default profile.
Type: | AzureSMProfile |
Position: | Named |
Default value: | None |
Required: | False |
Accept pipeline input: | False |
Accept wildcard characters: | False |
-StatusFolder
Specifies the location of the folder that contains standard outputs and error outputs for a job, including its exit code and task logs.
Type: | String |
Position: | Named |
Default value: | None |
Required: | False |
Accept pipeline input: | False |
Accept wildcard characters: | False |