Uploading data with Windows PowerShell
From: Developing big data solutions on Microsoft Azure HDInsight
The Azure module for Windows PowerShell includes a range of cmdlets that you can use to work with Azure services programmatically, including Azure storage. You can run PowerShell scripts interactively in a Windows command line window or in a PowerShell-specific command line console. Additionally, you can edit and run PowerShell scripts in the Windows PowerShell Interactive Scripting Environment (ISE), which provides IntelliSense and other user interface enhancements that make it easier to write PowerShell code. You can schedule the execution of PowerShell scripts using Windows Scheduler, SQL Server Agent, or other tools as described in Building end-to-end solutions using HDInsight.
Before you use PowerShell to work with HDInsight you must configure the PowerShell environment to connect to your Azure subscription. To do this you must first download and install the Azure PowerShell module, which is available through the Microsoft Web Platform Installer. For more details see How to install and configure Azure PowerShell.
To upload data files to the Azure blob store, you can use the Set-AzureStorageBlobContent cmdlet, as shown in the following code example.
# Azure subscription-specific variables.
$storageAccountName = "storage-account-name"
$containerName = "container-name"
# Find the local folder where this PowerShell script is stored.
$currentLocation = Get-location
$thisfolder = Split –parent $currentLocation
# Upload files in data subfolder to Azure.
$localfolder = "$thisfolder\data"
$destfolder = "data"
$storageAccountKey = (Get-AzureStorageKey -StorageAccountName $storageAccountName).Primary
$blobContext = New-AzureStorageContext -StorageAccountName $storageAccountName -StorageAccountKey $storageAccountKey
$files = Get-ChildItem $localFolder foreach($file in $files)
{
$fileName = "$localFolder\$file"
$blobName = "$destfolder/$file"
write-host "copying $fileName to $blobName"
Set-AzureStorageBlobContent -File $filename -Container $containerName -Blob $blobName -Context $blobContext -Force
}
write-host "All files in $localFolder uploaded to $containerName!"
Note that the code uses the New-AzureStorageContext cmdlet to create a context for the Azure storage account where the files are to be uploaded. This context requires the access key for the storage account, which is obtained using the Get-AzureStorageKey cmdlet. Authentication to obtain the key is based on the credentials or certificate used to connect the local PowerShell environment with the Azure subscription.
The code shown above also iterates over all of the files to be uploaded and uses the Set-AzureStorageBlobContent cmdlet to upload each one in turn. It does this in order to store each one in a specific path that includes the destination folder name. If all of the files you need to upload are in a folder structure that is the same as the required target paths, you could use the following code to upload all of the files in one operation instead of iterating over them in your PowerShell script.
cd [root-data-folder]
ls –Recurse –Path $localFolder | Set-AzureStorageBlobContent –Container $containerName –Context $blobContext