Hello Diego STUCCHI,
Welcome to the Microsoft Q&A and thank you for posting your questions here.
I understand that you would like to upload multiple folders for Remote Job via Azure Machine Learning Python SDK (v2).
When working with the Azure Machine Learning Python SDK (v2) to upload multiple folders for a remote job, there are several methods to consider. Each method has its pros and cons, and it’s important to choose the one that best fits your needs.
Methods to Upload Multiple Folders
- Using a ZIP File,
- Using a Custom Docker Image,
- Using AzureML Data Assets,
- Using the code_paths Parameter.
Best/Optimal Approach is code_paths Parameter.
The code_paths parameter in the AzureML SDK v2 supports multiple folders directly. This method avoids the hassle of compressing files or building custom containers. This is how you can implement it:
from azure.ai.ml import command, Input
job = command(
inputs=dict(
train_data=Input(type="uri_file", path="path/to/train_data"),
test_data=Input(type="uri_file", path="path/to/test_data"),
),
code_paths=[
"path/to/source",
"path/to/experiment",
"path/to/utils"
],
command='python path/to/experiment/main.py --train-data ${{inputs.train_data}} --test-data ${{inputs.test_data}}',
environment="azureml:<your-environment-name>",
experiment_name='my_experiment'
)
ml_client.create_or_update(job)
Using a ZIP File.
Should there be any issues with the code_paths parameter, another practical solution is to consolidate your directories into a ZIP file. This method works within the current SDK constraints and preserves your codebase structure. This is how you can do it:
Use the following bash command to create a ZIP file containing all your folders:
zip -r code_archive.zip path/to/source path/to/experiment path/to/utils
Modify Your Python Script: Use the ZIP file as the code path in your script:
from azure.ai.ml import command, Input
job = command(
inputs=dict(
train_data=Input(type="uri_file", path="path/to/train_data"),
test_data=Input(type="uri_file", path="path/to/test_data"),
),
code="path/to/code_archive.zip", # Use the ZIP file as the code path
command='python path/to/experiment/main.py --train-data ${{inputs.train_data}} --test-data ${{inputs.test_data}}',
environment="azureml:<your-environment-name>",
experiment_name='my_experiment'
)
ml_client.create_or_update(job)
Check more details in the Documentation - https://zcusa.951200.xyz/en-us/azure/machine-learning/how-to-use-azureml-sdk and Azure ML Job Submission Best Practices - https://zcusa.951200.xyz/en-us/azure/machine-learning/how-to-submit-jobs
I hope this is helpful! Do not hesitate to let me know if you have any other questions.
Please don't forget to close up the thread here by upvoting and accept it as an answer if it is helpful.