In the current Azure Document Intelligence workflow, you cannot train custom models using purely local files. Unlike the “test/try” flow (which supports uploading local files on the fly for inference), the training process requires an Azure Blob Storage location. When using Document Intelligence Studio, that’s why you see the requirement to specify a Blob container containing your training files.
What if you don’t want to keep files in Blob Storage long-term?
A common workaround is to use a temporary container in an Azure Storage account. You can upload your on-prem training files to this container just before training, and then remove them afterward. Here’s a suggested approach:
- Create a temporary container in Azure Storage.
- Upload your training files from your on-premises location to the container.
- Generate a SAS URL for that container, granting read access to the Document Intelligence service.
- Point Document Intelligence Studio (or the API/SDK) at that container and its SAS URL for training.
- Delete or clean up the container/files once training is finished, if you don’t need to keep them online.
By doing this, you keep your training documents “continuous” in Blob Storage only for as long as you need them, and you avoid persisting files in the cloud any longer than necessary. Unfortunately, there isn’t a direct local-file-based training mechanism right now. Blob Storage is still required for the service to access your training documents.
Please feel free to click the 'Upvote' (Thumbs-up) button and 'Accept as Answer'. This helps the community by allowing others with similar queries to easily find the solution.