Share via

ADF dataflow not writing blob but no error

Moyer, Todd 85 Reputation points
7 Jan 2025, 4:25 pm

We have a Azure Data Factory dataflow that is supposed to write a single blob with a path and name as given in a data column. This is working in some parts of our system, but appears to have a sneaky bug. It runs and reports that thousands of rows were written successfully. However, the blob/file is nowhere on the storage container. (I'm searching in Storage Explorer with the filter in Flat list mode and sorting by Last Modified date.)

The path and file name are passed in as pipeline expressions. It seems to be sensative to underbars and slashes ("_", "/") in the path. Path and file name are concatinated as ...

derive(fileName = concat("/", $sinkPath, "/", $sinkFileName))

This would be a lot easier to debug if the log had more info on where the dataflow sink was actually written. And if it isn't able to write, it should not report "success".

Azure Blob Storage
Azure Blob Storage
An Azure service that stores unstructured data in the cloud as blobs.
3,036 questions
Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
11,119 questions
0 comments No comments
{count} votes

2 answers

Sort by: Most helpful
  1. Moyer, Todd 85 Reputation points
    9 Jan 2025, 4:08 pm

    The attached pipeline isolates the problem. The leading slash and underbars are not the cause. It looks like having a dataflow where the source and sink have the same path is the cause. This is confusing because I thought the path was just part of the blob name and had little significance otherwise.

    To run:

    1. Upload test.csv to <your storage>/<your container>/test_in/test.csv
    2. You will need to swap in your own Azure blob storage linked service for the datasets and dataflow.
    3. Swap in your container name for the datasets and the pipeline.
    4. Run the pipeline.

    Expected result: The pipeline says it ran successfully and wrote a record, but <your storage>/<your container>/test_in/out.csv was not written.

    To show that the problem is related to the sink path being the same as the source path...

    1. Go into sink1 in the dataflow. Change the Column data from fileOutSamePath to fileOutDiffPath.
    2. Rerun.

    New expected result: <your storage>/<your container>/test_out/out.csv was written.

    PL_test.json.txt

    testDf.json.txt

    testCsv.json.txt

    generic_csv_internal.json.txt

    test.csv.txt

    1 person found this answer helpful.
    0 comments No comments

  2. Vinodh247 27,106 Reputation points MVP
    9 Jan 2025, 12:38 am

    Hi ,

    Thanks for reaching out to Microsoft Q&A.

    Focus on removing the leading slash and validating how slashes are concatenated. Also, ensure you’re checking logs (both in the Data Flow Debug output and ADF Monitor) to confirm the exact path. If the sink can’t write due to an invalid path, it’ll sometimes fail silently but still report “success” rows in the row count.

    Potential causes:

    1. Leading Slash Issue:
      1. Prepending "/" before the folder path in concat("/", $sinkPath, "/", $sinkFileName) can lead to unexpected relative paths. If the dataflow sink interprets the leading slash as an absolute path, it could be “missing” your intended container/folder structure.
    2. Special Characters in Path:
      1. Underscores and slashes in the path can be mishandled depending on how the mapping dataflow sink is configured or how the underlying file system interprets them.
    3. Silent Failures:
      1. Dataflows sometimes write “success” rows to the logs but never produce a physical file if the path or the sink configuration is invalid.

    Please feel free to click the 'Upvote' (Thumbs-up) button and 'Accept as Answer'. This helps the community by allowing others with similar queries to easily find the solution.


Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.