Getting Different results when I am running the Global Templary table (Transform Query)

PRADEEP KUMAR Namani 0 Reputation points
2024-11-25T10:03:05.2566667+00:00

Hi Team,

We are running a logic in an Azure Databricks notebook using PySpark. Initially, we read data from ADLS and load it into a global temporary table to perform data quality checks. We then recreate the same temporary table. Afterward, we use these temporary tables to write Spark SQL for transformations, execute the query, and load the results into another global temporary table. Finally, the result set is written back to ADLS.

Problem: When reading the transformed data, one of the column values is inconsistent. Each time we query the data, the column alternates between returning null and the expected value.

Please find the below screenshot

RUN1: WrittenAmount value is coming Null

User's image

RUN2: WrittenAmount value is coming some data

User's image

Please help me on this, Thank you in Advance

Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
2,302 questions
{count} votes

1 answer

Sort by: Most helpful
  1. Chandra Boorla 6,460 Reputation points Microsoft Vendor
    2025-01-02T10:38:59.9033333+00:00

    @PRADEEP KUMAR Namani

    I'm glad that you were able to resolve your issue and thank you for posting your solution so that others experiencing the same thing can easily reference this! Since the Microsoft Q&A community has a policy that "The question author cannot accept their own answer. They can only accept answers by others ", I'll repost your solution in case you'd like to accept the answer.

    Issue:

    We are running a logic in an Azure Databricks notebook using PySpark. Initially, we read data from ADLS and load it into a global temporary table to perform data quality checks. We then recreate the same temporary table. Afterward, we use these temporary tables to write Spark SQL for transformations, execute the query, and load the results into another global temporary table. Finally, the result set is written back to ADLS.

    Problem: When reading the transformed data, one of the column values is inconsistent. Each time we query the data, the column alternates between returning null and the expected value.

    Solution:

    To address the inconsistent data issue, I implemented a solution that performs a write operation immediately after the read operation, followed by another read operation. This method effectively resolves the problem.

    If I missed anything please let me know and I'd be happy to add it to my answer, or feel free to comment below with any additional information.

    Hope this helps. Do let us know if you have any further queries.


    If this answers your query, do click Accept Answer and Yes for was this answer helpful. And, if you have any further query do let us know.


Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.