Transform data with Spark in Azure Synapse Analytics
Data engineers commonly need to transform large volumes of data. Apache Spark pools in Azure Synapse Analytics provide a distributed processing platform that they can use to accomplish this goal.
Learning objectives
In this module, you will learn how to:
- Use Apache Spark to modify and save dataframes
- Partition data files for improved performance and scalability.
- Transform data with SQL
Prerequisites
Before taking this module, you should be familiar with Apache Spark pools in Azure Synapse Analytics. Consider completing the Analyze data with Apache Spark in Azure Synapse Analytics module first.