Condividi tramite


The PIVOT Operator

In my next few posts, I'm going to look at how SQL Server implements the PIVOT and UNPIVOT operators.  Let's begin with the PIVOT operator.  The PIVOT operator takes a normalized table and transforms it into a new table where the columns of the new table are derived from the values in the original table.  For example, suppose we want to store annual sales data by employee.  We might create a schema such as the following:

CREATE TABLE Sales (EmpId INT, Yr INT, Sales MONEY)
INSERT Sales VALUES(1, 2005, 12000)
INSERT Sales VALUES(1, 2006, 18000)
INSERT Sales VALUES(1, 2007, 25000)
INSERT Sales VALUES(2, 2005, 15000)
INSERT Sales VALUES(2, 2006, 6000)
INSERT Sales VALUES(3, 2006, 20000)
INSERT Sales VALUES(3, 2007, 24000)

Notice that this schema has one row per employee per year.  Moreover, notice that in the sample data employees 2 and 3 only have sales data for two of the three years worth of data.  Now suppose that we'd like to transform this data into a table that has one row per employee with all three years of sales data in each row.  We can achieve this conversion very easily using PIVOT:

SELECT EmpId, [2005], [2006], [2007]
FROM (SELECT EmpId, Yr, Sales FROM Sales) AS s
PIVOT (SUM(Sales) FOR Yr IN ([2005], [2006], [2007])) AS p

I'm not going to delve into the PIVOT syntax which is already documented in Books Online.  Suffice it to say that this statement sums up the sales for each employee for each of the specified years and outputs one row per employee.  The resulting output is:

 EmpId       2005                  2006                  2007
----------- --------------------- --------------------- ---------------------
1           12000.00              18000.00              25000.00
2           15000.00              6000.00               NULL
3           NULL                  20000.00              24000.00

Notice that SQL Server inserts NULLs for the missing sales data for employees 2 and 3.

The SUM keyword (or some other aggregate) is required.  If the Sales table includes multiple rows for a particular employee for a particular year, PIVOT does aggregate them - in this case by summing them -  into a single data point in the result.  Of course, in this example, since the entry in each "cell" of the output table is the result of summing a single input row, we could just as easily have used another aggregate such as MIN or MAX.  I've used SUM since it is more intuitive.

This PIVOT example is reversible.  The information in the output table can be used to reconstruct the original input table using an UNPIVOT operation (which I will cover in a later post).  However, not all PIVOT operations are reversible.  To be reversible, a PIVOT operation must meet the following criteria:

  • All of the input data must be transformed. If we include a filter of any kind including on the IN clause, some data may be omitted from the PIVOT result. For example, if we altered the above example only to output sales for 2006 and 2007, clearly we could not reconstruct the 2005 sales data from the result.
  • Each cell in the output table must derive from a single input row. If multiple input rows are aggregated into a single cell, there is no way to reconstruct the original input rows.
  • The aggregate function must be an identity function (when used on a single input row). SUM, MIN, MAX, and AVG all return the single input value unchanged and, thus, can be reversed. COUNT does not return its input value unchanged and, thus, cannot be reversed.

Here is an example of a non-reversible PIVOT operation.  This example, calculates the total sales for all employees for all three years.  It does not itemize the output by employee.

SELECT [2005], [2006], [2007]
FROM (SELECT Yr, Sales FROM Sales) AS s
PIVOT (SUM(Sales) FOR Yr IN ([2005], [2006], [2007])) AS p

Here is the output.  Each cell represents the sum of two or three rows from the input table.

 2005                  2006                  2007
--------------------- --------------------- ---------------------
27000.00              44000.00              49000.00

In my next post, I'll look at some example PIVOT query plans.

Comments

  • Anonymous
    July 09, 2007
    In my last post , I gave an overview of the PIVOT operator. In this post, I'm going to take a look at

  • Anonymous
    July 09, 2007
    In my last post , I gave an overview of the PIVOT operator. In this post, I'm going to take a look

  • Anonymous
    July 17, 2007
    The UNPIVOT operator is the opposite of the PIVOT operator. As I explained in my earlier post , the PIVOT

  • Anonymous
    July 17, 2007
    The UNPIVOT operator is the opposite of the PIVOT operator. As I explained in my earlier post , the PIVOT

  • Anonymous
    July 28, 2007
    Great Post...Is there any way to add row-level aggregations to the end?  Following your example, I would like to add another column for Total Employee Sales that sums the years by employee.  And the parallel question would be can I add a row with column-level aggregations?  For example, a new row with null employee id that sums the sales by year.Any help would be appreciated.

  • Anonymous
    August 01, 2007
    For your first question, the best I can do with PIVOT is:SELECT EmpId, [2005], [2006], [2007], [0] AS [Total]FROM (   SELECT EmpId, Yr, Sales FROM Sales   UNION ALL   SELECT EmpId, 0, Sum(Sales) FROM Sales GROUP BY EmpId   ) AS sPIVOT (SUM(Sales) FOR Yr IN ([2005], [2006], [2007], [0])) AS pWe can get a better plan by rewriting it without PIVOT as:SELECT EmpId,   SUM(CASE WHEN Yr = 2005 THEN Sales END) AS [2005],   SUM(CASE WHEN Yr = 2006 THEN Sales END) AS [2006],   SUM(CASE WHEN Yr = 2007 THEN Sales END) AS [2007],   SUM(Sales) AS [Total]FROM SalesGROUP BY EmpIdFor your second question, we can use GROUP BY WITH ROLLUP:SELECT EmpId, SUM([2005]), SUM([2006]), SUM([2007])FROM (SELECT EmpId, Yr, Sales FROM Sales) AS sPIVOT (SUM(Sales) FOR Yr IN ([2005], [2006], [2007])) AS pGROUP BY EmpId WITH ROLLUPPerhaps there is a better solution (to either question) that another reader might suggest?

  • Anonymous
    June 17, 2010
    it's a great topic, hm..I want to make table like this using pivot, is that posible? thx b4    StudentName  | AssignmentName |  Grade    ---------------------------------------    StudentA     | Assignment 1   | 100    StudentA     | Assignment 2   | 80    StudentA     | Total          | 180    StudentB     | Assignment 1   | 100    StudentB     | Assignment 2   | 80    StudentB     | Assignment 3   | 100    StudentB     | Total          | 280

  • Anonymous
    June 22, 2010
    Use ROLLUP or GROUPING SETS.  See blogs.msdn.com/.../aggregation-with-rollup.aspx or blogs.msdn.com/.../grouping-sets-in-sql-server-2008.aspx.

  • Anonymous
    October 28, 2011
    is it possible to use join statements in pivot operator ?

  • Anonymous
    October 30, 2011
    Yes, joins are possible in the source query which can be any legal SQL query.