and the borrower is servant to the lender.
He that soweth iniquity shall reap vanity:
and the rod of his anger shall fail."
-- Proverbs 22:7-8
RANKing OVER data sets
We are still on the SELECT clause.
- FROM
- ON
- JOIN
- WHERE
- GROUP BY
- WITH CUBE or WITH ROLLUP
- HAVING
- SELECT
- DISTINCT
- ORDER BY
- TOP
The easy way to think of the OVER clause is that it is a GROUP BY applied to the data set being SELECTed (note this is not what is actually happening, but it is the easy way to think about it).
RANKing
Given the following table (or data set) of playwrights:
CREATE TABLE playwrights ( id [int] IDENTITY(1,1) ,name [varchar](100) ,birthYear [int] ,birthEra [char](2));
INSERT INTO playwrights
(name, birthYear, birthEra)
VALUES
('Oscar Wilde', 1854, 'AD')
,('Euripides', 480, 'BC')
,('Aeschylus', 525, 'BC')
,('Sophocles', 497, 'BC')
,('William Shakespeare', 1564, 'AD')
,('Jacopone da Todi', 1230, 'AD')
;
We see that the playwrights have a natural ranking by birthEra and birthYear, so if we wanted to rank them by when they were born we could do something like the following for the BC era playwrights:
SELECT
*
,RANK() OVER (ORDER BY birthYear DESC) AS rank
FROM playwrights
WHERE birthEra = 'BC'
;
(on SQL Fiddle)
Looking at the results we see that the rank returned is ordered by the the birthYear in descending order, for the BC era this is what we want.
Now for AD era we want the birthYear to be ranked in ascending order, like this:
SELECT
*
,RANK() OVER (ORDER BY birthYear ASC) AS rank
FROM playwrights
WHERE birthEra = 'AD'
;
(on SQL Fiddle)
If we want to combine the results into one, we can UNION the two sets and do a RANKing over the results:
SELECT
RANK() OVER (ORDER BY birthEra DESC, year_rank ASC) AS rank
,*
FROM (
SELECT
*
,RANK() OVER (ORDER BY birthYear DESC) AS year_rank
FROM playwrights
WHERE birthEra = 'BC'
UNION ALL
SELECT
*
,RANK() OVER (ORDER BY birthYear ASC) AS year_rank
FROM playwrights
WHERE birthEra = 'AD'
) AS needed_for_sub_select
;
(on SQL Fiddle)
The sub select will allow us to rank each era in the correct way (BC by DESC and AD by ASC). We combine each result from the two sub selects with a UNION ALL, this gives us one big data set which is piped to the outer SELECT statement. Given this big data set we can now do an OVER clause on the UNION of the two sets in the sub select with a RANKing by the birthEra descending (so that BC is before AD) followed by year_rank. This is a fairly clean and readable way to solve the problem (plus it is very fast, more on that later).