Sunday, August 25, 2013

Developer Freedom is a Blank Text Editor

"Though age from folly could not give me freedom,
It does from childishness.
"
-- Shakespeare, Antony and Cleopatra
Act I Scene III Lines 57-58

"Ye shall know the truth, and the truth shall set you free."
-- John 8:32

"Freedom! Horrible, horrible freedom!"
-- Simpsons 3F31

For a software developer there is nothing as freeing or as scary as a blank text editor.  Just looking at it is both liberating and intimidating at the same time.


(or to not start any wars)


With a blank text editor you can create anything and everything.  This is how Google and Facebook got their start.  The software used on the Space Shuttle, Unix, and even the Internet itself all got their starts from a blank text editor.  Think about this next time someone tries to sell you a tool to limit your freedom.

Saturday, August 17, 2013

Learn You Some SQL For Great Good - Unlearn What You Have Learned OR How Select Logically Works (the SELECT clause--OVER Ranking clause)

"The rich ruleth over the poor,
and the borrower is servant to the lender.
He that soweth iniquity shall reap vanity:
and the rod of his anger shall fail.
"
-- Proverbs 22:7-8

RANKing OVER data sets


We are still on the SELECT clause.

  1. FROM
  2. ON
  3. JOIN
  4. WHERE
  5. GROUP BY
  6. WITH CUBE or WITH ROLLUP
  7. HAVING
  8. SELECT
  9. DISTINCT
  10. ORDER BY
  11. TOP
We are going to talk about what I think is one of the most useful functionality available in modern SQL, the OVER clause.  For this post we'll look at the ranking function of RANK.

The easy way to think of the OVER clause is that it is a GROUP  BY applied to the data set being SELECTed (note this is not what is actually happening, but it is the easy way to think about it).

RANKing


Given the following table (or data set) of playwrights:


CREATE TABLE playwrights (  id [int] IDENTITY(1,1) ,name [varchar](100) ,birthYear [int] ,birthEra [char](2));


INSERT INTO playwrights
(name, birthYear, birthEra)
VALUES
  ('Oscar Wilde', 1854, 'AD')
 ,('Euripides', 480, 'BC')
 ,('Aeschylus', 525, 'BC')
 ,('Sophocles', 497, 'BC')
 ,('William Shakespeare', 1564, 'AD')
 ,('Jacopone da Todi', 1230, 'AD')

;

We see that the playwrights have a natural ranking by birthEra and birthYear, so if we wanted to rank them by when they were born we could do something like the following for the BC era playwrights:

SELECT
   *
  ,RANK() OVER (ORDER BY birthYear DESC) AS rank
  FROM playwrights
  WHERE birthEra = 'BC'
;


(on SQL Fiddle)

Looking at the results we see that the rank returned is ordered by the the birthYear in descending order, for the BC era this is what we want.

Now for AD era we want the birthYear to be ranked in ascending order, like this:

SELECT
   *
  ,RANK() OVER (ORDER BY birthYear ASC) AS rank
  FROM playwrights
  WHERE birthEra = 'AD'
;


(on SQL Fiddle)

If we want to combine the results into one, we can UNION the two sets and do a RANKing over the results:

SELECT
   RANK() OVER (ORDER BY birthEra DESC, year_rank ASC) AS rank
  ,*
  FROM (

SELECT
   *
  ,RANK() OVER (ORDER BY birthYear DESC) AS year_rank
  FROM playwrights
  WHERE birthEra = 'BC'

UNION ALL

SELECT
   *
  ,RANK() OVER (ORDER BY birthYear ASC) AS year_rank
  FROM playwrights
  WHERE birthEra = 'AD'

) AS needed_for_sub_select
;


(on SQL Fiddle)

The sub select will allow us to rank each era in the correct way (BC by DESC and AD by ASC).  We combine each result from the two sub selects with a UNION ALL, this gives us one big data set which is piped to the outer SELECT statement.  Given this big data set we can now do an OVER clause on the UNION of the two sets in the sub select with a RANKing by the birthEra descending (so that BC is before AD) followed by year_rank.  This is a fairly clean and readable way to solve the problem (plus it is very fast, more on that later).

Sunday, August 4, 2013

Learn You Some SQL For Great Good - Unlearn What You Have Learned OR How Select Logically Works (the SELECT clause)

"Though thanks to all, must I select from all. The rest
Shall bear the business in some other fight,
"
-- Shakespeare, Coriolanus
Act I, Scene VI, Lines 81-82

SELECT information on selections


We are finally on the term SELECT.  Yep, this is normally the first thing covered when one learns about SQL, but as you now know SELECT is logically the 8th thing that gets looked at when doing a SELECT SQL query.

  1. FROM
  2. ON
  3. JOIN
  4. WHERE
  5. GROUP BY
  6. WITH CUBE or WITH ROLLUP
  7. HAVING
  8. SELECT
  9. DISTINCT
  10. ORDER BY
  11. TOP
SELECT has a lot of different things going on, so we'll spend a bit of time on SELECT.  For this post we'll just cover projections.

Projections OR how to impress people when talking about SELECT


A projection is a fancy relational algebra term for something that is very easy to understand.  A project can be thought of as a list of things that one wants from a set.

Given the following tables (or data sets):

CREATE TABLE plays (
  id [int] IDENTITY(1,1)
 ,name [varchar](100)
 ,playwrightId [int]
);

CREATE TABLE playwrights (
  id [int] IDENTITY(1,1)
 ,name [varchar](100)
 ,birthYear [int]
 ,birthEra [char](2)
);

We can say I am only interested in the name column (or attribute) of the plays table with the following query.



SELECT name
  FROM plays
;

(on SQL Fiddle)

This will limit are results to just the plays in the plays table.

Like wise we can join the two tables together and say which columns (attributes) we want for each.



SELECT
   p.name AS play
  ,w.name AS playwright
  FROM plays AS p
  INNER JOIN playwrights AS w
    ON p.playwrightId = w.id
;

(on SQL Fiddle)

Notice we just do not speak of the things we do not want, SQL is a declarative language (yep, even Forbes think declarative programming is a good thing), this means we describe what we want and not how to do it.

Also notice the AS term, this term allows us to alias things.  If we did not have the alias of p and w on the plays and playwrights tables we would not be able to distinguish between the plays' name column and the playwrights' name column or the id on plays and the id on playwrights

SELECT
   name AS play
  ,name AS playwright
  FROM plays
  INNER JOIN playwrights
    ON playwrightId = id
;

(on SQL Fiddle)

We get the following error:

Ambiguous column name 'id'.: SELECT name AS play ,name AS playwright FROM plays INNER JOIN playwrights ON playwrightId = id

There is no way for the database management system to be able to tell the difference between the plays' id and the playwrights' id.  To fix this we can do the following.

SELECT
   plays.name AS play
  ,playwrights.name AS playwright
  FROM plays
  INNER JOIN playwrights
    ON plays.playwrightId = playwrights.id
;

(on SQL Fiddle)

By saying which table they come from we are able to use column name and id in the query above.  This is a bit wordy which is why we have the AS term.  AS allows us to alias the fully name of the table as a different value.  Let's look back at the original query.

SELECT
   p.name AS play
  ,w.name AS playwright
  FROM plays AS p
  INNER JOIN playwrights AS w
    ON p.playwrightId = w.id
;

(on SQL Fiddle)

In the above query we have alias plays AS p and playwrights AS w.  Now when we go to use the playwrights id we can simply say w.id, likewise p.name is the plays' name column and w.name is the playwrights' name.  Notice that we further alias the projections of p.name AS play and w.name AS playwright, this will change the attributes on the result set of the query to play for p.name and playwright for w.name.

Sunday, July 28, 2013

Learn You Some SQL For Great Good - Unlearn What You Have Learned OR How Select Logically Works (the HAVING clause)

"O Cassius, Brutus gave the word too early,
Who, having some advantage on Octavius,
Took it too eagerly; his soldiers fell to spoil,
Whilst we by Antony are all enclosed.
"
-- Shakespeare, Julius Caesar
Act V, Scene III, Lines 5-8

HAVING and a group


Next on our trip down the SELECT we find ourselves at HAVING.

  1. FROM
  2. ON
  3. JOIN
  4. WHERE
  5. GROUP BY
  6. WITH CUBE or WITH ROLLUP
  7. HAVING
  8. SELECT
  9. DISTINCT
  10. ORDER BY
  11. TOP
HAVING can best be thought of as a WHERE clause for the GROUP BY.  HAVING applies a filter on the groups created by the GROUP BY clause.

A common need is to find when you have more than one of something in your data set.


SELECT
   a.name AS playwright
  ,COUNT(*) AS number_of_plays
  FROM plays AS p
  INNER JOIN playwrights AS a
    ON p.playwrightId = a.id
  WHERE a.birthEra = 'AD'
  GROUP BY a.name
  HAVING COUNT(*) > 1
;

(on SQL Fiddle)

By using the HAVING clause we are able to filter on the groups that the GROUP BY gives us, in the above we filter out the playwrights that have less than 1 play in our data set.  This filtering takes place after the filtering we do with the INNER JOIN (which gets rid of Jacopone da Todi) and the WHERE clause filtering which gets rid of the ancient playwrights.

Given the data set below.

CREATE TABLE plays (
  id [int] IDENTITY(1,1)
 ,name [varchar](100)
 ,playwrightId [int]
);

CREATE TABLE playwrights (
  id [int] IDENTITY(1,1)
 ,name [varchar](100)
 ,birthYear [int]
 ,birthEra [char](2)
);

INSERT INTO playwrights
(name, birthYear, birthEra)
VALUES
  ('Oscar Wilde', 1854, 'AD')
 ,('Euripides', 480, 'BC')
 ,('Aeschylus', 525, 'BC')
 ,('Sophocles', 497, 'BC')
 ,('William Shakespeare', 1564, 'AD')
 ,('Jacopone da Todi', 1230, 'AD')
;

INSERT INTO plays
(name, playwrightId)
VALUES
  ('The Importance of Being Earnest', 1)
 ,('Medea', 2)
 ,('Women of Troy', 2)
 ,('Orestes', 2)
 ,('Phoenician Women', 2)
 ,('Bacchae', 2)
 ,('Seven against Thebes', 3)
 ,('Agamemnon', 3)
 ,('The Libation Bearers', 3)
 ,('The Eumenides', 3)
 ,('Prometheus Bound', 3)
 ,('Oedipus Tyrannus', 4)
 ,('Oedipus at Colonus', 4)
 ,('Antigone', 4)
 ,('Electra', 4)
 ,('The Tempest', 5)
 ,('A Midsummer Night''s Dream', 5)
 ,('The Merchant of Venice', 5)
 ,('As You Like It', 5)
 ,('Romeo and Juliet', 5)
 ,('Julius Caesar', 5)
 ,('Macbeth', 5)
 ,('Hamlet', 5)
 ,('Othello', 5)
 ,('The Second Shepherds'' Play', 0)
 ,('Edward III', 0)
;

(on SQL Fiddle)

Learn You Some SQL For Great Good - Unlearn What You Have Learned OR How Select Logically Works (the WITH CUBE or WITH ROLLUP clause)

"They came upon me as a wide breaking in of waters:
in the desolation they rolled themselves upon me.
"
-- Job 30:14

We are now onto WITH CUBE and WITH ROLLUP in our trip down the SELECT clause.

  1. FROM
  2. ON
  3. JOIN
  4. WHERE
  5. GROUP BY
  6. WITH CUBE or WITH ROLLUP
  7. HAVING
  8. SELECT
  9. DISTINCT
  10. ORDER BY
  11. TOP
I would be lying if I say I have ever used these.  Instead of bsing my way through them, I will instead give links to others that seem to know what they are talking about.  At sometime if I actually use them I may come back and put actual content.

Pinal Dave is one of the first people I look to when I need to know something about SQL Server here is a link to his  blog on WITH ROLLUP: http://blog.sqlauthority.com/2010/02/24/sql-server-introduction-to-rollup-clause/

Craig Freedman has a blog entry on WITH ROLLUP: http://blogs.msdn.com/b/craigfr/archive/2007/09/21/aggregation-with-rollup.aspx and another on WITH CUBE: http://blogs.msdn.com/b/craigfr/archive/2007/09/27/aggregation-with-cube.aspx