Saturday, July 6, 2013

Learn You Some SQL For Great Good - Unlearn What You Have Learned OR How Select Logically Works (the WHERE clause--NULL)

"We hold the promise, who shall make it null? If God be for us, who can be against us?"
-- Saint Augustine
The Confessions of Saint Augustine, Book XII

Who Could Make It NULL?


We are going to cover one last thing before we are done with the WHERE clause in the SELECT statement.

  1. FROM
  2. ON
  3. JOIN
  4. WHERE
  5. GROUP BY
  6. WITH CUBE or WITH ROLLUP
  7. HAVING
  8. SELECT
  9. DISTINCT
  10. ORDER BY
  11. TOP

I Don't Know


A criticism of popular governments is that people felt that they should always have an opinion on everything.  Often it is OK to simply say "I don't know".  An unknown represents a missing or unexplored part.



On old maps, the unknown sections of the map would have the caption, here be dragons.

In SQL an unknown is represented by a value of NULL.  We have already seen NULL values when we were doing LEFT OUTER and FULL OUTER JOINs.  Those places where we do not have a related value across the tables, those values came back as NULL, meaning unknown.

Working with the Unknown


In SQL NULL does not equal NULL, meaning that an unknown value does not equal an unknown value (which makes sense if you really think about it).  This is important, since if you have NULL values you need to use the special IS NULL operator to check if a value is NULL.

Here is some SQL (that has not been full gone over yet) to show the point:

SELECT 
  CASE
    WHEN NULL = NULL THEN 'NULL = NULL'
    ELSE 'NULL <> NULL'
  END AS NullEqual
 ,CASE
    WHEN NULL IS NULL THEN 'NULL IS NULL'
    ELSE 'NULL IS NOT NULL'
  END AS NullIsNull
;

This will give us the output: NULL <> NULL (meaning NULL does not equal NULL) and NULL IS NULL (meaning NULL has the value of NULL).

(on SQL Fiddle)

Data Quality is Job Number 1


Say that you want to be able to get the values in one table that do not link to the values in another table, how would you do that?


SELECT
   p.name AS play
  FROM plays AS p
  LEFT JOIN playwrights AS a
    ON p.playwrightId = a.id
  WHERE a.id IS NULL
;

(on SQL Fiddle)

The WHERE clause on the query would filter off all the plays which have known playwrights leaving us with just The Second Sheperd's Play and Edward III.

Likewise we can use this pattern to find all the playwrights without known plays.


SELECT
   a.name AS playwright
  FROM playwrights AS a
  LEFT JOIN plays AS p
    ON p.playwrightId = a.id
  WHERE p.playwrightId IS NULL
;

(on SQL Fiddle)

This will give us just the playwright Jacopone da Todi, who has no known plays.