Tuesday, September 14, 2010

Complex SQL Updates with the Case Statement

One of the most important rules-of-thumb for database performance and data quality is to do as much as you can in as few transactions as possible.  When making Updates to data sometimes you need to do different things based on different conditions.  In order to make these complex Updates and to do them in as few transaction as possible, you'll need some way to make a decision.  One way to make a decision in SQL is by using the Case statement.

Given the following table:
idcountrystatezip
2USAIL60191
3UKAA0
5USACA90210
7UK?99999

Say the state and zip columns are now made nullable so that garbage data will stop being put into it (e.g., UK having state AA and ?, and zip of 0 and 99999).  You'll want to correct the data you have so that you no longer have garbage data in the table.  To do so what you'll want to do is null the state and zip when the country is not USA.


update location
  set
    state = case
      when country = 'USA' then state
      when country <> 'USA' then null
    end,
    zip = case
      when country = 'USA' then zip
      when country <> 'USA' then null
    end


or you can do


update location
  set
    state = case
      when country = 'USA' then state
      else null
    end,
    zip = case
      when country = 'USA' then zip
      else null
    end


or you can replace null with ''.


The table should look like this now:
idcountrystatezip
2USAIL60191
3UK(null)(null)
5USACA90210
7UK(null)(null)

You can also used any other value you can obtain from the table in the Update statement, like this.

Given the following table again:
idcountrystatezip
2USAIL60191
3UKAA0
5USACA90210
7UK?99999

We can do something like this (assuming that state and zip are Character data types).

update location
  set
    state = case
      when country = 'USA' then state
      when country <> 'USA' then country
    end,
    zip = case
      when country = 'USA' then zip
      when country <> 'USA' then country
    end


The table should look like this now:
idcountrystatezip
2USAIL60191
3UKUKUK
5USACA90210
7UKUKUK

Remember if you can picture what you want the data to look like then you can do it in SQL, it is a declarative programming language after all.