--- Frank Lloyd Wright
"Less is more."
-- Ludwig Mies van der Rohe
It happens, your on a project and some architect comes up with some "great" idea. Something along the lines of, "Instead of deleting data we'll do a "soft" delete, that way if we need to unefface the data will not have to go to the backed up source data or the data warehouse or one of the many other copies we have". There you are, stuck living with faking a deletion of data, adding unnecessary complexity to your system.
One of the great things about a "soft" delete is now you need to check if the record already exist as a "soft" deleted record before you can insert any new data. Great. So how do you do this?
The wrong way to do it would be to do a SELECT and then either an INSERT or UPDATE based on the results of the SELECT. This causes two different interaction with the database.
The better way to handle a "soft" delete is to use a MERGE for an Upsert.
Say you have the following tables.
CREATE TABLE dbo.stage
(
id [int] NOT NULL,
first_name [varchar] (100) NOT NULL,
last_name [varchar] (100) NOT NULL
);
CREATE TABLE dbo.party
(
id [int] NOT NULL,
first_name [varchar] (100) NOT NULL,
last_name [varchar] (100) NOT NULL,
is_deleted [char] DEFAULT('N') NOT NULL
);
With the following data.
INSERT INTO dbo.stage
(
id
,first_name
,last_name
)
VALUES
(1, 'Mike', 'Harris')
,(2, 'Kelsey', 'Harris')
,(3, 'Jack', 'Harris')
;
INSERT INTO dbo.party
(
id
,first_name
,last_name
,is_deleted
)
VALUES
(1, 'Mike', 'Harris', 'Y')
,(2, 'Kelsey', 'Hopkins', 'N')
,(4, 'Tom', 'Smith', 'N')
;
Staging
ID | FIRST_NAME | LAST_NAME |
---|---|---|
1 | Mike | Harris |
2 | Kelsey | Harris |
3 | Jack | Harris |
Party with "soft" delete
ID | FIRST_NAME | LAST_NAME | IS_DELETED |
---|---|---|---|
1 | Mike | Harris | Y |
2 | Kelsey | Hopkins | N |
4 | Tom | Smith | N |
Looking at the data above, we see that the record with ID value of 1 has been "soft" deleted.
Say we want to merge the staging data into the Party data. Assuming we have a full data set, we can "delete" any of the records not found in the staging data.
In order to merge the staging data into the Party data with "soft" deletes we would use the following MERGE statement.
MERGE dbo.party AS TARGET
USING dbo.stage AS SOURCE
ON TARGET.id = SOURCE.id
WHEN MATCHED
AND (TARGET.first_name <> SOURCE.first_name
OR TARGET.last_name <> SOURCE.last_name)
OR TARGET.is_deleted = 'Y' THEN
UPDATE
SET TARGET.first_name = SOURCE.first_name
,TARGET.last_name = SOURCE.last_name
,TARGET.is_deleted = 'N'
WHEN NOT MATCHED BY TARGET THEN
INSERT (id, first_name, last_name)
VALUES (SOURCE.id, SOURCE.first_name, SOURCE.last_name)
WHEN NOT MATCHED BY SOURCE THEN
UPDATE
SET TARGET.is_deleted = 'Y'
;
The statements in bold are what make this MERGE statement use "soft" deletes. On the WHEN MATCHED we want to include an OR TARGET.is_deleted = 'Y' (or whatever kind of delete indicator you are using), this will allow for an UPDATE to take place when the record that is being inserted is a "soft" deleted record. The WHEN NOT MATCHED BY SOURCE THEN must do an UPDATE with SET TARGET.is_deleted = 'Y' (or whatever kind of delete indicator you are using) instead of the typical DELETE.
Party with "soft" delete
ID | FIRST_NAME | LAST_NAME | IS_DELETED |
---|---|---|---|
1 | Mike | Harris | N |
2 | Kelsey | Harris | N |
4 | Tom | Smith | Y |
3 | Jack | Harris | N |
In the table above, Mike has been un-"soft" deleted, Kelsey's last name has changed to Harris, Tom has been "soft" deleted, and Jack has been inserted into the table.
Well, that is how one does a "soft" delete in SQL Server.
I used SQL Fiddle to test this out, the sample code can be found here.