MERGE gives better OUTPUT options

June 12, 2012

MERGE is very cool. There are a ton of useful things about it – mostly around the fact that you can implement a ton of change against a table all at once. This is great for data warehousing, handling changes made to relational databases by applications, all kinds of things.

One of the more subtle things about MERGE is the power of the OUTPUT clause. Useful for logging.

 TSQL2sDay150x150

If you’re not familiar with the OUTPUT clause, you really should be – it basically makes your DML (INSERT/DELETE/UPDATE/MERGE) statement return data back to you. This is a great way of returning identity values from INSERT commands (so much better than SCOPE_IDENTITY() or the older (and worse) @@IDENTITY, because you can get lots of rows back). You can even use it to grab default values that are set using non-deterministic functions like NEWID() – things you couldn’t normally get back without running another query (or with a trigger, I guess, but that’s not pretty).

That inserted table I referenced – that’s part of the ‘behind-the-scenes’ work that goes on with all DML changes. When you insert data, this internal table called inserted gets populated with rows, and then used to inflict the appropriate inserts on the various structures that store data (HoBTs – the Heaps or B-Trees used to store data as tables and indexes). When deleting, the deleted table gets populated. Updates get a matching row in both tables (although this doesn’t mean that an update is a delete followed by an inserted, it’s just the way it’s handled with these tables). These tables can be referenced by the OUTPUT clause, which can show you the before and after for any DML statement. Useful stuff.

MERGE is slightly different though.

With MERGE, you get a mix of entries. Your MERGE statement might be doing some INSERTs, some UPDATEs and some DELETEs. One of the most common examples of MERGE is to perform an UPSERT command, where data is updated if it already exists, or inserted if it’s new. And in a single operation too. Here, you can see the usefulness of the deleted and inserted tables, which clearly reflect the type of operation (but then again, MERGE lets you use an extra column called $action to show this).

image

(Don’t worry about the fact that I turned on IDENTITY_INSERT, that’s just so that I could insert the values)

One of the things I love about MERGE is that it feels almost cursor-like – the UPDATE bit feels like “WHERE CURRENT OF …”, and the INSERT bit feels like a single-row insert. And it is – but into the inserted and deleted tables. The operations to maintain the HoBTs are still done using the whole set of changes, which is very cool.

And $action – very convenient.

But as cool as $action is, that’s not the point of my post. If it were, I hope you’d all be disappointed, as you can’t really go near the MERGE statement without learning about it.

The subtle thing that I love about MERGE with OUTPUT is that you can hook into more than just inserted and deleted.

Did you notice in my earlier query that my source table had a ‘src’ field, that wasn’t used in the insert? Normally, this would be somewhat pointless to include in my source query. But with MERGE, I can put that in the OUTPUT clause.

image

This is useful stuff, particularly when you’re needing to audit the changes. Suppose your query involved consolidating data from a number of sources, but you didn’t need to insert that into the actual table, just into a table for audit. This is now very doable, either using the INTO clause of OUTPUT, or surrounding the whole MERGE statement in brackets (parentheses if you’re American) and using a regular INSERT statement.

This is also doable if you’re using MERGE to just do INSERTs.

In case you hadn’t realised, you can use MERGE in place of an INSERT statement. It’s just like the UPSERT-style statement we’ve just seen, except that we want nothing to match. That’s easy to do, we just use ON 1=2.

This is obviously more convoluted than a straight INSERT. And it’s slightly more effort for the database engine too. But, if you want the extra audit capabilities, the ability to hook into the other source columns is definitely useful.

Oh, and before people ask if you can also hook into the target table’s columns… Yes, of course. That’s what deleted and inserted give you.

This Post Has 23 Comments

  1. Greg Lucas

    Rob,
    Great post, I use the OUTPUT clause quite regularly – especially to populate rollback tables on production data changes but I never knew about $action or the ability to reference other columns from the WITH block.
    Thanks

  2. Rob Farley

    Thanks Greg. And it’s worth remembering that this is only available with MERGE – the other commands only let you hook into inserted and deleted.

  3. SomewhereSomehow

    Hi Rob. Very interesting scenario of using merge for audit. Goes to my KB.
    Maybe, it is worth saying, the case of using merge for insert is very convenient to get rid of cursors when copying linked entities. I blogged about it sometime ago. For example, if we have classical scheme: Customer<- Order <- OrderDetail and a need to copy orders from one client to another. With help of merge we may get a table of links: (OrderID_old, OrderID_new). And after that, use it in join, to copy all the order details at once, in one single query.

  4. Ilmar

    Very nice article!  I came accross it while searching for an elegant solution to a problem very similar to the one SomewhereSomewhat describes.  Definately added to my toolkit.

  5. Rob Farley

    Yes, definitely.

  6. Rob Farley

    Ilmar – you’re more than welcome. Good you found it helpful.

  7. Deege

    Hi Rob,
    Is multiple ‘OUTPUT INTO’ clauses allowed
    UPDATE …
    OUTPUT Deleted.*
    INTO @oldValues
    OUTPUT Inserted.*
    INTO @newValues
    WHERE id = @id
    Currently using
    UPDATE …
    OUTPUT Deleted.*
    INTO @oldValues
    OUTPUT Inserted.*
    INTO @newValues
    WHERE id = @id;
    INSERT @newValues…
    SELECT *
    FROM   ..
    WHERE  id = @id
    I am able to do a OUTPUT INTO and OUTPUT but gives an error when I use multiple OUTPUT INTO

  8. Rob Farley

    Hi Deege,
    This is correct. You can use only one OUTPUT .. INTO and only one OUTPUT clause. But you can use one of each in a single query.
    Rob

  9. Lally

    Rob, you are a god.
    Well done on being the only person on the net to point this out. Saved me hours.

  10. Rob Farley

    Not a god. Just happy to help.

  11. Shane

    Rob thanks for the reminder about $action, I often forget that is available. However I wanted to make one correction. The only DML statement that you can’t use source data in the output clause is the INSERT statement. You can capture data form the source for UPDATE and DELETE statements as well as MERGE. It is also important to point out that you can add variables, functions, and constants in the output clause caolumns as well. Here are examples:
    DECLARE @Colors TABLE (
    ID INT IDENTITY PRIMARY KEY,
    Color VARCHAR(50)
    )
    ;
    DECLARE @DataSource TABLE (
    ID INT IDENTITY PRIMARY KEY,
    Color VARCHAR(50),
    DateModified DATE,
    [Action] VARCHAR(50)
    )
    ;
    DECLARE @Audit TABLE (
    ID INT,
    PreviousColor VARCHAR(50),
    NewColor VARCHAR(50),
    OrignalDateModified DATE,
    DateModified DATE,
    [Action] VARCHAR(50)
    )
    ;
    INSERT INTO
    @Colors (Color)
    VALUES
    (‘Red’),
    (‘Blue’),
    (‘Yellow’)
    ;
    INSERT INTO
    @DataSource (Color, DateModified, [Action])
    VALUES
    (‘Yellow’, ‘20130207’, ‘Update’),
    (‘Red’, ‘20130210’, ‘Update’),
    (”, ‘20130212’, ‘Delete’)
    ;
    UPDATE
    @Colors
    SET
    Color = DS.Color
    OUTPUT
    INSERTED.ID,
    DELETED.Color,
    INSERTED.Color,
    DS.DateModified,
    SYSDATETIME(),
    ‘Updated’
    INTO
    @Audit
    FROM
    @Colors C
    JOIN @DataSource DS
    ON DS.ID = C.ID
    AND DS.[Action] = ‘Update’
    DELETE
    C
    OUTPUT
    DELETED.ID,
    DELETED.Color,
    DS.Color,
    DS.DateModified,
    SYSDATETIME(),
    ‘Deleted’
    INTO
    @Audit
    FROM
    @Colors C
    JOIN @DataSource DS
    ON DS.ID = C.ID
    AND DS.[Action] = ‘Delete’
    SELECT
    *
    FROM
    @Audit

  12. K Aditya Vardhan

    Hi Rob,,
    I am using MERGE Statement in which there are three tablse source target and history table
    Source –It acts as an intermediate  stage table
    Target –It would be the final table
    History– It would be used as an audit purpose which captures the updated records which also consists of two not null columns
    When there is no data in target table and History table the output port is sending the updated records with NULL value for which it is throwing error but actually it is not updating any of records how can i avoid such kind of situation..Its only happening when data in target and History are not available when there is data in target the functionality is working fine
    Can you suggest me a solution how to fix this

  13. Rob Farley

    Hi Aditya – Can you send me your code? My email address is on the right…

  14. Mark Kennedy

    You can access data other than the inserted/deleted in an output clause, but it has to be joined.
    There’s examples on the OUTPUT clause MSDN page, example E is where I got the idea to try self join.
    http://msdn.microsoft.com/en-gb/library/ms177564(v=sql.105).aspx
    but try this:
    UPDATE #tmp
    SET someval += 1
    OUTPUT deleted.*, inserted.someval, t.id
    FROM #tmp
    INNER JOIN #tmp as t ON #tmp.id = t.id
    I don’t know why exactly, but its probably something to do with not being able to use values on the table directly being changed, but joining to its self seems to work.
    I did this on SQL server 2008R2
    Saves having to delve into using merge statements if you just want to get other data in an OUTPUT from a simple update.

  15. Rob Farley

    Thanks Mark – that’s an interesting idea. 🙂

  16. Sean Peffer

    Thanks for this great explaantion of the OUTPUT/$ACTION – very useful.
    I used the MERGE statement to update a SCD2 Dimension and it’s great that it can be done in a single step iso multiple steps and temp tables. I now require to capture the Audit info as well and this function helps with this requirement as well. Here’s an exmaple used inside a stored proc to return the audit info
    DECLARE @Results TABLE(Action VARCHAR(20));
    /…….Merge statements here…../
    OUTPUT $action INTO @Results;
    SELECT @Insert=[INSERT],
    @Update=[UPDATE],
    @Delete=[DELETE]
    FROM    (SELECT Action,
        1 ROWS
    FROM  @Results) P
    PIVOT (COUNT(ROWS)
    FOR  Action IN  ([INSERT], [UPDATE], [DELETE])) AS pvt;

  17. Dipankar Saha

    Hi Rob,
    I was working on a requirement where i need to access inserted.* values and values from the Select List in the output clause for an insert statement. I am working in 2005 so i cannot use merge statement which i know is the other way around but wanted to clear my doubt that its techinically not possible or i am missing something.
    PFB, the code part.
          INSERT INTO ExternalEmployeeNumber(ExternalEmployeeNumber,DateUsed)
           SELECT  dateadd(second, ROWVAL, DateVale)
           FROM
           (   SELECT B.[FIRST NAME] FNAME, B.[LAST NAME] LNAME , B.EMAIL EMAIL , GETDATE() DateVale , ROW_NUMBER ( )
    OVER (ORDER BY B.[FIRST NAME], B.[LAST NAME] , B.EMAIL) ROWVAL
    FROM dbo.[EOP_REMs$] B where B.[last name] is not null and B.[first name] is not null and B.[email] is not null
    and NOT EXISTS ( SELECT 1 FROM employee E WHERE B.[first name] = E.[firstname]  AND B.[email] = E.[email] AND B.[last name] = E.[lastname] )
           ) X
           OUTPUT INSERTED.ExternalEmployeeNumber , FNAME , LNAME , EMAIL
                 INTO #EmployeeNew([Employee#],[FirstName],[LastName],[Email]);
    In the code part above, i am using Fname, Lname and Email other than the inserted column.
    It is giving me an error , invalid reference of the Fname, Lname, Email column name ?
    Can you help me out. As i said i implemented this using Merge and its working but i need to implement this in SQL 2005.
    Thanks and Regards
    Dipz

  18. Rob Farley

    Sorry Dipz, you don’t have access to that information in an INSERT statement.

  19. Pete Nightingale

    Interesting but this method does not actually allow you to pickup the values applied with an IDENTITY column as they so not appear in the INSERTED table.
    Suppose I had better write a loop then, good job I am covering a rare event.

  20. Peter Nightingale

    Correction, the IDENTITY column does appear in the INSERTED table, but not in the target table.
    So v useful actually. (ahem)

  21. Rob Farley

    Hi Pete – yes. 🙂

  22. Ritesh Patel

    Hi Rob,
    The audit table part of your query does not quiet work as expected. Although it is capturing the changes into #someaudittable it is still also performing the merge action on the #tmp table.
    Is there any way to prevent the actual merge on #tmp and just getting the ‘proposed’ changes into #someaudittable?
    Thanks,
    Ritesh

  23. Karl Ninh

    Can you output to a file?

Leave a Reply

LobsterPot Blogs

Blog posts by Rob Farley and other LobsterPot Solutions team members.

Search