Getting to grips with Dynamic SQL: Working with datasets

Introduction

Previously I covered how to validate the values that are being received by your dynamic SQL to help ensure that malicious code is even harder to be run. In this article I'm going to cover how you can use a dataset to create a dynamic statement, or dynamic statements. This will likely be as you want to perform a specific task against several objects in your database, or server, and instead of writing the same statement for each object you use a dynamic one to do the work. These are, in my opinion, the more valid reason for dynamic statements as such dynamic statements can be extremely useful for maintenance and administrative tasks.

The 2 methods I'm going to cover here are using a CURSOR and using string aggregation. Both have their benefits and cons. A CURSOR is often easier for many to understand from a logic perspective and many find easier to write. Performance wise they are slowly, as each (set of) statement(s) needs is run separately in their own batch, which comes with a cost. As, however, each (set of) statement(s) are run in their own scope this does mean that just because a prior statement fails doesn't mean the entire process will, as the the error may well not cause the outer batch to fail.

Conversely string aggregation is often faster, as it's executed in a single dynamic batch. As, however, all the statements are in a single batch then if a prior statement fails then the entire batch will unless you include error handing (in the form of a TRY...CATCH) within the dynamic statement itself as well. This can significantly increase the level of understanding required to ensure that errors are caught correctly and that the right one(s) are ignored or handled gracefully, while others are THROWN. I, personally, prefer the string aggregation method, due to the performance benefit, however, unless you are well versed in Dynamic SQL already and comfortable with your ability then using a CURSOR is likely the better choice. If performance really is an issue then you can then implement a string aggregation method.

There are, however, some exceptions where string aggregation can't be used. Some statements require them to be the only statement in the batch, such as DDL statements for Stored Procedures. As such there will be times were you cannot use string aggregation unless you go down a dynamic dynamic SQL approach; yep that means creating a dynamic statement that creates dynamic statements. That isn't something I plan to cover in these articles at this time.

Before you start writing the dynamic SQL

Prior to either method, the first thing you'll want to do is work out what your statement would be if it wasn't dynamic. This is true for all the prior examples, but I'm going to reiterate it here.

Let's assume, for the purposes of this article, that you want to DELETE any rows in any table on a specific schema, where the value of the LastActive is over 3 years ago,, which is perhaps part of your data retention policy (we'll assume this won't break any foreign keys here). The first step would be to write out the statement for a single table:

DELETE
FROM MySchema.MyTable
WHERE LastActive <= DATEADD(YEAR, -3, GETDATE());

A pretty simple statement. Next you want to design the statement as if it were a dynamic statement using scalar values (as shown in the prior articles). I'm going to wrap the value injected in braces ({}) here, as it may not be a variable we'll be injecting:

DECLARE @SQL nvarchar(MAX),
        @CRLF nchar(2) = NCHAR(13) + NCHAR(10);

SET @SQL = N'DELETE' + @CRLF +
           N'FROM MySchema.' + QUOTENAME({TableName}) + @CRLF +
           N'WHERE LastActive <= DATEADD(YEAR, -3, GETDATE());';

This gives you the template you are aiming for, and can be used in both methods.

Using a CURSOR to run each statement individually

As I mentioned, this is the normally the easier method to use for those new to dynamic SQL, and even those more well versed. It's often easier to read for many, and can be easier to debug, as as error in one batch doesn't cause the other to fail or be rolled back. Though Cursors do have a "bad" reputation this (in my opinion) is undeserved. The problem isn't Cursors, they have their uses (such as here), it's that they are misused by developers for tasks that should not be using an iterative method; for example getting a cumulative sum/count when all supported versions of SQL Server support windowed aggregates and even prior you had the "quirky update" and a triangular join. Here we have an example where although a Set Based solution will likely be faster, we're not dealing with data in the same way, and so an iterative solution is rarely frowned upon.

The idea behind using a CURSOR is actually very simple. Firstly declare the cursor to get the needed data, then use a WHILE for each value in the CURSOR to run a dynamic statement like that shown in the very first example in this set of articles. If the values you are looping through for your dynamic statement is coming from user input, for example a Stored Procedure with a table type parameter, then any validation should likely be performed outside the WHILE and probably in the statement setting the values for the CURSOR. Though you can do the validation within the WHILE, and likely THROW your own error, this will also incur the additional cost of validating each object individually which will result in further performance degradation when compared to a string aggregation solution. You can see how either method would work, however, in Example 2.

Using string aggregation to run all the statements in a single batch

This method can often be far more complex, especially when introducing error handling, but will often come with performance benefits; especially if dealing with a large dataset.

The method you use to implement the string aggregation will vary depending on what version of SQL Server you are using. STRING_AGG is only available on SQL Server 2017 onwards, and can't be nested. On the other hand, FOR XML PATH is available in every supported version of SQL Server and can be nested. For me that means that I find myself using both methods, either as I am forced on older versions of SQL Server, or because I need more than one set of aggregation in a single statement (not something I will be covering in this article). Unsurprisingly many find STRING_AGG easier to understand, but it's important to understand how the FOR XML PATH method works even on SQL Server 2017 onwards, you may well find you have need to use it. Explaining how it works will be in a separate article though, which I'll reference here when released.

Here, instead, you would need to directly use the dataset to build the entire statement, effecting every dynamic object, in a single run. It's also very important to validate your inputs here if they are coming from a user as a single invalid object will cause the entire dynamic batch to not run. This is because the dynamic batch would be parsed when you attempt to run it and would error at that stage should an invalid object be referenced. This can mean that debugging can be difficult if the statement is poorly written (Formatting is Important) when trying to work out where something like an invalid object error is being thrown. Likewise, if a single statement in the batch fails, this could cause the entire batch to be rolled back and the rest of it not run. This means you may well need to also ensure you add appropriate TRY...CATCH statements in the dynamic batch, as demonstrated in Example 4 and Example 5.

Example 1: Running each statement in a CURSOR

Here we loop through each table returned within the CURSOR and DELETE delete old rows.

DECLARE @SQL nvarchar(MAX),
        @CRLF nchar(2) = NCHAR(13) + NCHAR(10),
        @TableName sysname;

DECLARE TableCursor CURSOR FOR   
SELECT t.[name]
FROM sys.schemas s
     JOIN sys.tables t ON s.schema_id = t.schema_id
WHERE s.[name] = N'MySchema'
  AND EXISTS (SELECT 1
              FROM sys.columns c
              WHERE c.object_id = t.object_id);
  
OPEN TableCursor;
  
FETCH NEXT FROM TableCursor   
INTO @TableName;

WHILE @@FETCH_STATUS <> 0 BEGIN

    SET @SQL = N'DELETE' + @CRLF +
               N'FROM MySchema.' + QUOTENAME(@TableName) + @CRLF +
               N'WHERE LastActive <= DATEADD(YEAR, -3, GETDATE());';

    EXEC sys.sp_executesql @SQL;

    FETCH NEXT FROM TableCursor   
    INTO @TableName;

END;

CLOSE TableCursor;
DEALLOCATE TableCursor;

Example 2: Validating the object within the CURSOR

Here the objects from the table variable are validated within the CURSOR and will THROW custom errors if the table/column does not exist. Note that the value for the error has to be assigned to a variable prior to the check, so that a meaningful message can be presented to the application layer.

DECLARE @SQL nvarchar(MAX),
        @CRLF nchar(2) = NCHAR(13) + NCHAR(10),
        @TableName sysname,
        @Error nvarchar(255);

DECLARE TableCursor CURSOR FOR   
SELECT Tablename
FROM @TableList;

OPEN TableCursor;
  
FETCH NEXT FROM TableCursor   
INTO @TableName;

WHILE @@FETCH_STATUS <> 0 BEGIN

    SET @Error =  N'Table ''MySchema.' + @TableName + N''' is invalid. The dynamic statement has not been run, however, the batch was not aborted.';
    IF NOT EXISTS (SELECT 1
                   FROM sys.schemas s
                        JOIN sys.tables t ON s.schema_id = t.schema_id
                   WHERE s.[name] = N'MySchema'
                     AND t.[name] = @TableName)
        THROW 87001, @Error, 10;

    SET @Error =  N'Table ''MySchema.' + @TableName + N''' does not have a column ''LastActive''. The dynamic statement has not been run, however, the batch was not aborted.'; 
    IF NOT EXISTS (SELECT 1
                   FROM sys.schemas s
                        JOIN sys.tables t ON s.schema_id = t.schema_id
                        JOIN sys.columns c ON t.object_id = c.object_id
                   WHERE t.[name] = @TableName
                     AND c.[name] = N'LastActive')
        THROW 87002, @Error, 10;

    SET @SQL = N'DELETE' + @CRLF +
               N'FROM MySchema.' + QUOTENAME(@TableName) + @CRLF +
               N'WHERE LastActive <= DATEADD(YEAR, -3, GETDATE());';

    EXEC sys.sp_executesql @SQL;

    FETCH NEXT FROM TableCursor   
    INTO @TableName;

END;

CLOSE TableCursor;
DEALLOCATE TableCursor;

Example 3: STRING_AGG to create all the statements in a single batch

STING_AGG is used to assign an entire dynamic batch to a variable to be executed.

DECLARE @SQL nvarchar(MAX),
        @CRLF nchar(2) = NCHAR(13) + NCHAR(10);

SELECT @SQL = STRING_AGG(N'DELETE' + @CRLF +
                         N'FROM MySchema.' + QUOTENAME(t.[name]) + @CRLF +
                         N'WHERE LastActive <= DATEADD(YEAR, -3, GETDATE());',@CRLF) WITHIN GROUP (ORDER BY t.object_id)
FROM sys.schemas s
     JOIN sys.tables t ON s.schema_id = t.schema_id
WHERE s.[name] = N'MySchema'
  AND EXISTS (SELECT 1
              FROM sys.columns c
              WHERE c.object_id = t.object_id);

EXEC sys.sp_executesql @SQL;

Example 4: FOR XML PATH creating a batch with a TRY...CATCH

FOR XML PATH is used to create the dynamic batch statement, which includes a TRY...CATCH statement which results in the entire batch being rolled back in the event of a failure.

DECLARE @SQL nvarchar(MAX),
        @CRLF nchar(2) = NCHAR(13) + NCHAR(10),
        @Active date = DATEADD(YEAR, -3, GETDATE());

SELECT @SQL = N'BEGIN TRY' + @CRLF +
              N'    BEGIN TRANSACTION DeleteInactive;' + @CRLF +
              STRING_AGG(N'    DELETE' + @CRLF +
                         N'    FROM MySchema.' + QUOTENAME(t.[name]) + @CRLF +
                         N'    WHERE LastActive <= @Active;',@CRLF) WITHIN GROUP (ORDER BY t.object_id) +
              N'    COMMIT TRANSACTION DeleteInactive;' + @CRLF +
              N'END TRY' + @CRLF +
              N'BEGIN CATCH' + @CRLF +
              N'    ROLLBACK TRANSACTION DeleteInactive;' + @CRLF +
              N'    THROW' + @CRLF +
              N'END CATCH;'
FROM sys.schemas s
     JOIN sys.tables t ON s.schema_id = t.schema_id
WHERE s.[name] = N'MySchema'
  AND EXISTS (SELECT 1
              FROM sys.columns c
              WHERE c.object_id = t.object_id);

EXEC sys.sp_executesql @SQL, N'@Active date', @Active;

Example 5: FOR XML PATH creating a batch with multiple TRY...CATCHes

Instead, in this example, the TRY...CATCH is wrapped around each individual DELETE and a PRINT is used to provide the error. The error number is also returned to the outer scope with an OUTPUT parameter. (This method is used in sp_foreachdatabase)

DECLARE @SQL nvarchar(MAX),
        @CRLF nchar(2) = NCHAR(13) + NCHAR(10),
        @Active date = DATEADD(YEAR, -3, GETDATE()),
        @RC int;

SELECT @SQL = N'DECLARE @Error nvarchar(4000);' + @CRLF +
              STRING_AGG(N'BEGIN TRY' + @CRLF + 
                         N'    BEGIN TRANSACTION;' + @CRLF +
                         N'    DELETE' + @CRLF +
                         N'    FROM MySchema.' + QUOTENAME(t.[name]) + @CRLF +
                         N'    WHERE LastActive <= @Active;' + @CRLF +
                         N'    COMMIT TRANSACTION;' + @CRLF +
                         N'END TRY' + @CRLF +
                         N'BEGIN CATCH' + @CRLF +
                         N'    ROLLBACK TRANSACTION;' + @CRLF +
                         N'    SET @Error = N''Msg '' + CONVERT(nvarchar(6),ERROR_NUMBER()) + '', Level '' + CONVERT(nvarchar(6),ERROR_SEVERITY()) + '', State '' + CONVERT(nvarchar(6),ERROR_STATE()) + '', Line '' + CONVERT(nvarchar(6),ERROR_LINE()) + NCHAR(13) + NCHAR(10) +' + @CRLF +
                         N'                 ERROR_MESSAGE();' + @CRLF +
                         N'    PRINT @Error;' + @CRLF +
                         N'    SET @RC = ERROR_NUMBER();' + @CRLF +
                         N'END CATCH;',@CRLF) WITHIN GROUP (ORDER BY t.object_id)
FROM sys.schemas s
     JOIN sys.tables t ON s.schema_id = t.schema_id
WHERE s.[name] = N'MySchema'
  AND EXISTS (SELECT 1
              FROM sys.columns c
              WHERE c.object_id = t.object_id);

EXEC sys.sp_executesql @SQL, N'@Active date, @RC int OUTPUT', @Active, @RC OUTPUT;

Leave a Reply

Your email address will not be published. Required fields are marked *