Fundamentals: The semicolon (;) is a statement terminator

The title pretty much says it all here, when you’re writing T-SQL, the correct place to place the semicolon (;) is at the end of all you’re statements.

Something I see quite often is batches where people don’t terminate any of their statements, but do start statements that use a CTE or Merge statement with a semicolon. This means you see syntax like the below:

DECLARE @MyVariable int
SET @MyVariable = 5

;WITH MyCTE AS(
    SELECT *,
           ROW_NUMBER() OVER (PARTITION BY CONVERT(date,MyDate) ORDER BY MyDate DESC) AS RN
    FROM MyTable
    WHERE MyColumn = @MyVariable)
SELECT *
FROM MyCTE
WHERE RN = 1

There are several problems, in my opinion, about not correctly terminating your statements, but also beginning statements with a semicolon that require the previous statement to be properly terminated.

Firstly, not terminating your statements is a deprecated feature in SQL Server. This means that should you have SQL that isn’t properly terminated, and Microsoft do (eventually) remove the functionality of not terminating your statements, then your query will fail. There is no alternative to this, Microsoft only offers one solution: “End Transact-SQL statements with a semicolon ( ; ).

The likely reason that this is deprecated isn’t because the parser is being “less” clever but that T-SQL is becoming more complex. The WITH keyword can appear in several places and has different meanings. For example it’s used at the start of a statement to denote that a CTE is being declared. It can be used at the end of a DDL statement to specify that versioning is enabled, the type of compression or if the table is memory optimised. It’s also used for query hints when referencing a table (most commonly seen when people are abusing the NOLOCK hint). In all these cases, WITH has a different meaning, and the parser needs to be able to know which one it has. As a result, if the first “word” in the statement is WITH, then SQL Server knows that it’s the declaration of a CTE; but to safely know it’s the first word in the statement you need to tell SQL Server it is, by properly terminating the prior statement.

When you write code, I hope that you properly format it. From the Parser’s point of view though, it doesn’t “care” about white space and line breaks the same way we do. Take the statements below:

CREATE TABLE dbo.MyTable (ID int,
                          MyColumn varchar(10))
                          
WITH CTE AS(
    SELECT MAX(ID)
    FROM dbo.OtherTable)
INSERT INTO dbo.MyTable
SELECT ID, N'OtherTable'
FROM CTE

To us it is easy to see that WITH is the declaration of a CTE. It’s on a separate line, and has a gap between it and the end of the DDL statement. To the parser, however, WITH is a perfectly acceptable keyword to be placed after varchar(10)). As a result, without the statement being terminated the WITH keyword meaning is ambiguous. It’s just like if someone gave you the date “03/01/2019”. For some of us that would represent the date 03 January 2019 but for others it would be March 01 2019. Unless someone explicitly tells you the format of the date, or uses an unambiguous format (yyyyMMdd or yyyy-MM-ddThh:mm:ss.nnnnnnn) then you don’t know what it represents. For the parser, putting the semicolon is your way of unambiguously telling it that WITH is the first word in the statement so represents the declaration of a CTe.

If you do try to run the above, the WITH will actually be interpreted as part of the CREATE TABLE statement, and therefore you’ll get the error below:

Msg 102, Level 15, State 1, Line 4
Incorrect syntax near ‘CTE’.

If you use WITH as part of a DDL statement, the first thing expected afterwards is a left parenthesis, so having CTE makes no sense to the parser. The more common error you see, however, when you forget to properly terminate your statement is very explicit on what you need to do:

Incorrect syntax near the keyword ‘with’. If this statement is a common table expression, an xmlnamespaces clause or a change tracking context clause, the previous statement must be terminated with a semicolon.

To be just that little be explicit though, notice it states “the previous statement must be terminated with a semicolon“. It doesn’t say the “statement must be started with a semicolon”. This is very much telling you that you need to put the semicolon at the end of the last statement, not at the start of the one that begins with WITH.

Using a semicolon at the start of a statement also teaches some bad habits. Unlike other RDBMS, SQL Server does not support the declaration of a CTE inside the statement (i.e. like PostgreSQL). This does not mean, however, that I haven’t (often) seen people write statements that look like this (I am not suggesting that this syntax would even be roughly correct in PostgreSQL):

SELECT *
FROM (
    ;WITH CTE AS(
        SELECT *,
               ROW_NUMBER() OVER (PARTITION BY ID ORDER BY MyDate DESC) AS RN
        FROM MyTable)
     SELECT *
     FROM CTE 
     WHERE RN = 1) MT
     JOIN dbo.OtherTable OT ON MT.ID = OT.ParentID

To those of us that know T-SQL, you can see multiple multiple problems here, however, the one I want to draw attention to is the use of the semicolon in the middle of the statement. A semicolon is a statement terminator, which means that in reality you have 2 statements there. One that ends with FROM ( and another that begins with WITH CTE AS. To those of us that know T-SQL well, we know that both ; and WITH (when declaring a CTE) don’t go in the middle of a statement, however, to those that are starting with the language, they don’t know that. Having statements that start with ;WITH can make them think that it’s not WITH that’s the keyword but that it’s ;WITH.

I’ve even seen someone (though this has only happened once) who had a query that looked like this:

SELECT *
FROM dbo.MyTable MT ;WITH (NOLOCK)
     JOIN dbo.OtherTable OT ;WITH (NOLOCK) ON OT.ID = MT.ID
WHERE MT.[Name] = 'Steve'

I appreciate that we all have preferences on how we format our SQL too, and that some prefer to start all their statements with a semicolon instead. Although this does work, to a degree, this will always leave your last statement un-terminated (meaning it could fail in the future) and you have a superfluous ; at the very start of your SQL (as there was no prior statement to be terminated). Taking the time to ensure you terminate all your statements now, at the end, is really important for future proofing, and to aid the parser when it needs it. Don’t wait until you have to, take the time to do it as you write your SQL. No, you probably don’t need to go through your entire application and correct everything to be properly terminated at the moment, but when you need to change something, make sure you add them then to the SQL you’re working with. Then, when eventually the “optionalness” of terminating your statements is removed, at least you’ve got some of the work already done.

Leave a Reply

Your email address will not be published. Required fields are marked *