Union All Handling in SQL Server

Ryan Chong
CERTIFIED EXPERT
(NIV) Hebrews 10:35 So do not throw away your confidence; it will be richly rewarded.
Published:
Edited by: Andrew Leniart
The UNION ALL (or UNION) operator is used to combine different result sets. This article focuses on handling the UNION ALL operator in a better way.
The UNION ALL (or UNION) operator is used to combine different result sets.

An example of UNION ALL could be as simple as shown below:

SELECT col1 FROM table1
UNION ALL
SELECT col1 FROM table2

If you're not familiar with the UNION ALL (or UNION) operator, you could refer to the explanation here.

I'm going to demonstrate the following works in MS SQL Server 2017. Some of the scripts showing could be compatible but if it's not, I will demonstrate compatible scripts separately.

Prepare the data


You can refer to my previous articles on how to load data into SQL Server as we will be using the same tables for demonstration.


For this article, we will be using both of the tables listed below:

  • TableExcel (previously named as "Sheet1$" in the article)
  • TableJSON

Work it out!


First, let's try to do a basic UNION ALL for these 2 tables:

SELECT * FROM TableExcel
UNION ALL
SELECT * FROM TableJSON

This is something very basic and you should see 50 records being returned.


But if you have noticed it when you ordering the records, there were "duplicate" records being found. This can be identified by simply sorting the columns:

SELECT * FROM TableExcel
UNION ALL
SELECT * FROM TableJSON
ORDER BY [Dates], [Type], [Code]


And in fact, this is because both table's records are identical, which means one record was actually from table: TableExcel, and another one from table: TableJSON.

Indication of the source data


To differentiate the records, we could add in one more column into each table and set its flag accordingly. For example, we will create a new column called as "Flag", and set it as "Excel" for the table: TableExcel, and set it as "JSON" for the table: TableJSON.

We could achieve that by running the following scripts:

ALTER TABLE TableExcel ADD Flag VARCHAR(10);
GO

ALTER TABLE TableJSON ADD Flag VARCHAR(10);
GO

UPDATE TableExcel SET Flag = 'Excel';

UPDATE TableJSON SET Flag = 'JSON';

Once this is done, you can then run the script below and the "duplicate" records will no longer exist:

SELECT * FROM TableExcel
UNION ALL
SELECT * FROM TableJSON
ORDER BY [Dates], [Type], [Code], [Flag]


So far so good?


The examples above demonstrate the basic way of handling data using the UNION ALL. But now I would like to further show the handling for the issues below:

a) If the tables have different number of columns

Let's say table: TableExcel got an extra column: ExtraCol, which is not available in table: TableJSON. In that case, we are combining the data from these two tables, so you would probably see an error as shown below:


To resolve this:

i) If this is an extra column that is necessary to be selected

Depends on the requirement, we could simply define an expression, such as a NULL value for column ExtraCol for those tables that does not have this column. For example, we would like to select the Dates, Type, Code, Amount, ExtraCol and Flag from the tables.

SELECT * FROM TableExcel
UNION ALL
SELECT *, NULL ExtraCol FROM TableJSON
ORDER BY [Dates], [Type], [Code], [Flag]


ii) If this is an extra column that is not necessary to be selected

We could simply remove it by specifying the columns in the Select query. For example, we would like to select the Dates, Type, Code, Amount and Flag from the tables.

SELECT [Dates], [Type], [Code], [Amount], [Flag] FROM TableExcel
UNION ALL
SELECT [Dates], [Type], [Code], [Amount], [Flag] FROM TableJSON
ORDER BY [Dates], [Type], [Code], [Flag]


b) What if we need to combine the data from multiple sources?

This would be simple by repeating the UNION ALL operator to another source. Assuming we have a third table with the exact table structure: TableOther, then we could use the following scripts:

SELECT [Dates], [Type], [Code], [Amount], [Flag] FROM TableExcel
UNION ALL
SELECT [Dates], [Type], [Code], [Amount], [Flag] FROM TableJSON
UNION ALL
SELECT [Dates], [Type], [Code], [Amount], [Flag] FROM TableOther
ORDER BY [Dates], [Type], [Code], [Flag]


c) If the data type of one source table is different

The expressions (or what we usually called as fields or columns) to be UNION ALL (or UNION) must have the same data type, else you could face an error when you try to execute it.

SELECT [Dates], [Type], [Code], [Amount], [Flag] FROM TableExcel
UNION ALL
SELECT [Dates], [Type], [Code], [Amount], [Flag] FROM TableJSON
UNION ALL
SELECT [Dates], [Type], [Code], [Amount], [Flag] FROM TableInvalid
ORDER BY [Dates], [Type], [Code], [Flag]


Let's say we got an invalid entry for the column: Dates in table: TableInvalid, which apparently the field's data type is VARCHAR(50) and an invalid date was being saved.


To resolve the issue:

Make sure the expressions to be UNION ALL are all convertible into the same data type in which some casting or comparison of value at row level could be necessary. For example, TRY_CAST function could be used to correct the invalid value.

SELECT [Dates], [Type], [Code], [Amount], [Flag] FROM TableExcel
UNION ALL
SELECT [Dates], [Type], [Code], [Amount], [Flag] FROM TableJSON
UNION ALL
SELECT TRY_CAST([Dates] AS DATETIME), [Type], [Code], [Amount], [Flag] FROM TableInvalid
ORDER BY [Dates], [Type], [Code], [Flag]


Similar conditions applied to the cases where there are multiple tables that come with different data types, the casting technique could be used.

d) Summation or grouping of selected records from UNION ALL result

This can be done easily and treat it as a normal source table for the summation or grouping. For example, we could create a Common Table Expression for illustration:

;WITH cte AS
(
    SELECT [Dates], [Type], [Code], [Amount], [Flag] FROM TableExcel
    UNION ALL
    SELECT [Dates], [Type], [Code], [Amount], [Flag] FROM TableJSON
    UNION ALL
    SELECT TRY_CAST([Dates] AS DATETIME), [Type], [Code], [Amount], [Flag] FROM TableInvalid
)
SELECT [Dates], [Type], SUM([Amount]) Amount
FROM cte
GROUP BY [Dates], [Type]
ORDER BY [Dates], [Type]


e) Verify the table and make sure it exists before running the query - with the Temporary Table approach

If we are trying to refer to an invalid object in the UNION ALL operator, whether it's wrongly named or does not exist, it would probably return with the following error:

;WITH cte AS
(
    SELECT [Dates], [Type], [Code], [Amount], [Flag] FROM TableExcel
    UNION ALL
    SELECT [Dates], [Type], [Code], [Amount], [Flag] FROM TableJSON
    UNION ALL
    SELECT TRY_CAST([Dates] AS DATETIME), [Type], [Code], Amount, [Flag] FROM TableInvalid
    UNION ALL
    SELECT [Dates], [Type], [Code], [Amount], [Flag] FROM TableDeleted
)
SELECT [Dates], [Type], SUM([Amount]) Amount
FROM cte
GROUP BY [Dates], [Type]
ORDER BY [Dates], [Type]


The easy way is probably to verify the OBJECT_ID of the table, and if it's returning as NULL means the object does not exist. But since the verification needs to be done at the individual database object, the existing script needs to be amended.

In a context that UNION ALL not going to be used any longer, the solution would be to append the valid table's records into a temporary table.

DECLARE @sql VARCHAR(MAX)

DROP TABLE IF EXISTS #TableOutput

CREATE TABLE #TableOutput
(
    [Dates] [datetime] NULL,
    [Type] [varchar](1) NULL,
    [Code] [varchar](20) NULL,
    [Price] [money] NULL,
    [Unit] [int] NULL,
    [SubTotal] [money] NULL,
    [Brokeage Rate] [money] NULL,
    [Tax] [money] NULL,
    [Stamp Duty] [money] NULL,
    [Clearing Fee] [money] NULL,
    [Amount] [money] NULL,
    [Svc Cost] [money] NULL,
    [Flag] [varchar](10) NULL
)

IF OBJECT_ID('TableExcel', 'U') IS NOT NULL
    BEGIN
        SET @sql = 'INSERT INTO #TableOutput
                    SELECT * FROM TableExcel'
        EXEC(@sql)
    END

IF OBJECT_ID('TableJSON', 'U') IS NOT NULL
    BEGIN
        SET @sql = 'INSERT INTO #TableOutput
                    SELECT * FROM TableJSON'
        EXEC(@sql)
    END

IF OBJECT_ID('TableInvalid', 'U') IS NOT NULL
    BEGIN
        SET @sql = 'INSERT INTO #TableOutput
                    SELECT TRY_CAST([Dates] AS DATETIME), [Type], [Code],
                    [Price], [Unit], [SubTotal], [Brokeage Rate], [Tax],
                    [Stamp Duty], [Clearing Fee], [Amount], [Svc Cost], [Flag]
                    FROM TableInvalid'
        EXEC(@sql)
    END

IF OBJECT_ID('TableDeleted', 'U') IS NOT NULL
    BEGIN
        SET @sql = 'INSERT INTO #TableOutput
                    SELECT * FROM TableDeleted'
        EXEC(@sql)
    END

SELECT [Dates], [Type], SUM([Amount]) Amount
FROM #TableOutput
GROUP BY [Dates], [Type]
ORDER BY [Dates], [Type]

By running the script above, you should able to get the same result as what we have been executed in d) Summation or grouping of selected records from UNION ALL operator.


Do note the different script has been applied for UNION ALL TableInvalid to handle the invalid value of different data type.

f) Verify the table and make sure it exists before running the query - with Dynamic SQL string approach 

Another possible way I think could be more direct is to generate the dynamic SQL string and execute it. This could be done by using the following script:

DECLARE @sql VARCHAR(MAX) = ''

IF OBJECT_ID('TableExcel', 'U') IS NOT NULL
    SET @sql = @sql + CASE WHEN @sql <> '' THEN CHAR(13) + ' UNION ALL ' + CHAR(13) ELSE '' END + 'SELECT [Dates], [Type], [Code], [Amount], [Flag] FROM TableExcel'

IF OBJECT_ID('TableJSON', 'U') IS NOT NULL
    SET @sql = @sql + CASE WHEN @sql <> '' THEN CHAR(13) + ' UNION ALL ' + CHAR(13) ELSE '' END + 'SELECT [Dates], [Type], [Code], [Amount], [Flag] FROM TableJSON'

IF OBJECT_ID('TableInvalid', 'U') IS NOT NULL
    SET @sql = @sql + CASE WHEN @sql <> '' THEN CHAR(13) + ' UNION ALL ' + CHAR(13) ELSE '' END + 'SELECT TRY_CAST([Dates] AS DATETIME), [Type], [Code], [Amount], [Flag] FROM TableInvalid'

IF OBJECT_ID('TableDeleted', 'U') IS NOT NULL
    SET @sql = @sql + CASE WHEN @sql <> '' THEN CHAR(13) + ' UNION ALL ' + CHAR(13) ELSE '' END + 'SELECT [Dates], [Type], [Code], [Amount], [Flag] FROM TableDeleted'

SET @sql = '
;WITH cte AS
(
    ' + @sql + '
)
SELECT [Dates], [Type], SUM([Amount]) Amount
FROM cte
GROUP BY [Dates], [Type]
ORDER BY [Dates], [Type]
'
EXEC(@sql)

By running the script above, you should able to get the same result as what we have been executing in d) Summation or grouping of selected records from UNION ALL operator.


Again, do note a different script has been applied for UNION ALL TableInvalid to handle the invalid value of different data type.

In addition, you could see we are using CHAR(13) in the script, which is actually adding the carriage return to the output string. This could be helpful especially when debugging is necessary.

g) What if the source data has come from View and not Table?

Well, this is also a straight forward change for us, where we just need to change a bit the script of OBJECT_ID validation to as follows:

IF OBJECT_ID('ViewExcel', 'V') IS NOT NULL
    SET @sql = @sql + CASE WHEN @sql <> '' THEN CHAR(13) + ' UNION ALL ' + CHAR(13) ELSE '' END + 'SELECT [Dates], [Type], [Code], [Amount], [Flag] FROM ViewExcel'

Do note the change of the second parameter for the OBJECT_ID function that passing in. For the full list of references, you can refer here.

h) Better handling and report an error if a Table (or View) does not exist

For this approach, we could probably re-use the script of OBJECT_ID function with a bit of amendment:

DECLARE @sql VARCHAR(MAX) = ''
DECLARE @msg VARCHAR(MAX) = ''

IF OBJECT_ID('TableExcel', 'U') IS NOT NULL
    SET @sql = @sql + CASE WHEN @sql <> '' THEN CHAR(13) + ' UNION ALL ' + CHAR(13) ELSE '' END + 'SELECT [Dates], [Type], [Code], [Amount], [Flag] FROM TableExcel'
ELSE
    SET @msg = @msg + CASE WHEN @msg <> '' THEN CHAR(13) + '' ELSE '' END + 'Invalid object name: TableExcel'

IF OBJECT_ID('TableJSON', 'U') IS NOT NULL
    SET @sql = @sql + CASE WHEN @sql <> '' THEN CHAR(13) + ' UNION ALL ' + CHAR(13) ELSE '' END + 'SELECT [Dates], [Type], [Code], [Amount], [Flag] FROM TableJSON'
ELSE
    SET @msg = @msg + CASE WHEN @msg <> '' THEN CHAR(13) + '' ELSE '' END + 'Invalid object name: TableJSON'

IF OBJECT_ID('TableInvalid', 'U') IS NOT NULL
    SET @sql = @sql + CASE WHEN @sql <> '' THEN CHAR(13) + ' UNION ALL ' + CHAR(13) ELSE '' END + 'SELECT TRY_CAST([Dates] AS DATETIME), [Type], [Code], [Amount], [Flag] FROM TableInvalid'
ELSE
    SET @msg = @msg + CASE WHEN @msg <> '' THEN CHAR(13) + '' ELSE '' END + 'Invalid object name: TableInvalid'

IF OBJECT_ID('TableDeleted', 'U') IS NOT NULL
    SET @sql = @sql + CASE WHEN @sql <> '' THEN CHAR(13) + ' UNION ALL ' + CHAR(13) ELSE '' END + 'SELECT [Dates], [Type], [Code], [Amount], [Flag] FROM TableDeleted'
ELSE
    SET @msg = @msg + CASE WHEN @msg <> '' THEN CHAR(13) + '' ELSE '' END + 'Invalid object name: TableDeleted'

IF @msg = ''
    SET @sql = '
    ;WITH cte AS
    (
        ' + @sql + '
    )
    SELECT [Dates], [Type], SUM([Amount]) Amount
    FROM cte
    GROUP BY [Dates], [Type]
    ORDER BY [Dates], [Type]
    '
ELSE
    SET @sql = 'SELECT ''' + @msg + ''' Error '

EXEC(@sql)

So, instead of getting the result, you show see the result below being return since we are UNION ALL with an invalid table: TableDeleted




As a summary, I would like to highlight some of the major points when we are using UNION ALL (or UNION clause) in SQL:

  • All queries combined using a UNION, INTERSECT or EXCEPT operator must have an equal number of expressions in their target lists
  • Make sure the expressions to be UNION ALL are all convertible into the same data type in which some casting or comparison of value at row level could be necessary

Finally, I hope you enjoyed this article!
1
136 Views
Ryan Chong
CERTIFIED EXPERT
(NIV) Hebrews 10:35 So do not throw away your confidence; it will be richly rewarded.

Comments (0)

Have a question about something in this article? You can receive help directly from the article author. Sign up for a free trial to get started.

Get access with a 7-day free trial.
You Belong in the World's Smartest IT Community