[Okta Webinar] Learn how to a build a cloud-first strategyRegister Now

x
  • Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 605
  • Last Modified:

Union query is very sloooooow

I have a union query.  It is based upon two queries, each brings fields from multiple tables, some of which are calculated.
One query it is based upon returns about 1500 records in a few seconds.
The other query returns about 33000 records in, say, 20 seconds.
The union (SELECT * FROM AuftragCostEigenes  UNION ALL SELECT * FROM AuftragCostAkkordant) takes over five minutes to complete.
There is no sorting in either base query.

As the total time for the individual queries is dramatically less than the union query does anyone have any ideas why / suggestions that could improve matters.
0
AndyAinscow
Asked:
AndyAinscow
  • 14
  • 9
  • 6
  • +2
1 Solution
 
GRayLCommented:
Try putting a sort in both base queries.  Just out of curiosity, remove the ALL and see what that does to the time.
0
 
harfangCommented:
[Ray, the order by clause of both queries is skipped in a UNION query. Also, removing the ALL can only make things worse (it will have compare all 34500 records to determine uniqueness).]


Hello AndyAinscow

I'm not going to ask *why* you need to combine 1500 records with 33000 others in a UNION, but I should as this seems a clear sign of bad design. However, when working with medium-sized to large recordsets, you will prefer temp tables:

    DROP TABLE zttblTempUnion;
    SELECT * INTO zttblTempUnion FROM AuftragCostEigenes;
    INSERT INTO zttblTempUnion SELECT * FROM AuftragCostAkkordant;
    CREATE INDEX ...?

Cheers!
(°v°)
0
 
GRayLCommented:
harfang:  Maybe partly so, but I get this from Help:

If you want to specify sorting in a union query, add a single ORDER BY clause to the end of the last SELECT statement. In the ORDER BY clause, specify the field name to sort, which must come from the first SELECT statement. For information on the ORDER BY clause, click .


0
Transaction-level recovery for Oracle database

Veeam Explore for Oracle delivers low RTOs and RPOs with agentless transaction log backup and transaction-level recovery of Oracle databases. You can restore the database to a precise point in time, even to a specific transaction.

 
AndyAinscowAuthor Commented:
OK.
Adding sort and removing the ALL make little difference, a couple of seconds here and there.

ALL - in any case that is REQUIRED else any duplicate records are suppressed by the UNION (I got burnt by that many years ago - I found out why the result was not correct by comparing every single record from the individual queries and the union).

<I'm not going to ask *why* you need to combine 1500 records with 33000 others in a UNION>
Actually the UNION query is in a stripped down version.  The end result is to calculate the cost of a 'piece of work'.  The work can combine materials purchased, machine hours, employee time (both own employees and contactors with different valuations to cover social and other costs....) and so on.
Maybe a different table structure would be better for this (but might be worse for other things) - BUT the table structure exists and I have to work with it.

I suspect having to fill a temporary table is a really bad option due to database bloating  (If I really do need to go that way, which I might, then I expect I would initially use a disconnected recordset).


What I am doing here is a worse case to test performance, in reality there would be a lot of pre-filtering.
0
 
AndyAinscowAuthor Commented:
To re-iterate.
I am really puzzled by the discrepancy in performance.  A few seconds plus a few more seconds for the individual queries wheras the union takes an order of magnitude longer.
0
 
GRayLCommented:
Have you tried compact and repair?  You may have a limiting memory size that will accomodate either of the two queries without paging to virtual memory on the disk, but with the union query, you may require that disk access which as we all know is somewhat slooowweerrr.  I have half a gig memory,and with a 6k records table and a 12k records table the union all query ran in about 4 seconds.
0
 
AndyAinscowAuthor Commented:
compact and repair - yes did that before I asked the question, both on BE and FE.

Memory - its running (both DB's local) on my laptop, Win XP with 256 MB.  According to task manager roughly half of memory is in use by OP system and running progs.


I'll be on line again tomorrow, its getting on for midnight here now.
0
 
AndyAinscowAuthor Commented:
ps.
Each record isn't that big, just a few numeric fields, a date and a small text string.
0
 
Scott McDaniel (Microsoft Access MVP - EE MVE )Infotrakker SoftwareCommented:
harfang is on the mark, and you are correct in regards to bloating. Often devs will use a temporary db for their temp tables, and perform a compact/repair on that db as needed.
0
 
harfangCommented:
AndyAinscow

The best answer I can provide is that the Union basically prevent the use of any indexes. Being faced with a union query would be the human equivalent of getting two address books and being asked: "Please combine all Swiss addresses from the first with all Company addressed from the second!" Each of the individual queries might be doable, thanks to proper indexes and flags, but you will not be able to just use pointers to the information, you will need to copy everything down.

That is what's happening. Instead of a dynamic keyset, you get a snapshot. Worse, you get loosly typed fields, and all records will have to be checked for possible conversion (because Field1 in the first query is a 10-character text field, but in the second, it's a date field, for instance).

So Jet will execute each query speedily, but it will also have to parse all records, copy the fields, do the conversions, check for errors, and decide what to show you first. All that whithout indexes.

In essence, I'm convinced that a Union query is not the proper tool in your case. You should perform the price calculations for each different cost center, and add them up only in the end. If this is for a report, you should use subqueries for  each cost center, and not one gigantic query for all details of different nature. Just running that union query is sloooow. Imagine what it will be like to create groups and subgroups on it (without indexes)...

If you really need it for some reason, use a temp table, even in a temp db, which you can delete and recreate at each startup if you like. But you should probably find a way to avoid it alltogether.

(°v°)
0
 
GRayLCommented:
C'mon Markus:  (because Field1 in the first query is a 10-character text field, but in the second, it's a date field, for instance)  -  you of all people should know that the only type conversion allowed in UNION queries is text to number and vice versa.  Your example, according to MS Help would not even run.  Must be getting late over there!
0
 
harfangCommented:
Huh? If you mean "allowed if you want a meaningful result", I agree, but the mechanism is simply this: Jet will cast most mixed type column to string – even just date and number. This works in Northwind.mdb:

    SELECT Discontinued From Products
    UNION SELECT OrderDate FROM Orders
    UNION SELECT CompanyName FROM Customers

Mixing a boolean, a date and text is silly, of course, but it's still the type of things that Jet has to manage in Union queries... I believe the only rule is that the number of columns must match.

Cheers!
(°v°)
0
 
AndyAinscowAuthor Commented:
Harfang - what you say makes sense re the indexs.
I always make certain fields are of correct types and in the same order for UNION queries.
Dynamic vs Snaphot sets - no problem, it is for a report so a snapshot is suitable.

I don't think using a temp table (DB) is suitable (write to disc) I may well have to use a disconnected set (or other in memory collection).


I'l look into the subquery / subreport.  It may be suitable for this particular case.


ps.  We are each by a big lake at opposite ends of the same country.
0
 
harfangCommented:
Andy, I had thought Germany. Didn't look up your profile... Is business good on your side?
(°v°)
0
 
puppydogbuddyCommented:
Andy,
Based on the following excerpt from this link and other links, there are few things you can try that might improve your union query performance.
         http://archive.baarns.com/access/faq/ad_quer.asp#12

The purpose of a UNION is to take several dissimilar tables, or query results, and create a single unified result. To perform this step, SQL needs to know how you want the result table to look. You use the first SELECT fields list for this determination. Therefore, the fields list in the first SELECT must allow for all the fields that will be needed, including those fields that are not in the first statement but will appear later.

1. try placing the longest most complicated select statement as the first select statement.
2. consider enumerating your field lists instead of using wildcards.
3. eliminate columns with aggregate functions (sum, count, etc.) unless they are absolutely needed in the final result....perhaps you could leave the calcs/aggregates out of the union query, but include the calcs/aggregates as part of a select query that uses the union query as a "raw data" source.
0
 
harfangCommented:
PDB,

I would suggest the exact opposite of your point 3. The UNION should come at the latest possible moment, meaning that all aggregate functions and groupings should be made in the individual queries. Using aggregate functions based on a union query is particularily bad if you have groupings, because no indexes are available.

Just my opinion, of course,
(°v°)
0
 
puppydogbuddyCommented:
Harfang,
You are probably correct (as usual), but we will not know for sure unless Andy gives it a try.

                                   PDB
0
 
harfangCommented:
PDB, thanks ;) and yes, of course, each situation is different. Cheers!
(°v°)
0
 
AndyAinscowAuthor Commented:
<Is business good on your side?>
Can't complain but always interested in possible extra clients.


PDB.
1)  Next to no difference noted, I tried that earlier just didn't report the results here.
2)  I'll give it a try but I don't expect any benefit (maybe I'll be pleasantly suprised)
3)  I agree with harfang re the aggragate/union order of action, I would be astonished if that made an improvement.
0
 
AndyAinscowAuthor Commented:
OK, tried the SELECT a, b, c instead of SELECT * - no improvement.
0
 
AndyAinscowAuthor Commented:
I've just tried creating a new DB and importing the tables from the BE and the queries from the FE.
The performance is still catastrophic.
0
 
AndyAinscowAuthor Commented:
PDB
I've just checked your link out - no new info (for me) that I can see there.


I think I'm going to have to do something in memory with individual queries rather than the union.
0
 
AndyAinscowAuthor Commented:
I've had another look at the tables the queries are based upon.  Adding some more indexs might help speed things up (not all fields it filters on are indexed).
0
 
GRayLCommented:
Markus:  MS Access Help says:

Note   Each SELECT statement must return the same number of fields, in the same order. The corresponding fields are required to have compatible data types with one exception: You can use a Number and a Text field as corresponding fields.

AndyAinscow:  I'm still suspicious of memory.  Try cutting the larger table in two and running the UNION.  Eventually you should get to a point where there will be a remarkable difference in execution speed.  The OS can be quite a memory hog, and all the TSR programs we have in the start menu these days blows me away.  
0
 
harfangCommented:
Ray,

This might be a vesion issue. A2k, Jet SQL 4.0, DAO 3.6 and ADO 2.1 all state:
All queries in a UNION operation must request the same number of fields; however, the fields do not have to be of the same size or data type.

Which version's help file were you referring to, and did you try if it's enforced?
(°v°)
0
 
harfangCommented:
Hello AndyAinscow

If you want to play with temp tables, you can use this module to create and manage a linked temp database. This avoids bloating (most of it anyway). I created the module a few months back and cleaned it up for publishing here.

Have fun!

'-------------------------------------------------------------------------------
' Class Module  claTempDB
' Author        Markus G Fischer, aka harfang (°v°)
' Created       September 2005
' Updated       May 2006, MGF: cleanup, comments
' Copyright     Markus G Fischer, Geneva 2005-2006
' Distribution  free if proper credit is given
'-------------------------------------------------------------------------------
'
' Purpose
' ¯¯¯¯¯¯¯
' Implement a local temp database for use with temp tables. This prevents
' "bloating" of the front-end while providing all the benefits of locally
' stored temp tables.
'
'
' Implementation
' ¯¯¯¯¯¯¯¯¯¯¯¯¯¯
' The temp database is created at each startup in the users temp folder, as
' returned by Environ("TEMP"), using the same name as the current database,
' with two leading tilde, e.g. "~~MyFE.mdb".
'
'
' Usage
' ¯¯¯¯¯
'
' • Create a global variable for the class module:
'   ¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯
'   Global Temp As claTempDB
'
'
' • Create temporary tables with MakeTable()
'   ¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯
'   Temp.MakeTable(TableName, Query [,Index [,Index ...]])
'
'   TableName:  name of the temp table, e.g. "zttblAllDetails"
'   Query:      SQL string, name of a query, or name of a table
'   Index...:   optional name(s) of fields to index
'
'   Creates:    the table in the temp database, and
'               a linked table in the current database
'   Returns:    the number of records transfered, or -1 on failure
'
'   WARNING:    if the table exists, it is overwritten!
'
'   Example:
'
'       strSQL _
'           = " SELECT CompanyName, CustomerID As ID FROM Customers" _
'           & " UNION SELECT CompanyName, SupplierID FROM Suppliers"
'
'       If Temp.MakeTable("zttblAll", strSQL, "CompanyName", "ID") > 0 Then
'
'           Temp.DB.TableDefs("zttblAll").Fields("ID").Required = True
'           DoCmd.OpenTable "zttblAll"
'
'       End If
'
' • Delete them with DropTable()
'   ¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯
'   Temp.DropTable(TableName)
'
'   TableName:  name of the temp table to delete
'
'   Deletes:    the table in both the temp and the current databases
'   Returns:    True on success or table not found, False on error
'
'
' • Access the temp database with Temp.DB
'   ¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯
'   Any more complex operations can be executed through either the linked
'   tables in the current database, or in the temp database, using the DB
'   property. For example:
'
'   Debug.Print Temp.DB.Name, Temp.DB.TableDefs.Count
'   Set recTable = Temp.DB.OpenRecordset("zttblAny", dbOpenTable)
'
'
' • Delete the temp database by releasing the global variable
'   ¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯
'   Set Temp = Nothing
'
'
' Disclaimer
' ¯¯¯¯¯¯¯¯¯¯
' The method DropTable(), called by MakeTable(), actually deletes the named
' table in the arguments. You should therefore use a very strict naming
' convention, for example by using the suffix "zttbl" – meaning non-data (z)
' temporary (t) table (tbl) – for all temporary tables.
'

Option Compare Database
Option Explicit

Dim mstrFileName    As String           ' path and name of temp database
Dim mdbTemp         As DAO.Database     ' open database object

Property Get DB() As DAO.Database
    Set DB = mdbTemp
End Property

Private Function Drop(pdb As Database, pstrTable As String) As Boolean
'
' Drop a table from a database, does not fail if it did not exist.
'
On Error Resume Next

    pdb.Execute "DROP TABLE " & pstrTable
    If Err.Number = 3376 Then Err.Clear   ' does not exist
    Drop = (Err.Number = 0)
   
End Function

Public Function DropTable(TableName As String) As Boolean
'
' Drops a table from both the temp and the current database
' (see also description above)
'
    If Drop(CurrentDb, TableName) Then
        If Drop(mdbTemp, TableName) Then
            DropTable = True
            Exit Function
        End If
    End If
    If Err Then MsgBox Err.Description, vbExclamation, "DropTable()": Err.Clear
   
End Function

Public Function MakeTable( _
    ByVal TableName As String, _
    ByVal Query As String, _
    ParamArray IndexFields() _
    ) As Long
'
' Creates a temp table in the temp database, a linked table in the current
' database, and basic indexed on selected fields.
' (see also description above)
'
    Dim tdf As TableDef
    Dim intF As Integer
   
On Error GoTo Failure
   
    ' [Table Name] brackets
    If InStr(TableName, " ") And Left(Trim(TableName), 1) <> "[" Then
        TableName = "[" & TableName & "]"
    End If
   
    ' Subquery for SQL strings, and brackets for [Query names]
    If InStr(Query, "SELECT ") Then
        Query = "(" & Query & ") As TMP"
    ElseIf InStr(Query, " ") And Left(Trim(Query), 1) <> "[" Then
        Query = "[" & Query & "]"
    End If
   
    ' [Index Field] brackets
    For intF = 0 To UBound(IndexFields)
        If InStr(IndexFields(intF), " ") _
        And Left(Trim(IndexFields(intF)), 1) <> "[" Then
            IndexFields(intF) = "[" & IndexFields(intF) & "]"
        End If
    Next intF
   
    ' build full make-table query
    Query _
        = " SELECT * INTO " & TableName _
        & " IN '" & mstrFileName & "'" _
        & " FROM " & Query
   
    ' drop existing tables
    If Not DropTable(TableName) Then GoTo Failure
   
    With CurrentDb
       
        ' make the temp table
        .Execute Query
        MakeTable = .RecordsAffected
       
        ' and the linked table
        Set tdf = .CreateTableDef(TableName)
        tdf.Connect = ";DATABASE=" & mstrFileName
        tdf.SourceTableName = TableName
        .TableDefs.Append tdf
       
    End With
   
    ' index all selected fields
    For intF = 0 To UBound(IndexFields)
        Query _
            = " CREATE INDEX " & IndexFields(intF) _
            & " ON " & TableName _
            & " (" & IndexFields(intF) & ")"
        Debug.Print Query
        mdbTemp.Execute Query
    Next intF
       
    Exit Function
   
Failure:
    If Err Then MsgBox Err.Description, vbExclamation, "MakeTable()": Err.Clear
    MakeTable = -1
   
End Function

Private Sub Class_Initialize()
'
' Generate the temp database name and create, overwriting if needed.
'
    mstrFileName = CurrentDb.Name
    mstrFileName = Mid(mstrFileName, InStrRev(mstrFileName, "\") + 1)
    mstrFileName = Environ("TEMP") & "\~~" & mstrFileName
    If Len(Dir$(mstrFileName)) Then Kill mstrFileName
    Set mdbTemp = DBEngine.CreateDatabase(mstrFileName, dbLangGeneral)
   
End Sub

Private Sub Class_Terminate()
'
' Close the temp database and delete it.
'
On Error Resume Next

    If Not mdbTemp Is Nothing Then
        mdbTemp.Close
        Set mdbTemp = Nothing
    End If
    If Len(Dir$(mstrFileName)) Then Kill mstrFileName
   
End Sub
'-------------------------------------------------------------------------------


You could also add this to a normal module (the above  is a class module):

Global Temp As New claTempDB

Function TempCleanup()

    Dim intT As Integer
    Dim strTable As String

    Set Temp = Nothing
    With CurrentDb
        For intT = .TableDefs.Count - 1 To 0 Step -1
            strTable = .TableDefs(intT).Name
            If Left(strTable, 5) = "zttbl" Then _
                .TableDefs.Delete strTable
        Next intT
    End With
   
End Function


(°v°)
0
 
AndyAinscowAuthor Commented:
harfang - interesting piece of code.

(For this case of little use - only the data/queries is in access DBs.  The user interface is coded in C++.  Don't bother trying to convert it, I already have C++ code to cope with creating/modifying/linking tables and DBs.)


What I have noticed is something I can't explain.
In the test UNION query and its two base queries.
Each query on its own runs rapidly but in the UNION the performance is terrible.
Now I have had a look at the tables and added an index to one field that one query was filtering on.
No noticable difference to the individual queries BUT the union query is dramatically faster (not really acceptable but a lot better).

The only explanation I can come up with is that the jet engine isn't running each query individually and filling a 'pipe' filled with the output (which was how I viewed it).
0
 
harfangCommented:
Andy,

I think you are right. The default mechanism of the UNION query involves removing duplicates, so that Jet could very well run all queries simultaneously. The modifier ALL turns the removal off, but perhaps not the central mechanism.

If you are serious about improving your query's performance, check out these links:

Information about query performance in an Access database
http://support.microsoft.com/?kbid=209126

Use Microsoft Jet's ShowPlan to write more efficient queries
http://builder.com.com/5102-6388-5064388.html

The ShowPlan feature might give you the answer to your last question...

Happy hacking!
(°v°)
0
 
AndyAinscowAuthor Commented:
First link.
This looks as if it could be relevant.
Avoid calculated fields in subqueries. If you add a query that contains a calculated field to another query, the expression in the calculated field may slow performance in the top-level query.
0
 
AndyAinscowAuthor Commented:
Second link.
Very interesting.  (Why can't M$ have tools like this directly available from the access interface, rather than having to set registry keys...)
The info in the link would be better if the listings it refers to where actually there.  I'll have to study that more closely.
0
 
GRayLCommented:
Markus:  A2K with SP3.  To be honest, I did not test the 'allegation'.
0
 
AndyAinscowAuthor Commented:
Thanks to everyone.

I have made collections in memory to store and manipulate the results of individual queries to provide the information I require.  It takes a few seconds to run which is acceptable given what it is doing.
0

Featured Post

Concerto Cloud for Software Providers & ISVs

Can Concerto Cloud Services help you focus on evolving your application offerings, while delivering the best cloud experience to your customers? From DevOps to revenue models and customer support, the answer is yes!

Learn how Concerto can help you.

  • 14
  • 9
  • 6
  • +2
Tackle projects and never again get stuck behind a technical roadblock.
Join Now