Solved

# tSQL reporting problem (grouping vs detailed)

Posted on 2008-06-10
360 Views
I have 2 reports that I run on a daily basis.  One is a detailed report of 'self trades'.  The other is a summation of the same 'self trades'.  so if I have 100 self trades for this trader on this endpoint and symbol in the detailed, I would have one record in the summation with a total selftrade count of 100.  The detailed rpt runs, loading current day to working tables, then the summation runs off the same working tables.  It has come to my attention that the detailed and the summation are not equating.
Here is a specific example:

from the summation:
CBOT   abc        ZTU8      5738     1782        99      50       0.017253398     0.028058361
CBOT   def        ZTU8       6216     2166       58       38       0.009330759     0.01754386

yet in the detailed report, there are 19 trades for trader abc with a volume of 63, and 15 trades for the trader def, with a volume of 32.  The two together, a total volume of 95.  see the selftrade count and volume for each id --
the summation:
count      volume
abc     50           99
def     38           58

the detailed
count      volume
abc      19          63
def       15          32

- my summation says i have 50 trades by abc on that symbol and endpoint.   my detailed has only 19.
- my summation says i have 38 trades by def on that symbol and endpoint.    my detailed has only 15.

this is my detailed logic:
CREATE proc detailedProc
AS
SET NOCOUNT ON

INSERT INTO workingDB.dbo.epSelfTrade SELECT DISTINCT EndPoint FROM SERVER1.database1.dbo.EPGroups WHERE EndPoint NOT IN (
'....a list of endpoints we ignore for this report....')

inserttime,source
from database.dbo.ordertable a with (nolock)
AND EXISTS (SELECT 1 FROM workingDB.dbo.epSelfTrade WHERE endpoint = a.endpoint)
AND symbol NOT LIKE '%-%'
AND NOT EXISTS (select 1 orderno from database.dbo.ordertable b with (nolock) where a.origorderno = b.origorderno
AND EXISTS (SELECT 1 FROM workingDB.dbo.epSelfTrade WHERE endpoint = b.endpoint)
AND symbol LIKE '%-%'   )
inserttime,source
FROM database.dbo.ordertable t1 WITH (NOLOCK)
AND EXISTS (SELECT 1 FROM workingDB.dbo.epSelfTrade WHERE endpoint = t1.endpoint)
AND symbol LIKE '%-%'

SELECT DISTINCT b.endpoint,b.trader AS Btrader, b.acct AS BAcctID,b.Client AS BClientID, b.OrigOrderNo AS BOrigOrderNo,s.trader AS Strader,s.acct AS SAcctID,s.Client AS SClientID, s.OrigOrderNo AS SOrigOrderNo,b.Symbol, b.Quantity, b.Price,
INNER JOIN workingDB.dbo.SelfTrades s WITH (NOLOCK)
ON s.endpoint = b.endpoint
AND s.Quantity = b.Quantity
AND s.Price = b.Price
AND s.side = 'S'
AND b.side = 'B'

SET NOCOUNT OFF
GO

--summation procedure
CREATE proc dbo.summationProc
AS

SET NOCOUNT ON

FROM (
FROM database.dbo.ordertable a WITH (NOLOCK)
AND EXISTS (SELECT 1 FROM workingDB.dbo.epSelfTrade WHERE endpoint = a.EndPoint)
GROUP BY a.EndPoint,a.Trader,a.Symbol )  AS q1
INNER JOIN
INNER JOIN workingDB.dbo.SelfTrades s WITH (NOLOCK)
ON s.EndPoint = b.EndPoint
AND s.Quantity = b.Quantity
AND s.Price = b.Price
AND s.Side = 'S'
AND b.Side = 'B'
GROUP BY b.EndPoint,b.Trader,b.Symbol ) AS q2
AND q1.symbol = q2.symbol

SET NOCOUNT OFF
GO

I know it's kind of lengthy.  I apologize for that.  Does anybody see what may be causing this variance?
0
Question by:dbaSQL
[X]
###### Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

• Help others & share knowledge
• Earn cash & points
• 15
• 11

LVL 17

Author Comment

ID: 21751091
Any input at all is very much appreciated.
0

LVL 8

Expert Comment

ID: 21751758
Yeah it is a bit lengthy. Can you publish some create tables and insert statements to run your procedures? How are you comparing two results?
0

LVL 17

Author Comment

ID: 21752188
the reports are sent out via email, as .csv's.  the comparison is simply walking thru one, then looking at the other to ensure it all adds up.  i've got both reports in front of me now, the example i gave is legit.  it doesn't add up.
0

LVL 8

Expert Comment

ID: 21753753
Filter on SelfTrades table is the same in both cases - so it is not the problem.

The detailed SQL is using distinct clause. Why? Can you run the query without DISTINCT?
0

LVL 17

Author Comment

ID: 21754198
yes, let me try this
0

LVL 17

Author Comment

ID: 21754439
well, without the distinct, i may be getting the data, BUT now i'm getting a lot of dupes in the detailed report, so it's hard to tell.

for example, this one is in there 12 times:

CBOT abc  8135  zt-01 05Mtgprap  abc 8135  zt-01  05Mtgprat  ZTU8  1  105.64  06/05/2008 10:24AM

this one is in there twice:
CBOT  jkl  8135  zn-02  05Mtgv2js  jkl      8135  zn-03      05Mth8ykt  ZNU8  1  113.17  06/05/2008 11:01AM

that's not going to work

any thoughts what may be doing this?  no changes other than commenting out DISTINCT

0

LVL 8

Expert Comment

ID: 21754653
You hit the problem. Your summary query does "group by" and publishes aggregation of details. I assume you need another aggregation but on the more detailed level then your summary.

You need to rewrite the detailed query to have sum or avg on quantity, price and group by on endPoint, Btrader etc.
0

LVL 17

Author Comment

ID: 21758624
I'm not sure I follow what you are suggesting, srnar.
0

LVL 8

Expert Comment

ID: 21758693
Sorry for confusing. I'm not able to help you more with the problem without looking into data. I try to find some more time and generate some fake data.

Otherwise we are only spending our time.
0

LVL 17

Author Comment

ID: 21758703
not confusing, really, i'm just not certain what you are suggesting.  i'd be happy to show you some data, i'm just not certain how to do so.  i don't think a cut/paste of a ton of records (detailed/summary) will go well here.
0

LVL 8

Expert Comment

ID: 21759160
3 things will be welcome:

1. create script of SelfTrades table -  can be scripted from Management studio
2. create script of ordertable -  can be also scripted from Management studio
3. select top 100 * from ordertable - can be exported into Excel via clipboard from Management studio
0

LVL 17

Author Comment

ID: 21759201
yes, srnar, i know how to script the tables.  and, i fully intended to do so, along with a capture of both the detailed data and summary data, on a specific trader/endpoint/symbol.  thus, we've got one example to work with in both detailed and summary

but.  i'd like to avoid posting it up here, if possible.

i can genericize the table def, and attach those, but what of the data?
0

LVL 8

Expert Comment

ID: 21759410
Sorry to underestimate you, dbaSQL. It is 100 records - you can change the sensitive data (names, quantities) to stay anonymous - can be done in 10 minutes. You can send it to my mail srnar at volny.cz
0

LVL 17

Author Comment

ID: 21759771
oh no, you didn't underestimate.  i just wanted you to know i was pretty comfy w/the table def and data and alll, i just didn't want to post up here.  doing this now.
0

LVL 8

Accepted Solution

srnar earned 500 total points
ID: 21786979
If you want to compare Quantity and Trade count from detail to summary you have to replace DISTINCT with aggregations - see 2 new columns newQuantity, newTrades and the long GROUP BY member set (GROUP BY b.endpoint,b.trader, b.acct,b.Client, b.OrigOrderNo,s.trader,s.acct,s.Client, s.OrigOrderNo,b.Symbol, b.Price, b.tradetime).

I hope it will be running - your samples were still far from perfect - so I had to heavily change a lot of code and data. Described change will give you when comparing same numbers - but I do not know whether they are correct - the ordertable has no natural key. Also joining on quantity or price columns is not good practice.

Also usage of identity and another column like a primary key is a bad design pattern - see this constraint:

CONSTRAINT [clidx_OrderTable_sequencenumber_timeofexec] UNIQUE  CLUSTERED
(
[SequenceNumber],

Yet I hope our effort was not useless. Enjoy.
``````SELECT b.endpoint,b.trader AS Btrader, b.acct AS BAcctID,b.Client AS BClientID, b.OrigOrderNo AS BOrigOrderNo,s.trader AS Strader,s.acct AS SAcctID,s.Client AS SClientID, s.OrigOrderNo AS SOrigOrderNo,b.Symbol, b.Price,
--new
SUM(b.Quantity) as newQuantity,
INNER JOIN workingDB.dbo.SelfTrades s WITH (NOLOCK)
ON s.endpoint = b.endpoint
AND s.Quantity = b.Quantity
AND s.Price = b.Price
AND s.side = 'S'
AND b.side = 'B'
--new
--existing
``````
0

LVL 17

Author Comment

ID: 21788448
>>Also joining on quantity or price columns is not good practice.

This particular report is of 'selftrades', which are not the norm, or standard, trading practice.  It must be joined on quantity and price in order to identify the trades we are watching for.
0

LVL 17

Author Comment

ID: 21804035
still having problems w/this.  are you avail?
0

LVL 8

Expert Comment

ID: 21809924
Not returning good data?
0

LVL 17

Author Comment

ID: 21811645
well, i'm not sure.  they are questioning this:
SUM(b.Quantity) as newQuantity,

and as i go in to check the data, i've got to question it, too. given that it is in the detailed report, not the summation, is this actually just a count of the selftrades?
0

LVL 8

Expert Comment

ID: 21811705
Yes -  newTrades = count of trades that have this same values:

which is more detailed than set (b.EndPoint,b.Trader,b.Symbol) used in your sum proc
0

LVL 17

Author Comment

ID: 21811754
>>which is more detailed than set (b.EndPoint,b.Trader,b.Symbol) used in your sum proc
can you elaborate for me, please, srnar?
0

LVL 8

Expert Comment

ID: 21811903
see 5th row of the following statement

FROM database.dbo.ordertable a WITH (NOLOCK)
AND EXISTS (SELECT 1 FROM workingDB.dbo.epSelfTrade WHERE endpoint = a.EndPoint)
GROUP BY a.EndPoint,a.Trader,a.Symbol )  AS q1
INNER JOIN
INNER JOIN workingDB.dbo.SelfTrades s WITH (NOLOCK)
ON s.EndPoint = b.EndPoint
AND s.Quantity = b.Quantity
AND s.Price = b.Price
AND s.Side = 'S'
AND b.Side = 'B'
GROUP BY b.EndPoint,b.Trader,b.Symbol ) AS q2
AND q1.symbol = q2.symbol
0

LVL 17

Author Comment

ID: 21812275
yes, i can see the differences between the two, i'm just looking for a better understanding of the change
0

LVL 17

Author Comment

ID: 21865050
I still don't have a thorough understanding of the change.  Please advise if you are able.
0

LVL 8

Expert Comment

ID: 21865316
dbaSQL, simply to avoid duplicity you have to group rows on your detail report - values than can be aggregated (used SUM, COUNT function) -

SUM(b.Quantity) as newQuantity,

for more you can visit:
http://msdn.microsoft.com/en-us/library/ms173454.aspx
0

LVL 17

Author Comment

ID: 21865418
Thank you.
0

## Featured Post

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

This article explains how to reset the password of the sa account on a Microsoft SQL Server.  The steps in this article work in SQL 2005, 2008, 2008 R2, 2012, 2014 and 2016.
This article shows gives you an overview on SQL Server 2016 row level security. You will also get to know the usages of row-level-security and how it works
Familiarize people with the process of utilizing SQL Server functions from within Microsoft Access. Microsoft Access is a very powerful client/server development tool. One of the SQL Server objects that you can interact with from within Microsoft Acâ€¦
Via a live example combined with referencing Books Online, show some of the information that can be extracted from the Catalog Views in SQL Server.
###### Suggested Courses
Course of the Month5 days, 9 hours left to enroll