Go Premium for a chance to win a PS4. Enter to Win

x
  • Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 368
  • Last Modified:

tSQL reporting problem (grouping vs detailed)

I have 2 reports that I run on a daily basis.  One is a detailed report of 'self trades'.  The other is a summation of the same 'self trades'.  so if I have 100 self trades for this trader on this endpoint and symbol in the detailed, I would have one record in the summation with a total selftrade count of 100.  The detailed rpt runs, loading current day to working tables, then the summation runs off the same working tables.  It has come to my attention that the detailed and the summation are not equating.  
Here is a specific example:

from the summation:
endpoint/trader/symbol/totalvol/totalTrades/SelfTradeVol/TotalSelfTrade/SelfTradeVolRatio/SelfTradeRatio
CBOT   abc        ZTU8      5738     1782        99      50       0.017253398     0.028058361
CBOT   def        ZTU8       6216     2166       58       38       0.009330759     0.01754386

yet in the detailed report, there are 19 trades for trader abc with a volume of 63, and 15 trades for the trader def, with a volume of 32.  The two together, a total volume of 95.  see the selftrade count and volume for each id --  
the summation:
         count      volume
abc     50           99
def     38           58

the detailed
         count      volume
abc      19          63
def       15          32

- my summation says i have 50 trades by abc on that symbol and endpoint.   my detailed has only 19.
- my summation says i have 38 trades by def on that symbol and endpoint.    my detailed has only 15.

this is my detailed logic:
CREATE proc detailedProc
AS
SET NOCOUNT ON

DELETE workingDB.dbo.epSelfTrade
INSERT INTO workingDB.dbo.epSelfTrade SELECT DISTINCT EndPoint FROM SERVER1.database1.dbo.EPGroups WHERE EndPoint NOT IN (
'....a list of endpoints we ignore for this report....')

delete workingDB.dbo.SelfTrades
/* only non-spread trades */
insert workingDB.dbo.SelfTrades
select trader,acct,prevorderno,endpoint,symbol,side,quantity,price,tradetime,clearingfee,lflag,client, orderno,origorderno,basketid,basketnumber,quantity,executedquantity,leavesquantity,ordertype, limitprice,avgfillprice,orderstatus,stopprice,timeinforce,cntrpartyID,msgID,hostname,ordermarking,
inserttime,source
from database.dbo.ordertable a with (nolock)
WHERE tradetime >= LEFT(GETDATE(),11)
AND EXISTS (SELECT 1 FROM workingDB.dbo.epSelfTrade WHERE endpoint = a.endpoint)
AND symbol NOT LIKE '%-%'
AND NOT EXISTS (select 1 orderno from database.dbo.ordertable b with (nolock) where a.origorderno = b.origorderno
AND tradetime >= LEFT(GETDATE(),11)
AND EXISTS (SELECT 1 FROM workingDB.dbo.epSelfTrade WHERE endpoint = b.endpoint)
AND symbol LIKE '%-%'   )
/* only spread trades */
insert workingDB.dbo.SelfTrades
SELECT trader,acct,prevorderno,endpoint,symbol,side,quantity,price,tradetime,clearingfee,lflag,client, orderno,origorderno,basketid,basketnumber,quantity,executedquantity,leavesquantity,ordertype, limitprice,avgfillprice,orderstatus,stopprice,timeinforce,cntrpartyID,msgID,hostname,ordermarking,
inserttime,source
FROM database.dbo.ordertable t1 WITH (NOLOCK)
WHERE tradetime >= LEFT(GETDATE(),11)
AND EXISTS (SELECT 1 FROM workingDB.dbo.epSelfTrade WHERE endpoint = t1.endpoint)
AND symbol LIKE '%-%'

SELECT DISTINCT b.endpoint,b.trader AS Btrader, b.acct AS BAcctID,b.Client AS BClientID, b.OrigOrderNo AS BOrigOrderNo,s.trader AS Strader,s.acct AS SAcctID,s.Client AS SClientID, s.OrigOrderNo AS SOrigOrderNo,b.Symbol, b.Quantity, b.Price,
CONVERT(VARCHAR(10),b.tradetime,101)+' '+SUBSTRING(CONVERT(VARCHAR(20),b.tradetime,100),13,7) AS tradetime
FROM workingDB.dbo.SelfTrades b WITH (NOLOCK)
INNER JOIN workingDB.dbo.SelfTrades s WITH (NOLOCK)
  ON s.endpoint = b.endpoint
  AND s.Quantity = b.Quantity
  AND s.Price = b.Price
  AND s.tradetime = b.tradetime
  AND s.side = 'S'
  AND b.side = 'B'
ORDER BY EndPoint,Btrader,b.Symbol

SET NOCOUNT OFF
GO

--summation procedure
CREATE proc dbo.summationProc
AS

SET NOCOUNT ON

SELECT q1.EndPoint,q1.Trader,q1.Symbol,q1.TotalVol,q1.TotalTrades,q2.SelfTradeVol,q2.TotalSelfTrades,
CAST(q2.SelfTradeVol AS DECIMAL(15,3))/CAST(q1.TotalVol AS DECIMAL(15,3)) AS SelfTradeVolRatio,
CAST(q2.TotalSelfTrades AS DECIMAL(15,3))/CAST(q1.TotalTrades AS DECIMAL(15,3)) AS TotalSelfTradeRatio

FROM (
  SELECT a.EndPoint,a.Trader,a.Symbol,SUM(quantity) AS TotalVol,COUNT(*) AS TotalTrades
  FROM database.dbo.ordertable a WITH (NOLOCK)
  WHERE tradetime >= LEFT(GETDATE(),11)
  AND EXISTS (SELECT 1 FROM workingDB.dbo.epSelfTrade WHERE endpoint = a.EndPoint)
  GROUP BY a.EndPoint,a.Trader,a.Symbol )  AS q1
  INNER JOIN
  (SELECT b.EndPoint,b.Trader,b.Symbol,SUM(b.quantity)AS SelfTradeVol,COUNT(*) AS TotalSelfTrades
  FROM workingDB.dbo.SelfTrades b WITH (NOLOCK)
  INNER JOIN workingDB.dbo.SelfTrades s WITH (NOLOCK)
        ON s.EndPoint = b.EndPoint
      AND s.Quantity = b.Quantity
        AND s.Price = b.Price
        AND s.tradetime = b.tradetime
        AND s.Side = 'S'
      AND b.Side = 'B'
   GROUP BY b.EndPoint,b.Trader,b.Symbol ) AS q2
   ON q1.Trader = q2.Trader
   AND q1.symbol = q2.symbol
   ORDER BY q1.EndPoint,q1.Trader,q1.Symbol

SET NOCOUNT OFF
GO



I know it's kind of lengthy.  I apologize for that.  Does anybody see what may be causing this variance?
0
dbaSQL
Asked:
dbaSQL
  • 15
  • 11
1 Solution
 
dbaSQLAuthor Commented:
Any input at all is very much appreciated.
0
 
srnarCommented:
Yeah it is a bit lengthy. Can you publish some create tables and insert statements to run your procedures? How are you comparing two results?
0
 
dbaSQLAuthor Commented:
the reports are sent out via email, as .csv's.  the comparison is simply walking thru one, then looking at the other to ensure it all adds up.  i've got both reports in front of me now, the example i gave is legit.  it doesn't add up.
0
Industry Leaders: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 
srnarCommented:
Filter on SelfTrades table is the same in both cases - so it is not the problem.

The detailed SQL is using distinct clause. Why? Can you run the query without DISTINCT?
0
 
dbaSQLAuthor Commented:
yes, let me try this
0
 
dbaSQLAuthor Commented:
well, without the distinct, i may be getting the data, BUT now i'm getting a lot of dupes in the detailed report, so it's hard to tell.  

for example, this one is in there 12 times:
EndPoint/BTrader/BAcctID/BClientID/BOrigOrderNo/STrader/SAcctID/SClientID/SOrigOrderNo/Symbol/Quantity/Price/tradeTime

CBOT abc  8135  zt-01 05Mtgprap  abc 8135  zt-01  05Mtgprat  ZTU8  1  105.64  06/05/2008 10:24AM

this one is in there twice:
CBOT  jkl  8135  zn-02  05Mtgv2js  jkl      8135  zn-03      05Mth8ykt  ZNU8  1  113.17  06/05/2008 11:01AM

that's not going to work

any thoughts what may be doing this?  no changes other than commenting out DISTINCT

0
 
srnarCommented:
You hit the problem. Your summary query does "group by" and publishes aggregation of details. I assume you need another aggregation but on the more detailed level then your summary.

You need to rewrite the detailed query to have sum or avg on quantity, price and group by on endPoint, Btrader etc.
0
 
dbaSQLAuthor Commented:
I'm not sure I follow what you are suggesting, srnar.
0
 
srnarCommented:
Sorry for confusing. I'm not able to help you more with the problem without looking into data. I try to find some more time and generate some fake data.

Otherwise we are only spending our time.
0
 
dbaSQLAuthor Commented:
not confusing, really, i'm just not certain what you are suggesting.  i'd be happy to show you some data, i'm just not certain how to do so.  i don't think a cut/paste of a ton of records (detailed/summary) will go well here.
0
 
srnarCommented:
3 things will be welcome:

1. create script of SelfTrades table -  can be scripted from Management studio
2. create script of ordertable -  can be also scripted from Management studio
3. select top 100 * from ordertable - can be exported into Excel via clipboard from Management studio
0
 
dbaSQLAuthor Commented:
yes, srnar, i know how to script the tables.  and, i fully intended to do so, along with a capture of both the detailed data and summary data, on a specific trader/endpoint/symbol.  thus, we've got one example to work with in both detailed and summary

but.  i'd like to avoid posting it up here, if possible.  

i can genericize the table def, and attach those, but what of the data?
0
 
srnarCommented:
Sorry to underestimate you, dbaSQL. It is 100 records - you can change the sensitive data (names, quantities) to stay anonymous - can be done in 10 minutes. You can send it to my mail srnar at volny.cz
0
 
dbaSQLAuthor Commented:
oh no, you didn't underestimate.  i just wanted you to know i was pretty comfy w/the table def and data and alll, i just didn't want to post up here.  doing this now.  
0
 
srnarCommented:
If you want to compare Quantity and Trade count from detail to summary you have to replace DISTINCT with aggregations - see 2 new columns newQuantity, newTrades and the long GROUP BY member set (GROUP BY b.endpoint,b.trader, b.acct,b.Client, b.OrigOrderNo,s.trader,s.acct,s.Client, s.OrigOrderNo,b.Symbol, b.Price, b.tradetime).

I hope it will be running - your samples were still far from perfect - so I had to heavily change a lot of code and data. Described change will give you when comparing same numbers - but I do not know whether they are correct - the ordertable has no natural key. Also joining on quantity or price columns is not good practice.

Also usage of identity and another column like a primary key is a bad design pattern - see this constraint:

      CONSTRAINT [clidx_OrderTable_sequencenumber_timeofexec] UNIQUE  CLUSTERED
      (
            [SequenceNumber],
            [TradeTime])

Yet I hope our effort was not useless. Enjoy.
SELECT b.endpoint,b.trader AS Btrader, b.acct AS BAcctID,b.Client AS BClientID, b.OrigOrderNo AS BOrigOrderNo,s.trader AS Strader,s.acct AS SAcctID,s.Client AS SClientID, s.OrigOrderNo AS SOrigOrderNo,b.Symbol, b.Price,
CONVERT(VARCHAR(10),b.tradetime,101)+' '+SUBSTRING(CONVERT(VARCHAR(20),b.tradetime,100),13,7) AS tradetime,
--new
SUM(b.Quantity) as newQuantity,
COUNT(*) AS newTrades
FROM workingDB.dbo.SelfTrades b WITH (NOLOCK)
INNER JOIN workingDB.dbo.SelfTrades s WITH (NOLOCK)
  ON s.endpoint = b.endpoint
  AND s.Quantity = b.Quantity
  AND s.Price = b.Price
  AND s.tradetime = b.tradetime
  AND s.side = 'S'
  AND b.side = 'B'
--new  
GROUP BY b.endpoint,b.trader, b.acct,b.Client, b.OrigOrderNo,s.trader,s.acct,s.Client, s.OrigOrderNo,b.Symbol, b.Price,
b.tradetime
--existing
ORDER BY EndPoint,Btrader,b.Symbol

Open in new window

0
 
dbaSQLAuthor Commented:
>>Also joining on quantity or price columns is not good practice.

This particular report is of 'selftrades', which are not the norm, or standard, trading practice.  It must be joined on quantity and price in order to identify the trades we are watching for.
0
 
dbaSQLAuthor Commented:
still having problems w/this.  are you avail?
0
 
srnarCommented:
Not returning good data?
0
 
dbaSQLAuthor Commented:
well, i'm not sure.  they are questioning this:
SUM(b.Quantity) as newQuantity,
COUNT(*) AS newTrades

and as i go in to check the data, i've got to question it, too. given that it is in the detailed report, not the summation, is this actually just a count of the selftrades?  
0
 
srnarCommented:
Yes -  newTrades = count of trades that have this same values:

b.endpoint,b.trader, b.acct,b.Client, b.OrigOrderNo,s.trader,s.acct,s.Client, s.OrigOrderNo,b.Symbol, b.Price, b.tradetime

which is more detailed than set (b.EndPoint,b.Trader,b.Symbol) used in your sum proc
0
 
dbaSQLAuthor Commented:
>>which is more detailed than set (b.EndPoint,b.Trader,b.Symbol) used in your sum proc
can you elaborate for me, please, srnar?
0
 
srnarCommented:
see 5th row of the following statement

....SELECT a.EndPoint,a.Trader,a.Symbol,SUM(quantity) AS TotalVol,COUNT(*) AS TotalTrades
  FROM database.dbo.ordertable a WITH (NOLOCK)
  WHERE tradetime >= LEFT(GETDATE(),11)
  AND EXISTS (SELECT 1 FROM workingDB.dbo.epSelfTrade WHERE endpoint = a.EndPoint)
  GROUP BY a.EndPoint,a.Trader,a.Symbol )  AS q1
  INNER JOIN
  (SELECT b.EndPoint,b.Trader,b.Symbol,SUM(b.quantity)AS SelfTradeVol,COUNT(*) AS TotalSelfTrades
  FROM workingDB.dbo.SelfTrades b WITH (NOLOCK)
  INNER JOIN workingDB.dbo.SelfTrades s WITH (NOLOCK)
        ON s.EndPoint = b.EndPoint
      AND s.Quantity = b.Quantity
        AND s.Price = b.Price
        AND s.tradetime = b.tradetime
        AND s.Side = 'S'
      AND b.Side = 'B'
   GROUP BY b.EndPoint,b.Trader,b.Symbol ) AS q2
   ON q1.Trader = q2.Trader
   AND q1.symbol = q2.symbol
   ORDER BY q1.EndPoint,q1.Trader,q1.Symbol
0
 
dbaSQLAuthor Commented:
yes, i can see the differences between the two, i'm just looking for a better understanding of the change
0
 
dbaSQLAuthor Commented:
I still don't have a thorough understanding of the change.  Please advise if you are able.
0
 
srnarCommented:
dbaSQL, simply to avoid duplicity you have to group rows on your detail report - values than can be aggregated (used SUM, COUNT function) -

SUM(b.Quantity) as newQuantity,
COUNT(*) AS newTrades

for more you can visit:
http://msdn.microsoft.com/en-us/library/ms173454.aspx
0
 
dbaSQLAuthor Commented:
Thank you.
0

Featured Post

Hire Technology Freelancers with Gigs

Work with freelancers specializing in everything from database administration to programming, who have proven themselves as experts in their field. Hire the best, collaborate easily, pay securely, and get projects done right.

  • 15
  • 11
Tackle projects and never again get stuck behind a technical roadblock.
Join Now