Solved

Selecting "distinct multirow sets"

Posted on 2013-06-11
18
372 Views
Last Modified: 2013-07-06
I'm trying to find a way to get a 'distinct' list of 'sets of data' across multiple records.

Basically trying to see the distinct list of operations that are available in a bill-of-material (routing, actually).

e.g.

Part      Seq  Operation
partA    10    CUT
partA    20    Weld
partB    10    CUT
partB    20   Weld
partB    30   Grind
partC    10    CUT
partC    20    Weld

I want to see that there is two distinct sets of operations: CUT, WELD and CUT, Weld, Grind.

This is on an ancient Sybase ASE 11.9.2 server, so stuck with SQL89 syntax.

Everything I've tried gives me a distinct list of ops...

Any suggestions appreciated!

TIA.

Cheers,
EdB
0
Comment
Question by:edbored
  • 10
  • 7
18 Comments
 
LVL 40

Expert Comment

by:Sharath
ID: 39240190
Do you just want the distinct operation?
select distinct Operation from your_table

Open in new window

If this is not what you are looking for, post the expected result.
0
 
LVL 48

Expert Comment

by:PortletPaul
ID: 39240344
"distinct" has no parameters or conditions, so when you say "distinct multi-row sets" I'm definitely of the view that just adding "distinct" isn't going to cut it... (pardon pun)

what I believe you are asking for is:
2 operation sequences are important, and need to be treated as separate
a. cut & weld & grind
b. cut & weld

    Part      Seq  Operation
b. partA    10    CUT & weld                (partA    20    Weld)
a. partB    10    CUT & Weld & Grind (partB    20   Weld, partB    30   Grind)
b. partC    10    CUT & Weld                 (partC    20    Weld)

not sure what your 'expected results' will be.

could you confirm if my interpretation is correct and indicate what the expected results are please?

& is the small sample in the question representative of the actual data?

{+edit, needed a correction}
0
 
LVL 48

Expert Comment

by:PortletPaul
ID: 39240373
here's a "first cut", I just "ground out" (forgive more puns?)
Results of the following query look like this:
PART     CUT    WELD    GRIND  CUT_SEQ WELD_SEQ  GRIND_SEQ
partA    CUT    Weld    (null)    10      20     (null)
partB    CUT    Weld    Grind     10      20      30
partC    CUT    Weld    (null)    10      20     (null)

Open in new window

and the query, which 'should work' in your ancient ASE (I hope)
select
  cut.part
, cut.operation   as cut
, weld.operation  as  weld
, grind.operation as  grind
, cut.seq   as cut_seq
, weld.seq  as weld_seq
, grind.seq as grind_seq

from (
       select * from Operations where Operation = 'CUT'
     ) as cut
left join (
           select * from Operations where Operation = 'Weld'
          ) as weld on cut.part = weld.part 
                   and cut.seq = (weld.seq-10)
left join (
           select * from Operations where Operation = 'Grind'
          ) as grind on cut.part = grind.part 
                    and cut.seq = (grind.seq-20)

Open in new window

this is operational (using sql 2008) at: http://sqlfiddle.com/#!3/756dc/7
0
 
LVL 1

Author Comment

by:edbored
ID: 39241224
Hi, thank you both for the input...

Definitely not just looking for 'distinct list': I need to see the distinct combinations of operations performed. In my example above, I indicated (for 3 sample parts) that there were two distinct combinations found:
a) cut, then  weld
b) Cut, then weld, then grind

A simple 'select distinct operation from routing_detail' gives me 96 different operations.
I need to determine the particular combinations used in the data. That is, I need to know the different combinations of operations ( "distinct multi-row sets" - not sure what else to call this!).

I'm thinking I may have to write code to parse this thing as a brute-force attack. I've got stuff like :

routing               operation                          seq_no
1LPC00406M     LASER OR PLASMA CUT    10
1LPC00406M     LASER OR PLASMA CUT    30
1LPC00406M     LASER OR PLASMA CUT    20
1LPC00406M     WELD                                  40

That is, a single part (synonymous with 'routing' in this context) with  three separate cut operations, and one weld operation.

If I run:
select  distinct  operation, count(1) as RecCount
from dbc.routing_detail
where routing like '1lpc%'
group by routing

Open in new window


I get:

operation                         RecCount
LASER OR PLASMA CUT   1
BEND                                 2
FORMING                          2
LASER OR PLASMA CUT   2
ROLLING                           2
FORMING                          3
LASER OR PLASMA CUT   3
ROLLING                           3
EXTERIOR PLATING          4
FORMING                          4
LASER OR PLASMA CUT   4
WAIT                                  4
WELD                                 4

So, "LASER OR PLASMA CUT" is used "by itself" (1 rec), and in combination with 1, 2, and 3 other operations (or "itself" multiple times).

Ultimately, I'm trying to determine the list of different combinations...

Oh, and it looks like  ASE 11.9 doesn't support   "select x.field from (select field, field from table) as x;"   - or the syntax I'm using is wrong...

Sorry for the length of this post.

Any other suggestions?

TIA.

Cheers,
EdB
0
 
LVL 48

Expert Comment

by:PortletPaul
ID: 39242968
Although you have not asked for a "pivot" such as the following I think treating the data this way will provide the most reliable method for determining unique routing chains.
COUNTOF    O1_OPERATION    O2_OPERATION    O3_OPERATION           O4_OPERATION          O5_OPERATION    O6_OPERATION    O7_OPERATION    O8_OPERATION    O9_OPERATION    O10_OPERATION
2          CUT             Weld
1          CUT             Weld            Grind
1          Grind
1          LASER OR PLASMA CUT             LASER OR PLASMA CUT    LASER OR PLASMA CUT    WELD
1          LASER OR PLASMA CUT             LASER OR PLASMA CUT    WELD
1          LASER OR PLASMA CUT             WELD
3          WELD
1          Weld            Grind

Open in new window

I've combined the small pieces of information provided into a table I have now called Routings and field names have changed to the most recent headings provided. The code below make the inherent assumption that a chain of operations cannot exceed 10, if that isn't true then more joins can be added. The other key assumption is that seq_no does increment by 10 if that isn't true then this query is in real trouble. Anyway perhaps you can try it.
select
  count(*) as countof
, O1.operation  as O1_operation
, O2.operation  as O2_operation
, O3.operation  as O3_operation
, O4.operation  as O4_operation
, O5.operation  as O5_operation
, O6.operation  as O6_operation
, O7.operation  as O7_operation
, O8.operation  as O8_operation
, O9.operation  as O9_operation
, O10.operation as O10_operation
from Routings O1
left join Routings O2  on O1.routing = O2.routing  and O1.seq_no = (O2.seq_no-10)
left join Routings O3  on O1.routing = O3.routing  and O2.seq_no = (O3.seq_no-10)
left join Routings O4  on O1.routing = O4.routing  and O3.seq_no = (O4.seq_no-10)
left join Routings O5  on O1.routing = O5.routing  and O4.seq_no = (O5.seq_no-10)
left join Routings O6  on O1.routing = O6.routing  and O5.seq_no = (O6.seq_no-10)
left join Routings O7  on O1.routing = O7.routing  and O6.seq_no = (O7.seq_no-10)
left join Routings O8  on O1.routing = O8.routing  and O7.seq_no = (O8.seq_no-10)
left join Routings O9  on O1.routing = O9.routing  and O8.seq_no = (O9.seq_no-10)
left join Routings O10 on O1.routing = O10.routing and O9.seq_no = (O10.seq_no-10)
group by
  O1.operation
, O2.operation
, O3.operation
, O4.operation
, O5.operation
, O6.operation
, O7.operation
, O8.operation
, O9.operation
, O10.operation

Open in new window

There might still be ASE 11x syntax errors here (I can't test that) for example it might not like 'as ...' or it might insist that table aliases do use ' as ' - but I'm hopeful that it will at least allow the joins proposed above.

nb, I couldn't locate ASE 11x documentation, 12x is as far back as Sybase seems to go, but it really does surprise me it couldn't deal with subquery aliases, oh well.

{add this: http://sqlfiddle.com/#!3/b9292/6 (mssql 2008) }
0
 
LVL 1

Author Comment

by:edbored
ID: 39243562
I found a document comparing ms-sql 6.5 to ASE 11.9.2:

The ANSI 92 SQL language definition added a verbose form of join syntax to the SELECT statement (using the keywords JOIN, CROSS JOIN, INNER JOIN, LEFT OUTER JOIN, RIGHT OUTER JOIN, FULL OUTER JOIN). Version 6.5 of Microsoft SQL Server added this syntax to its SQL dialect. ASE does not provide this syntax for the SELECT statement, but instead supports the traditional syntax using ‘=’, ‘*=’ and ‘=*'

So, almost, but not quite, exactly like SQL92 - except for the parts that aren't.

<sigh>

At any rate, the seq_no values are all over the map - nothing as clean as increments of 10.

OTOH, you've given me an idea - I'm booked Thursday and Friday, so it might be the weekend before I check back in.

At this point I'm thinking I might have to dump the 60K  or so records to another database - one that wasn't designed to be installed in Win95 or NT4... <g>.

Thanks for your efforts - greatly appreciated!

EdB
0
 
LVL 48

Expert Comment

by:PortletPaul
ID: 39243571
OK, no problem, at least (maybe?) some progress - I'd concur with getting this data into something more "suitable".

Perhaps those op seqience numbers could be "standardized" by using something like:

row_number() over (partition by routing, order by seq_no) as my_seq_nums

then at least then the base data variability could be avoided.
also opens up possibility for orher approaches such as "for xml path" so that maybe it can be addressed without any self-joins.

I may mock-up an approach... what dbms would you be thinking of using?
0
 
LVL 1

Author Comment

by:edbored
ID: 39243591
Definitely progress - you reminded me of something I wrote for this company back in 2002 (ish) - if I can find the code, or dredge up what I did, I can probably solve this.

If I do a dump, I'll either use MS-SQL Express 2005 (or 2008), or Sybase SQL Anywhere 11. I also have Firebird 2.1 installed in a vm somewhere on my laptop. Heck, I could even throw the data into an MDB file and use Delphi ADO components to query direct.

I really appreciate your efforts - if you think there's some clever SQL that'll do this for me, I wouldn't mind seeing it (even just a mockup/pointer to "try this").

OTOH, I'm probably going to end up writing Delphi code to query raw data, and handle sorting out the "combinations" of operations manually (brute force, it ain't pretty, but it works) - so, don't put too much effort into this!

I'll let you know what I come up with, and award the points then (so I'm not "posting after awarding").

Cheers,
EdB
0
 
LVL 48

Expert Comment

by:PortletPaul
ID: 39243613
can't help you with Delphi at all (never touched Firebird either) but if using sql server I'd go for 2008 if possible - and not too familiar with Sybase version details to be honest

(in reality I work most with Oracle)

However I'll knock-up something which might be useful to your thinking

out of interest how many records (roughly) are you dealing with in the routing table?
0
What Is Threat Intelligence?

Threat intelligence is often discussed, but rarely understood. Starting with a precise definition, along with clear business goals, is essential.

 
LVL 48

Expert Comment

by:PortletPaul
ID: 39243639
btw: I'm thinking that one of the approaches here may be appropriate, so this quite broad discussion with options could be just the ticket for you:

https://www.simple-talk.com/sql/t-sql-programming/concatenating-row-values-in-transact-sql/

in essence your need (I think) to get those routing chains into a string, and then perform the aggregations. So it's really a matter of choosing how to get those strings (again - I think)

cheers. Paul
0
 
LVL 1

Author Comment

by:edbored
ID: 39243658
I just checked - the routing_detail table has about 22,000 records.

But, I have to run the same type of review in a variety of different tables. There's one with  3,000,000+ records (it's been running since 2001).

Oh, and it turns out I have MSSQL express 2005 already installed and running in a vm.

In case you were wondering - I'm trying to classify manufacturing/financial information in order to prioritize what data needs to be migrated to a modern version of an ERP/MRP system.

As far as Delphi goes - no problem, it's my tool of choice - I figure it'll take about a day to write something that will handle a reasonably generic case of "gimme the combinations"...

Hmm, 3:30 am - time to hang it up for the night.

Thanks again.
0
 
LVL 48

Accepted Solution

by:
PortletPaul earned 500 total points
ID: 39243703
on very limited data, but do have a look: http://sqlfiddle.com/#!3/b9292/15
-- correlated subquery/for xml path
SELECT
  [chain]
, sum(count_of) AS frequency_of_use
FROM (
      SELECT
           O1.routing
          , stuff((
                  SELECT ', ' + [operation]
                  FROM Routings AS O2
                  WHERE O1.routing = O2.routing
                  ORDER BY O2.seq_no
                  FOR XML PATH('')
                      , TYPE
                  ).value('.', 'varchar(max)'), 1, 1, '') AS [chain]
          , count(DISTINCT O1.routing) AS count_of
      FROM Routings AS O1
      GROUP BY routing
     ) AS derived
GROUP BY
   [chain]
;

-- cross apply/for xml path
SELECT
  [chain]
, count(distinct routing) AS frequency_of_use
FROM (
        SELECT
              O1.routing
            , CA.chain
        FROM Routings AS O1
        CROSS APPLY (
                SELECT 
                    stuff((
                            SELECT ', ' + [operation]
                            FROM Routings AS O2
                            WHERE O1.routing = O2.routing
                            ORDER BY O2.seq_no
                            FOR XML PATH('')
                                , TYPE
                            ).value('.', 'varchar(max)'), 1, 1, '')
            ) AS CA(chain)
     ) AS derived
GROUP BY
   [chain]

Open in new window

0
 
LVL 48

Expert Comment

by:PortletPaul
ID: 39243721
MSSQL express 2005 should be fine for the above, could also use recursive CTE approaches.

22k won't be an issue but 3m might be 'expensive' - but as its not recurring stuff I guess that won't be too much of an issue. Might stretch the available VM memory - might want to look at that before getting too ambitious.

I'd also suggest placing the output of whatever is decided upon into a table so you can analyse to your hearts content once the base data has been crunched.

with respect to having a 'variety of tables' I'll leave that to you - but perhaps massaging these into one common field structure may be beneficial for this exercise?

do have a look at that article - the logic contained in the functions discussed might translate quite readily to Delphi (a guess).

sleep well.
0
 
LVL 1

Author Comment

by:edbored
ID: 39243730
Ah - that link also helps - the code I wrote in 2001 dealt with bill-of-material data. I started from the bottom of a BOM tree and created a 'where-used' list by concatenating all the child-to-parent-to-parent-etc part numbers.

The leaf nodes were all  flagged as purchased parts.

Starting with the list of leaf nodes, I 'drilled back' with some screwy " select parent from BOM where exists (select l2.child  from item L2 where L2.parent=L1.child ..." - all hand-coded (no recursion) to about 5 levels deep.

Hmm, I think I did a group by and count - then did a union of results of concatenation levels based the count  - that is, hand unrolled 2 levels of concatenation, 3 levels of concats, etc.

That's not quite right, but close enough for 4:00am....

Might give that sort of thing a try.
0
 
LVL 48

Expert Comment

by:PortletPaul
ID: 39262465
Hi, how is this going? (Selecting "distinct multirow sets") Any progress?
0
 
LVL 1

Author Comment

by:edbored
ID: 39266907
Have not had a chance to even attempt the brute-force code (other fires with other customers).

I'm happy to accept your last answer as solution though - if it's doing what I think it does, it would get me most of the way to what I need (let me know).

When I get back  (hopefully next week) to this problem, I'll post what I did.

Cheers,
EdB
0
 
LVL 48

Expert Comment

by:PortletPaul
ID: 39267374
>>if it's doing what I think it does, it would get me most of the way to what I need (let me know).
Yes, I believe it will produce the 'operation chains' for further evaluation.
Cheers, Paul
0
 
LVL 1

Author Closing Comment

by:edbored
ID: 39304252
Thanks for the many posts on this - I've been slammed with work and can't put it into a "production solution" (there are so many variations where I need to do this that I can't test it properly) but the quick and dirty testing I did indicated this is the right direction...
0

Featured Post

IT, Stop Being Called Into Every Meeting

Highfive is so simple that setting up every meeting room takes just minutes and every employee will be able to start or join a call from any room with ease. Never be called into a meeting just to get it started again. This is how video conferencing should work!

Join & Write a Comment

Is your Office 365 signature not working the way you want it to? Are signature updates taking up too much of your time? Let's run through the most common problems that an IT administrator can encounter when dealing with Office 365 email signatures.
What is Backup? Backup software creates one or more copies of the data on your digital devices in case your original data is lost or damaged. Different backup solutions protect different kinds of data and different combinations of devices. For e…
Sending a Secure fax is easy with eFax Corporate (http://www.enterprise.efax.com). First, Just open a new email message.  In the To field, type your recipient's fax number @efaxsend.com. You can even send a secure international fax — just include t…
Here's a very brief overview of the methods PRTG Network Monitor (https://www.paessler.com/prtg) offers for monitoring bandwidth, to help you decide which methods you´d like to investigate in more detail.  The methods are covered in more detail in o…

744 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

15 Experts available now in Live!

Get 1:1 Help Now