?
Solved

Query to compare data sets

Posted on 2011-09-21
14
Medium Priority
?
407 Views
Last Modified: 2012-05-12
What I am trying to do is compare two data sets and only return rows that have differing values or are add/deleted  example:

Series    item num    price
TTT    1234    1.23
TTT    3456    2.45
TTT    6789    6.78
TTT    8765    9.01
ZZZ    1234    1.23
ZZZ    3456    2.45
ZZZ    6789    6.78
ZZZ    4321    7.89
ZZZ    8765    9.02
ZZZ    9999    9.88

I want to compare series ZZZ in the same table to previous series TTT and find what has been added, removed, and changed so the quries would produce

Deleted:  ZZZ    4321    7.89
Added: ZZZ    9999    9.88
Changed:  TTT    8765    9.01, ZZZ    8765    9.02

I know how to do basic queries but I think this would require joins of some sort.
0
Comment
Question by:jason987
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 7
  • 5
  • 2
14 Comments
 
LVL 3

Expert Comment

by:dkellner
ID: 36575732
Not completely clear but if I get it right, you're looking for this:

(assume you have a MyTable with a field ListNr that's 1 for the first recordset and 2 for the second; there are some kind of products in the table and they're identified by a productID...  then you can adapt this to your actual situation and queries)

So, the trick is to join the same table to itself:

SELECT * FROM MyTable AS T1
    LEFT JOIN MyTable AS T2 ON T1.productID=T2.productID
    WHERE 1
    AND T1.ListNr=1
    AND T2.ListNr=2
    AND T1.productID IS NOT NULL
    AND T2.productID IS NULL
;

Open in new window


Wonder if this helps and the logic is clear.  I didn't test this example as it's time consuming to build the necessary tables but it illustrates the way to go.
0
 
LVL 5

Author Comment

by:jason987
ID: 36575867
I made a table using those field names and ran your query and it returned an empty set.
0
 
LVL 3

Expert Comment

by:dkellner
ID: 36575893
Yes because it's only the way to do it.
But if you give me an sql export I'll try to tell what's missing.
0
Optimize your web performance

What's in the eBook?
- Full list of reasons for poor performance
- Ultimate measures to speed things up
- Primary web monitoring types
- KPIs you should be monitoring in order to increase your ROI

 
LVL 5

Author Comment

by:jason987
ID: 36575975

SET SQL_MODE="NO_AUTO_VALUE_ON_ZERO";


/*!40101 SET @OLD_CHARACTER_SET_CLIENT=@@CHARACTER_SET_CLIENT */;
/*!40101 SET @OLD_CHARACTER_SET_RESULTS=@@CHARACTER_SET_RESULTS */;
/*!40101 SET @OLD_COLLATION_CONNECTION=@@COLLATION_CONNECTION */;
/*!40101 SET NAMES utf8 */;


-- Table structure for table `test`
--

CREATE TABLE IF NOT EXISTS `test` (
  `ListNr` varchar(11) NOT NULL,
  `productID` varchar(256) NOT NULL,
  `PRICE` decimal(11,8) NOT NULL
) ENGINE=MyISAM DEFAULT CHARSET=latin1;

--
-- Dumping data for table `test`
--

INSERT INTO `test` (`ListNr`, `productID`, `PRICE`) VALUES
('TTT', '1234', '1.23000000'),
('TTT', '3456', '2.45000000'),
('TTT', '6789', '6.78000000'),
('TTT', '8765', '9.01000000'),
('ZZZ', '1234', '1.23000000'),
('ZZZ', '3456', '2.45000000'),
('ZZZ', '6789', '6.78000000'),
('ZZZ', '4321', '7.89000000'),
('ZZZ', '8765', '9.02000000'),
('ZZZ', '9999', '9.00000000');

Open in new window

0
 
LVL 3

Expert Comment

by:dkellner
ID: 36576255
Okay, now my query tried to find all the rows missing from list 2, existing in list 1 only. There are no such rows as I see. You need several queries for several kinds of differences, like "in list1 but not in list2" (this is what you have now), "in list2 but not in list1" (this would return a nonzero result; swap t1/t2 in my last 2 lines in query), and "in list1 and in list2 but not the same data in other fields", which I have not implemented in detail but it's quite obvious.
0
 
LVL 5

Author Comment

by:jason987
ID: 36576348
I ran both:

SELECT * FROM test AS T1
   LEFT JOIN test AS T2 ON T1.productID=T2.productID
   WHERE 1
   AND T1.ListNr='ZZZ'
   AND T2.ListNr='TTT'

   AND T1.productID IS NOT NULL
   AND T2.productID IS NULL

and

SELECT * FROM test AS T1
   LEFT JOIN test AS T2 ON T1.productID=T2.productID
   WHERE 1
   AND T1.ListNr='TTT'
   AND T2.ListNr='ZZZ'
   AND T1.productID IS NOT NULL
   AND T2.productID IS NULL

Both returned nothing.  Am I misunderstanding you?
0
 
LVL 3

Expert Comment

by:dkellner
ID: 36576992
No, sorry, I'm telling the wrong thing.
You can't do this left join thing with both lists in one table because join will always find the product in the same list and say "hey, I've got it".  Won't be NULL.  Damn thing.

Okay then.
Shall we use temporary tables or solve the problem in PHP?

TMP tables will make this join possible but you must do ay INSERT ... SELECT for each comparison which is rarely wise.  With PHP you'll have to fetch both record sets and do the comparison in the result arrays - which is easy unless you have millions of records.

OR, maybe someone else has a better solution.

Which way to go?
0
 
LVL 5

Author Comment

by:jason987
ID: 36577074
Yea I'm on the verge of the PHP solution which I can do but I hoped to make something I could distribute easily.
0
 
LVL 3

Expert Comment

by:dkellner
ID: 36577086
Okay, sorry again.
0
 
LVL 5

Author Comment

by:jason987
ID: 36577110
No need to apologize I appreciate the help.
0
 
LVL 4

Accepted Solution

by:
Fugas earned 2000 total points
ID: 36579117
Here is a simple query. If you have  many records, and it takes long time to execute, it is possible to optimize it.
select *,'not in ZZZ' as result from test where listNr='TTT' and productID not in (select productID from test where ListNr='ZZZ')
union 
select *,'not in TTT' as result from test where listNr='ZZZ' and productID not in (select productID from test where ListNr='TTT')

Open in new window

0
 
LVL 5

Author Comment

by:jason987
ID: 36584274
THanks that seems to do the trick.  I can't get it to work on my large DB for some reason but that is probably my fault.  What would be an optimization for that query?
0
 
LVL 5

Author Comment

by:jason987
ID: 36584284
Also, would you know how to get changed prices?
0
 
LVL 4

Expert Comment

by:Fugas
ID: 36585334
For optimalization, I have to see, how many records you have in the table and the count of those categories like 'TTT' and 'ZZZ'. But now is the query ok for less than a milion records.
0

Featured Post

Want to be a Web Developer? Get Certified Today!

Enroll in the Certified Web Development Professional course package to learn HTML, Javascript, and PHP. Build a solid foundation to work toward your dream job!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

This post contains step-by-step instructions for setting up alerting in Percona Monitoring and Management (PMM) using Grafana.
In this article, we’ll look at how to deploy ProxySQL.
In this video, Percona Solution Engineer Dimitri Vanoverbeke discusses why you want to use at least three nodes in a database cluster. To discuss how Percona Consulting can help with your design and architecture needs for your database and infras…
In this video, Percona Director of Solution Engineering Jon Tobin discusses the function and features of Percona Server for MongoDB. How Percona can help Percona can help you determine if Percona Server for MongoDB is the right solution for …

770 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question