Want to win a PS4? Go Premium and enter to win our High-Tech Treats giveaway. Enter to Win

x
?
Solved

Query to compare data sets

Posted on 2011-09-21
14
Medium Priority
?
415 Views
Last Modified: 2012-05-12
What I am trying to do is compare two data sets and only return rows that have differing values or are add/deleted  example:

Series    item num    price
TTT    1234    1.23
TTT    3456    2.45
TTT    6789    6.78
TTT    8765    9.01
ZZZ    1234    1.23
ZZZ    3456    2.45
ZZZ    6789    6.78
ZZZ    4321    7.89
ZZZ    8765    9.02
ZZZ    9999    9.88

I want to compare series ZZZ in the same table to previous series TTT and find what has been added, removed, and changed so the quries would produce

Deleted:  ZZZ    4321    7.89
Added: ZZZ    9999    9.88
Changed:  TTT    8765    9.01, ZZZ    8765    9.02

I know how to do basic queries but I think this would require joins of some sort.
0
Comment
Question by:jason987
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 7
  • 5
  • 2
14 Comments
 
LVL 3

Expert Comment

by:dkellner
ID: 36575732
Not completely clear but if I get it right, you're looking for this:

(assume you have a MyTable with a field ListNr that's 1 for the first recordset and 2 for the second; there are some kind of products in the table and they're identified by a productID...  then you can adapt this to your actual situation and queries)

So, the trick is to join the same table to itself:

SELECT * FROM MyTable AS T1
    LEFT JOIN MyTable AS T2 ON T1.productID=T2.productID
    WHERE 1
    AND T1.ListNr=1
    AND T2.ListNr=2
    AND T1.productID IS NOT NULL
    AND T2.productID IS NULL
;

Open in new window


Wonder if this helps and the logic is clear.  I didn't test this example as it's time consuming to build the necessary tables but it illustrates the way to go.
0
 
LVL 5

Author Comment

by:jason987
ID: 36575867
I made a table using those field names and ran your query and it returned an empty set.
0
 
LVL 3

Expert Comment

by:dkellner
ID: 36575893
Yes because it's only the way to do it.
But if you give me an sql export I'll try to tell what's missing.
0
Veeam Task Manager for Hyper-V

Task Manager for Hyper-V provides critical information that allows you to monitor Hyper-V performance by displaying real-time views of CPU and memory at the individual VM-level, so you can quickly identify which VMs are using host resources.

 
LVL 5

Author Comment

by:jason987
ID: 36575975

SET SQL_MODE="NO_AUTO_VALUE_ON_ZERO";


/*!40101 SET @OLD_CHARACTER_SET_CLIENT=@@CHARACTER_SET_CLIENT */;
/*!40101 SET @OLD_CHARACTER_SET_RESULTS=@@CHARACTER_SET_RESULTS */;
/*!40101 SET @OLD_COLLATION_CONNECTION=@@COLLATION_CONNECTION */;
/*!40101 SET NAMES utf8 */;


-- Table structure for table `test`
--

CREATE TABLE IF NOT EXISTS `test` (
  `ListNr` varchar(11) NOT NULL,
  `productID` varchar(256) NOT NULL,
  `PRICE` decimal(11,8) NOT NULL
) ENGINE=MyISAM DEFAULT CHARSET=latin1;

--
-- Dumping data for table `test`
--

INSERT INTO `test` (`ListNr`, `productID`, `PRICE`) VALUES
('TTT', '1234', '1.23000000'),
('TTT', '3456', '2.45000000'),
('TTT', '6789', '6.78000000'),
('TTT', '8765', '9.01000000'),
('ZZZ', '1234', '1.23000000'),
('ZZZ', '3456', '2.45000000'),
('ZZZ', '6789', '6.78000000'),
('ZZZ', '4321', '7.89000000'),
('ZZZ', '8765', '9.02000000'),
('ZZZ', '9999', '9.00000000');

Open in new window

0
 
LVL 3

Expert Comment

by:dkellner
ID: 36576255
Okay, now my query tried to find all the rows missing from list 2, existing in list 1 only. There are no such rows as I see. You need several queries for several kinds of differences, like "in list1 but not in list2" (this is what you have now), "in list2 but not in list1" (this would return a nonzero result; swap t1/t2 in my last 2 lines in query), and "in list1 and in list2 but not the same data in other fields", which I have not implemented in detail but it's quite obvious.
0
 
LVL 5

Author Comment

by:jason987
ID: 36576348
I ran both:

SELECT * FROM test AS T1
   LEFT JOIN test AS T2 ON T1.productID=T2.productID
   WHERE 1
   AND T1.ListNr='ZZZ'
   AND T2.ListNr='TTT'

   AND T1.productID IS NOT NULL
   AND T2.productID IS NULL

and

SELECT * FROM test AS T1
   LEFT JOIN test AS T2 ON T1.productID=T2.productID
   WHERE 1
   AND T1.ListNr='TTT'
   AND T2.ListNr='ZZZ'
   AND T1.productID IS NOT NULL
   AND T2.productID IS NULL

Both returned nothing.  Am I misunderstanding you?
0
 
LVL 3

Expert Comment

by:dkellner
ID: 36576992
No, sorry, I'm telling the wrong thing.
You can't do this left join thing with both lists in one table because join will always find the product in the same list and say "hey, I've got it".  Won't be NULL.  Damn thing.

Okay then.
Shall we use temporary tables or solve the problem in PHP?

TMP tables will make this join possible but you must do ay INSERT ... SELECT for each comparison which is rarely wise.  With PHP you'll have to fetch both record sets and do the comparison in the result arrays - which is easy unless you have millions of records.

OR, maybe someone else has a better solution.

Which way to go?
0
 
LVL 5

Author Comment

by:jason987
ID: 36577074
Yea I'm on the verge of the PHP solution which I can do but I hoped to make something I could distribute easily.
0
 
LVL 3

Expert Comment

by:dkellner
ID: 36577086
Okay, sorry again.
0
 
LVL 5

Author Comment

by:jason987
ID: 36577110
No need to apologize I appreciate the help.
0
 
LVL 4

Accepted Solution

by:
Fugas earned 2000 total points
ID: 36579117
Here is a simple query. If you have  many records, and it takes long time to execute, it is possible to optimize it.
select *,'not in ZZZ' as result from test where listNr='TTT' and productID not in (select productID from test where ListNr='ZZZ')
union 
select *,'not in TTT' as result from test where listNr='ZZZ' and productID not in (select productID from test where ListNr='TTT')

Open in new window

0
 
LVL 5

Author Comment

by:jason987
ID: 36584274
THanks that seems to do the trick.  I can't get it to work on my large DB for some reason but that is probably my fault.  What would be an optimization for that query?
0
 
LVL 5

Author Comment

by:jason987
ID: 36584284
Also, would you know how to get changed prices?
0
 
LVL 4

Expert Comment

by:Fugas
ID: 36585334
For optimalization, I have to see, how many records you have in the table and the count of those categories like 'TTT' and 'ZZZ'. But now is the query ok for less than a milion records.
0

Featured Post

Free Tool: Site Down Detector

Helpful to verify reports of your own downtime, or to double check a downed website you are trying to access.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

This post contains step-by-step instructions for setting up alerting in Percona Monitoring and Management (PMM) using Grafana.
Instead of error trapping or hard-coding for non-updateable fields when using QODBC, let VBA automatically disable them when forms open. This way, users can view but not change the data. Part 1 explained how to use schema tables to do this. Part 2 h…
This is a high-level webinar that covers the history of enterprise open source database use. It addresses both the advantages companies see in using open source database technologies, as well as the fears and reservations they might have. In this…
In this video, Percona Solution Engineer Rick Golba discuss how (and why) you implement high availability in a database environment. To discuss how Percona Consulting can help with your design and architecture needs for your database and infrastr…
Suggested Courses

604 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question