[Okta Webinar] Learn how to a build a cloud-first strategyRegister Now

x
?
Solved

Count distinct over Union of two tables?

Posted on 2005-05-14
6
Medium Priority
?
1,817 Views
Last Modified: 2012-06-27
Hello all,

I have tables A and B.  Each has only one column, column r.  A and B may have some duplicate entries.  I want to count the total distinct rows in A and B.  I've fiddled with various unions, to no avail.

Suggestions will be appreciated.
0
Comment
Question by:xfvgdrthbdtyvhgscv
  • 3
  • 3
6 Comments
 
LVL 5

Expert Comment

by:ljw87505
ID: 14003913
Hello xfv...

I have two questions:

1) Is this homework?
2) From your last post I assume that you are < mysql 4.1, but can you check by executing the following at the prompt:
  select version();

Thanks,
LJ
0
 
LVL 1

Author Comment

by:xfvgdrthbdtyvhgscv
ID: 14003918
Hi ljw87505,

1. This is not homework.  This is a Perl programmer spending his Saturday evening writing SQL queries that feed his Perl script with data to crunch.

2. Version is 4.0.17.

Appreciate your help.
0
 
LVL 5

Accepted Solution

by:
ljw87505 earned 1500 total points
ID: 14003959
Actully, as soon as I asked the homework question, I was thinking I'd like to find the school that actually gives their students mysql homework.  Sorry to misjudge.

Anyway, there's no subquery support in that version, so the easy solution isn't available.

I admit I'm having difficulty myself figuring out how to do this in one sql without subqueries.  Assuming the the rows within a particular table are distinct (the tables' rows just aren't distinct when combined), the following will give you two rows whose sum will be what you want:

select count(a.r)-count(b.r) from a left join b on a.r=b.r union select count(*) from b;

Of course, the following will give the all the columns (not the count) which you can then count in perl or use the DBI stuff to get a count of rows:
select r from a union select r from b;
Of course, you'll have to pull all the rows over to perl when you do that and if there a bunch of them, that won't be very efficient.  The earlier suggestion would be more efficient.

I'm sorry...it's been too long since I've had to try and work around the no-subquery limitation of mysql.  If you upgrade to >=4.1, you can do the obvious thing of:
select count(*) from (select r from a union select r from b) x;

LJ
0
Get free NFR key for Veeam Availability Suite 9.5

Veeam is happy to provide a free NFR license (1 year, 2 sockets) to all certified IT Pros. The license allows for the non-production use of Veeam Availability Suite v9.5 in your home lab, without any feature limitations. It works for both VMware and Hyper-V environments

 
LVL 1

Author Comment

by:xfvgdrthbdtyvhgscv
ID: 14004131
OK, I see - well, how about another tack then -

How about first count all the orws in table A.  Then all the rows in Table that are not in table A.  Then add them.  But I'll need the sql for table B.
0
 
LVL 5

Expert Comment

by:ljw87505
ID: 14004151
That's in theory what the
select count(a.r)-count(b.r) from a left join b on a.r=b.r union select count(*) from b;
is doing.

count(a.r) is the total rows in a,
count(b.r) is the total rows in b that are also in a
count(a.r)-count(b.r) is the total rows in a that are distinct (no corresponding row in b).
union
select count(*) from b is the total rows in b

so you get two rows from that statement: the first row represents the rows in a that don't have corresponding rows in b and the second row is the total number of rows in b.  The sum of those two (the summing has to be done in Perl, not in sql because we don't have subqueries) is what you want (I think).

0
 
LVL 1

Author Comment

by:xfvgdrthbdtyvhgscv
ID: 14005712
Well, I ended up just doing

select  r from A
while (results) {
     $temp_hash{result} = 1
}
select  r from B
while (results) {
     $temp_hash{result} = 1
}
$count = keys(%temp_hash)

%temp_hash is the union of tables A & B.  Then just count the number of elements in the hash.  It's not pretty, but it works pretty quickly I think in part thanks to the simplicity of the queries.

Anyway, thanks for the lesson in UNION.  That alone was woth it.
0

Featured Post

Transaction-level recovery for Oracle database

Veeam Explore for Oracle delivers low RTOs and RPOs with agentless transaction log backup and transaction-level recovery of Oracle databases. You can restore the database to a precise point in time, even to a specific transaction.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

In this blog post, we’ll look at how ClickHouse performs in a general analytical workload using the star schema benchmark test.
In this article, we’ll look at how to deploy ProxySQL.
In this video, Percona Solution Engineer Dimitri Vanoverbeke discusses why you want to use at least three nodes in a database cluster. To discuss how Percona Consulting can help with your design and architecture needs for your database and infras…
In this video, Percona Solution Engineer Rick Golba discuss how (and why) you implement high availability in a database environment. To discuss how Percona Consulting can help with your design and architecture needs for your database and infrastr…
Suggested Courses

834 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question