Count distinct over Union of two tables?

Hello all,

I have tables A and B.  Each has only one column, column r.  A and B may have some duplicate entries.  I want to count the total distinct rows in A and B.  I've fiddled with various unions, to no avail.

Suggestions will be appreciated.
LVL 1
xfvgdrthbdtyvhgscvAsked:
Who is Participating?
 
ljw87505Connect With a Mentor Commented:
Actully, as soon as I asked the homework question, I was thinking I'd like to find the school that actually gives their students mysql homework.  Sorry to misjudge.

Anyway, there's no subquery support in that version, so the easy solution isn't available.

I admit I'm having difficulty myself figuring out how to do this in one sql without subqueries.  Assuming the the rows within a particular table are distinct (the tables' rows just aren't distinct when combined), the following will give you two rows whose sum will be what you want:

select count(a.r)-count(b.r) from a left join b on a.r=b.r union select count(*) from b;

Of course, the following will give the all the columns (not the count) which you can then count in perl or use the DBI stuff to get a count of rows:
select r from a union select r from b;
Of course, you'll have to pull all the rows over to perl when you do that and if there a bunch of them, that won't be very efficient.  The earlier suggestion would be more efficient.

I'm sorry...it's been too long since I've had to try and work around the no-subquery limitation of mysql.  If you upgrade to >=4.1, you can do the obvious thing of:
select count(*) from (select r from a union select r from b) x;

LJ
0
 
ljw87505Commented:
Hello xfv...

I have two questions:

1) Is this homework?
2) From your last post I assume that you are < mysql 4.1, but can you check by executing the following at the prompt:
  select version();

Thanks,
LJ
0
 
xfvgdrthbdtyvhgscvAuthor Commented:
Hi ljw87505,

1. This is not homework.  This is a Perl programmer spending his Saturday evening writing SQL queries that feed his Perl script with data to crunch.

2. Version is 4.0.17.

Appreciate your help.
0
Upgrade your Question Security!

Your question, your audience. Choose who sees your identity—and your question—with question security.

 
xfvgdrthbdtyvhgscvAuthor Commented:
OK, I see - well, how about another tack then -

How about first count all the orws in table A.  Then all the rows in Table that are not in table A.  Then add them.  But I'll need the sql for table B.
0
 
ljw87505Commented:
That's in theory what the
select count(a.r)-count(b.r) from a left join b on a.r=b.r union select count(*) from b;
is doing.

count(a.r) is the total rows in a,
count(b.r) is the total rows in b that are also in a
count(a.r)-count(b.r) is the total rows in a that are distinct (no corresponding row in b).
union
select count(*) from b is the total rows in b

so you get two rows from that statement: the first row represents the rows in a that don't have corresponding rows in b and the second row is the total number of rows in b.  The sum of those two (the summing has to be done in Perl, not in sql because we don't have subqueries) is what you want (I think).

0
 
xfvgdrthbdtyvhgscvAuthor Commented:
Well, I ended up just doing

select  r from A
while (results) {
     $temp_hash{result} = 1
}
select  r from B
while (results) {
     $temp_hash{result} = 1
}
$count = keys(%temp_hash)

%temp_hash is the union of tables A & B.  Then just count the number of elements in the hash.  It's not pretty, but it works pretty quickly I think in part thanks to the simplicity of the queries.

Anyway, thanks for the lesson in UNION.  That alone was woth it.
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.