Access Not correlating subquery correctly

dqmq
dqmq used Ask the Experts™
on
myTable has a unique ID column and an EmployeeID column that allows duplicates. The following query is expected to return 1 row for each employee and does exactly that in most SQL databases.  But in Access (2007), it only returns 1 employee row.  The cause has to do with the T1 alias, which confuses Access.   If you change or remove the capitalized T1 alias, then the query works fine.

select * from mytable as t1
where id = (select max(id) from (SELECT * FROM MYTABLE AS T1) as t2
where  t1.[employeeid] = t2.[employeeid]);

To my understanding the capitalized T1 alias is out of scope for the outer where clause on employeeid and, therefore, should not be misinterpreted. Is Access out of compliance with the SQL standard or am I missing something?

Please do not answer with suggestions about how to "fix" the SQL and, yes, I know the inner most select is really unnecessary in this simplified example.

 Thanks









   
create table MyTable (id int, employeeid int, tries int);

	insert into MyTable  select 1, 1, 1;
	insert into MyTable  select 2, 1, 2;
	insert into MyTable  select 3, 1, 4;
	insert into MyTable  select 4, 1, 4;
	insert into MyTable  select 5, 2, 2;
	insert into MyTable  select 6, 1, 2;
	insert into MyTable  select 7, 2, 3;
	insert into MyTable  select 8, 3, 1;

--this produces 1 row per employee
select * from mytable as t1
where id = (select max(id) from (SELECT * FROM MYTABLE AS T1) as t2
where  t1.[employeeid] = t2.[employeeid]);

--this produces 1 employee row
select * from mytable as t1
where id = (select max(id) from (SELECT * FROM MYTABLE) as t2
where  t1.[employeeid] = t2.[employeeid]);

Open in new window

Comment
Watch Question

Do more with

Expert Office
EXPERT OFFICE® is a registered trademark of EXPERTS EXCHANGE®
Expert of the Quarter 2010
Expert of the Year 2010

Commented:
> Is Access out of compliance with the SQL standard or am I missing something?
It is out of compliance.
I can say it as many times as you like. :)
DatabaseMX (Joe Anderson - Microsoft Access MVP)Database Architect / Application Developer
Top Expert 2007

Commented:
Not sure any DB is in full compliance :-)

mx
Expert of the Quarter 2010
Expert of the Year 2010

Commented:
The query fragments don't even run, not just access but sql server.
Maybe fix the sql and we can try the question again?
Expert of the Quarter 2010
Expert of the Year 2010

Commented:
This works in SQL Server but Access will not even have a bar of it.

--this produces 1 row per employee
select *
from mytable as t1
where id = (
      select id
      from (
            select employeeid, max(id) as id
            from (
                  SELECT * FROM MYTABLE) AS T1
            group by employeeid) AS T2
      where t1.[employeeid] = t2.[employeeid])

Access is never and has never been a fully ANSI compliant industry grade DBMS.
So, what can I say...
I tried your example with Access 2010 and get
1 line for query 1 and
3 lines for query 2

But the result is the same, nevertheless I use T1 or t1
So here it may have something to do with just the order.

Is Access out of compliance with the SQL standard or am I missing something:
Access is in general not 100% SQL conform, but the question is also, which SQL is the standard?
In my mind, the usuage of "T1" for two different parts of the query is subject of interpretation.
Expert of the Quarter 2010
Expert of the Year 2010
Commented:
And if you want to see Access doing something weird, have a look at this query

select *
from mytable2 as t1
where id in (    '''''' <<<< change this to "=" first
      select id2
      from (
            select employeeid, max(t1.id) as id2
            from (
                  SELECT * FROM MYTABLE2) AS T1
            group by employeeid) AS T2
      where t1.[employeeid] = t2.[employeeid])

For the equal version, Access complains that it must return 1 row only.  Doing it by hand, I am absolutely certain it returns only one record per employeeid.

The in version works, but I assure you not in the way you would think.
DatabaseMX (Joe Anderson - Microsoft Access MVP)Database Architect / Application Developer
Top Expert 2007

Commented:
I've emulated this in A2003:

"--this produces 1 row per employee"
Mine produces 1 row period.
id      employeeid      tries
8      3      1

"this produces 1 employee row
Mine produces 3 employee rows:
id      employeeid      tries
6      1      2
7      2      3
8      3      1

Seems you have stated it backwards ?

mx
Expert of the Quarter 2010
Expert of the Year 2010

Commented:
dbmx.. which queries are you using!?
DatabaseMX (Joe Anderson - Microsoft Access MVP)Database Architect / Application Developer
Top Expert 2007

Commented:
The two he posted, in that order.

mx
Expert of the Quarter 2010
Expert of the Year 2010

Commented:
mx,

Thanks!

dqmq,

You know what? I need a break :)
Sorry for the comment http:#a33119442 .  There was a PEBKAC between copying and pasting your code.
DatabaseMX (Joe Anderson - Microsoft Access MVP)Database Architect / Application Developer
Top Expert 2007

Commented:
Soooo ... what's happening ?

mx

Author

Commented:
mx,
My goof: I labeled the queries backwards.   Nonetheless, my contention is that the queries are logically equivalent, and produce identical results on SQL server, but different results on Access.


cyber,
I'm using the two queries in the snippet.  They both run on SQL server and Access...just produce different results.




Expert of the Quarter 2010
Expert of the Year 2010

Commented:
>cyber,
>I'm using the two queries in the snippet.  They both run on SQL server and Access...just produce different results.

Yes I figured that out (in the end).
I provided some more tests for you up to to see other Access "Quirks", to reveal more of what the engine is doing.

If you want traction and it bothers you, log a bug with Microsoft or go to a Microsoft forum.
Otherwise, just always use unique aliases!
DatabaseMX (Joe Anderson - Microsoft Access MVP)Database Architect / Application Developer
Top Expert 2007

Commented:
So the only difference is:

FROM MYTABLE) as t2

vs

FROM MYTABLE AS T1) as t2

?

Author

Commented:
Right
DatabaseMX (Joe Anderson - Microsoft Access MVP)Database Architect / Application Developer
Top Expert 2007

Commented:
this

FROM MYTABLE AS T1) as t2

seems confusing to me.  What would be the reason for doing this ?

mx

Author

Commented:
In the actual query, that's a join to another nested select and yet another correlating where clause.  In isolating the cause, I methodically reduced the query to the minimal structure needed to illustrate the problem.  I was trying to make it LESS confusing. :>)  
Expert of the Quarter 2010
Expert of the Year 2010

Commented:
dqmq,

I think you have been poisoned *ahem* spoilt by Sql Server.
Doubly nested queries are ok in Sql Server but even Oracle/MySQL will give up.  I haven't read the Ansi SQL standard, but I guess it doesn't mention much about nesting.

select *, (Select min(a) from (select a union all select b union all select c))
from tbl

where a,b,c are columns in the table "tbl". This query finds the least non-null value from the 3 columns.
I don't know what you mean by “give up”...

Oracle® Database SQL Reference — Using Subqueries
http://download.oracle.com/docs/cd/B19306_01/server.102/b14200/queries007.htm

“A subquery can contain another subquery. Oracle Database imposes no limit on the number of subquery levels in the FROM clause of the top-level query. You can nest up to 255 levels of subqueries in the WHERE clause.”

The following page starts explaining how the optimiser can often “unnest” subqueries to produce a better execution plan. Scoping of aliases is formalised as well, along with general name resolution.

Access also handles sub-queries quite gracefully, in its own way. It doesn't optimise them, and apparently has no formal scoping of aliases, but they do work. I can't speak for MySQL, but are you saying it would choke on your query above?

I don't believe you can get spoiled by SQL Server. It's just one engine among many.

(°v°)
dqmq,

I was following this thread out of curiosity. Subqueries and even nested subqueries in the WHERE are supported by Access since version 2.0 at least. Subqueries in the FROM clause came in a little later (first as an undocumented hack).

However, I have never seen a document about scoping rules of aliases in Jet. It seems logical to us that the inner alias (T1) should not be visible outside of the subquery, and this is formalised in most if not all major SQL engines. I don't know if ANSI has any specifications about that, though. I have looked for it a while yesterday, but all examples systematically use different aliases, typically t1, t2, t3, not stating if the choice is made for the human reader or for the engine.

So it's not really a bug in Jet, since scoping isn't documented.

Cheers!
(°v°)
DatabaseMX (Joe Anderson - Microsoft Access MVP)Database Architect / Application Developer
Top Expert 2007

Commented:
"I was following this thread out of curiosity"
And I was about to ping you in case you were not :-)

mx
Expert of the Quarter 2010
Expert of the Year 2010

Commented:
@harfanq

Have you tried that particular subquery?
A subquery 2 levels deep trying to access columns from the outer query?
Oracle throws its hands in the air and says "wtf?!"
Expert of the Quarter 2010
Expert of the Year 2010

Commented:
Just a note, in Sql Server, you need to alias the subquery, so the above should have been

select *, (Select min(a) from (select a union all select b union all select c) x)
from tbl

To help you, here's a test including table creation and population:

create table tbl_ck(a int, b int, c int)
insert into tbl_ck values (1,2,3)
insert into tbl_ck values (4,5,6)
select *, (Select min(a) from (select a union all select b union all select c) x)
from tbl_ck

Try in Sql Server, then Oracle and MySql
cyberkiwi,

My bad. I read your comment to mean that “Doubly nested queries are ok in Sql Server but even Oracle/MySQL will give up.” I didn't link that sentence to the query below, which I admit I hadn't studied in detail (I didn't see how it was addressing alias scoping issues), and I haven't tried it in Oracle. Is this syntax part of the “SELECT overloading” principle?

Rereading, I do believe you when you say it will fail. Sorry about overreacting.

(°v°)

Author

Commented:
>And if you want to see Access doing something weird, have a look at this query
>(see query several frames back)

I agree, that's weird alright.  And it's also the same effect that I experienced in my query with the T1 alias.  Change  the inner most (captialized query) alias to T3 and it works the same as the IN version, which is how it should work.  It also works the properly on SQL server.    

Columns from the inner most SELECT are of scope for the last  WHERE clause.  You can prove that by running this query:

select *
from mytable as t1
where id = (  
      select id
      from (
            select employeeid, max(id)
            from ( SELECT * FROM MYTABLE) AS T1
            group by employeeid ) AS T2
      where t1.[id] = t2.[employeeid])   <==note the change here

In this case, access properly cannot resolve t1.id in the final line because it is out of scope. But change it to t1.employeeid and access improperly resolves it.

-----
FWIW, it's Access that's spoiled me, not SQL Server. :>)    



 
DatabaseMX (Joe Anderson - Microsoft Access MVP)Database Architect / Application Developer
Top Expert 2007

Commented:
"Access that's spoiled me"
10-4 And Copy That :-)

mx

Author

Commented:
Actually, the 92 standard is painfully detailed about nesting and scope.  My reading of the standard says that:

1).  The scope of the inner select does not extend to an outer where clause
2).  The same alias is NOT allowed twice within the same scope
3).  The same alias is permitted in different scopes

Soooo...the query should fail to compile if both T1's are in the same scope and it should return a different result if they are not in the same scope.  I rather respect Oracle for balking at the SQL, and as much as I fear Access's occasional complaint that a query is too complex, that's better than happily returning nonsensical results.

For your reading pleasure, here are some excerpts from the stancard:

            An <identifier> that is a <correlation name> is associated with
            a table within a particular scope. The scope of a <correlation
            name> is either a <select statement: single row>, <subquery>, or
            <query specification> (see Subclause 6.3, "<table reference>").
            Scopes may be nested. In different scopes, the same <correlation
            name> may be associated with different tables or with the same
            table.

       A <correlation name> immediately contained in a <table refer-
            ence> TR is exposed by TR. A <table name> immediately contained
            in a <table reference> TR is exposed by TR if and only if TR
            does not specify a <correlation name>.

         2) Case:

            a) If a <table reference> TR is contained in a <from clause> FC
              with no intervening <derived table>, then the scope clause
              SC of TR is the <select statement: single row> or innermost
              <query specification> that contains FC. The scope clause of
              the exposed <correlation name> or exposed <table name> of TR
              is the <select list>, <where clause>, <group by clause>, and
              <having clause> of SC, together with the <join condition> of
              all <joined table>s contained in SC that contains TR.

            b) Otherwise, the scope clause SC of TR is the outermost <joined
              table> that contains TR with no intervening <derived table>.
              The scope of the exposed <correlation name> or exposed <table
              name> of TR is the <join condition> of SC and of all <joined
              table>s contained in SC that contain TR.

         3) A <table name> that is exposed by a <table reference> TR shall
            not be the same as any other <table name> that is exposed by a
            <table reference> with the same scope clause as TR.

Expert of the Quarter 2010
Expert of the Year 2010

Commented:
I don't know about you, but the fact that "innermost" and "outermost" is used implies that nesting and scoping goes multi-level.  I cannot make more sense out of that without diagrams and solid examples - so I bet the engineers at different companies also cannot agree.
In plain english, if scoping is restricted to one level in or out, then it would have used different words or phrases that do not contain "-most", such as "immediately outer", "outer".

IMHO
dqmq,

I can't seem to make as much sense as you do from these specification. When you express it like this, I tend to agree: if the T1 alias collides, it should be an error (there is no rule on alias precedence).

(°v°)
cyberkiwi,

I did the test in Oracle. Indeed, the column aliases from the main query are not available at the second level of nesting. Also, Oracle doesn't support the null FROM clause (I used FROM DUAL), but that wasn't the issue.

Cheers!
(°v°)

Author

Commented:
Thanks all for sharing your experience on various platforms

Do more with

Expert Office
Submit tech questions to Ask the Experts™ at any time to receive solutions, advice, and new ideas from leading industry professionals.

Start 7-Day Free Trial