Link to home
Start Free TrialLog in
Avatar of PeteEngineer
PeteEngineerFlag for India

asked on

What is the difference between where and exists statement

What is the difference between where and exists statement

Please don't post any links , i need a clear cut answer with one or two sentence !

This is for the purpose of interview!
ASKER CERTIFIED SOLUTION
Avatar of pdd1lan
pdd1lan

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of pdd1lan
pdd1lan

they are both doing the same job, but WHERE EXISTS is doing better job on the perform as my opinion.
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
SOLUTION
Avatar of NerdsOfTech
NerdsOfTech
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
EXISTS i mean ;)
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
<<In relational terms, SELECT WHERE is a direct implementation of the projection operation, while EXISTS is an indirect implementation of the natural join.>>
In other words, they are different relational operators implementations in SQL Server.
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
<<EXISTS specifies a subquery that determines whether or not a row(s) matching one or more specified conditions is found.  It returns TRUE if row found, FALSE otherwise.>>
Defining what the EXISTS operator does according to the subquery conditions is wrong and misleading.  

In the top level query, the  EXISTS *always* evaluates to TRUE.  The subquery has *no* influence over what the operator does.  If the subquery returns FALSE, the rows are simply discarded in the top query, not evaluated to FALSE.  

Additionally, in relational algebra, an operator is NOT defined according to the data but according to relations.An EXISTS is a natural/semi JOIN implementation on the relation defined by the subquery, the subquery being some other relation.

And that is the precise reason why any EXISTS SQL operator can be written as a part of a INNER JOIN condition, not the way around.  

SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
In other words, if you don't already know the answer to that question, you are probably not qualified...
>> In the top level query, the  EXISTS *always* evaluates to TRUE.  The subquery has *no* influence over what the operator does. <<

?  What is your justification for claiming that ?

EXISTS() is boolean -- it simply returns TRUE or FALSE to the upper query.  The EXISTS() does *not* control row selection in the upper query; it can't possibly do that, since it's completely unaware of all the other WHERE conditions in the query.


>>  If the subquery returns FALSE, the rows are simply discarded in the top query, not evaluated to FALSE. <<

I don't see how it could possibly work that way.  How then would NOT EXISTS *ever* work?  If a FALSE return from the subquery "simply discarded (the row) in the top query", a NOT EXISTS could *never* work ... but it does.

One could also have OR EXISTS() OR EXISTS() OR EXISTS() ... if one false EXISTS() discarded the row, it could falsify the final result set.


>> An EXISTS is a natural/semi JOIN implementation <<

A "natural JOIN" has a specific meaning in relational that has nothing to do with EXISTS.  
<<?  What is your justification for claiming that ?>>
From BOL...

EXISTS: Returns TRUE if a subquery contains any rows(TRUE).
NOT EXISTS: works the opposite of EXISTS. The WHERE clause in NOT EXISTS is satisfied(TRUE) if *no* rows(FALSE) are returned by the subquery.

Both evaluate to TRUE at top query level if they are to be satisfied.  You confuse relational operators and row-level constructors.  EXISTS/NOT EXISTS are relational operators implementations of natural/semi join(EXISTS) and antijoin (NOT EXISTS)

<<The EXISTS() does *not* control row selection in the upper query; >>
Total nonsense.
You are wrong and stuck into low level abstraction.  It is because you only understand EXISTS as a function and row constructor that you are confused, you simply can't see that it is an operator as any operator in the upper query.

<<it can't possibly do that, since it's completely unaware of all the other WHERE conditions in the query.>>
You are wrong.  

EXISTS and NOT EXISTS are treated as any predicate in the top level query.  They both are set level operators.

<<I don't see how it could possibly work that way. >>
Whether you see it or not is a different matter.  

<<How then would NOT EXISTS *ever* work?
 If a FALSE return from the subquery "simply discarded (the row) in the top query", a NOT EXISTS could *never* work ... but it does.>>
EXISTS evaluates to TRUE when rows within the subquery evaluate to TRUE.
NOT EXISTS evaluates to TRUE when rows within the subquery evaluate to FALSE.

You only view EXISTS/NOT EXISTS as a function when they are in fact a relational operator implementation.

<<One could also have OR EXISTS() OR EXISTS() OR EXISTS() ... if one false EXISTS() discarded the row, it could falsify the final result set. >>
You confuse what a set level relational operator with a function.

<<A "natural JOIN" has a specific meaning in relational that has nothing to do with EXISTS. >>
See above.  
Mod please destroy this question >:) JK
>> In the top level query, the  EXISTS *always* evaluates to TRUE. <<

From BOL?  Please state where BOL states that EXISTS *always* evaluates to TRUE.
I think it's mistaken and misleading to include anything about a JOIN in the definition of an EXISTS.  

While it's possible that the EXISTS is part of a relational relationship to the "parent"/upper query, it's obviously not a requirement of an EXISTS.  Thus, to mention "join" in the *base definition* of an EXISTS is inaccurate.

For example, this is perfectly valid T-SQL:

IF EXISTS(SELECT TOP 1 * FROM tablename)
    PRINT 'tablename has at least one row.'
ELSE
    PRINT 'tablename is empty.'

Clearly there's no "join" involved here; indeed, there's no "parent"/upper/outer query at all.

Yes, sometimes SQL will implement the EXISTS as a join/anti-join operator, if the conditions in the EXISTS make it possible and the optimizer determines that it's better for performance.  I don't see that as fundamentally affecting that an EXISTS is simply a boolean check for the existence of a condition(s).  *Sometimes* that's equivalent to a join of some type; other times it's *not*.

So, some EXISTS are joins, but not all EXISTS are joins.  Therefore, EXISTS is not by *base definition* a "join".
<<I think it's mistaken and misleading to include anything about a JOIN in the definition of an EXISTS.  >>
As a function, EXISTS is only *one* possible implementation of the relational JOIN, SQL JOIN being another.  Since you don't make the difference between the former and the latter, the chances that you understand what I am referring to are very slim.

<<For example, this is perfectly valid T-SQL:

IF EXISTS(SELECT TOP 1 * FROM tablename)
    PRINT 'tablename has at least one row.'
ELSE
    PRINT 'tablename is empty.'
>>
Nobody claimed that EXISTS is not ,or can not, be used as separate function to test for the existence of a specific row.  However, in the context of being part of an *upper query*, it necessarily is nothing else than the implementation of a relational operator between two relations.  

<<Clearly there's no "join" involved here; indeed, there's no "parent"/upper/outer query at all.>>
See above.

<<So, some EXISTS are joins, but not all EXISTS are joins.  Therefore, EXISTS is not by *base definition* a "join>>
You simply established that EXISTS is a SQL function, which is obvious.  What you still fail to understand, is that, when used as an operator in a SQL query predicate, EXISTS clause can be systematically be replaced by INNER JOIN with the subquery.  In such context, EXISTS (not NOT EXISTS) and INNER JOIN/ ON are simply two ways of doing the same relational operation: natural/semi join
>> Since you don't make the difference between the former and the latter, the chances that you understand what I am referring to are very slim. <<

Sad you feel the need to keep doing those cheap shots.


>> However, in the context of being part of an *upper query*, it necessarily is nothing else than the implementation of a relational operator between two relations. <<

There was no such context in the original q.

Even so, what you've said is still simply not true.  It's *not* "necessary" that the EXISTS have *any relation whatsoever* to the query above it.  It's certainly allowed, but it is *not* required.

SELECT 1
FROM tableA a
WHERE EXISTS (
    SELECT 1
    FROM tableC c
    WHERE c.somecolumn = 'somevalue'
    )

Is perfectly acceptable.  There is an upper query, but the EXISTS subquery has no relationship whatsoever to the query above it.


[Btw, why do you keep insisting on calling EXISTS a "function"?  It's clearly *not* a function.]
<<.... cheap shots....>>
*ad hominem* comments.  Totally irrelevant to the topic.

All I can say is that recognizing our limitations is our responsibility and burden.  Don't put yours on me.

<<Is perfectly acceptable.  There is an upper query, but the EXISTS subquery has no relationship whatsoever to the query above it.>>
OK, this is my last attempt...Run attached snipplet...

<<http://www.java2s.com/Code/SQLServer/Subquery/EXISTSFunctionandSubqueries.htm>>
*Within the subquery*, EXISTS acts primarily a row constructor function that can take one or several columns of the *subquery* as an input and return a boolean, that can be either TRUE or FALSE within the subquery.  

On the other hand, in the *top query*, EXISTS is *also* a possible implementation of the semi-join relational operator between two relations R1 and R2.  In that situation, SELECT ...FROM R1 WHERE EXISTS(SELECT ...FROM R2) is simply a relational operation, known as semijoin between set R1 and set  R2. EXISTS(SELECT ...FROM R2) is a predicate on which R1 gets *filtered* as it would with any other predicate in the WHERE clause.



if object_id('relationA') is not null drop table relationA; 
	create table relationA(c1 int, c2  varchar(1));
	create unique index uix1 on relationA(c1); 
	insert relationA select 1, 'A' union all select 2, 'B';
	
if object_id('relationB') is not null drop table relationB; 
	create table relationB(c1 int, c2 varchar(1));
	create unique index uix2 on relationB(c1);
	insert relationB select 1, 'A' union all select 3, 'B' union all select 4, 'C'

/*RELATIONAL ALGEBRA SEMIJOIN - All SQL operations below are an implementation of the same relational operation */
select A.c1 
from relationA A
where exists(select c1 from relationB where c1=A.c1)

select A.c1 
from relationA A
where A.c1 in (select c1 from relationB) 

select A.c1 
from relationA A inner join (select c1 from relationB) B on A.c1 = B.c1 

select c1 from relationA
intersect
select c1 from relationB

/*RELATIONAL ALGEBRA ANTIJOIN - All SQL operations below are an implementation of the same relational operation*/
select A.c1 
from relationA A
where not exists(select c1 from relationB where c1=A.c1) 

select A.c1 
from relationA A
where A.c1 not in (select c1 from relationB) 

select c1 from relationA
except
select c1 from relationB

Open in new window

?

I already stated that EXISTS *could* be used in a relation.  Just that it's *NOT* "necessary" that an EXISTS is a relation, as you kept repeating.  

EXISTS is 100% NOT a "row constructor" by anyone's account.  Whoever wrote that is also plain wrong.
<<I already stated that EXISTS *could* be used in a relation.>>
Do you know what a relation is ?  From the tenure of your comment, I believe you don't.  
<<EXISTS is 100% NOT a "row constructor" by anyone's account.  Whoever wrote that is also plain wrong.>>
Your claim is baseless.  

By redefining the meaning of relation as a pointer, you are redefining an established term that has been around for the past 40 years.  A relation is a *logical* entity which is implemented physically as a table or a view under SQL.   In the provided example, tables relationA and relationB are physical implementations of relationA and relationB.  In relational model, that's what the *relational* stands for, not some meaningless pointer.
A quick wilkipedia...

A table is an accepted visual representation of a relation; a tuple is similar to the concept of row, but note that in the database language SQL the columns and the rows of a table are ordered.[citation needed]
@JDettman

<<OK guys, I think Pete knows more then he probably ever wanted to about WHERE vs EXISTS<g>.>>
LOL...
Thanks for the reminder. ;)

<<If you can keep it civil, your certainly welcome to thrash it out, but I think the original question has been answered many times over.>>
You are correct.  We are beginning to become off topic.

Regards...
Sorry.  I don't like allowing falsehoods to stand, but clearly the time has come to do just that.
Scott,
<<I already stated that EXISTS *could* be used in a relation>>
<<Sorry.  I don't like allowing falsehoods to stand, but clearly the time has come to do just that. >>
 Well I don't want to re-ignite this, but one thing you should be aware of is that  a "relation" and a "relationship" are two different things.   As Racimo pointed out, a relation is a logical construct.  In physical database terms it's represented by a table.   A relationship (a join) is something between two relations.
JimD.
<<Sorry.  I don't like allowing falsehoods to stand, but clearly the time has come to do just that.>>
Well, if you claim what I write is falsehood then you should be confident.  

After all, truth stands on its own and anybody with database education should easily separate the wheat from the chaff.
<<Well I don't want to re-ignite this, but one thing you should be aware of is that  a "relation" and a "relationship" are two different things.   As Racimo pointed out, a relation is a logical construct.  In physical database terms it's represented by a table.  >>

Thanks for (re)clarifying that.  Allow me to elaborate on few issues you pointed.

First, there is a need to be cautious about the *Relationship* term as it is not a part of the relational model: they are purely SQL concept to designate a physical pointers between indices.  Under logical perceptive of relational model, they are refered to as a specific type of constraint.  Once could say that relationships are SQL implementations of that specific constraint type.

Second, though relations are physically represented as tables on SQL systems, they can also be represented is not mandatory:

> Lists in CHECK IN constraints can represent a relation.  
> Views are physical representations of relations resulting from the operation between one or more relations.
> Table functions are similar to views.

Regards...
Wow, you really just want to pull out one word, huh?

OK, relation was the wrong term there.

But this statement:

"EXISTS is an indirect implementation of the natural join."

is still false as a *definition* of EXISTS.  [Btw, you should look up "natural join"; it has a specific meanng you're apparently not aware of.  Remember, we're in a *physical implementation* within SQL Server, not in a theoretical model.]

That aside, an EXISTS is not required to be a join of any type, natural or otherwise.

EXISTS tests for the existence (or not) of a row; hence the term "EXISTS", one supposes.

A given EXISTS *may* be equivalent to some type of join, but that is not "necessary", as Racimo repeatedly stated.  Viz:

SELECT 'EXISTS was TRUE!'
WHERE EXISTS(SELECT NULL)

A valid use of EXISTS -- absolutely no semblance of any join whatsoever.
Of course I meant "relation" in context to my other statements, i.e., the EXISTS had no relation to the upper query.  Relation as in normal English, not relation in relational-speak.

Even an EXISTS that is subordinated to an upper/owning query is *not* required to be join of any type, natural or otherwise.

it is simply mistaken to include "join" in the definition of EXISTS; they are two separate things.
<<EXISTS is an indirect implementation of the natural join.>>
semi join to be more precise.  semi join is a particular natural join.

<<Remember, we're in a *physical implementation* within SQL Server, not in a theoretical model.>>
Are  *at least* you aware that SQL performs relational operations, and that each table is a relation physical implementation.  It is the logical that defines the physical, not the way around.

<<That aside, an EXISTS is not required to be a join of any type, natural or otherwise.>>
EXISTS is one *possible* *physical* *implementation* of the relational semijoin.  Not the same as saying that the EXISTS is a semijoin.  The subtlety eludes you completely since you do not distinguish the logical from the physical layers.

<<EXISTS tests for the existence (or not) of a row; hence the term "EXISTS", one supposes.>>
As I said before, EXISTS acts *both* as a row constructor within the subquery and as a semijoin implementation within the upper query.

<<A given EXISTS *may* be equivalent to some type of join, but that is not "necessary", as Racimo repeatedly stated>>
You are wrong (again).  Here is what I said: <<in the context of being part of an *upper query*, it (EXISTS)necessarily is nothing else than the implementation of a relational operator between two relations>>.  meaning that when EXISTS is a part of a predicate in the WHERE clause of some query, it can systematically be written as an INNER JOIN.  I have never claimed that it can not be used otherwise.

<<
SELECT 'EXISTS was TRUE!'
WHERE EXISTS(SELECT NULL)
>>

Jeez. You chose the only example of EXISTS usage that is totally meaningless: use of EXISTS with NULL is meaningless since NULL is 3VL(= operator is meaningless).  

Even the following returns nothing

SELECT 'EXISTS was TRUE!'
WHERE NULL=NULL  --> yields nothing

And you failed to see that
SELECT 'EXISTS was TRUE!'
WHERE EXISTS(SELECT 1)

SELECT 'EXISTS was TRUE!' from
(SELECT 1 as c1) A
INNER JOIN (SELECT 1 as c1) B ON A.c1=B.c1

they are different simply different physical implementations of logical semijoin (NOT the physical JOIN)

<<Relation as in normal English, not relation in relational-speak.>>
General speak is irrelevant to understand logical operations.

<<Even an EXISTS that is subordinated to an upper/owning query is *not* required to be join of any type, natural or otherwise

it is simply mistaken to include "join" in the definition of EXISTS; they are two separate things.
>>
I see the source of your confusion now.

Since you ignore the relational model, and how SQL relates to relational algebra, you simply can't make the difference between the logical JOIN and the physical SQL JOIN, a logical database layer and a physical layer: The two are intermixed in your mind.

I am trying to tell you the two are *NOT* the same and that the latter is one possible* implementation of the former, another possible implementation being the EXISTS.    But I did not realize you totally ignored the difference between physical and logical layer.

SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Racimo, I posted before I saw your reply. In case it isn't obvious the odd "¿" character above is supposed to be the EXISTS quantifier (reverse E).
>>  (ie the two tables returned by the two SELECT statements) <<

The SELECT in an EXISTS does not return a "table"; it simply checks for the existence of a row and returns TRUE or FALSE.

EXISTS in an IF works the same way -- it returns TRUE or FALSE.


>> EXISTS certainly is a join when it is used as part of a query (ie when it's not being used as a function in procedural code) <<

Hmm ... an EXISTS used in an IF is not a join, but that exact same EXISTS used in a WHERE clause of a SELECT would de facto always be a join??  That's just not logical.


And EXISTS is NOT a "function".  That term has a specific meaning in SQL Server as well.


EXISTS introduces a subquery that checks for the existence of a row.  Thus it is clearly not a "row constructor".  How bizarre to claim that it is.


EXISTS is not merely another way to express a join.  Sometimes the EXISTS() condition could not be added in any normal join syntax to the upper query.  Or, as MS puts it in BOL:

"
When a subquery is introduced with the keyword EXISTS, the subquery functions as an existence test. The WHERE clause of the outer query tests whether the rows that are returned by the subquery exist. The subquery does not actually produce any data; it returns a value of TRUE or FALSE.
...
The EXISTS keyword is important because frequently there is no alternative, nonsubquery formulation. ... some queries that are created with EXISTS cannot be expressed any other way
"

Presumably "cannot be expressed any other way" would include "cannot be expressed as a join".
>> Hmm ... an EXISTS used in an IF is not a join, but that exact same EXISTS used in >> a WHERE clause of a SELECT would de facto always be a join??  That's just not logical.

Clearly the EXISTS keyword is being used for two different things in T-SQL. Certainly not the only keyword where that is the case. There's no point trying to compare procedural code to a query.

>> The SELECT in an EXISTS does not return a "table"; it simply checks for the existence of a row and returns TRUE or FALSE.
The SQL standard disagrees with you. The definition of EXISTS in the 2003 standard (section 8.9) begins: "Let T be the result of the <table subquery>...". Logically speaking that's how EXISTS is defined. As Racimo says, you seem to be confusing logical and physical operations. Logically speaking, EXISTS is a join operator.

As for functions. Microsoft jargon notwithstanding, the word "function" has a proper mathematical meaning in the English language and has done for hundreds of years. http://mathworld.wolfram.com/Function.html. EXISTS does not stop being a function whatever Microsoft says!

Ok.  Fair enough, logically, internally it's a join.

I still strongly suspect the interviewer isn't primarily interested in the relational calculus involved or the internal logical operations.  I suspect that if you tell an interviewer that an EXISTS is a join you won't be there long.  "Next peson please...".
@dportas

<<Specifically, EXISTS is a semi-join.>>
Indeed.  See above posted code for examples of different implementations of semi join in SQL.

<<This is a join between two relations >>
Not quite.  relations are 2VL.  NULL is not.

<<In case it isn't obvious the odd "¿" character above is supposed to be the EXISTS quantifier (reverse E).>>
Thank you for pointing that out.  I traditionally rather use relational algebra and D&D algebra then relational calculus.

<<In Tutorial D syntax

t{ExistsWasTrue} JOIN s{foo} WHERE (1=1)
>> 
You mean :

t{ExistsWasTrue} SEMIJOIN s{foo} WHERE (1=1)
or
t{ExistsWasTrue} MATCHING s{foo} WHERE (1=1)
or
( ExistsWasTrue} JOIN {foo} ) { 1 }

<<Regretably, that's hardly surprising given that many people working with SQL are largely ignorant of the relational model, algebra or calculus.>>
Exactly.

@Scott
<<That's just not logical.>>
On the contrary, what dportas and JDettman makes *perfect* sense.  

<<EXISTS introduces a subquery that checks for the existence of a row.  Thus it is clearly not a "row constructor".  How bizarre to claim that it is.>>
The claim is not bizarre. It actually is obvious to the educated.

<<Presumably "cannot be expressed any other way" would include "cannot be expressed as a join".>>

The fellow exactly said:

<*SOME* queries that are created with EXISTS cannot be expressed any other way>  which means that SOME concerns a limited type of queries using EXISTS.>

....but did *not* specify which.  

I told you  that *ANY* query that uses EXISTS as a part of a WHERE clause can be expressed as an INNER JOIN, if the row constructor does not involve comparing NULL values.  
Maybe I'm not the one that's confused.

If someone wants to know how to start their car, I tell them to turn the key, not how starters, fuel injectors, batteries and spark plugs work.
>> Not quite.  relations are 2VL.  NULL is not
Agreed. I wanted to make that point and take the null out but I suspected Scott would cry foul. The null in this case could easily be replaced by a value because the point is just that there is a join between two tuples.

I also made an error in my Tutorial D query by not projecting away the "foo" attribute. I could even have used DEE instead of s{foo} in this particular case but I think that would have made the point about joining less clear.

@dportas

As I already explained, in the beginning of the thread I invite to read entirely, to Scott,  EXISTS is *both* the SQL implementation of logical semijoin(at upper level query), and a physical row constructor function in SQL Server internals(subquery).  Since Scott confuses the two layers, that subtlety totally eludes him.

As a row constructor function, EXISTS validates physical rows at run time that meet the inner query predicate and can return TRUE or FALSE on a row. Scott can not see it otherwise since he ignores relational model.

As a relational operator implementation in the upper query, well, even talking about an evaluation for an operator is a moot concept.  

Regards...
Scott, I've no idea what answer the interviewer might expect. Your assumptions about what the interviewer wanted are a poor excuse for you to hijack this thread, making it a confrontation rather than a discussion and only then to concede grudgingly that Racimo was right all along.
<<Maybe I'm not the one that's confused.>>
3 people already pointed out to you where your misunderstanding comes from but you seem to ignore their call.  

Are you in a learning perspective or are we wasting our time ?

<<If someone wants to know how to start their car, I tell them to turn the key, not how starters, fuel injectors, batteries and spark plugs work.>>
Here we go with analogies.   Since when are cars of any relevance in databases ?
I don't ignore the relational model.  I just don't think it applies to an interviewer's q about EXISTS.

I don't believe they want an answer that says it's a join, just because it's a *logical* join.

I think they want to know whether or not you understand the purpose of using EXISTS and what it *conceptually* does in a query.

[Obviously the answer would be different in a relational algebra context.]


So the OP can make his choice.  Tell the interviewer that EXISTS:

1) Is a logical join / row constructor because, based on the relational model, ..... [I don't think the rest matters; they're not listening anymore anyway]

2) is used to determine if a row exists that meets specified conditions, and is TRUE if such a row does exist and FALSE if it doesn't.
<<I also made an error in my Tutorial D query by not projecting away the "foo" attribute. I could even have used DEE instead of s{foo} in this particular case but I think that would have made the point about joining less clear.>>
Yep.
Scott:
1) is a better answer because it demonstrates that the interviewee probably understands conceptually what EXISTS does in a query. 2) reads more like a straight repetition of what the manual says.
<<I think they want to know whether or not you understand the purpose of using EXISTS and what it *conceptually* does in a query.>>
Here we go with vague and obscure terminology.  

On a DBMS such as SQL there are *only* two ways to look at a query: *logically* and *physically*.  I clarified both.  Beside there is no such thing as a *relational context*.  Relational model is pure set logic applied to physical SQL systems.  They are not to be separated.
Suggested:
Accept http:#a33989634 

(500 points for the SOLUTION)
NerdsOfTech,

I see no merit in your suggestion whatsoever, so I can only assume you wrote that in jest.  It is precisely this type of thread that serves the purpose to enlighten other readers, such as myself.

Thank you Racimo and dportas for a constructive argument.  No doubt, the author has abandoned the question, however it has been very thought-provoking and for that I appreciate your explanations and above all else your patience in clarifying the definition of the different terms.
Avatar of PeteEngineer

ASKER

Thanks for all the comments . i have not abandoned the question . i was waiting for more answers and different perspectives.

 
@acperkins, forgive me, I did do that blog hijack in jest. I wouldn't doubt with your Spock-like logic you would find that funny -- you are too smart for that :)

This is in fact I agree that this is an informative blog... the volume of info experts can pull out just from to SQL keywords is proof >:p

I am even more astonished by the fact the asker wants MORE.



OKAY ASKER we can give you MORE EXISTS and MORE WHERE info!
It's just that it is going to be the SAME info OVER AND OVER again!

Is there anything you need us to clarify?
Do you have examples of output you expect from a query with EXIST and/or WHERE?

Thanks!

Looks like everyone might get 17 points each. All 29 of us!


@dportas

<< EXISTS is always a join - actually a truer form of join than SQL's "INNER JOIN". An "inner" join, unlike the real relational join operation, can generate additional duplicate rows in the result. EXISTS does not do that.>>
With...
C0: cardinality 0 - no dupplicates in the output.
C1: cardinality of R1 dupplicates
C2: cardinality of R2 dupplicates
C3: cardinality of R1 dupplicates * cardinality of R2 dupplicates

..., the SQL Server dupplicate cardinality of R1 SEMI-JOIN R2 implentation for the operators  is the following (sorry for any tabular format loss)

                     Only R1 Dups   Only R2 Dups   Both R1 and R2 dups  
EXISTS                   C1                   C0                           C1                                      
IN                            C1                   C0                           C1
INNER JOIN             C1                   C2                           C3
INTERSECT             C0                   C0                           C0

--> When *either* R1 or R2 have dupplicates, INNER JOIN returns as many dupplicates that *either* R1 or R2 have in the first place.  
--> When *both* R1 or R2 have dupplicates, INNER JOIN returns a cartesian product number of dupplicates between the cardinality of R1 dupplicates and the cardinality of R2.   INNER JOIN is the worst implementation of SEMI JOIN.
--> EXISTS and IN are dependent of R1 dupplicates and ignore R2 duplicates.  Normal, considering they are handled internally the same (the same showplan).
--> INTERSECT is R1 and R2 dupplicate independent.  

Observations:
--> INTERSECT is the only SEMI JOIN SQL implementation, independent from the dupplicates cardinalities of any table involved in the operation.
--> INNER JOIN is the worst implementation of SEMI-JOIN.  When there are dupplicates on both sides, INNER JOIN returns a cartesian product between the dups cardinality of R1 and the dup cardinality of R2, the use of DISTINCT in the top query is mandatory to insure data correctness in the results.  

Conclusion
--> The only way to insure dupplicate-free results in SEMI-JOIN SQL Server's implementations is *either* to put DISTINCT in the top query or directly use the INTERSECT operator .
 
@PeteEngineer:
<<Thanks for all the comments . i have not abandoned the question . i was waiting for more answers and different perspectives.>>
I am going to (attempt) synthetize the points evoked in this thread by getting back to your initial question:

What the difference beween WHERE and EXISTS ?

In regards to the context of ignorance prevailing nowadays in database industry, the best answer could well be: It depends what kind of candidate you are looking for.  

If you are looking for somebody who tries to understand what he is doing, then you should be looking for somebody who can demonstrate *not only*  show SQL proficiency, but some basic level of database education.  For this profile, the obvious answer is, as indicated by dportas and myself: NONE since they are *both* implementations of the relational SEMI JOIN.  

That profile could be more relevant for a database designer or database analyst post types.

But as a second grade option, you may simply need the candidate to establish a limited scope mastery of SQL Server development for specific tasks to be performed, such as troubleshooting or code maintainance.  In such perspective, you may want him to list the consequences and differences between using the two operators in SQL Server.  

The answers could then be then in the format:
> Difference in usage: Traditionally, in SQL Server, WHERE is used to establish predicates with single values (in fact cardinality one relations) to achive simple filtering.  , while EXISTS is more used to establish predicates with column subsets.
> Difference in readibility, WHERE clauses usually make the whole query easier to read.  WHERE is probably more *natural language* like than EXISTS.  Perhaps, EXISTS is less intuitive.
> Difference in performance, EXISTS has a higher risk of provoking table scans than WHERE.
> Difference in behavior: compile/run time differences, dupplicates handling ...

That profile could be more relevant for a job requiring some limited scope SQL Server database administration design skills.

Hope this answers better your question...
>>  Relational model is pure set logic applied to physical SQL systems. <<

Actually no existing relational dbms product comes even close to being "pure".  Not one product really meets Dr. Codd's "12" rules as he first specified them for a relational dbms.

Luckily for us, the vendors weren't pedantic enough to insist on relational purity before releasing products, as some other people might have been.
@NerdsOfTech

Do you have any particular reason for keeping on pushing for the thread closure ?

You need to be made aware of that your insistance not only is disrespectful to acperkins, but also to the ASKER, who clearly expressed he wants to know more, and to all the people who put some *serious* effort into clarifying this complex question.  

<<I am even more astonished by the fact the asker wants MORE. >>
Since you underestimate the complexity of the question, it is no surprise, you underestimate the complexity of the answer.

<<It's just that it is going to be the SAME info OVER AND OVER again!>>
How do you know ?

Have you considered that, if the questionner requested more information, it might mean he's actually learning something, as all of us do, once in a blue moon.  

Eveybody who wants to learn can benefit from that.  So why the insistance ?

<<Looks like everyone might get 17 points each. All 29 of us!>>
Oh boy !  My poinks are gone !

LOL :))
<<Actually no existing relational dbms product comes even close to being "pure".  >>
First there is no such thing as a RDBMS, there are SQL-DBMS's.  

Second, there are some past and present implementations under development, that are light years from SQL systems and are closer to what an RDBMS would do, as fas as operations are concerned.  Most of them are Tutorial D variations implemented in Python, DOT NET and Perl.  

Here is a list

http://www.dcs.warwick.ac.uk/~hugh/TTM/Projects.html

<<Luckily for us, the vendors weren't pedantic enough to insist on relational purity before releasing products, as some other people might have been. >>
First time I hear we should feel lucky because Oracle, Microsoft DBMS's snake oil salesmen sold uneducated audiences systems returning dupplicates without a warning.  Considering that as an opportunity is simply saying to the salesman:

We don't know your DBMS sucks (return dupps without a warning) but that's fine: we'll pay both the license AND do the dupp cleaning for you.  All that should and could have been taken car of by the DBMS is not putting ton of unecessary work on database practictionners.

Also, there is a huge difference between being *pedantic* and being *educated*.  Among other things, database education (relational theory), allows one to recognize more systematically proprietary product limitations and cope with them proactively.  For example, when one knows what a SEMI JOIN is supposed to do, it is easier to see how not to do it or to do it poorly.  

In the end, only ignorants consider education with such disdain.
<<All that should and could have been taken car of by the DBMS is not putting ton of unecessary work on database practictionners.>>
I meant, all that should and could have been taken care of by the DBMS to not put tons of unecessary work on database practictionners.
>> Also, there is a huge difference between being *pedantic* and being *educated*. <<

They are not mutually exclusive, as anyone who's suffered through reading what you wrote convincingly knows.

The world would in terrible shape with such pendantry everywhere.  

The first cars sucked too.  So did the first light bulbs.  The first forks even.  The first car tires.  [Btw, "don't reinvent the wheel" is the dumbest expression in the world.  The wheel has been reinvented countless times: thankfully, otherwise we'd all be riding around on wooden/stone wheels.]
<<They are not mutually exclusive, as anyone who's suffered through reading what you wrote convincingly knows.>>
You are the one claiming they are mutually inclusive.  A new baseless claim.

<<The world would in terrible shape with such pendantry everywhere.  >>
You promote ignorance.  

As Dijjkstra claims...

The competent programmer is fully aware of the limited size of his own skull. He therefore approaches his task with full humility, and avoids clever tricks like the plague.

<<They are not mutually exclusive, as anyone who's suffered through reading what you wrote convincingly knows.>>
Right.  I don't have the pretension on educating everybody.  Appart from you, those who see interest in what I wrote have expressed themselves favorably.  

So, for what it's worth,I will console myself with the thought that I have not been the one rebuked by already 3 of my peers.

<<The first cars sucked too.  So did the first light bulbs.  The first forks even.  The first car tires.  [Btw, "don't reinvent the wheel" is the dumbest expression in the world.  The wheel has been reinvented countless times: thankfully, otherwise we'd all be riding around on wooden/stone wheels.]>>
Those who view database technology as cars, screws and lightbulbs should stick to what they know.  

Besides, rediscovering an existing concept is not the same as promoting ignorance.  Ignorants are condemned to repeat the same mistakes over and over.
>. He therefore approaches his task with full humility <<

Coming from you?   ROFLOL


There is a huge difference between promoting ignorance -- a truly bizarre claim here -- and being practical.

When you go in for an interview for a SQL job, and they ask you about "EXISTS", I don't think they want a slew of relation algebra or a relational concepts seminar.


>> Those who view database technology as cars, screws and lightbulbs should stick to what they know. <<

You can't see a simple analogy of technologies that all had to evolve?  If you didn't consider yourself so much above everyone else, maybe you'd understand things a little better.

<<There is a huge difference between promoting ignorance -- a truly bizarre claim here -- and being practical.>>
Failure to identify and recognize the extent of one's own ignorance, negative assimilation of knowledge to pedantry, obscure analogies and hasty irrelevant generalizations are effective ways to promote of ignorance .

Database practitioners, who bother educating themselves in database science, instead of discarding knowledge, are known to be more effective in their practice since they can identify more easily their shortcomings.  Others keep on repeating the same error without *ever* understanding why.

If you *still* think what I am saying is *bizarre*, I suggest you begin reading about the Dunning-Kruger effect

http://en.wikipedia.org/wiki/Dunning–Kruger_effect

--------------------------------------------------------------------------------------------------------------------
> Kruger and Dunning proposed that, for a given skill, incompetent people will:
-->tend to overestimate their own level of skill;
--> fail to recognize genuine skill in others;
--> fail to recognize the extremity of their inadequacy;
--> recognize and acknowledge their own previous lack of skill, if they can be trained to substantially improve.
--------------------------------------------------------------------------------------------------------------------

<<When you go in for an interview for a SQL job, and they ask you about "EXISTS", I don't think they want a slew of relation algebra or a relational concepts seminar.>>
You are the one saying that an interview is a place for educating people.  I have provided many practical enough answers, links and constructive suggestions to the questioner.  

Apart from continuous personal character attacks attempts, blowing wind, and continuous proofs of ignorance what have you produced in this thread  ?

<<You can't see a simple analogy of technologies that all had to evolve?>>
Oh, I understand *analogism*, I just find such too sloppy and irrelevant to talk about database technology.   Effective database technology communication and relevance respectively relies on well established terminonology and science, not on mumbo jumbo.

If you consider screws, tires, wheels, vague evolution concepts, and other gibberish  as a part of database technology, that should give you how much you *really* know about it.  If you can't talk about database technology in recognized terminology and keep on making stuff up, I will just begin ignoring your sentences.  


<<If you didn't consider yourself so much above everyone else, maybe you'd understand things a little better.>>
After the *you-are-pedantic-cause-you-like-science* attack, now the *thinks-he-knows-best* lectures.  

I do not consider myself better or worse than anybody, but I like educating myself about database technology and occasionally exposing nonsense and ignorance when I see it: thanks for the opportunity.

If you don't have anything else of value to produce then personal attacks, I am going to simply ignore any irrelevant mumbo jumbo from now on.
>> After the *you-are-pedantic-cause-you-like-science* attack <<

Sigh, all you do is set up straw men and then trumpet that you knocked them down.

I never said you were pedantic because you liked science.  I don't even know that you like science, just that you are *always* pedantic.
<<Snipped mumbo jumbo and personal attacks>>
As I said, you have nothing of value to add.  Case closed.
For the love of...close this.. oh wait.. Thanks @PeteEngineer!

I think we all learned something from this thread; maybe a little too much, but, hey --- Thanks for the 56 points anyways!

That's 229.4% more points than I anticipated 27 posts ago. Sweet!

:)