# What Codd once said... part 2

The following tables, one in Standard SQL and the other in T-SQL to continue the discussion on sets used,

Relation     (in T-SQL called Table)
-----------------------------------------------
Name             City              <-- Heading: set of attributes with attribute name and type
Jeff               New York       \    in T-SQL attribute is known as column. (no order for columns)
Mike             Los Angeles    >
Hans            Pasadena       /   Body: set of distinct tuples (in T-SQL distinct rows)
(no order for tuples)

Body set: set{{Jeff, New York}, {Mike, Los Angeles}, {Hans, Pasadena}}

Here we have the same table adopted to T-SQL with an identical row added to it (because TSQL allows it):
Table1
-----------------------------------------------
RID       Name             City              <-- Heading: set of attributes (in T-SQL called columns)
100        Jeff               New York      \                                       (There is no order for columns)
102        Mike             Los Angeles   \
103        Mike             Los Angeles    >  Body: set of distinct rows, T-SQL allows repeated rows
104        Hans            Pasadena       /   thus, hidden row id (RID) is added to make it distinct.

Body set: set{{Jeff, New York} , {Mike, Los Angeles}, {Mike, Los Angeles}, {Hans, Pasadena}}
(Column RID exluded intentionally)

Other than Heading set and Body set, we also have Predicate set used in WHERE clause like:

SELECT * FROM TABLE_1 WHERE Name IN('Jeff', 'Mike')

The Predicate set is not a part of this discussion because the partial text below discusses predicate many lines after the quote from Codd.

Question: Why Codd says set {a, b, c} is considered equal to the set {a, a, b, c, c, c}?

We understand that a set used as predicate (in WHERE clause) turns a set like IN(a, a, b, c, c, c) to IN(a, b, c) hence we can see why they are equal. But we are talking about sets used in heading or body not predicate.

--------------------------------------------
Lesson 1: Understanding the Foundations of T-SQL Chapter 1 Page 4
.
.
Remember that the heading of a relation is a set of attributes, and the body a set of tuples.
So what is a set? According to the creator of mathematical set theory, Georg Cantor, a set is
described as follows:

By a “set” we mean any collection M into a whole of definite, distinct objects
m (which are called the “elements” of M) of our perception or of our thought.

There are a number of very important principles in this definition that, if understood,
should have direct implications on your T-SQL coding practices. For one, notice the term
whole. A set should be considered as a whole. This means that you do not interact with the
individual elements of the set, rather with the set as a whole.

Lesson 1: Understanding the Foundations of T-SQL Chapter 1 Page 5
Notice the term distinct—a set has no duplicates. Codd once remarked on the no duplicates
aspect: ”If something is true, then saying it twice won't make it any truer.“ For example,
the set {a, b, c} is considered equal to the set {a, a, b, c, c, c}.

Another critical aspect of a set doesn’t explicitly appear in the aforementioned definition
by Cantor, but rather is implied—there’s no relevance to the order of elements in a set. In
contrast, a sequence (which is an ordered set), for example, does have an order to its elements.
Combining the no duplicates and no relevance to order aspects means that the set
{a, b, c} is equal to the set {b, a, c, c, a, c}.

The other branch of mathematics that the relational model is based on is called predicate
logic. A predicate is an expression that when attributed to some object, makes a proposition
either true or false. For example, “salary greater than \$50,000” is a predicate. You can evaluate
this predicate for a specific employee, in which case you have a proposition. For example,
suppose that for a particular employee, the salary is \$60,000. When you evaluate the proposition
for that employee, you get a true proposition. In other words, a predicate is a parameterized
proposition.
------------------------------------------------------
BookCover.png
LVL 34
###### Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

Director, Practice Manager and Computing ConsultantCommented:
And what's the question?
0

Billing EngineerCommented:
Yes, what do you want to know now?
0

freelancerCommented:
Perhaps this should be a discussion?
0

Database and Application DeveloperAuthor Commented:
As stated in the original post:

Question: Why Codd says set {a, b, c} is considered equal to the set {a, a, b, c, c, c} ?

The second sets has duplicates thus it is not distinct.

We understand that a set used as predicate (in WHERE clause) turns a set like IN(a, a, b, c, c, c) to IN(a, b, c) hence we can see why they are equal. But we are talking about sets used in heading or body not predicate.
0

Billing EngineerCommented:
because as "set", they are indeed the same. Speaking purely "maths"
consider the EXISTS( ) condition, for example.

select * from sometable t
where exists ( select * from lookup l where l.value = t.value )

with the lookup table being the "set" mentioned.

in database term (implementation), it's a different world, as you most often have to consider each row with a "row id" (or row number) concept, hence a "database table" is not a "set", in regards to the maths concepts.

it will work in the EXISTS () concept
it will work when using a proper 'DISTINCT or GROUP BY' concept
it will eventually give issues in a plain JOIN, if either "sets (aka tables)" contains "duplicates"
0

Database and Application DeveloperAuthor Commented:
Guy,

EXISTS( )  is predicate. Predicates is not a part of this discussion.

Would you say that:

set{{Jeff, New York} , {Mike, Los Angeles}, {Mike, Los Angeles}, {Hans, Pasadena}}

is equal to

set{{Jeff, New York} , {Mike, Los Angeles}, {Mike, Los Angeles}, {Mike, Los Angeles}, {Hans, Pasadena}}

They are not to me.
0

Director, Practice Manager and Computing ConsultantCommented:
In set theory, they are.

In T-SQL, they are not.
0

Experts Exchange Solution brought to you by ConnectWise

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Billing EngineerCommented:
Egthebas:
if you read my full comment, I said that in the context of that predicate, it behaves the same, and that in database implementation, they are NOT the same, but that you can make them (the tables/data) behavethe same.
0

Database and Application DeveloperAuthor Commented:
Guy,

I will read it again, it just seemed to me my emphases to exclude predicate is not noticed the same way my question statement was not noticed although in bold fonts.

Sorry for making you repeat your post. Basically, I know what Codd has said must be true. Based on what Phillip is saying, I have started reading about how sets work to see it all clearly.

Mike
0

Billing EngineerCommented:
you had put plenty of things in bold, sorry if it was overlooked. happens to everybody...
0

Database and Application DeveloperAuthor Commented:
Good news. I found the following:

Sets are unordered and we do not distinguish between repeated elements. So:
{1, 1, 1} = {1}, and {a, b, c} = {b, a, c}.

Philip,
Your comment was very helpful. It led me to do some reading on the sets and I found the above. For now, I know enough to move on. Disentanglement of T-SQL table principles from set theory was the key for my understanding.

Thanks,

Mike
0

Billing EngineerCommented:
> I know what Codd has said must be true.
it's good to challenge the old theories, even if afterwards all you come up is that they are (still) true.
The effect is to study and learn

>Body: set of distinct rows, T-SQL allows repeated rows thus, hidden row id (RID) is added to make it distinct.
0

Database and Application DeveloperAuthor Commented:
True they all are related in one way or another but, with simple math thinking, I had problem justifying why
set {a, b, c} is considered equal to the set {a, a, b, c, c, c}

Thank you for bearing with me and guiding me through.

Mike
0

Commented:
Also note the questions are a little confusing, since the critical piece you are asking about isn't something Codd actually said; and isn't correct anyway.

Look again at the text you have cited.  The  equivalence piece is not part of the Codd quote, they are something the other author added.

That addition isn't correct because {a,a,b,c,c,c} is not a set (by definition, which is what Codd was commenting on.)

It's like saying:  this apple is equivalent to this orange because they are both apples.

Except that's not true.

Now that I've read more of the rest of the source context.  The author introduced the problem with sloppy math.  The whole point of that section was about the fact that a set is distinct.  So constructing a new collection that has duplicates is, again by definition, not a set and therefore not really relevant to the rest of text.

But - the point the author was trying to connect was about how the property of distinctness applies to "relations" of sets which is, specifically the condition of membership or in sql context an IN or EXISTS predicate, and nothing more.
0

Billing EngineerCommented:
sdstuber, I think I have to disagree at some point: http://en.wikipedia.org/wiki/Set_%28mathematics%29
There are two important points to note about sets. First, a set can have two or more members which are identical, for example, {11, 6, 6}. However, we say that two sets which differ only in that one has duplicate members are in fact exactly identical (see Axiom of extensionality). Hence, the set {11, 6, 6} is exactly identical to the set {11, 6}.

now, the starting of (all) those articles about sets start or contain something like:
> In mathematics, a set is a collection of distinct objects <

the difference is about "object" and "value"
you have have several "objects" having the same "value", hence have distinct objects but not distinct values.
0

Commented:
you are correct, in my commenting, I was sloppy myself and probably added to the confusion.

thank you for the correction
0

Database and Application DeveloperAuthor Commented:
Thank you for bonus explanations.
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.