Solved

# What Codd once said... part 2

Posted on 2015-02-06
136 Views
The following tables, one in Standard SQL and the other in T-SQL to continue the discussion on sets used,

Relation     (in T-SQL called Table)
-----------------------------------------------
Name             City              <-- Heading: set of attributes with attribute name and type
Jeff               New York       \    in T-SQL attribute is known as column. (no order for columns)
Mike             Los Angeles    >
Hans            Pasadena       /   Body: set of distinct tuples (in T-SQL distinct rows)
(no order for tuples)

Body set: set{{Jeff, New York}, {Mike, Los Angeles}, {Hans, Pasadena}}

Here we have the same table adopted to T-SQL with an identical row added to it (because TSQL allows it):
Table1
-----------------------------------------------
RID       Name             City              <-- Heading: set of attributes (in T-SQL called columns)
100        Jeff               New York      \                                       (There is no order for columns)
102        Mike             Los Angeles   \
103        Mike             Los Angeles    >  Body: set of distinct rows, T-SQL allows repeated rows
104        Hans            Pasadena       /   thus, hidden row id (RID) is added to make it distinct.

Body set: set{{Jeff, New York} , {Mike, Los Angeles}, {Mike, Los Angeles}, {Hans, Pasadena}}
(Column RID exluded intentionally)

Other than Heading set and Body set, we also have Predicate set used in WHERE clause like:

SELECT * FROM TABLE_1 WHERE Name IN('Jeff', 'Mike')

The Predicate set is not a part of this discussion because the partial text below discusses predicate many lines after the quote from Codd.

Question: Why Codd says set {a, b, c} is considered equal to the set {a, a, b, c, c, c}?

We understand that a set used as predicate (in WHERE clause) turns a set like IN(a, a, b, c, c, c) to IN(a, b, c) hence we can see why they are equal. But we are talking about sets used in heading or body not predicate.

--------------------------------------------
Lesson 1: Understanding the Foundations of T-SQL Chapter 1 Page 4
.
.
Remember that the heading of a relation is a set of attributes, and the body a set of tuples.
So what is a set? According to the creator of mathematical set theory, Georg Cantor, a set is
described as follows:

By a “set” we mean any collection M into a whole of definite, distinct objects
m (which are called the “elements” of M) of our perception or of our thought.

There are a number of very important principles in this definition that, if understood,
should have direct implications on your T-SQL coding practices. For one, notice the term
whole. A set should be considered as a whole. This means that you do not interact with the
individual elements of the set, rather with the set as a whole.

Lesson 1: Understanding the Foundations of T-SQL Chapter 1 Page 5
Notice the term distinct—a set has no duplicates. Codd once remarked on the no duplicates
aspect: ”If something is true, then saying it twice won't make it any truer.“ For example,
the set {a, b, c} is considered equal to the set {a, a, b, c, c, c}.

Another critical aspect of a set doesn’t explicitly appear in the aforementioned definition
by Cantor, but rather is implied—there’s no relevance to the order of elements in a set. In
contrast, a sequence (which is an ordered set), for example, does have an order to its elements.
Combining the no duplicates and no relevance to order aspects means that the set
{a, b, c} is equal to the set {b, a, c, c, a, c}.

The other branch of mathematics that the relational model is based on is called predicate
logic. A predicate is an expression that when attributed to some object, makes a proposition
either true or false. For example, “salary greater than \$50,000” is a predicate. You can evaluate
this predicate for a specific employee, in which case you have a proposition. For example,
suppose that for a particular employee, the salary is \$60,000. When you evaluate the proposition
for that employee, you get a true proposition. In other words, a predicate is a parameterized
proposition.
------------------------------------------------------
BookCover.png
0
Question by:Mike Eghtebas
• 6
• 6
• 2
• +2

LVL 24

Expert Comment

And what's the question?
0

LVL 142

Expert Comment

Yes, what do you want to know now?
0

LVL 48

Expert Comment

Perhaps this should be a discussion?
0

LVL 33

Author Comment

As stated in the original post:

Question: Why Codd says set {a, b, c} is considered equal to the set {a, a, b, c, c, c} ?

The second sets has duplicates thus it is not distinct.

We understand that a set used as predicate (in WHERE clause) turns a set like IN(a, a, b, c, c, c) to IN(a, b, c) hence we can see why they are equal. But we are talking about sets used in heading or body not predicate.
0

LVL 142

Assisted Solution

Guy Hengel [angelIII / a3] earned 250 total points
because as "set", they are indeed the same. Speaking purely "maths"
consider the EXISTS( ) condition, for example.

select * from sometable t
where exists ( select * from lookup l where l.value = t.value )

with the lookup table being the "set" mentioned.

in database term (implementation), it's a different world, as you most often have to consider each row with a "row id" (or row number) concept, hence a "database table" is not a "set", in regards to the maths concepts.

it will work in the EXISTS () concept
it will work when using a proper 'DISTINCT or GROUP BY' concept
it will eventually give issues in a plain JOIN, if either "sets (aka tables)" contains "duplicates"
0

LVL 33

Author Comment

Guy,

EXISTS( )  is predicate. Predicates is not a part of this discussion.

Would you say that:

set{{Jeff, New York} , {Mike, Los Angeles}, {Mike, Los Angeles}, {Hans, Pasadena}}

is equal to

set{{Jeff, New York} , {Mike, Los Angeles}, {Mike, Los Angeles}, {Mike, Los Angeles}, {Hans, Pasadena}}

They are not to me.
0

LVL 24

Accepted Solution

Phillip Burton earned 250 total points
In set theory, they are.

In T-SQL, they are not.
0

LVL 142

Expert Comment

Egthebas:
if you read my full comment, I said that in the context of that predicate, it behaves the same, and that in database implementation, they are NOT the same, but that you can make them (the tables/data) behavethe same.
0

LVL 33

Author Comment

Guy,

I will read it again, it just seemed to me my emphases to exclude predicate is not noticed the same way my question statement was not noticed although in bold fonts.

Sorry for making you repeat your post. Basically, I know what Codd has said must be true. Based on what Phillip is saying, I have started reading about how sets work to see it all clearly.

Mike
0

LVL 142

Expert Comment

you had put plenty of things in bold, sorry if it was overlooked. happens to everybody...
0

LVL 33

Author Comment

Good news. I found the following:

Sets are unordered and we do not distinguish between repeated elements. So:
{1, 1, 1} = {1}, and {a, b, c} = {b, a, c}.

Philip,
Your comment was very helpful. It led me to do some reading on the sets and I found the above. For now, I know enough to move on. Disentanglement of T-SQL table principles from set theory was the key for my understanding.

Thanks,

Mike
0

LVL 142

Expert Comment

> I know what Codd has said must be true.
it's good to challenge the old theories, even if afterwards all you come up is that they are (still) true.
The effect is to study and learn

>Body: set of distinct rows, T-SQL allows repeated rows thus, hidden row id (RID) is added to make it distinct.
0

LVL 33

Author Comment

True they all are related in one way or another but, with simple math thinking, I had problem justifying why
set {a, b, c} is considered equal to the set {a, a, b, c, c, c}

Thank you for bearing with me and guiding me through.

Mike
0

LVL 73

Expert Comment

Also note the questions are a little confusing, since the critical piece you are asking about isn't something Codd actually said; and isn't correct anyway.

Look again at the text you have cited.  The  equivalence piece is not part of the Codd quote, they are something the other author added.

That addition isn't correct because {a,a,b,c,c,c} is not a set (by definition, which is what Codd was commenting on.)

It's like saying:  this apple is equivalent to this orange because they are both apples.

Except that's not true.

Now that I've read more of the rest of the source context.  The author introduced the problem with sloppy math.  The whole point of that section was about the fact that a set is distinct.  So constructing a new collection that has duplicates is, again by definition, not a set and therefore not really relevant to the rest of text.

But - the point the author was trying to connect was about how the property of distinctness applies to "relations" of sets which is, specifically the condition of membership or in sql context an IN or EXISTS predicate, and nothing more.
0

LVL 142

Expert Comment

sdstuber, I think I have to disagree at some point: http://en.wikipedia.org/wiki/Set_%28mathematics%29
There are two important points to note about sets. First, a set can have two or more members which are identical, for example, {11, 6, 6}. However, we say that two sets which differ only in that one has duplicate members are in fact exactly identical (see Axiom of extensionality). Hence, the set {11, 6, 6} is exactly identical to the set {11, 6}.

now, the starting of (all) those articles about sets start or contain something like:
> In mathematics, a set is a collection of distinct objects <

the difference is about "object" and "value"
you have have several "objects" having the same "value", hence have distinct objects but not distinct values.
0

LVL 73

Expert Comment

you are correct, in my commenting, I was sloppy myself and probably added to the confusion.

thank you for the correction
0

LVL 33

Author Comment

Thank you for bonus explanations.
0

## Featured Post

### Suggested Solutions

This post first appeared at Oracleinaction  (http://oracleinaction.com/undo-and-redo-in-oracle/)by Anju Garg (Myself). I  will demonstrate that undo for DML’s is stored both in undo tablespace and online redo logs. Then, we will analyze the reaso…
International Data Corporation (IDC) prognosticates that before the current the year gets over disbursing on IT framework products to be sent in cloud environs will be \$37.1B.
This videos aims to give the viewer a basic demonstration of how a user can query current session information by using the SYS_CONTEXT function
Via a live example, show how to shrink a transaction log file down to a reasonable size.