Solved

What Codd once said... part 2

Posted on 2015-02-06
17
139 Views
Last Modified: 2015-02-06
The following tables, one in Standard SQL and the other in T-SQL to continue the discussion on sets used,

Relation     (in T-SQL called Table)
-----------------------------------------------
Name             City              <-- Heading: set of attributes with attribute name and type
Jeff               New York       \    in T-SQL attribute is known as column. (no order for columns)
Mike             Los Angeles    >
Hans            Pasadena       /   Body: set of distinct tuples (in T-SQL distinct rows)
                                                                                          (no order for tuples)

Heading set: set{Name,  City}  
Body set: set{{Jeff, New York}, {Mike, Los Angeles}, {Hans, Pasadena}}

Here we have the same table adopted to T-SQL with an identical row added to it (because TSQL allows it):
Table1
-----------------------------------------------
RID       Name             City              <-- Heading: set of attributes (in T-SQL called columns)
100        Jeff               New York      \                                       (There is no order for columns)
102        Mike             Los Angeles   \
103        Mike             Los Angeles    >  Body: set of distinct rows, T-SQL allows repeated rows
104        Hans            Pasadena       /   thus, hidden row id (RID) is added to make it distinct.

Heading set: set{Name,  City}  
Body set: set{{Jeff, New York} , {Mike, Los Angeles}, {Mike, Los Angeles}, {Hans, Pasadena}}
(Column RID exluded intentionally)

Other than Heading set and Body set, we also have Predicate set used in WHERE clause like:

SELECT * FROM TABLE_1 WHERE Name IN('Jeff', 'Mike')

The Predicate set is not a part of this discussion because the partial text below discusses predicate many lines after the quote from Codd.

Question: Why Codd says set {a, b, c} is considered equal to the set {a, a, b, c, c, c}?

We understand that a set used as predicate (in WHERE clause) turns a set like IN(a, a, b, c, c, c) to IN(a, b, c) hence we can see why they are equal. But we are talking about sets used in heading or body not predicate.

--------------------------------------------
Lesson 1: Understanding the Foundations of T-SQL Chapter 1 Page 4
.
.
Remember that the heading of a relation is a set of attributes, and the body a set of tuples.
So what is a set? According to the creator of mathematical set theory, Georg Cantor, a set is
described as follows:

By a “set” we mean any collection M into a whole of definite, distinct objects
m (which are called the “elements” of M) of our perception or of our thought.

There are a number of very important principles in this definition that, if understood,
should have direct implications on your T-SQL coding practices. For one, notice the term
whole. A set should be considered as a whole. This means that you do not interact with the
individual elements of the set, rather with the set as a whole.

Lesson 1: Understanding the Foundations of T-SQL Chapter 1 Page 5
Notice the term distinct—a set has no duplicates. Codd once remarked on the no duplicates
aspect: ”If something is true, then saying it twice won't make it any truer.“ For example,
the set {a, b, c} is considered equal to the set {a, a, b, c, c, c}.

Another critical aspect of a set doesn’t explicitly appear in the aforementioned definition
by Cantor, but rather is implied—there’s no relevance to the order of elements in a set. In
contrast, a sequence (which is an ordered set), for example, does have an order to its elements.
Combining the no duplicates and no relevance to order aspects means that the set
{a, b, c} is equal to the set {b, a, c, c, a, c}.

The other branch of mathematics that the relational model is based on is called predicate
logic. A predicate is an expression that when attributed to some object, makes a proposition
either true or false. For example, “salary greater than $50,000” is a predicate. You can evaluate
this predicate for a specific employee, in which case you have a proposition. For example,
suppose that for a particular employee, the salary is $60,000. When you evaluate the proposition
for that employee, you get a true proposition. In other words, a predicate is a parameterized
proposition.
------------------------------------------------------
BookCover.png
0
Comment
Question by:Mike Eghtebas
  • 6
  • 6
  • 2
  • +2
17 Comments
 
LVL 24

Expert Comment

by:Phillip Burton
ID: 40593132
And what's the question?
0
 
LVL 142

Expert Comment

by:Guy Hengel [angelIII / a3]
ID: 40593134
Yes, what do you want to know now?
0
 
LVL 48

Expert Comment

by:PortletPaul
ID: 40593144
Perhaps this should be a discussion?
0
Backup Your Microsoft Windows Server®

Backup all your Microsoft Windows Server – on-premises, in remote locations, in private and hybrid clouds. Your entire Windows Server will be backed up in one easy step with patented, block-level disk imaging. We achieve RTOs (recovery time objectives) as low as 15 seconds.

 
LVL 33

Author Comment

by:Mike Eghtebas
ID: 40593145
As stated in the original post:

Question: Why Codd says set {a, b, c} is considered equal to the set {a, a, b, c, c, c} ?


The second sets has duplicates thus it is not distinct.

We understand that a set used as predicate (in WHERE clause) turns a set like IN(a, a, b, c, c, c) to IN(a, b, c) hence we can see why they are equal. But we are talking about sets used in heading or body not predicate.
0
 
LVL 142

Assisted Solution

by:Guy Hengel [angelIII / a3]
Guy Hengel [angelIII / a3] earned 250 total points
ID: 40593148
because as "set", they are indeed the same. Speaking purely "maths"
consider the EXISTS( ) condition, for example.

select * from sometable t
where exists ( select * from lookup l where l.value = t.value )

with the lookup table being the "set" mentioned.


in database term (implementation), it's a different world, as you most often have to consider each row with a "row id" (or row number) concept, hence a "database table" is not a "set", in regards to the maths concepts.

it will work in the EXISTS () concept
it will work when using a proper 'DISTINCT or GROUP BY' concept
it will eventually give issues in a plain JOIN, if either "sets (aka tables)" contains "duplicates"
0
 
LVL 33

Author Comment

by:Mike Eghtebas
ID: 40593152
Guy,

EXISTS( )  is predicate. Predicates is not a part of this discussion.

Would you say that:

set{{Jeff, New York} , {Mike, Los Angeles}, {Mike, Los Angeles}, {Hans, Pasadena}}

is equal to

set{{Jeff, New York} , {Mike, Los Angeles}, {Mike, Los Angeles}, {Mike, Los Angeles}, {Hans, Pasadena}}

They are not to me.
0
 
LVL 24

Accepted Solution

by:
Phillip Burton earned 250 total points
ID: 40593158
In set theory, they are.

In T-SQL, they are not.
0
 
LVL 142

Expert Comment

by:Guy Hengel [angelIII / a3]
ID: 40593163
Egthebas:
if you read my full comment, I said that in the context of that predicate, it behaves the same, and that in database implementation, they are NOT the same, but that you can make them (the tables/data) behavethe same.
0
 
LVL 33

Author Comment

by:Mike Eghtebas
ID: 40593170
Guy,

I will read it again, it just seemed to me my emphases to exclude predicate is not noticed the same way my question statement was not noticed although in bold fonts.

Sorry for making you repeat your post. Basically, I know what Codd has said must be true. Based on what Phillip is saying, I have started reading about how sets work to see it all clearly.

Mike
0
 
LVL 142

Expert Comment

by:Guy Hengel [angelIII / a3]
ID: 40593186
you had put plenty of things in bold, sorry if it was overlooked. happens to everybody...
0
 
LVL 33

Author Comment

by:Mike Eghtebas
ID: 40593250
Good news. I found the following:

Sets are unordered and we do not distinguish between repeated elements. So:
{1, 1, 1} = {1}, and {a, b, c} = {b, a, c}.

Philip,
Your comment was very helpful. It led me to do some reading on the sets and I found the above. For now, I know enough to move on. Disentanglement of T-SQL table principles from set theory was the key for my understanding.

Thanks,

Mike
0
 
LVL 142

Expert Comment

by:Guy Hengel [angelIII / a3]
ID: 40593261
> I know what Codd has said must be true.
it's good to challenge the old theories, even if afterwards all you come up is that they are (still) true.
The effect is to study and learn

btw, your question had the "answer" also:
>Body: set of distinct rows, T-SQL allows repeated rows thus, hidden row id (RID) is added to make it distinct.
0
 
LVL 33

Author Comment

by:Mike Eghtebas
ID: 40593268
True they all are related in one way or another but, with simple math thinking, I had problem justifying why
set {a, b, c} is considered equal to the set {a, a, b, c, c, c}

Thank you for bearing with me and guiding me through.

Mike
0
 
LVL 73

Expert Comment

by:sdstuber
ID: 40593601
Also note the questions are a little confusing, since the critical piece you are asking about isn't something Codd actually said; and isn't correct anyway.

Look again at the text you have cited.  The  equivalence piece is not part of the Codd quote, they are something the other author added.

That addition isn't correct because {a,a,b,c,c,c} is not a set (by definition, which is what Codd was commenting on.)

It's like saying:  this apple is equivalent to this orange because they are both apples.

Except that's not true.

Now that I've read more of the rest of the source context.  The author introduced the problem with sloppy math.  The whole point of that section was about the fact that a set is distinct.  So constructing a new collection that has duplicates is, again by definition, not a set and therefore not really relevant to the rest of text.

But - the point the author was trying to connect was about how the property of distinctness applies to "relations" of sets which is, specifically the condition of membership or in sql context an IN or EXISTS predicate, and nothing more.
0
 
LVL 142

Expert Comment

by:Guy Hengel [angelIII / a3]
ID: 40593703
sdstuber, I think I have to disagree at some point: http://en.wikipedia.org/wiki/Set_%28mathematics%29
There are two important points to note about sets. First, a set can have two or more members which are identical, for example, {11, 6, 6}. However, we say that two sets which differ only in that one has duplicate members are in fact exactly identical (see Axiom of extensionality). Hence, the set {11, 6, 6} is exactly identical to the set {11, 6}.

now, the starting of (all) those articles about sets start or contain something like:
> In mathematics, a set is a collection of distinct objects <

the difference is about "object" and "value"
you have have several "objects" having the same "value", hence have distinct objects but not distinct values.
0
 
LVL 73

Expert Comment

by:sdstuber
ID: 40593738
you are correct, in my commenting, I was sloppy myself and probably added to the confusion.

thank you for the correction
0
 
LVL 33

Author Comment

by:Mike Eghtebas
ID: 40594163
Thank you for bonus explanations.
0

Featured Post

VMware Disaster Recovery and Data Protection

In this expert guide, you’ll learn about the components of a Modern Data Center. You will use cases for the value-added capabilities of Veeam®, including combining backup and replication for VMware disaster recovery and using replication for data center migration.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Load balancing is the method of dividing the total amount of work performed by one computer between two or more computers. Its aim is to get more work done in the same amount of time, ensuring that all the users get served faster.
This article provides a brief introduction to tissue engineering, the process by which organs can be grown artificially. It covers the problems with organ transplants, the tissue engineering process, and the current successes and problems of the tec…
This video shows how to copy an entire tablespace from one database to another database using Transportable Tablespace functionality.
Viewers will learn how to use the SELECT statement in SQL and will be exposed to the many uses the SELECT statement has.

776 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question