The following tables, one in Standard SQL and the other in T-SQL to continue the discussion on sets used,
Relation (in T-SQL called Table)
--------------------------
---------------------
Name City <-- Heading: set of attributes with attribute name and type
Jeff New York \ in T-SQL attribute is known as column. (no order for columns)
Mike Los Angeles >
Hans Pasadena / Body: set of distinct tuples (in T-SQL distinct rows)
(no order for tuples)
Heading set: set{Name, City}
Body set: set{{Jeff, New York}, {Mike, Los Angeles}, {Hans, Pasadena}}
Here we have the same table adopted to T-SQL with an identical row added to it (because TSQL allows it):
Table1
-----------------------------------------------
RID Name City <-- Heading: set of attributes (in T-SQL called columns)
100 Jeff New York \ (There is no order for columns)
102 Mike Los Angeles \
103 Mike Los Angeles > Body: set of distinct rows, T-SQL allows repeated rows
104 Hans Pasadena / thus, hidden row id (RID) is added to make it distinct.
Heading set: set{Name, City}
Body set: set{{Jeff, New York} , {Mike, Los Angeles}, {Mike, Los Angeles}, {Hans, Pasadena}}
(Column RID exluded intentionally)
Other than Heading set and Body set, we also have Predicate set used in WHERE clause like:
SELECT * FROM TABLE_1 WHERE Name IN('Jeff', 'Mike')
The Predicate set is not a part of this discussion because the partial text below discusses predicate many lines after the quote from Codd.
Question: Why Codd says set {a, b, c} is considered equal to the set {a, a, b, c, c, c}?
We understand that a set used as predicate (in WHERE clause) turns a set like IN(a, a, b, c, c, c) to IN(a, b, c) hence we can see why they are equal. But we are talking about sets used in heading or body not predicate.
--------------------------------------------
Lesson 1: Understanding the Foundations of T-SQL Chapter 1 Page 4
.
.
Remember that the heading of a relation is a set of attributes, and the body a set of tuples.
So what is a set? According to the creator of mathematical set theory, Georg Cantor, a set is
described as follows:
By a “set” we mean any collection M into a whole of definite, distinct objects
m (which are called the “elements” of M) of our perception or of our thought.
There are a number of very important principles in this definition that, if understood,
should have direct implications on your T-SQL coding practices. For one, notice the term
whole. A set should be considered as a whole. This means that you do not interact with the
individual elements of the set, rather with the set as a whole.
Lesson 1: Understanding the Foundations of T-SQL Chapter 1 Page 5
Notice the term distinct—a set has no duplicates. Codd once remarked on the no duplicates
aspect: ”If something is true, then saying it twice won't make it any truer.“ For example,
the set {a, b, c} is considered equal to the set {a, a, b, c, c, c}.
Another critical aspect of a set doesn’t explicitly appear in the aforementioned definition
by Cantor, but rather is implied—there’s no relevance to the order of elements in a set. In
contrast, a sequence (which is an ordered set), for example, does have an order to its elements.
Combining the no duplicates and no relevance to order aspects means that the set
{a, b, c} is equal to the set {b, a, c, c, a, c}.
The other branch of mathematics that the relational model is based on is called predicate
logic. A predicate is an expression that when attributed to some object, makes a proposition
either true or false. For example, “salary greater than $50,000” is a predicate. You can evaluate
this predicate for a specific employee, in which case you have a proposition. For example,
suppose that for a particular employee, the salary is $60,000. When you evaluate the proposition
for that employee, you get a true proposition. In other words, a predicate is a parameterized
proposition.
------------------------------------------------------
BookCover.png