Реклама: |

In the previous section (*Relational Data Model Formalities*)
we defined the mathematical notion of
the relational model. Now we know how the data can be stored using a
relational data model but we do not know what to do with all these
tables to retrieve something from the database yet. For example somebody
could ask for the names of all suppliers that sell the part
'Screw'. Therefore two rather different kinds of notations for
expressing operations on relations have been defined:

The

*Relational Algebra*which is an algebraic notation, where queries are expressed by applying specialized operators to the relations.The

*Relational Calculus*which is a logical notation, where queries are expressed by formulating some logical restrictions that the tuples in the answer must satisfy.

The *Relational Algebra* was introduced by
E. F. Codd in 1972. It consists of a set of operations on relations:

SELECT (σ): extracts

*tuples*from a relation that satisfy a given restriction. Letbe a table that contains an attribute*R*. σ*A*_{A=a}(R) = {t ∈ R ∣ t(A) = a} where`t`denotes a tuple ofand*R*`t(A)`denotes the value of attributeof tuple*A*`t`.PROJECT (π): extracts specified

*attributes*(columns) from a relation. Let`R`be a relation that contains an attribute`X`. π_{X}(`R`) = {t(X) ∣ t ∈`R`}, where`t`(`X`) denotes the value of attribute`X`of tuple`t`.PRODUCT (×): builds the Cartesian product of two relations. Let

`R`be a table with arity`k`_{1}and let`S`be a table with arity`k`_{2}.`R`×`S`is the set of all`k`_{1}+`k`_{2}-tuples whose first`k`_{1}components form a tuple in`R`and whose last`k`_{2}components form a tuple in`S`.UNION (∪): builds the set-theoretic union of two tables. Given the tables

`R`and`S`(both must have the same arity), the union`R`∪`S`is the set of tuples that are in`R`or`S`or both.INTERSECT (∩): builds the set-theoretic intersection of two tables. Given the tables

`R`and`S`,`R`∪`S`is the set of tuples that are in`R`and in`S`. We again require that`R`and`S`have the same arity.DIFFERENCE (− or ∖): builds the set difference of two tables. Let

`R`and`S`again be two tables with the same arity.`R`-`S`is the set of tuples in`R`but not in`S`.JOIN (∏): connects two tables by their common attributes. Let

`R`be a table with the attributes`A`,`B`and`C`and let`S`be a table with the attributes`C`,`D`and`E`. There is one attribute common to both relations, the attribute`C`. R ∏ S = π_{R.A,R.B,R.C,S.D,S.E}(σ_{R.C=S.C}(R × S)). What are we doing here? We first calculate the Cartesian product`R`×`S`. Then we select those tuples whose values for the common attribute`C`are equal (σ_{R.C = S.C}). Now we have a table that contains the attribute`C`two times and we correct this by projecting out the duplicate column.**Example 59-2. An Inner Join**Let's have a look at the tables that are produced by evaluating the steps necessary for a join. Let the following two tables be given:

R A | B | C S C | D | E ---+---+--- ---+---+--- 1 | 2 | 3 3 | a | b 4 | 5 | 6 6 | c | d 7 | 8 | 9

First we calculate the Cartesian product

`R`×`S`and get:R x S A | B | R.C | S.C | D | E ---+---+-----+-----+---+--- 1 | 2 | 3 | 3 | a | b 1 | 2 | 3 | 6 | c | d 4 | 5 | 6 | 3 | a | b 4 | 5 | 6 | 6 | c | d 7 | 8 | 9 | 3 | a | b 7 | 8 | 9 | 6 | c | d

After the selection σ

_{R.C=S.C}(R × S) we get:A | B | R.C | S.C | D | E ---+---+-----+-----+---+--- 1 | 2 | 3 | 3 | a | b 4 | 5 | 6 | 6 | c | d

To remove the duplicate column

`S`.`C`we project it out by the following operation: π_{R.A,R.B,R.C,S.D,S.E}(σ_{R.C=S.C}(R × S)) and get:A | B | C | D | E ---+---+---+---+--- 1 | 2 | 3 | a | b 4 | 5 | 6 | c | d

DIVIDE (÷): Let

`R`be a table with the attributes A, B, C, and D and let`S`be a table with the attributes C and D. Then we define the division as: R ÷ S = {t ∣ ∀ t_{s}∈ S ∃ t_{r}∈ R such that t_{r}(A,B)=t∧t_{r}(C,D)=t_{s}} where t_{r}(x,y) denotes a tuple of table`R`that consists only of the components`x`and`y`. Note that the tuple`t`only consists of the components`A`and`B`of relation`R`.Given the following tables

R A | B | C | D S C | D ---+---+---+--- ---+--- a | b | c | d c | d a | b | e | f e | f b | c | e | f e | d | c | d e | d | e | f a | b | d | e

R ÷ S is derived asA | B ---+--- a | b e | d

For a more detailed description and definition of the relational
algebra refer to [*Ullman, 1988*] or
[*Date, 1994*].

**Example 59-3. A Query Using Relational Algebra**

Recall that we formulated all those relational operators to be able to
retrieve data from the database. Let's return to our example from
the previous
section (*Operations in the Relational Data Model*)
where someone wanted to know the names of all
suppliers that sell the part `Screw`.
This question can be answered
using relational algebra by the following operation:

π_{SUPPLIER.SNAME}(σ_{PART.PNAME='Screw'}(SUPPLIER ∏ SELLS ∏ PART))

We call such an operation a query. If we evaluate the above query
against the our example tables
(*The Suppliers and Parts Database*)
we will obtain the following result:

SNAME ------- Smith Adams

The relational calculus is based on the
*first order logic*. There are
two variants of the relational calculus:

The

*Domain Relational Calculus*(DRC), where variables stand for components (attributes) of the tuples.The

*Tuple Relational Calculus*(TRC), where variables stand for tuples.

We want to discuss the tuple relational calculus only because it is
the one underlying the most relational languages. For a detailed
discussion on DRC (and also
TRC) see
[*Date, 1994*]
or
[*Ullman, 1988*].

The queries used in TRC are of the following
form:
x(A) ∣ F(x)
where `x` is a tuple variable
`A` is a set of attributes and `F` is a
formula. The resulting relation consists of all tuples
`t(A)` that satisfy `F(t)`.

If we want to answer the question from example
*A Query Using Relational Algebra*
using TRC we formulate the following query:

{x(SNAME) ∣ x ∈ SUPPLIER ∧ \nonumber ∃ y ∈ SELLS ∃ z ∈ PART (y(SNO)=x(SNO) ∧ \nonumber z(PNO)=y(PNO) ∧ \nonumber z(PNAME)='Screw')} \nonumber

Evaluating the query against the tables from
*The Suppliers and Parts Database*
again leads to the same result
as in
*A Query Using Relational Algebra*.

The relational algebra and the relational calculus have the same
*expressive power*; i.e. all queries that
can be formulated using relational algebra can also be formulated
using the relational calculus and vice versa.
This was first proved by E. F. Codd in
1972. This proof is based on an algorithm ("Codd's reduction
algorithm") by which an arbitrary expression of the relational
calculus can be reduced to a semantically equivalent expression of
relational algebra. For a more detailed discussion on that refer to
[*Date, 1994*]
and
[*Ullman, 1988*].

It is sometimes said that languages based on the relational calculus are "higher level" or "more declarative" than languages based on relational algebra because the algebra (partially) specifies the order of operations while the calculus leaves it to a compiler or interpreter to determine the most efficient order of evaluation.