Basic Structure of SQL Queries



Rating - 3/5
513 views

The basic structure of an SQL query consists of three clauses: select, from, and
where. The query takes as its input the relations listed in the from clause, operates
on them as specified in the where and select clauses, and then produces a relation
as the result. We introduce the SQL syntax through examples, and describe the
general structure of SQL queries later.


1. Qeries on a Single Relation

Let us consider a simple query using our university example, “Find the names
of all instructors.” Instructor names are found in the instructor relation, so we 

putthat relation in the from clause. The instructor’s name appears in the name
attribute, so we put that in the select clause.

These Topics Are Also In Your Syllabus
1 Initial Conceptual Design of the COMPANY Database link
2 Relationship Types, Relationship Sets, Roles, and Structural Constraints link
You May Find Something Very Interesting Here. link
3 Attribute Specification in Select Clause link
4 Weak Entity Types link
5 Types Of Systems link

select name
from instructor;

The result is a relation consisting of a single attribute with the heading name. If
the instructor relation is as shown in Figure 2.1, then the relation that results from
the preceding query is shown in Figure 3.2.

Now consider another query, “Find the department names of all instructors,”
which can be written as:

select dept_name
from instructor;

Since more than one instructor can belong to a department, a department name
could appear more than once in the instructor relation. The result of the above
query is a relation containing the department names, shown in Figure 3.3.

These Topics Are Also In Your Syllabus
1 Overview of Object Database Concepts link
2 Overview of the SQL Query Language link
You May Find Something Very Interesting Here. link
3 SQL Data Definition link
4 Basic Structure of SQL Queries link
5 Types Of Systems link

In the formal, mathematical definition of the relational model, a relation is a
set. Thus, duplicate tuples would never appear in relations. In practice, duplicate
elimination is time-consuming. Therefore, SQL allows duplicates in relations as
well as in the results of SQL expressions. Thus, the preceding SQL query lists
each department name once for every tuple in which it appears in the instructor
relation.

In those cases where we want to force the elimination of duplicates, we insert
the keyword distinct after select. We can rewrite the preceding query as:

select distinct dept_name
from instructor;

if we want duplicates removed. The result of the above query would contain each
department name at most once.

These Topics Are Also In Your Syllabus
1 Sensitive Data and Types of Disclosures link
2 Access Control, User Accounts, and Database Audits link
You May Find Something Very Interesting Here. link
3 Database Security and the DBA link
4 Where Clause Predicates link
5 Types Of Systems link

SQL allows us to use the keyword all to specify explicitly that duplicates are
not removed:

select all dept_name
from instructor;

Since duplicate retention is the default, we shall not use all in our examples. To
ensure the elimination of duplicates in the results of our example queries, we
shall use distinct whenever it is necessary.

The select clause may also contain arithmetic expressions involving the op?erators +, −, ∗, and / operating on constants or attributes of tuples. For example,
the query:

select ID, name, dept_name, salary * 1.1
from instructor;

These Topics Are Also In Your Syllabus
1 Where Clause Predicates link
2 Control Measures link
You May Find Something Very Interesting Here. link
3 Set Operations: Introduction link
4 Null Values link
5 Types Of Systems link

returns a relation that is the same as the instructorrelation, except that the attribute
salary is multiplied by 1.1. This shows what would result if we gave a 10% raise
to each instructor; note, however, that it does not result in any change to the
instructor relation.

SQL also provides special data types, such as various forms of the date type,
and allows several arithmetic functions to operate on these types. We discuss this
further in Section 4.5.1.

The where clause allows us to select only those rows in the result relation of
the from clause that satisfy a specified predicate. Consider the query “Find the
names of all instructors in the Computer Science department who have salary
greater than $70,000.” This query can be written in SQL as:

select name
from instructor

These Topics Are Also In Your Syllabus
1 Keys link
2 Introduction to Database Security Issues link
You May Find Something Very Interesting Here. link
3 Discretionary Access Control Based on Granting and Revoking Privileges link
4 Introduction to Database Security Issues link
5 Types Of Systems link

where dept_name = ’Comp. Sci.’ and salary > 70000;

If the instructor relation is as shown in Figure 2.1, then the relation that results
from the preceding query is shown in Figure 3.4.

SQL allows the use of the logical connectives and, or, and not in the where
clause. The operands of the logical connectives can be expressions involving
the comparison operators <, <=, >, >=, =, and <>. SQL allows us to use the
comparison operators to compare strings and arithmetic expressions, as well as
special types, such as date types.

We shall explore other features of where clause predicates later in this topic

2. Queries on Multiple Relations

These Topics Are Also In Your Syllabus
1 Data Abstraction, Knowledge Representation, and Ontology Concepts link
2 Using High-Level Conceptual Data Models for Database Design link
You May Find Something Very Interesting Here. link
3 Using High-Level Conceptual Data Models for Database Design link
4 Using High-Level Conceptual Data Models for Database Design link
5 Types Of Systems link

So far our example queries were on a single relation. Queries often need to access
information from multiple relations. We now study how to write such queries.

An an example, suppose we want to answer the query “Retrieve the names
of all instructors, along with their department names and department building
name.”

Looking at the schema of the relation instructor, we realize that we can get
the department name from the attribute dept_name, but the department building
name is present in the attribute building of the relation department. To answer the
query, each tuple in the instructor relation must be matched with the tuple in
the department relation whose dept_name value matches the dept_name value of the instructor tuple.

In SQL, to answer the above query, we list the relations that need to be accessed
in the from clause, and specify the matching condition in the where clause. The
above query can be written in SQL as

select name, instructor.dept_name, building
from instructor, department
where instructor.dept_name= department.dept_name;

These Topics Are Also In Your Syllabus
1 Introduction to Database Security Issues link
2 Introduction to Database Security Issues link
You May Find Something Very Interesting Here. link
3 The Natural Join link
4 Additional Basic Operations: The Rename Operation link
5 Types Of Systems link

If the instructor and department relations are as shown in Figures 2.1 and 2.5
respectively, then the result of this query is shown in Figure 3.5.

Note that the attribute dept_name occurs in both the relations instructor and
department, and the relation name is used as a prefix (in instructor.dept_name, and department.dept_name) to make clear to which attribute we are referring. In contrast, the attributes name and building appear in only one of the relations, and therefore do not need to be prefixed by the relation name.

This naming convention requires that the relations that are present in the from
clause have distinct names. This requirement causes problems in some cases,
such as when information from two different tuples in the same relation needs to
be combined. In Section 3.4.1, we see how to avoid these problems by using the
rename operation.

We now consider the general case of SQL queries involving multiple relations.
As we have seen earlier, an SQL query can contain three types of clauses, the
select clause, the from clause, and the where clause. The role of each clause is as
follows:

These Topics Are Also In Your Syllabus
1 Attribute Specification in Select Clause link
2 Weak Entity Types link
You May Find Something Very Interesting Here. link
3 Refining the ER Design for the COMPANY Database link
4 ER Diagrams, Naming Conventions, and Design Issues link
5 Types Of Systems link

• The select clause is used to list the attributes desired in the result of a query.

• The from clause is a list of the relations to be accessed in the evaluation of
the query.

• The where clause is a predicate involving attributes of the relation in the
from clause.

A typical SQL query has the form

select A1, A2,..., An
from r1, r2,...,rm
where P;

These Topics Are Also In Your Syllabus
1 Relational Query Languages link
2 Relational Operations link
You May Find Something Very Interesting Here. link
3 Overview of Object Database Concepts link
4 Overview of the SQL Query Language link
5 SQL Data Definition link

Each Ai represents an attribute, and each ri a relation. P is a predicate. If the where
clause is omitted, the predicate P is true.

Although the clauses must be written in the order select, from, where, the
easiest way to understand the operations specified by the query is to consider the
clauses in operational order: first from, then where, and then select.

The from clause by itself defines a Cartesian product of the relations listed
in the clause. It is defined formally in terms of set theory, but is perhaps best
understood as an iterative process that generates tuples for the result relation of
the from clause.

for each tuple t1 in relation r1

for each tuple t2 in relation r2

These Topics Are Also In Your Syllabus
1 Relational Query Languages link
2 Relational Operations link
You May Find Something Very Interesting Here. link
3 Overview of Object Database Concepts link
4 Overview of the SQL Query Language link
5 SQL Data Definition link

...
for each tuple tm in relation rm

Concatenate t1, t2,..., tm into a single tuple t
Add t into the result relation

The result relation has all attributes from all the relations in the from clause.
Since the same attribute name may appear in both ri and rj , as we saw earlier,
we prefix the the name of the relation from which the attribute originally came,
before the attribute name.

For example, the relation schema for the Cartesian product of relations instructor and teaches is:

(instructor.ID, instructor.name, instructor.dept_name, instructor.salary
teaches.ID, teaches.course_id, teaches.sec_id, teaches.semester, teaches.year)

With this schema, we can distinguish instructor.ID from teaches.ID. For those attributes that appear in only one of the two schemas, we shall usually drop the
relation-name prefix. This simplification does not lead to any ambiguity. We can
then write the relation schema as:

(instructor.ID, name, dept_name, salary
teaches.ID, course_id, sec_id, semester, year)

To illustrate, consider the instructor relation in Figure 2.1 and the teaches
relation in Figure 2.7. Their Cartesian product is shown in Figure 3.6, which
includes only a portion of the tuples that make up the Cartesian product result.2
The Cartesian product by itself combines tuples from instructor and teaches
that are unrelated to each other. Each tuple in instructor is combined with every
tuple in teaches, even those that refer to a different instructor. The result can be
an extremely large relation, and it rarely makes sense to create such a Cartesian
product.

Instead, the predicate in the where clause is used to restrict the combinations
created by the Cartesian product to those that are meaningful for the desired
answer. We would expect a query involving instructor and teaches to combine a
particular tuple t in instructor with only those tuples in teaches that refer to the
same instructor to which t refers. That is, we wish only to match teaches tuples with
instructor tuples that have the same ID value. The following SQL query ensures
this condition, and outputs the instructor name and course identifiers from such
matching tuples.

select name, course_id
from instructor, teaches
where instructor.ID= teaches.ID;

Note that the above query outputs only instructors who have taught some course.
Instructors who have not taught any course are not output; if we wish to output
such tuples, we could use an operation called the outer join.

If the instructor relation is as shown in Figure 2.1 and the teaches relation is
as shown in Figure 2.7, then the relation that results from the preceding query
is shown in Figure 3.7. Observe that instructors Gold, Califieri, and Singh, who
have not taught any course, do not appear in the above result.

If we only wished to find instructor names and course identifiers for instructors in the Computer Science department, we could add an extra predicate to the

where clause, as shown below.

select name, course_id
from instructor, teaches
where instructor.ID= teaches.ID and instructor.dept_name = ’Comp. Sci.’;

Note that since the dept_name attribute occurs only in the instructor relation, we
could have used just dept_name, instead of instructor.dept_name in the above query.

In general, the meaning of an SQL query can be understood as follows:

1. Generate a Cartesian product of the relations listed in the from clause

2. Apply the predicates specified in the where clause on the result of Step 1.

3. For each tuple in the result of Step 2, output the attributes (or results of
expressions) specified in the select clause.


The above sequence of steps helps make clear what the result of an SQL query
should be, not how it should be executed. A real implementation of SQL would
not execute the query in this fashion; it would instead optimize evaluation by
generating (as far as possible) only elements of the Cartesian product that satisfy
the where clause predicates.


When writing queries, you should be careful to include appropriate where
clause conditions. If you omit the where clause condition in the preceding SQL
query, it would output the Cartesian product, which could be a huge relation.
For the example instructor relation in Figure 2.1 and the example teaches relation
in Figure 2.7, their Cartesian product has 12 ∗ 13 = 156 tuples — more than we
can show in the text! To make matters worse, suppose we have a more realistic
number of instructors than we show in our sample relations in the figures, say 200
instructors. Let’s assume each instructor teaches 3 courses, so we have 600 tuples
in the teaches relation. Then the above iterative process generates 200 ∗ 600 =
120,000 tuples in the result.


Rating - 3/5
529 views

Advertisements
Rating - 3/5