chapter 11 Object and Object-Relational Databases



Rating - 3/5
539 views

.I n this chapter, we discuss the features of objectoriented data models and show how some of these features have been incorporated in relational database systems. Object-oriented databases are now referred to as object databases (ODB) (previously called OODB), and the database systems are referred to as object data management systems (ODMS) (formerly referred to as ODBMS or OODBMS). Traditional data models and systems, such as relational, network, and hierarchical, have been quite successful in developing the database technologies required for many traditional business database applications. However, they have certain shortcomings when more complex database applications must be designed and implemented—for example, databases for engineering design and manufacturing (CAD/CAM and CIM1), scientific experiments, telecommunications, geographic information systems, and multimedia.2 These newer applications have requirements and characteristics that differ from those of traditional business applications, such as more complex structures for stored objects; the need for new data types for storing images, videos, or large textual items; longer-duration transactions; and the need to define nonstandard application-specific operations. Object databases were proposed to meet some of the needs of these more complex applications. A key feature of object databases is the power they give the designer to specify both the structure of complex objects and the operations that can be applied to these objects.

Another reason for the creation of object-oriented databases is the vast increase in the use of object-oriented programming languages for developing software applications. Databases are fundamental components in many software systems, and traditional databases are sometimes difficult to use with software applications that are developed in an object-oriented programming language such as C++ or Java. Object databases are designed so they can be directly—or seamlessly—integrated with software that is developed using object-oriented programming languages.

Relational DBMS (RDBMS) vendors have also recognized the need for incorporating features that were proposed for object databases, and newer versions of relational systems have incorporated many of these features. This has led to database systems that are characterized as object-relational or ORDBMSs. The latest version of the SQL standard (2008) for RDBMSs includes many of these features, which were originally known as SQL/Object and they have now been merged into the main SQL specification, known as SQL/Foundation.

Although many experimental prototypes and commercial object-oriented database systems have been created, they have not found widespread use because of the popularity of relational and object-relational systems. The experimental prototypes included the Orion system developed at MCC,3 OpenOODB at Texas Instruments, the Iris system at Hewlett-Packard laboratories, the Ode system at AT&T Bell Labs,4 and the ENCORE/ObServer project at Brown University. Commercially available systems included GemStone Object Server of GemStone Systems, ONTOS DB of Ontos, Objectivity/DB of Objectivity Inc., Versant Object Database and FastObjects by Versant Corporation (and Poet), ObjectStore of Object Design, and Ardent Database of Ardent.5 These represent only a partial list of the experimental prototypes and commercial object-oriented database systems that were created.

As commercial object DBMSs became available, the need for a standard model and language was recognized. Because the formal procedure for approval of standards normally takes a number of years, a consortium of object DBMS vendors and users, called ODMG,6 proposed a standard whose current specification is known as the ODMG 3.0 standard.

These Topics Are Also In Your Syllabus
1 Structure of Relational Databases link
2 Database Schema link
You May Find Something Very Interesting Here. link
3 Keys link
4 Introduction to Database Security Issues link
5 Types Of Systems link

Object-oriented databases have adopted many of the concepts that were developed originally for object-oriented programming languages.7 In Section 11.1, we describe the key concepts utilized in many object database systems and that were later incorporated into object-relational systems and the SQL standard. These include object identity, object structure and type constructors, encapsulation of operations and the definition of methods as part of class declarations, mechanisms for storing objects in a database by making them persistent, and type and class hierarchies and inheritance. Then, in Section 11.2 we see how these concepts have been incorporated into the latest SQL standards, leading to object-relational databases. Object features were originally introduced in SQL:1999, and then updated in the latest version (SQL:2008) of the standard. In Section 11.3, we turn our attention to “pure” object database standards by presenting features of the object database standard ODMG 3.0 and the object definition language ODL. Section 11.4 presents an overview of the database design process for object databases. Section 11.5 discusses the object query language (OQL), which is part of the ODMG 3.0 standard. In Section 11.6, we discuss programming language bindings, which specify how to extend objectoriented programming languages to include the features of the object database standard. Section 11.7 summarizes the chapter. Sections 11.5 and 11.6 may be left out if a less thorough introduction to object databases is desired.

11.1 Overview of Object Database Concepts

11.1.1 Introduction to Object-Oriented Concepts and Features

The term object-oriented—abbreviated OO or O-O—has its origins in OO programming languages, or OOPLs. Today OO concepts are applied in the areas of databases, software engineering, knowledge bases, artificial intelligence, and computer systems in general. OOPLs have their roots in the SIMULA language, which was proposed in the late 1960s. The programming language Smalltalk, developed at Xerox PARC8 in the 1970s, was one of the first languages to explicitly incorporate additional OO concepts, such as message passing and inheritance. It is known as a pure OO programming language, meaning that it was explicitly designed to be object-oriented. This contrasts with hybrid OO programming languages, which incorporate OO concepts into an already existing language. An example of the latter is C++, which incorporates OO concepts into the popular C programming language.

An object typically has two components: state (value) and behavior (operations). It can have a complex data structure as well as specific operations defined by the programmer.9 Objects in an OOPL exist only during program execution; therefore, they are called transient objects. An OO database can extend the existence of objects so that they are stored permanently in a database, and hence the objects become persistent objects that exist beyond program termination and can be retrieved later and shared by other programs. In other words, OO databases store persistent objects permanently in secondary storage, and allow the sharing of these objects among multiple programs and applications. This requires the incorporation of other well-known features of database management systems, such as indexing mechanisms to efficiently locate the objects, concurrency control to allow object sharing among concurrent programs, and recovery from failures. An OO database system will typically interface with one or more OO programming languages to provide persistent and shared object capabilities.

The internal structure of an object in OOPLs includes the specification of instance variables, which hold the values that define the internal state of the object. An instance variable is similar to the concept of an attribute in the relational model, except that instance variables may be encapsulated within the object and thus are not necessarily visible to external users. Instance variables may also be of arbitrarily complex data types. Object-oriented systems allow definition of the operations or functions (behavior) that can be applied to objects of a particular type. In fact, some OO models insist that all operations a user can apply to an object must be predefined. This forces a complete encapsulation of objects. This rigid approach has been relaxed in most OO data models for two reasons. First, database users often need to know the attribute names so they can specify selection conditions on the attributes to retrieve specific objects. Second, complete encapsulation implies that any simple retrieval requires a predefined operation, thus making ad hoc queries difficult to specify on the fly.

To encourage encapsulation, an operation is defined in two parts. The first part, called the signature or interface of the operation, specifies the operation name and arguments (or parameters). The second part, called the method or body, specifies the implementation of the operation, usually written in some general-purpose programming language. Operations can be invoked by passing a message to an object, which includes the operation name and the parameters. The object then executes the method for that operation. This encapsulation permits modification of the internal structure of an object, as well as the implementation of its operations, without the need to disturb the external programs that invoke these operations. Hence, encapsulation provides a form of data and operation independence

These Topics Are Also In Your Syllabus
1 Overview of Object Database Concepts link
2 Overview of the SQL Query Language link
You May Find Something Very Interesting Here. link
3 SQL Data Definition link
4 Basic Structure of SQL Queries link
5 Types Of Systems link

Another key concept in OO systems is that of type and class hierarchies and inheritance. This permits specification of new types or classes that inherit much of their structure and/or operations from previously defined types or classes. This makes it easier to develop the data types of a system incrementally, and to reuse existing type definitions when creating new types of objects.

One problem in early OO database systems involved representing relationships among objects. The insistence on complete encapsulation in early OO data models led to the argument that relationships should not be explicitly represented, but should instead be described by defining appropriate methods that locate related objects. However, this approach does not work very well for complex databases with many relationships because it is useful to identify these relationships and make them visible to users. The ODMG object database standard has recognized this need and it explicitly represents binary relationships via a pair of inverse references, as we will describe in Section 11.3.

Another OO concept is operator overloading, which refers to an operation’s ability to be applied to different types of objects; in such a situation, an operation name may refer to several distinct implementations, depending on the type of object it is applied to. This feature is also called operator polymorphism. For example, an operation to calculate the area of a geometric object may differ in its method (implementation), depending on whether the object is of type triangle, circle, or rectangle. This may require the use of late binding of the operation name to the appropriate method at runtime, when the type of object to which the operation is applied becomes known. In the next several sections, we discuss in some detail the main characteristics of object databases. Section 11.1.2 discusses object identity; Section 11.1.3 shows how the types for complex-structured objects are specified via type constructors; Section 11.1.4 discusses encapsulation and persistence; and Section 11.1.5 presents inheritance concepts. Section 11.1.6 discusses some additional OO concepts, and Section 11.1.7 gives a summary of all the OO concepts that we introduced. In Section 11.2, we show how some of these concepts have been incorporated into the SQL:2008 standard for relational databases. Then in Section 11.3, we show how these concepts are realized in the ODMG 3.0 object database standard.

11.1.2 Object Identity, and Objects versus Literals

One goal of an ODMS (Object Data Management System) is to maintain a direct correspondence between real-world and database objects so that objects do not lose their integrity and identity and can easily be identified and operated upon. Hence, an ODMS provides a unique identity to each independent object stored in the database. This unique identity is typically implemented via a unique, system-generated object identifier (OID). The value of an OID is not visible to the external user, but is used internally by the system to identify each object uniquely and to create and manage inter-object references. The OID can be assigned to program variables of the appropriate type when needed.

These Topics Are Also In Your Syllabus
1 Relational Operations link
2 Overview of Object Database Concepts link
You May Find Something Very Interesting Here. link
3 Overview of the SQL Query Language link
4 SQL Data Definition link
5 Basic Structure of SQL Queries link

The main property required of an OID is that it be immutable; that is, the OID value of a particular object should not change. This preserves the identity of the real-world object being represented. Hence, an ODMS must have some mechanism for generating OIDs and preserving the immutability property. It is also desirable that each OID be used only once; that is, even if an object is removed from the database, its OID should not be assigned to another object. These two properties imply that the OID should not depend on any attribute values of the object, since the value of an attribute may be changed or corrected. We can compare this with the relational model, where each relation must have a primary key attribute whose value identifies each tuple uniquely. In the relational model, if the value of the primary key is changed, the tuple will have a new identity, even though it may still represent the same real-world object. Alternatively, a real-world object may have different names for key attributes in different relations, making it difficult to ascertain that the keys represent the same real-world object (for example, the object identifier may be represented as Emp_id in one relation and as Ssn in another).

It is inappropriate to base the OID on the physical address of the object in storage, since the physical address can change after a physical reorganization of the database. However, some early ODMSs have used the physical address as the OID to increase the efficiency of object retrieval. If the physical address of the object changes, an indirect pointer can be placed at the former address, which gives the new physical location of the object. It is more common to use long integers as OIDs and then to use some form of hash table to map the OID value to the current physical address of the object in storage.

Some early OO data models required that everything—from a simple value to a complex object—was represented as an object; hence, every basic value, such as an integer, string, or Boolean value, has an OID. This allows two identical basic values to have different OIDs, which can be useful in some cases. For example, the integer value 50 can sometimes be used to mean a weight in kilograms and at other times to mean the age of a person. Then, two basic objects with distinct OIDs could be created, but both objects would represent the integer value 50. Although useful as a theoretical model, this is not very practical, since it leads to the generation of too many OIDs. Hence, most OO database systems allow for the representation of both objects and literals (or values). Every object must have an immutable OID, whereas a literal value has no OID and its value just stands for itself. Thus, a literal value is typically stored within an object and cannot be referenced from other objects. In many systems, complex structured literal values can also be created without having a corresponding OID if needed.

11.1.3 Complex Type Structures for Objects and Literals

Another feature of an ODMS (and ODBs in general) is that objects and literals may have a type structure of arbitrary complexity in order to contain all of the necessary information that describes the object or literal. In contrast, in traditional database systems, information about a complex object is often scattered over many relations or records, leading to loss of direct correspondence between a real-world object and its database representation. In ODBs, a complex type may be constructed from other types by nesting of type constructors. The three most basic constructors are atom, struct (or tuple), and collection.

These Topics Are Also In Your Syllabus
1 Additional Basic Operations: String Operations link
2 A Sample Database Application link
You May Find Something Very Interesting Here. link
3 Entity Types, Entity Sets, Attributes, and Keys-1 link
4 Entity Types, Entity Sets, Keys, and Value Sets-2 link
5 Types Of Systems link

1. One type constructor has been called the atom constructor, although this term is not used in the latest object standard. This includes the basic built-in data types of the object model, which are similar to the basic types in many programming languages: integers, strings, floating point numbers, enumerated types, Booleans, and so on. They are called single-valued or atomic types, since each value of the type is considered an atomic (indivisible) single value.

2. A second type constructor is referred to as the struct (or tuple) constructor. This can create standard structured types, such as the tuples (record types) in the basic relational model. A structured type is made up of several components, and is also sometimes referred to as a compound or composite type. More accurately, the struct constructor is not considered to be a type, but rather a type generator, because many different structured types can be created. For example, two different structured types that can be created are: struct Name, and struct CollegeDegree. To create complex nested type structures in the object model, the collection type constructors are needed, which we discuss next. Notice that the type constructors atom and struct are the only ones available in the original (basic) relational model.

3. Collection (or multivalued) type constructors include the set(T), list(T), bag(T), array(T), and dictionary(K,T) type constructors. These allow part of an object or literal value to include a collection of other objects or values when needed. These constructors are also considered to be type generators because many different types can be created. For example, set(string), set(integer), and set(Employee) are three different types that can be created from the set type constructor. All the elements in a particular collection value must be of the same type. For example, all values in a collection of type set(string) must be string values.

The atom constructor is used to represent all basic atomic values, such as integers, real numbers, character strings, Booleans, and any other basic data types that the system supports directly. The tuple constructor can create structured values and objects of the form , where each aj is an attribute name10 and each i j is a value or an OID.

The other commonly used constructors are collectively referred to as collection types, but have individual differences among them. The set constructor will create objects or literals that are a set of distinct elements {i 1, i 2, ..., i n}, all of the same type. The bag constructor (sometimes called a multiset) is similar to a set except that the elements in a bag need not be distinct. The list constructor will create an ordered list [i 1, i 2, ..., i n] of OIDs or values of the same type. A list is similar to a bag except that the elements in a list are ordered, and hence we can refer to the first, second, or jth element. The array constructor creates a single-dimensional array of elements of the same type. The main difference between array and list is that a list can have an arbitrary number of elements whereas an array typically has a maximum size. Finally, the dictionary constructor creates a collection of two tuples (K, V), where the value of a key K can be used to retrieve the corresponding value V.

These Topics Are Also In Your Syllabus
1 Subclasses, Superclasses, and Inheritance link
2 what is database management system link
You May Find Something Very Interesting Here. link
3 Database-System Applications link
4 Specialization and Generalization link
5 Constraints and Characteristics of Specialization and Generalization Hierarchies link

The main characteristic of a collection type is that its objects or values will be a collection of objects or values of the same type that may be unordered (such as a set or a bag) or ordered (such as a list or an array). The tuple type constructor is often called a structured type, since it corresponds to the struct construct in the C and C++ programming languages.

An object definition language (ODL)11 that incorporates the preceding type constructors can be used to define the object types for a particular database application. In Section 11.3 we will describe the standard ODL of ODMG, but first we introduce the concepts gradually in this section using a simpler notation. The type constructors can be used to define the data structures for an OO database schema. Figure 11.1 shows how we may declare EMPLOYEE and DEPARTMENT types. In Figure 11.1, the attributes that refer to other objects—such as Dept of EMPLOYEE or Projects of DEPARTMENT—are basically OIDs that serve as references to other objects to represent relationships among the objects. For example, the attribute Dept of EMPLOYEE is of type DEPARTMENT, and hence is used to refer to a specific DEPARTMENT object (the DEPARTMENT object where the employee works). The value of such an attribute would be an OID for a specific DEPARTMENT object. A binary relationship can be represented in one direction, or it can have an inverse reference. The latter representation makes it easy to traverse the relationship in both directions. For example, in Figure 11.1 the attribute Employees of DEPARTMENT has as its value a set of references (that is, a set of OIDs) to objects of type EMPLOYEE; these are the employees who work for the DEPARTMENT. The inverse is the reference attribute Dept of EMPLOYEE. We will see in Section 11.3 how the ODMG standard allows inverses to be explicitly declared as relationship attributes to ensure that inverse references are consistent.

 

 

These Topics Are Also In Your Syllabus
1 View of Data link
2 Introduction to Transaction Processing link
You May Find Something Very Interesting Here. link
3 Database Security link
4 Database Languages link
5 Types Of Systems link

11.1.4 Encapsulation of Operations and Persistence of Objects

Encapsulation of Operations? The concept of encapsulation is one of the main characteristics of OO languages and systems. It is also related to the concepts of abstract data types and information hiding in programming languages. In traditional database models and systems this concept was not applied, since it is customary to make the structure of database objects visible to users and external programs. In these traditional models, a number of generic database operations are applicable to objects of all types. For example, in the relational model, the operations for selecting, inserting, deleting, and modifying tuples are generic and may be applied to any relation in the database. The relation and its attributes are visible to users and to external programs that access the relation by using these operations. The concepts of encapsulation is applied to database objects in ODBs by defining the behavior of a type of object based on the operations that can be externally applied to objects of that type. Some operations may be used to create (insert) or destroy (delete) objects; other operations may update the object state; and others may be used to retrieve parts of the object state or to apply some calculations. Still other operations may perform a combination of retrieval, calculation, and update. In general, the implementation of an operation can be specified in a general-purpose programming language that provides flexibility and power in defining the operations.

The external users of the object are only made aware of the interface of the operations, which defines the name and arguments (parameters) of each operation. The implementation is hidden from the external users; it includes the definition of any hidden internal data structures of the object and the implementation of the operations that access these structures. The interface part of an operation is sometimes called the signature, and the operation implementation is sometimes called the method.

For database applications, the requirement that all objects be completely encapsulated is too stringent. One way to relax this requirement is to divide the structure of an object into visible and hidden attributes (instance variables). Visible attributes can be seen by and are directly accessible to the database users and programmers via the query language. The hidden attributes of an object are completely encapsulated and can be accessed only through predefined operations. Most ODMSs employ high-level query languages for accessing visible attributes. In Section 11.5 we will describe the OQL query language that is proposed as a standard query language for ODBs.

The term class is often used to refer to a type definition, along with the definitions of the operations for that type.12 Figure 11.2 shows how the type definitions in Figure 11.1 can be extended with operations to define classes. A number of operations are 

These Topics Are Also In Your Syllabus
1 Relational Query Languages link
2 Relational Operations link
You May Find Something Very Interesting Here. link
3 Overview of Object Database Concepts link
4 Overview of the SQL Query Language link
5 SQL Data Definition link

declared for each class, and the signature (interface) of each operation is included in the class definition. A method (implementation) for each operation must be defined elsewhere using a programming language. Typical operations include the object constructor operation (often called new), which is used to create a new object, and the destructor operation, which is used to destroy (delete) an object. A number of object modifier operations can also be declared to modify the states (values) of various attributes of an object. Additional operations can retrieve information about the object.

An operation is typically applied to an object by using the dot notation. For example, if d is a reference to a DEPARTMENT object, we can invoke an operation such as no_of_emps by writing d.no_of_emps. Similarly, by writing d.destroy_dept, the object referenced by d is destroyed (deleted). The only exception is the constructor operation, which returns a reference to a new DEPARTMENT object. Hence, it is customary in some OO models to have a default name for the constructor operation that is the name of the class itself, although this was not used in Figure 11.2.13 The dot notation is also used to refer to attributes of an object—for example, by writing d.Dnumber or d.Mgr_Start_date.

Specifying Object Persistence via Naming and Reachability?

An ODBS is often closely coupled with an object-oriented programming language (OOPL). The OOPL is used to specify the method (operation) implementations as well as other application code. Not all objects are meant to be stored permanently in the database. Transient objects exist in the executing program and disappear once the program terminates. Persistent objects are stored in the database and persist after program termination. The typical mechanisms for making an object persistent are naming and reachability.

The naming mechanism involves giving an object a unique persistent name within a particular database. This persistent object name can be given via a specific statement or operation in the program, as shown in Figure 11.3. The named persistent objects are used as entry points to the database through which users and applications can start their database access. Obviously, it is not practical to give names to all objects in a large database that includes thousands of objects, so most objects are made persistent by using the second mechanism, called reachability. The reachability mechanism works by making the object reachable from some other persistent object. An object B is said to be reachable from an object A if a sequence of references in the database lead from object A to object B.

These Topics Are Also In Your Syllabus
1 chapter 11 Object and Object-Relational Databases link
2 XML: Extensible Markup Language link
You May Find Something Very Interesting Here. link
3 Schema Diagrams link
4 Relational Query Languages link
5 Types Of Systems link

If we first create a named persistent object N, whose state is a set (or possibly a bag) of objects of some class C, we can make objects of C persistent by adding them to the set, thus making them reachable from N. Hence, N is a named object that defines a persistent collection of objects of class C. In the object model standard, N is called the extent of C (see Section 11.3).

For example, we can define a class DEPARTMENT_SET (see Figure 11.3) whose objects are of type set(DEPARTMENT).14 We can create an object of type DEPARTMENT_SET, and give it a persistent name ALL_DEPARTMENTS, as shown in Figure 11.3. Any DEPARTMENT object that is added to the set of ALL_DEPARTMENTS by using the add_dept operation becomes persistent by virtue of its being reachable from ALL_DEPARTMENTS. As we will see in Section 11.3, the ODMG ODL standard gives the schema designer the option of naming an extent as part of class definition.

Notice the difference between traditional database models and ODBs in this respect In traditional database models, such as the relational model, all objects are assumed to be persistent. Hence, when a table such as EMPLOYEE is created in a relational database, it represents both the type declaration for EMPLOYEE and a persistent set of all EMPLOYEE records (tuples). In the OO approach, a class declaration of EMPLOYEE specifies only the type and operations for a class of objects. The user must separately define a persistent object of type set(EMPLOYEE) or bag(EMPLOYEE) whose value is the collection of references (OIDs) to all persistent EMPLOYEE objects, if this is desired, as shown in Figure 11.3.15 This allows transient and persistent objects to follow the same type and class declarations of the ODL and the OOPL. In general, it is possible to define several persistent collections for the same class definition, if desired.

11.1.5 Type Hierarchies and Inheritance

Simplified Model for Inheritance Another main characteristic of ODBs is that they allow type hierarchies and inheritance. We use a simple OO model in this section—a model in which attributes and operations are treated uniformly—since both attributes and operations can be inherited. In Section 11.3, we will discuss the inheritance model of the ODMG standard, which differs from the model discussed here because it distinguishes between two types of inheritance. Inheritance allows the definition of new types based on other predefined types, leading to a type (or class) hierarchy.

These Topics Are Also In Your Syllabus
1 Relational Operations link
2 Overview of Object Database Concepts link
You May Find Something Very Interesting Here. link
3 Overview of the SQL Query Language link
4 SQL Data Definition link
5 Basic Structure of SQL Queries link

A type is defined by assigning it a type name, and then defining a number of attributes (instance variables) and operations (methods) for the type.16 In the simplified model we use in this section, the attributes and operations are together called functions, since attributes resemble functions with zero arguments. A function name can be used to refer to the value of an attribute or to refer to the resulting value of an operation (method). We use the term function to refer to both attributes and operations, since they are treated similarly in a basic introduction to inheritance.

A type in its simplest form has a type name and a list of visible (public) functions. When specifying a type in this section, we use the following format, which does not specify arguments of functions, to simplify the discussion:

     TYPE_NAME: function, function, ..., function

For example, a type that describes characteristics of a PERSON may be defined as follows:

         PERSON: Name, Address, Birth_date, Age, Ssn

These Topics Are Also In Your Syllabus
1 Database Languages link
2 Database Design link
You May Find Something Very Interesting Here. link
3 Relational Databases link
4 Relational Databases link
5 Relational Databases link

In the PERSON type, the Name, Address, Ssn, and Birth_date functions can be implemented as stored attributes, whereas the Age function can be implemented as an operation that calculates the Age from the value of the Birth_date attribute and the current date.

The concept of subtype is useful when the designer or user must create a new type that is similar but not identical to an already defined type. The subtype then inherits all the functions of the predefined type, which is referred to as the supertype. For example, suppose that we want to define two new types EMPLOYEE and STUDENT as follows:

   EMPLOYEE: Name, Address, Birth_date, Age,Ssn,Salary,Hire_date,Seniority

   STUDENT: Name, Address, Birth_date, Age, Ssn, Major, Gpa

Since both STUDENT and EMPLOYEE include all the functions defined for PERSON plus some additional functions of their own, we can declare them to be subtypes of PERSON. Each will inherit the previously defined functions of PERSON—namely, Name, Address, Birth_date, Age, and Ssn. For STUDENT, it is only necessary to define the new (local) functions Major and Gpa, which are not inherited. Presumably, Major can be defined as a stored attribute, whereas Gpa may be implemented as an operation that calculates the student’s grade point average by accessing the Grade values that are internally stored (hidden) within each STUDENT object as hidden attributes. For EMPLOYEE, the Salary and Hire_date functions may be stored attributes, whereas Seniority may be an operation that calculates Seniority from the value of Hire_date.

Therefore, we can declare EMPLOYEE and STUDENT as follows:

  EMPLOYEE subtype-of PERSON: Salary, Hire_date, Seniority

  STUDENT subtype-of PERSON: Major, Gpa

In general, a subtype includes all of the functions that are defined for its supertype plus some additional functions that are specific only to the subtype. Hence, it is possible to generate a type hierarchy to show the supertype/subtype relationships among all the types declared in the system.

As another example, consider a type that describes objects in plane geometry, which may be defined as follows:

GEOMETRY_OBJECT: Shape, Area, Reference_point

For the GEOMETRY_OBJECT type, Shape is implemented as an attribute (its domain can be an enumerated type with values ‘triangle’, ‘rectangle’, ‘circle’, and so on), and Area is a method that is applied to calculate the area. Reference_point specifies the coordinates of a point that determines the object location. Now suppose that we want to define a number of subtypes for the GEOMETRY_OBJECT type, as follows:

RECTANGLE subtype-of GEOMETRY_OBJECT: Width, Height

TRIANGLE S subtype-of GEOMETRY_OBJECT: Side1, Side2, Angle

CIRCLE subtype-of GEOMETRY_OBJECT: Radius

Notice that the Area operation may be implemented by a different method for each subtype, since the procedure for area calculation is different for rectangles, triangles, and circles. Similarly, the attribute Reference_point may have a different meaning for each subtype; it might be the center point for RECTANGLE and CIRCLE objects, and the vertex point between the two given sides for a TRIANGLE object.

Notice that type definitions describe objects but do not generate objects on their own. When an object is created, typically it belongs to one or more of these types that have been declared. For example, a circle object is of type CIRCLE and GEOMETRY_OBJECT (by inheritance). Each object also becomes a member of one or more persistent collections of objects (or extents), which are used to group together collections of objects that are persistently stored in the database.

Constraints on Extents Corresponding to a Type Hierarchy

In most ODBs, an extent is defined to store the collection of persistent objects for each type or subtype. In this case, the constraint is that every object in an extent that corresponds to a subtype must also be a member of the extent that corresponds to its supertype. Some OO database systems have a predefined system type (called the ROOT class or the OBJECT class) whose extent contains all the objects in the system.18 Classification then proceeds by assigning objects into additional subtypes that are meaningful to the application, creating a type hierarchy (or class hierarchy) for the system. All extents for system- and user-defined classes are subsets of the extent corresponding to the class OBJECT, directly or indirectly. In the ODMG model (see Section 11.3), the user may or may not specify an extent for each class (type), depending on the application.

An extent is a named persistent object whose value is a persistent collection that holds a collection of objects of the same type that are stored permanently in the database. The objects can be accessed and shared by multiple programs. It is also possible to create a transient collection, which exists temporarily during the execution of a program but is not kept when the program terminates. For example, a transient collection may be created in a program to hold the result of a query that selects some objects from a persistent collection and copies those objects into the transient collection. The program can then manipulate the objects in the transient collection, and once the program terminates, the transient collection ceases to exist. In general, numerous collections—transient or persistent—may contain objects of the same type.

The inheritance model discussed in this section is very simple. As we will see in Section 11.3, the ODMG model distinguishes between type inheritance—called interface inheritance and denoted by a colon (:)—and the extent inheritance constraint—denoted by the keyword EXTEND.

11.1.6 Other Object-Oriented Concepts?

Polymorphism of Operations (Operator Overloading).?Another characteristic of OO systems in general is that they provide for polymorphism of operations, which is also known as operator overloading. This concept allows the same operator name or symbol to be bound to two or more different implementations of the operator, depending on the type of objects to which the operator is applied. A simple example from programming languages can illustrate this concept. In some languages, the operator symbol “+” can mean different things when applied to operands (objects) of different types. If the operands of “+” are of type integer, the operation invoked is integer addition. If the operands of “+” are of type floating point, the operation invoked is floating point addition. If the operands of “+” are of type set, the operation invoked is set union. The compiler can determine which operation to execute based on the types of operands supplied.

In OO databases, a similar situation may occur. We can use the GEOMETRY_OBJECT example presented in Section 11.1.5 to illustrate operation polymorphism19 in ODB.

In this example, the function Area is declared for all objects of type GEOMETRY_OBJECT. However, the implementation of the method for Area may differ for each subtype of GEOMETRY_OBJECT. One possibility is to have a general implementation for calculating the area of a generalized GEOMETRY_OBJECT (forexample, by writing a general algorithm to calculate the area of a polygon) and then to rewrite more efficient algorithms to calculate the areas of specific types of geometric objects, such as a circle, a rectangle, a triangle, and so on. In this case, the Area function is overloaded by different implementations.

The ODMS must now select the appropriate method for the Area function based on the type of geometric object to which it is applied. In strongly typed systems, this can be done at compile time, since the object types must be known. This is termed early (or static) binding. However, in systems with weak typing or no typing (such as Smalltalk and LISP), the type of the object to which a function is applied may not be known until runtime. In this case, the function must check the type of object at runtime and then invoke the appropriate method. This is often referred to as late (or dynamic) binding.

Multiple Inheritance and Selective Inheritance. Multiple inheritance occurs when a certain subtype T is a subtype of two (or more) types and hence inherits the functions (attributes and methods) of both supertypes. For example, we may create a subtype ENGINEERING_MANAGER that is a subtype of both MANAGER and ENGINEER. This leads to the creation of a type lattice rather than a type hierarchy. One problem that can occur with multiple inheritance is that the supertypes from which the subtype inherits may have distinct functions of the same name, creating an ambiguity. For example, both MANAGER and ENGINEER may have a function called Salary. If the Salary function is implemented by different methods in the MANAGER and ENGINEER supertypes, an ambiguity exists as to which of the two is inherited by the subtype ENGINEERING_MANAGER. It is possible, however, that both ENGINEER and MANAGER inherit Salary from the same supertype (such as EMPLOYEE) higher up in the lattice. The general rule is that if a function is inherited from some common supertype, then it is inherited only once. In such a case, there is no ambiguity; the problem only arises if the functions are distinct in the two supertypes.

There are several techniques for dealing with ambiguity in multiple inheritance. One solution is to have the system check for ambiguity when the subtype is created, and to let the user explicitly choose which function is to be inherited at this time. A second solution is to use some system default. A third solution is to disallow multiple inheritance altogether if name ambiguity occurs, instead forcing the user to change the name of one of the functions in one of the supertypes. Indeed, some OO systems do not permit multiple inheritance at all. In the object database standard (see Section 11.3), multiple inheritance is allowed for operation inheritance of interfaces, but is not allowed for EXTENDS inheritance of classes.

Selective inheritance occurs when a subtype inherits only some of the functions of a supertype. Other functions are not inherited. In this case, an EXCEPT clause may be used to list the functions in a supertype that are not to be inherited by the subtype. The mechanism of selective inheritance is not typically provided in ODBs, but it is used more frequently in artificial intelligence applications.2

11.1.7 Summary of Object Database Concepts

To conclude this section, we give a summary of the main concepts used in ODBs and object-relational systems:

? Object identity. Objects have unique identities that are independent of their attribute values and are generated by the ODMS.

? Type constructors. Complex object structures can be constructed by applying in a nested manner a set of basic constructors, such as tuple, set, list, array, and bag.

? Encapsulation of operations. Both the object structure and the operations that can be applied to individual objects are included in the type definitions.

? Programming language compatibility. Both persistent and transient objects are handled seamlessly. Objects are made persistent by being reachable from a persistent collection (extent) or by explicit naming.

? Type hierarchies and inheritance. Object types can be specified by using a type hierarchy, which allows the inheritance of both attributes and methods (operations) of previously defined types. Multiple inheritance is allowed in some models.

? Extents. All persistent objects of a particular type can be stored in an extent. Extents corresponding to a type hierarchy have set/subset constraints enforced on their collections of persistent objects.

? Polymorphism and operator overloading. Operations and method names can be overloaded to apply to different object types with different implementations. In the following sections we show how these concepts are realized in the SQL standard (Section 11.2) and the ODMG standard (Section 11.3).


Rating - 3/5
475 views

Advertisements
Rating - 3/5