Home > Database Management System > Data Abstraction, Knowledge Representation, and Ontology Concepts

Data Abstraction, Knowledge Representation, and Ontology Concepts

<<
Example of Other Notation: Representing Specialization and Generalization in UML Class Diagrams

In this section we discuss in general terms some of the modeling concepts that we described quite specifically in our presentation of the ER and EER models in Chapter 7 and earlier in this chapter. This terminology is not only used in conceptual data modeling but also in artificial intelligence literature when discussing knowledge representation (KR). This section discusses the similarities and differences between conceptual modeling and knowledge representation, and introduces some of the alternative terminology and a few additional concepts. The goal of KR techniques is to develop concepts for accurately modeling some domain of knowledge by creating an ontology12 that describes the concepts of the domain and how these concepts are interrelated. Such an ontology is used to store and manipulate knowledge for drawing inferences, making decisions, or answering questions. The goals of KR are similar to those of semantic data models, but there are some important similarities and differences between the two disciplines:
? Both disciplines use an abstraction process to identify common properties and important aspects of objects in the miniworld (also known as domain of discourse in KR) while suppressing insignificant differences and unimportant details.
? Both disciplines provide concepts, relationships, constraints, operations, and languages for defining data and representing knowledge.
? KR is generally broader in scope than semantic data models. Different forms of knowledge, such as rules (used in inference, deduction, and search), incomplete and default knowledge, and temporal and spatial knowledge, are represented in KR schemes.
? KR schemes include reasoning mechanisms that deduce additional facts from the facts stored in a database. Hence, whereas most current database systems are limited to answering direct queries, knowledge-based systems using KR schemes can answer queries that involve inferences over the stored data.
? Whereas most data models concentrate on the representation of database schemas, or meta-knowledge, KR schemes often mix up the schemas with the instances themselves in order to provide flexibility in representing exceptions. This often results in inefficiencies when these KR schemes are implemented, especially when compared with databases and when a large amount of data (facts) needs to be stored.

ABSTRACTION CONCEPTS

We now discuss four abstraction concepts that are used in semantic data models, such as the EER model as well as in KR schemes:

(1) classification and instantiation

(2) identification

(3) specialization and generalization

(4) aggregation and association.

The paired concepts of classification and instantiation are inverses of one another, as are generalization and specialization.The concepts of aggregation and association are also related.

1.Classification and instatination : The process of classification involves systematically assigning similar objects/entities to object classes/entity types.We can now describe (in DB) or reason about (in KR) the classes rather than the individual objects. Collections of objects that share the same types of attributes, relationships, and constraints are classified into classes in order to simplify the process of discovering their properties. Instantiation is the inverse of classification and refers to the generation and specific examination of distinct objects of a class. An object instance is related to its object class by the IS-ANINSTANCE-OF or IS-A-MEMBER-OF relationship. Although EER diagrams do not display instances, the UML diagrams allow a form of instantiation by permitting the display of individual objects.We did not describe this feature in our introduction to UML class diagrams. In general, the objects of a class should have a similar type structure.However, some objects may display properties that differ in some respects from the other objects of the class; these exception objects also need to be modeled, and KR schemes allow more varied exceptions than do database models.

In addition, certain properties apply to the class as a whole and not to the individual objects; KR schemes allow such class properties. UML diagrams also allow specification of class properties. In the EER model, entities are classified into entity types according to their basic attributes and relationships. Entities are further classified into subclasses and categories based on additional similarities and differences (exceptions) among them. Relationship instances are classified into relationship types. Hence, entity types, subclasses, categories, and relationship types are the different concepts that are used for classification in the EER model. The EER model does not provide explicitly for
class properties, but it may be extended to do so. In UML, objects are classified into classes, and it is possible to display both class properties and individual objects. Knowledge representation models allow multiple classification schemes in which one class is an instance of another class (called a meta-class).Notice that this cannot be represented directly in the EER model, because we have only two levels—classes and instances. The only relationship among classes in the EER model is a superclass/ subclass relationship, whereas in some KR schemes an additional class/instance relationship can be represented directly in a class hierarchy. An instance may itself be another class, allowing multiple-level classification schemes.

2. Identification : Identification is the abstraction process whereby classes and objects are made uniquely identifiable by means of some identifier. For example, a class name uniquely identifies a whole class within a schema. An additional mechanism is necessary for telling distinct object instances apart by means of object identifiers. Moreover, it is necessary to identify multiple manifestations in the database of thesame real-world object. For example, we may have a tuple <‘Matthew Clarke’, ‘610618’, ‘376-9821’> in a PERSON relation and another tuple <‘301-54-0836’, ‘CS’, 3.8> in a STUDENT relation that happen to represent the same real-world entity. There is no way to identify the fact that these two\ database objects (tuples) represent the same real-world entity unless we make a provision at design time for appropriate cross-referencing to supply this identification. Hence, identification is needed at two levels:
? To distinguish among database objects and classes
? To identify database objects and to relate them to their real-world counterparts

In the EER model, identification of schema constructs is based on a system of unique names for the constructs in a schema. For example, every class in an EER schema—whether it is an entity type, a subclass, a category, or a relationship type— must have a distinct name. The names of attributes of a particular class must also be distinct. Rules for unambiguously identifying attribute name references in a specialization or generalization lattice or hierarchy are needed as well. At the object level, the values of key attributes are used to distinguish among entities of a particular entity type. For weak entity types, entities are identified by a combination of their own partial key values and the entities they are related to in the owner entity type(s). Relationship instances are identified by some combination of the entities that they relate to, depending on the cardinality ratio specified.

3. Specialization and Generalization : Specialization is the process of classifying a class of objects into more specialized subclasses. Generalization is the inverse process of generalizing several classes into a higher-level abstract class that includes the objects in all these classes. Specialization is conceptual refinement, whereas generalization is conceptual synthesis. Subclasses are used in the EER model to represent specialization and generalization. We call the relationship between a subclass and its superclass an IS-A-SUBCLASS-OF relationship, or simply an IS-A relationship. This is the same as the IS-A relationship discussed earlier in Section 8.5.3.

4. Aggregration and Association : Aggregation is an abstraction concept for building composite objects from their component objects. There are three cases where this concept can be related to the EER model. The first case is the situation in which we aggregate attribute values of an object to form the whole object. The second case is when we represent an aggregation relationship as an ordinary relationship. The third case, which the EER model does not provide for explicitly, involves the possibility of combining objects that are related by a particular relationship instance into a higher-level aggregate object. This is sometimes useful when the higher-level aggregate object is itself to be related to another object.We call the relationship between the primitive objects and their aggregate object IS-A-PART-OF; the inverse is called IS-A-COMPONENTOF. UML provides for all three types of aggregation. The abstraction of association is used to associate objects from several independent classes. Hence, it is somewhat similar to the second use of aggregation. It is represented in the EER model by relationship types, and in UML by associations. This abstract relationship is called IS-ASSOCIATED-WITH. In order to understand the different uses of aggregation better, consider the ER schema shown in Figure 8.11(a), which stores information about interviews by job applicants to various companies.

The class COMPANY is an aggregation of the attributes (or component objects) Cname (company name) and Caddress (company address), whereas JOB_APPLICANT is an aggregate of Ssn, Name, Address, and Phone.
The relationship attributes Contact_name and Contact_phone represent the name and phone number of the person in the company who is responsible for the interview.
Suppose that some interviews result in job offers, whereas others do not.We would like to treat INTERVIEW as a class to associate it with JOB_OFFER. The schema shown in Figure 8.11(b) is incorrect because it requires each interview relationship instance to have a job offer. The schema shown in Figure 8.11(c) is not allowed because the ER model does not allow relationships among relationships.
One way to represent this situation is to create a higher-level aggregate class composed of COMPANY, JOB_APPLICANT, and INTERVIEW and to relate this class to JOB_OFFER, as shown in Figure 8.11(d). Although the EER model as described in this book does not have this facility, some semantic data models do allow it and call the resulting object a composite or molecular object. Other models treat entity types and relationship types uniformly and hence permit relationships among relationships, as illustrated in Figure 8.11(c).
To represent this situation correctly in the ER model as described here, we need to create a new weak entity type INTERVIEW, as shown in Figure 8.11(e), and relate it to JOB_OFFER. Hence, we can always represent these situations correctly in the ER model by creating additional entity types, although it may be conceptually more desirable to allow direct representation of aggregation, as in Figure 8.11(d), or to allow relationships among relationships, as in Figure 8.11(c).

The main structural distinction between aggregation and association is that when an association instance is deleted, the participating objects may continue to exist. However, if we support the notion of an aggregate object—for example, a CAR that is made up of objects ENGINE, CHASSIS, and TIRES—then deleting the aggregate CAR object amounts to deleting all its component objects.

Frequently Asked Questions

Example of Other Notation: Representing Specialization and Generalization in UML Class Diagrams

Ans: The basic notation for specialization/generalization is to connect the subclasses by vertical lines to a horizontal line, which has a triangle connecting the horizontal line through another vertical line to the superclass. A blank triangle indicates a specialization/generalization with the disjoint constraint, and a filled triangle indicates an overlapping constraint. view more..

A Sample UNIVERSITY EER Schema, Design Choices, and Formal Definitions

Ans: For our sample database application, consider a UNIVERSITY database that keeps track of students and their majors, transcripts, and registration as well as of the university’s course offerings. The database also keeps track of the sponsored research projects of faculty and graduate students. view more..

Modeling of UNION Types Using Categories

Ans: it is sometimes necessary to represent a single superclass/subclass relationship with more than one superclass, where the superclasses represent different entity types. In this case, the subclass will represent a collection of objects that is a subset of the UNION of distinct entity types. view more..

Data Abstraction, Knowledge Representation, and Ontology Concepts

Ans: The similarities and differences between conceptual modeling and knowledge representation, and introduces some of the alternative terminology and a few additional concepts.The goal of KR techniques is to develop concepts for accurately modeling some domain of knowledge by creating an ontology that describes the concepts of the domain and how these concepts are interrelated. view more..

Using High-Level Conceptual Data Models for Database Design

Ans: Recovery from transaction failures usually means that the database is restored to the most recent consistent state just before the time of failure. To do this, the system must keep information about the changes that were applied to data items by the various transactions. view more..

Using High-Level Conceptual Data Models for Database Design

Ans: Several types of locks are used in concurrency control. To introduce locking concepts gradually, first we discuss binary locks, which are simple, but are also too restrictive for database concurrency control purposes, and so are not used in practice. Then shared/exclusive locks - also known as read/write locks - which provide more general locking capabilities and are used in practical database locking systems. view more..

Using High-Level Conceptual Data Models for Database Design

Ans: In this section we discuss the concepts of concurrent execution of transactions and recovery from transaction failures. view more..

Using High-Level Conceptual Data Models for Database Design

Ans: Conceptual modeling is a very important phase in designing a successful database application. Generally, the term database application refers to a particular database and the associated programs that implement the database queries and updates. view more..

Purpose of Database Systems

Ans: Data is converted into information, and information is then evaluated and organised so that it can be used purposefully as knowledge. view more..

View of Data

Ans: A database system is a collection of interrelated data and a set of programs that allow users to access and modify these data. A major purpose of a database system is to provide users with an abstract view of the data. That is, the system hides certain details of how the data are stored and maintained. view more..

Introduction to Transaction Processing

Ans: In this section we discuss the concepts of concurrent execution of transactions and recovery from transaction failures view more..

Database Security

Ans: This chapter discusses techniques for securing databases against a variety of threats. It also presents schemes of providing access privileges to authorized users. view more..

Database Languages

Ans: A database system provides a data-definition language to specify the database schema and a data-manipulation language to express database queries and updates. view more..

Database Design

Ans: Database design mainly involves the design of the database schema. The design of a complete database application environment that meets the needs of the enterprise being modelled requires attention to a broader set of issues. view more..

Relational Databases

Ans: A relational database is based on the relational model and uses a collection of tables to represent both data and the relationships among those data. view more..

Relational Databases

Ans: A relational database is based on the relational model and uses a collection of tables to represent both data and the relationships among those data. view more..

Relational Databases

Ans: A relational database is based on the relational model and uses a collection of tables to represent both data and the relationships among those data. view more..

Data Storage and Querying

Ans: A database system is partitioned into modules that deal with each of the responsibilities of the overall system. The functional components of a database system can be broadly divided into the storage manager and the query processor components. view more..

Next Article >>
Using High-Level Conceptual Data Models for Database Design

Thank you

you will get you notes soon

Having trouble in finding the notes for your syllabus?

Do you want to earn some cash? CLICK ME...

You have a question? thenWe will be Happy to Help You

Data Abstraction, Knowledge Representation, and Ontology Concepts

ABSTRACTION CONCEPTS

Frequently Asked Questions

Recommended Posts:

You have a question? then
We will be Happy to Help You