Relational Models II SS2 Data Processing Lesson Note

Download Lesson Note
Lesson Notes

Topic: Relational Models II

RELATIONAL MODEL: INTEGRITY CONSTRAINTS

Integrity Constraints Over Relations

An integrity constraint (IC) is a condition specified on a database schema and restricts the data that can be stored in an instance of the database. If a database instance satisfies all the integrity constraints specified on the database schema, it is a legal instance. A DBMS permits only legal instances to be stored in the database.

Many kinds of integrity constraints can be specified in the relational model:

Database integrity refers to the validity and consistency of stored data. Integrity is usually expressed in terms of constraints, which are consistency rules that the database is not permitted to violate.

Constraints may apply to each attribute or they may apply to relationships between tables.

Integrity constraints ensure that changes (update deletion, insertion) made to the database by authorized users do not result in a loss of data consistency. Thus, integrity constraints guard against accidental damage to the database.

EXAMPLE- A brood group must be ‘A’ or ‘B’ or ‘AB’ or ‘O’ only (can not be any other values else).

TYPES OF INTEGRITY CONSTRAINTS

Various types of integrity constraints are-

  1. Domain Integrity
  2. Entity Integrity Constraint
  3. Referential Integrity Constraint
  4. Key Constraints
  1. Domain Integrity- Domain integrity means the definition of a valid set of values for an attribute. You define a data type, length or size, whether is nulis l value allowed, is the value unique or not for an attribute, the default value, the range (values in between) and/or specific values for the attribute. 
  1. Entity Integrity Constraint- This rule states that in any database relation value of an attribute of a primary key can’t be null.

EXAMPLE- Consider a relation “STUDENT” Where “Stu_id” is a primary key and it must not contain any null value whereas other attributes may contain null value e.g.  “Branch” in the following relation contains one null value.

Stu_id Name Branch
11255234 Aman CSE
11255369 Kapil ECE
11255324 Ajay
11255237 Raman CSE
11255678 Aastha ECE

3.  Referential Integrity Constraint states that if a foreign key exists in a relation then either the foreign key value must match a primary key value of some tuple in its home relation or the foreign key value must be null.

  1. Key Constraints- A Key Constraint is a statement/condition that a certain minimal subset of the fields of a relation is a unique identifier for a tuple.

There are 4 types of  key constraints-

  • Candidate key.
  • Super key
  • Primary key
  • Foreign key

QUERYING RELATIONAL DATABASE

Several RDBMS vendors provide a GUI to aid their users in developing queries. These can be particularly helpful to novice users as it enables them to learn the overarching concepts involved in query development without getting bogged down in syntax details. 

For this reason, we will start the course with Microsoft Access, which provides perhaps the most user-friendly interface.

A. Download an Access database and review its tables:

Throughout this lesson, we’ll use a database of baseball statistics to help demonstrate the basics of SELECT queries.

  1. Open the database in MS Access.

One part of the Access interface that you’ll use frequently is the “Navigation Pane,” which is situated on the left side of the application window. The top of the Navigation Pane is just beneath the “Ribbon” (the strip of controls that runs horizontally along the top of the window).

The Navigation Pane provides access to the objects stored in the database, such as tables, queries, forms and reports. When you first open the baseball_stats.accdb database, the Navigation Pane should appear with the word Tables at the top, indicating that it is listing the tables stored in the database (PLAYERS, STATS and TEAMS).

  1. Double-click on a table’s name in the Navigation Pane to open it. Open all three tables and review the content. Note that the STATS table contains an ID for each player rather than his name. The names associated with the IDs are stored in the PLAYERS table.

B. Write a simple SELECT query

With our first query, we’ll retrieve data from selected fields in the STATS table.

  1. Click on the Create tab near the top of the application window.
  2. Next, click on the Query Design button (found on the left side of the Create Ribbon in the group of commands labelled Queries).
  3. In the Show Table dialog, double-click on the STATS table to add it to the query and click Close, (when you do this in Access 2010 the ribbon switches to the Design ribbon).
  4. Double-click on PLAYER_ID in the list of fields in the STATS table to add that field to the design grid below.
  5. Repeat this step to add the YEAR and RBI fields.

6. At any time, you can view the SQL that’s created by your GUI settings by accessing the View drop-down list on the far-left side of the Design Ribbon, (it is also available when you have the Home tab selected, as shown below).

As you go through the next steps look at the SQL that corresponds to queries you are building.

C. Restrict the returned records to a desired subset

  1. From the same View drop-down list, select Design View to return to the query design GUI.
  2. In the design grid, set the Criteria value for the RBI field to >99.
  3. Test the query by clicking on the red exclamation point, (it should return 103 records).

D. Sort the returned records

  1. Return to Design View.
  2. In the design grid click in the Sort cell under the RBI column and select Descending from the drop-down list.
  3.  This will sort the records from highest RBI total to lowest.
  4. Test the query.

E. Add additional criteria to the selection

  1. Return to Design View and set the Criteria value for the YEAR field to >1989. This will limit the results to seasons of over 100 RBI since 1990.
  2. Test the query, (it should return 53 records).
  3. Return to Design View and modify the Criteria value for the YEAR field to >1989 And<2000, which will further limit the results to just the 1990s.
  4. Test the query, (it should return 34 records).
  5. Return to Design View and change the Criteria value for the YEAR field back to >1989, beneath that cell (in the:or cell) add <1960.

As you should be able to guess, I’m asking you to write a query that identifies 100 RBI seasons since 1989 OR before 1960. However, the query as written at this point doesn’t quite yield that result, look at the WHERE line in the SQL view. Instead, it would return 100 RBI seasons since 1989 and all seasons before 1960 (not just the 100 RBI ones). To produce the desired result you need to repeat the >99 criterion in the RBI field’s:or cell, and check the SQL view to see the change.

F. Test the query, (it should return 74 records).

You’ve probably recognized by now that the output from these queries is not particularly human-friendly. In the next part of the lesson, we’ll see how to use a join between the two tables to add the names of the players to the query output.

Lesson Notes for Other Classes