'Traditional/file-based' systems (p. 196-198) (misnomer!):
Data reside in files.
Applications write/read data to and from the files.
Files can be ASCII (text) or binary.
Data in files maintain no or few relationships!!
Problems with 'file approach':
Different functions in an organization use separate versions of
the same data files:
Duplication of data.
Duplication of work when data must be changed/updated.
Error prone.
Data integrity difficult to maintain.
Database systems:
Data still in files (see figure 5.1)!!!
Applications
send requests for writing/reading data to an intervening database
mangement
system that controls the files (figure 5.4).
Files are mostly binary.
Applications do never(!!) directly access the file data. They
interact with the database software.
The database system maintains relationships between data.
Therefore:
Data can be kept in a single place.
Data can be in a single repository but spatially distributed.
Applications do not have to implement complicated file/data
access.
Less error prone.
The database can maintain data integrity.
Databases are self-describing (data dictionary).
Relational model of data (p. 203-204) (E.F. Codd, 1970s):
Relations express things and their characteristics or
attributes (states!).
Tables represent collections or sets of
things; e.g.,
A collection of students.
A collection of professors.
A collection of courses.
A collection of parking permits.
Etc.
A table row or record represents a single
thing or relation from the set; e.g., a single student, course,
a parking permit, etc.
Table columns or fields represent the
characteristics or attributes of the things:
A student:
Has an id.
Has a name.
Has an address.
A course:
Has a course number.
Has a name.
Has a time.
Has a place.
A professor:
Has an id.
Is rostered in a department.
Relationships express the association between
relations; e.g.,
Students and professors can have parking
permits.
A student can be enrolled in one or more courses.
A course in taught by a professor.
A professor works in a department.
Relationships are expressed by one relation referencing another;
The parking permit references a student
or
a professor.
Let's see if we can draw this up in a relational manner:
person:
person_id
name
address
type (student or professor).
student:
student_id (must exist as a person_id)
college_name
gpa
etc.
professor:
professor_id (must exist as a person_id)
dept_name
parking_permit:
id
holder_id (must exist as a person_id)
license
course:
name
time
place
professor_id
enrollment:
student_id
course_id
college:
college_name
dean_name
address
etc.
department:
dept_name
college_name
Notice how, through the department table, the
professor
is linked to a college. If the dept. changes college,
only one change must be made for all professors of that dept.
to move to the new college. The same applies when trying to find
the professor's dean or when a college changes its dean.
Database integrity constraints:
Primary key constraints:
Each person id should be unique.
Each course id should be unique.
Each college and department id should
be
unique.
Each parking permit id should be unique.
Etc.
Nullability: Certain fields in a table row may
remain
unused; e.g., although a
course might have
been scheduled, it might not yet have
a professor.
Referential (Foreign key) constraints:
Each student_id entered into the student
table must exist as a person_id in the person table.
Each professor_id entered into the professor
table must exist as a person_id in the person table.
Each holder_id entered into the parking_permit
table must exist in the person table.
Each student_id entered into the enrollment
table must exist in the student table.
Etc.
In relational databases, you specify these constraints
once
after which the database management system guards against their
violation.
Who or what interacts with a DBMS?
Database administrators.
Database application programmers.
Students of databases.
Application software (almost 100% of the interactions).
How to interact with a DBMS?
Direct manipulation: pick, click and drag; e.g., Ms. Access.
Structured Query Language (SQL):
Find out which professors have a parking permit through
the
application of a join:
select
name
from
professor,
person, parking_permit
where
person_id
= professor_id
and
professor_id
= holder_id
Problem:
Why use a language if you
can pick, click and drag?
Some recent developments:
Object-oriented vs. relational.
Memory-based/resident relational databases.
Distributed databases; e.g., what about data-carrying
RFIDs?
Use the proper tools in the proper places:
Relational databases are not the best choice for spatial
or
temporal information.
Relational databases are not the best choice for
full-text
searching.