-
In this video, we'll learn about the Relational Model.
-
The Relational Model is more than
-
35 years old, and it's
-
really the foundation of database management systems.
-
It's spawned a many billion dollar industry.
-
The relational model underlies all
-
commercial database systems at this point in time.
-
It's actually an extremely simple model and that's one of its benefits.
-
Furthermore, it can be queried.
-
By that I mean we can ask
-
questions of databases in the
-
model using High Level Languages.
-
High Level Languages are simple, yet
-
extremely expressive for asking questions over the database.
-
And finally, very importantly there
-
are extremely efficient implementations of
-
the relational model and of
-
the query languages on that model.
-
So let's move ahead and
-
understand the basic constructs in the relational model.
-
So, the primary construct is in fact, the relation.
-
A database consists of a
-
set of relations or sometimes
-
referred to as "tables", each of which has a name.
-
So, we're gonna use two relations in our example.
-
Our example is gonna be a fictitious database
-
about students applying to colleges.
-
For now we're just gonna look at the students and colleges themselves.
-
So we're gonna have two tables,
-
and let's call those tables
-
the Student table and the College table.
-
Now, as an aside, there's
-
a healthy debate in the
-
database world about whether tables
-
relations ought to be named using the singular or the plural.
-
I personally don't have a stake in that debate.
-
I'm going to use the singular.
-
Next, we have the concept of attributes.
-
So every relation and relational
-
database has a predefined set
-
of columns or attributes each of which has a name.
-
So, for our student table,
-
let's say that each student
-
is gonna have an ID, a
-
name, a GPA and a photo.
-
And for our college
-
table, let's say that every
-
college is going to have
-
a name, a state, and an enrollment.
-
We'll just abbreviate that ENR.
-
So those are the labeled columns.
-
Now the actual data itself
-
is stored in what are
-
called the tuples (or the rows) in the tables.
-
So let's put a couple
-
of the data tables, data tuples in our tables.
-
So let's start with the
-
students and lets say that
-
our first student has ID 123.
-
Name is Amy -
-
GPA 3.9, and she's happy with that.
-
So she has a smiley photo.
-
And our second student is 234,
-
Bob - his GPA
-
is 3.4.
-
He's not quite as happy.
-
And typically of course a
-
table will have thousands, maybe
-
millions, even sometimes billions of
-
rows, each row containing a value for each attribute.
-
In our college table, let's suppose
-
we have .... well of course
-
we're going to start with Stanford in
-
the state of California and Stanford's enrollment is 15,000.
-
We'll include our cross-bay rival Berkeley
-
again, in the state of California.
-
Berkeley's enrollment is a whopping 36,000.
-
And last of all,
-
we are going to not be West Coast biased.
-
We'll include MIT in the
-
state of Massachusetts with an enrollment of 10,000.
-
Now of course there's
-
gonna be again many more
-
tuples in the college table, and
-
many more tuples in the student table.
-
Okay, next, let me
-
mention that in a relational
-
database, typically each attribute or
-
column has a type
-
sometimes referred to as a domain.
-
For example, the ID might
-
be an integer, the name
-
might be a string, GPA might
-
be a float, photo might
-
be a jpeg file.
-
We do also in most relational
-
databases have a concept of enumerated domain.
-
So for example, the state might
-
be an enumerated domain for
-
the 50 abbreviations for states.
-
Now, it's typical for relational
-
databases to have just
-
atomic types in their
-
attributes as we have
-
here, but many database
-
systems do also support structured
-
types inside attributes.
-
Okay, a little bit more terminology.
-
The schema of a
-
database is the structure of the relation.
-
So the schema includes the name
-
of the relation and the attributes
-
of the relation and the types of those attributes.
-
Where the instance is the
-
actual contents of the table at a given point in time.
-
So, typically you set up
-
a schema in advance, then the
-
instances of the data will change over time.
-
Now, I mentioned that most columns have types.
-
But, there's also a special value
-
that's in any type of
-
any column and that's a
-
special value known as null, and nulls
-
are actually quite important in relational databases.
-
Null values are used to
-
denote that a particular
-
value is maybe unknown or undefined.
-
And, so let's suppose, let's add
-
another tuple to our database.
-
Let's say 345, another student
-
named Craig, and for,
-
whatever reason Craig doesn't have a GPA.
-
Maybe Craig is home schooled,
-
maybe Craig doesn't want to reveal his GPA.
-
So then the database would contain
-
a null value for Craig
-
and we'll just put a neutral face there.
-
Or, for example, maybe Bob doesn't
-
want to have his photo in
-
the database, so then Bob
-
would have a null value
-
for his photo, again nulls can go anywhere.
-
Now null values are useful
-
but one has to be very
-
careful in a database system
-
when you run queries
-
over relations that have null values.
-
In a later video we'll go
-
into this in more detail, but I
-
just wanted to give a just
-
sort of example of what can happen.
-
So, let's suppose we're
-
asking a query over our
-
student table of all students
-
whose GPA is greater than 3.5.
-
So when we run
-
that query on our database
-
obviously we'll get Amy out,
-
obviously we won't get Bob out, but should we get Craig?
-
The answer is No.
-
We don't know for a fact
-
that Craig's GPA is greater
-
than 3.5, so we'll only get
-
one student out from that query.
-
Now let's suppose we
-
had another query, where we
-
were gonna ask for the GPA
-
less than or equal to 3.5.
-
So, similarly where we
-
would not have Amy in result
-
and we would certainly have Bob in
-
the result and similarly would
-
not have Craig in the
-
result because we don't know that
-
his GPA is less than or equal to 3.5.
-
So far so good, but
-
it gets a little weird is
-
when we add an or here
-
in our query, we say I
-
want everyone who's GPA is greater
-
than 3.5 or who's GPA is less than or equal to 3.5.
-
And even though it
-
looks like every tuple should
-
satisfy this condition, that it's
-
always true, that's not the case when we have null values.
-
So, that's why one has to
-
be careful when one uses
-
null values in relational databases.
-
Let me erase this now and
-
let's move on to our next concept
-
which is the concept of Key.
-
Key is again another important concept in relational databases.
-
And, a key is an
-
attribute in of a relation
-
where every value for that attribute is unique.
-
So if we look at the
-
student relation, we can feel
-
pretty confident that the ID is going to be a key.
-
In other words, every tuple is
-
going to have a unique for ID.
-
Thinking about the college relation, it's a little less clear.
-
We might be tempted to say
-
that the name of the college is
-
an ID, that actually college
-
names probably are not unique across the country.
-
There's probably a lot of or
-
several colleges named Washington college for example.
-
You know what, we're allowed to
-
have sets of attributes that
-
are unique and that makes sense in the college relation.
-
Most likely the combination of
-
the name and state of
-
a college is unique, and that's
-
what we would identify as the key for the college relation.
-
Now, you might wonder why it's
-
even important to have attributes that are identified as keys.
-
There's actually several uses for them.
-
One of them is just to identify specific tuples.
-
So if you want to run
-
a query to get a specific
-
tuple out of the database you
-
would do that by asking for that tuple by its key.
-
And related to that
-
database systems for efficiency
-
tend to build special index
-
structures or store the database in a particular way.
-
So it's very fast to find a tuple based on its key.
-
And lastly, if one relation
-
in a relational database wants
-
to refer to tuples of another,
-
there 's no concept of pointer in relational databases.
-
Therefore, the first relation
-
will typically refer to a
-
tuple in the second relation by its unique key.
-
As our videos develop we'll see the importance of keys.
-
Okay, just to wrap up,
-
I'll mention how one creates relations
-
or tables in the SQL language.
-
It's very simple, you just
-
say "create table," give the
-
name of the relation and a list of the attributes.
-
And if you want to give types for the attributes.
-
It's similar except you follow
-
each attribute name with its type.
-
So to wrap up, the relational
-
model has been around a long time.
-
Has started a huge industry.
-
It's used by all database systems.
-
As you've seen it's a very
-
simple model and will shortly
-
see that it can be queried with very nice languages.
-
And, finally, it's been implemented very efficiently.