Return to Video

02-01-relational-model.mp4

  • 0:01 - 0:03
    In this video, we'll learn about the Relational Model.
  • 0:04 - 0:05
    The Relational Model is more than
  • 0:06 - 0:07
    35 years old, and it's
  • 0:08 - 0:09
    really the foundation of database management systems.
  • 0:10 - 0:12
    It's spawned a many billion dollar industry.
  • 0:14 - 0:15
    The relational model underlies all
  • 0:16 - 0:18
    commercial database systems at this point in time.
  • 0:19 - 0:21
    It's actually an extremely simple model and that's one of its benefits.
  • 0:22 - 0:24
    Furthermore, it can be queried.
  • 0:24 - 0:25
    By that I mean we can ask
  • 0:26 - 0:27
    questions of databases in the
  • 0:27 - 0:29
    model using High Level Languages.
  • 0:30 - 0:31
    High Level Languages are simple, yet
  • 0:32 - 0:34
    extremely expressive for asking questions over the database.
  • 0:35 - 0:37
    And finally, very importantly there
  • 0:37 - 0:40
    are extremely efficient implementations of
  • 0:40 - 0:42
    the relational model and of
  • 0:42 - 0:43
    the query languages on that model.
  • 0:44 - 0:45
    So let's move ahead and
  • 0:45 - 0:47
    understand the basic constructs in the relational model.
  • 0:49 - 0:51
    So, the primary construct is in fact, the relation.
  • 0:52 - 0:53
    A database consists of a
  • 0:53 - 0:55
    set of relations or sometimes
  • 0:55 - 0:58
    referred to as "tables", each of which has a name.
  • 0:58 - 1:01
    So, we're gonna use two relations in our example.
  • 1:02 - 1:03
    Our example is gonna be a fictitious database
  • 1:04 - 1:05
    about students applying to colleges.
  • 1:05 - 1:07
    For now we're just gonna look at the students and colleges themselves.
  • 1:08 - 1:09
    So we're gonna have two tables,
  • 1:10 - 1:11
    and let's call those tables
  • 1:12 - 1:14
    the Student table and the College table.
  • 1:15 - 1:16
    Now, as an aside, there's
  • 1:17 - 1:18
    a healthy debate in the
  • 1:18 - 1:20
    database world about whether tables
  • 1:20 - 1:23
    relations ought to be named using the singular or the plural.
  • 1:23 - 1:25
    I personally don't have a stake in that debate.
  • 1:25 - 1:26
    I'm going to use the singular.
  • 1:28 - 1:30
    Next, we have the concept of attributes.
  • 1:30 - 1:32
    So every relation and relational
  • 1:32 - 1:34
    database has a predefined set
  • 1:34 - 1:37
    of columns or attributes each of which has a name.
  • 1:38 - 1:39
    So, for our student table,
  • 1:40 - 1:41
    let's say that each student
  • 1:41 - 1:42
    is gonna have an ID, a
  • 1:43 - 1:46
    name, a GPA and a photo.
  • 1:47 - 1:48
    And for our college
  • 1:49 - 1:50
    table, let's say that every
  • 1:50 - 1:51
    college is going to have
  • 1:51 - 1:54
    a name, a state, and an enrollment.
  • 1:55 - 1:56
    We'll just abbreviate that ENR.
  • 1:58 - 1:59
    So those are the labeled columns.
  • 2:00 - 2:01
    Now the actual data itself
  • 2:02 - 2:02
    is stored in what are
  • 2:02 - 2:05
    called the tuples (or the rows) in the tables.
  • 2:06 - 2:07
    So let's put a couple
  • 2:07 - 2:09
    of the data tables, data tuples in our tables.
  • 2:09 - 2:12
    So let's start with the
  • 2:12 - 2:14
    students and lets say that
  • 2:14 - 2:15
    our first student has ID 123.
  • 2:16 - 2:18
    Name is Amy -
  • 2:18 - 2:22
    GPA 3.9, and she's happy with that.
  • 2:22 - 2:23
    So she has a smiley photo.
  • 2:24 - 2:25
    And our second student is 234,
  • 2:26 - 2:28
    Bob - his GPA
  • 2:30 - 2:30
    is 3.4.
  • 2:30 - 2:31
    He's not quite as happy.
  • 2:32 - 2:33
    And typically of course a
  • 2:33 - 2:35
    table will have thousands, maybe
  • 2:35 - 2:37
    millions, even sometimes billions of
  • 2:37 - 2:40
    rows, each row containing a value for each attribute.
  • 2:41 - 2:43
    In our college table, let's suppose
  • 2:43 - 2:44
    we have .... well of course
  • 2:44 - 2:47
    we're going to start with Stanford in
  • 2:47 - 2:50
    the state of California and Stanford's enrollment is 15,000.
  • 2:52 - 2:54
    We'll include our cross-bay rival Berkeley
  • 2:55 - 2:56
    again, in the state of California.
  • 2:56 - 2:58
    Berkeley's enrollment is a whopping 36,000.
  • 2:59 - 3:01
    And last of all,
  • 3:01 - 3:04
    we are going to not be West Coast biased.
  • 3:04 - 3:06
    We'll include MIT in the
  • 3:06 - 3:08
    state of Massachusetts with an enrollment of 10,000.
  • 3:08 - 3:10
    Now of course there's
  • 3:10 - 3:11
    gonna be again many more
  • 3:11 - 3:12
    tuples in the college table, and
  • 3:12 - 3:14
    many more tuples in the student table.
  • 3:15 - 3:17
    Okay, next, let me
  • 3:17 - 3:18
    mention that in a relational
  • 3:19 - 3:20
    database, typically each attribute or
  • 3:20 - 3:21
    column has a type
  • 3:22 - 3:23
    sometimes referred to as a domain.
  • 3:24 - 3:25
    For example, the ID might
  • 3:25 - 3:26
    be an integer, the name
  • 3:27 - 3:28
    might be a string, GPA might
  • 3:28 - 3:30
    be a float, photo might
  • 3:30 - 3:32
    be a jpeg file.
  • 3:33 - 3:34
    We do also in most relational
  • 3:35 - 3:37
    databases have a concept of enumerated domain.
  • 3:37 - 3:38
    So for example, the state might
  • 3:39 - 3:41
    be an enumerated domain for
  • 3:41 - 3:43
    the 50 abbreviations for states.
  • 3:44 - 3:45
    Now, it's typical for relational
  • 3:45 - 3:46
    databases to have just
  • 3:47 - 3:48
    atomic types in their
  • 3:48 - 3:50
    attributes as we have
  • 3:50 - 3:52
    here, but many database
  • 3:52 - 3:54
    systems do also support structured
  • 3:54 - 3:55
    types inside attributes.
  • 3:59 - 3:59
    Okay, a little bit more terminology.
  • 4:00 - 4:01
    The schema of a
  • 4:01 - 4:04
    database is the structure of the relation.
  • 4:05 - 4:07
    So the schema includes the name
  • 4:07 - 4:09
    of the relation and the attributes
  • 4:09 - 4:11
    of the relation and the types of those attributes.
  • 4:12 - 4:13
    Where the instance is the
  • 4:13 - 4:16
    actual contents of the table at a given point in time.
  • 4:16 - 4:17
    So, typically you set up
  • 4:17 - 4:19
    a schema in advance, then the
  • 4:19 - 4:22
    instances of the data will change over time.
  • 4:24 - 4:26
    Now, I mentioned that most columns have types.
  • 4:27 - 4:28
    But, there's also a special value
  • 4:28 - 4:30
    that's in any type of
  • 4:30 - 4:31
    any column and that's a
  • 4:31 - 4:33
    special value known as null, and nulls
  • 4:33 - 4:34
    are actually quite important in relational databases.
  • 4:35 - 4:37
    Null values are used to
  • 4:37 - 4:38
    denote that a particular
  • 4:39 - 4:41
    value is maybe unknown or undefined.
  • 4:43 - 4:44
    And, so let's suppose, let's add
  • 4:45 - 4:47
    another tuple to our database.
  • 4:48 - 4:50
    Let's say 345, another student
  • 4:50 - 4:52
    named Craig, and for,
  • 4:52 - 4:53
    whatever reason Craig doesn't have a GPA.
  • 4:54 - 4:55
    Maybe Craig is home schooled,
  • 4:55 - 4:57
    maybe Craig doesn't want to reveal his GPA.
  • 4:58 - 4:59
    So then the database would contain
  • 4:59 - 5:01
    a null value for Craig
  • 5:02 - 5:03
    and we'll just put a neutral face there.
  • 5:04 - 5:07
    Or, for example, maybe Bob doesn't
  • 5:07 - 5:08
    want to have his photo in
  • 5:08 - 5:10
    the database, so then Bob
  • 5:11 - 5:12
    would have a null value
  • 5:13 - 5:15
    for his photo, again nulls can go anywhere.
  • 5:16 - 5:17
    Now null values are useful
  • 5:17 - 5:18
    but one has to be very
  • 5:18 - 5:20
    careful in a database system
  • 5:20 - 5:21
    when you run queries
  • 5:22 - 5:23
    over relations that have null values.
  • 5:24 - 5:25
    In a later video we'll go
  • 5:25 - 5:26
    into this in more detail, but I
  • 5:26 - 5:27
    just wanted to give a just
  • 5:27 - 5:28
    sort of example of what can happen.
  • 5:29 - 5:30
    So, let's suppose we're
  • 5:30 - 5:32
    asking a query over our
  • 5:32 - 5:33
    student table of all students
  • 5:33 - 5:35
    whose GPA is greater than 3.5.
  • 5:36 - 5:37
    So when we run
  • 5:37 - 5:38
    that query on our database
  • 5:39 - 5:40
    obviously we'll get Amy out,
  • 5:41 - 5:43
    obviously we won't get Bob out, but should we get Craig?
  • 5:43 - 5:44
    The answer is No.
  • 5:45 - 5:46
    We don't know for a fact
  • 5:46 - 5:47
    that Craig's GPA is greater
  • 5:47 - 5:49
    than 3.5, so we'll only get
  • 5:49 - 5:51
    one student out from that query.
  • 5:51 - 5:53
    Now let's suppose we
  • 5:53 - 5:54
    had another query, where we
  • 5:54 - 5:55
    were gonna ask for the GPA
  • 5:56 - 5:57
    less than or equal to 3.5.
  • 5:58 - 5:59
    So, similarly where we
  • 6:00 - 6:01
    would not have Amy in result
  • 6:01 - 6:02
    and we would certainly have Bob in
  • 6:02 - 6:04
    the result and similarly would
  • 6:04 - 6:05
    not have Craig in the
  • 6:05 - 6:06
    result because we don't know that
  • 6:06 - 6:08
    his GPA is less than or equal to 3.5.
  • 6:09 - 6:10
    So far so good, but
  • 6:10 - 6:12
    it gets a little weird is
  • 6:12 - 6:13
    when we add an or here
  • 6:13 - 6:14
    in our query, we say I
  • 6:15 - 6:16
    want everyone who's GPA is greater
  • 6:16 - 6:19
    than 3.5 or who's GPA is less than or equal to 3.5.
  • 6:19 - 6:21
    And even though it
  • 6:21 - 6:22
    looks like every tuple should
  • 6:22 - 6:24
    satisfy this condition, that it's
  • 6:24 - 6:26
    always true, that's not the case when we have null values.
  • 6:27 - 6:28
    So, that's why one has to
  • 6:28 - 6:29
    be careful when one uses
  • 6:29 - 6:31
    null values in relational databases.
  • 6:32 - 6:34
    Let me erase this now and
  • 6:34 - 6:35
    let's move on to our next concept
  • 6:36 - 6:37
    which is the concept of Key.
  • 6:38 - 6:41
    Key is again another important concept in relational databases.
  • 6:42 - 6:44
    And, a key is an
  • 6:44 - 6:45
    attribute in of a relation
  • 6:45 - 6:49
    where every value for that attribute is unique.
  • 6:50 - 6:50
    So if we look at the
  • 6:50 - 6:52
    student relation, we can feel
  • 6:52 - 6:55
    pretty confident that the ID is going to be a key.
  • 6:55 - 6:56
    In other words, every tuple is
  • 6:56 - 6:58
    going to have a unique for ID.
  • 7:00 - 7:02
    Thinking about the college relation, it's a little less clear.
  • 7:03 - 7:04
    We might be tempted to say
  • 7:04 - 7:05
    that the name of the college is
  • 7:05 - 7:06
    an ID, that actually college
  • 7:07 - 7:09
    names probably are not unique across the country.
  • 7:10 - 7:11
    There's probably a lot of or
  • 7:11 - 7:13
    several colleges named Washington college for example.
  • 7:14 - 7:15
    You know what, we're allowed to
  • 7:15 - 7:17
    have sets of attributes that
  • 7:17 - 7:19
    are unique and that makes sense in the college relation.
  • 7:20 - 7:22
    Most likely the combination of
  • 7:22 - 7:23
    the name and state of
  • 7:23 - 7:24
    a college is unique, and that's
  • 7:24 - 7:27
    what we would identify as the key for the college relation.
  • 7:28 - 7:29
    Now, you might wonder why it's
  • 7:29 - 7:32
    even important to have attributes that are identified as keys.
  • 7:33 - 7:34
    There's actually several uses for them.
  • 7:35 - 7:37
    One of them is just to identify specific tuples.
  • 7:38 - 7:39
    So if you want to run
  • 7:39 - 7:40
    a query to get a specific
  • 7:40 - 7:42
    tuple out of the database you
  • 7:42 - 7:45
    would do that by asking for that tuple by its key.
  • 7:45 - 7:47
    And related to that
  • 7:47 - 7:48
    database systems for efficiency
  • 7:49 - 7:50
    tend to build special index
  • 7:51 - 7:53
    structures or store the database in a particular way.
  • 7:53 - 7:56
    So it's very fast to find a tuple based on its key.
  • 7:57 - 7:58
    And lastly, if one relation
  • 7:59 - 8:00
    in a relational database wants
  • 8:00 - 8:01
    to refer to tuples of another,
  • 8:02 - 8:04
    there 's no concept of pointer in relational databases.
  • 8:05 - 8:07
    Therefore, the first relation
  • 8:07 - 8:08
    will typically refer to a
  • 8:09 - 8:11
    tuple in the second relation by its unique key.
  • 8:11 - 8:13
    As our videos develop we'll see the importance of keys.
  • 8:15 - 8:16
    Okay, just to wrap up,
  • 8:17 - 8:19
    I'll mention how one creates relations
  • 8:19 - 8:21
    or tables in the SQL language.
  • 8:21 - 8:23
    It's very simple, you just
  • 8:23 - 8:24
    say "create table," give the
  • 8:24 - 8:26
    name of the relation and a list of the attributes.
  • 8:27 - 8:29
    And if you want to give types for the attributes.
  • 8:30 - 8:31
    It's similar except you follow
  • 8:31 - 8:33
    each attribute name with its type.
  • 8:34 - 8:35
    So to wrap up, the relational
  • 8:36 - 8:37
    model has been around a long time.
  • 8:37 - 8:39
    Has started a huge industry.
  • 8:39 - 8:41
    It's used by all database systems.
  • 8:41 - 8:42
    As you've seen it's a very
  • 8:42 - 8:44
    simple model and will shortly
  • 8:44 - 8:46
    see that it can be queried with very nice languages.
  • 8:47 - 8:48
    And, finally, it's been implemented very efficiently.
Title:
02-01-relational-model.mp4
Video Language:
English
Duration:
08:50
Alberto Gonzalez Rouille added a translation

English subtitles

Revisions