Return to Video

03-02-dtds-ids-idrefs.mp4

  • 0:00 - 0:02
    In the previous video, we learned the basics of XML.
  • 0:02 - 0:04
    In this video, we're
  • 0:04 - 0:06
    going to learn about Document Type Descriptors,
  • 0:06 - 0:11
    also known as DTDs, and also ID and ID ref attributes.
  • 0:11 - 0:13
    We learned that well-formed XML
  • 0:13 - 0:14
    is XML that adheres to
  • 0:14 - 0:16
    basic structural requirements: a single
  • 0:16 - 0:18
    root element, matched tags with
  • 0:18 - 0:20
    proper nesting, and unique
  • 0:20 - 0:23
    attributes within each element.
  • 0:23 - 0:26
    Now we're going to learn about what's known as valid XML.
  • 0:26 - 0:27
    Valid XML has to adhere
  • 0:27 - 0:30
    to the same basic structural requirements
  • 0:30 - 0:32
    as well-formed XML, but it
  • 0:32 - 0:35
    also adheres to content specific specifications.
  • 0:35 - 0:38
    And we're going to learn two languages for those specifications.
  • 0:38 - 0:39
    One of them is Document Type
  • 0:39 - 0:41
    Descriptors or DTDs, and the
  • 0:41 - 0:44
    other, a more powerful language, is XML schema.
  • 0:44 - 0:46
    Specifications in XML
  • 0:46 - 0:50
    schema are known as XSDs, for XML Schema Descriptions.
  • 0:50 - 0:52
    So as a reminder, here's how
  • 0:52 - 0:54
    things worked with well-formed XML documents.
  • 0:54 - 0:55
    We sent the document to a
  • 0:55 - 0:57
    parser and the parser would
  • 0:57 - 0:58
    either return that the document
  • 0:58 - 1:02
    was not well-formed or it would return parsed XML.
  • 1:02 - 1:03
    Now let's consider what happens with valid XML.
  • 1:03 - 1:05
    Now we use a validating
  • 1:05 - 1:07
    XML parser, and we have
  • 1:07 - 1:08
    an additional input to the
  • 1:08 - 1:10
    process, which is a
  • 1:10 - 1:12
    specification, either a DTD or an XSD.
  • 1:12 - 1:15
    So that's also fed to the parser, along with the document.
  • 1:15 - 1:17
    The parser can again
  • 1:17 - 1:18
    say the document is
  • 1:18 - 1:22
    not well formed if it doesn't meet the basic structural requirements.
  • 1:22 - 1:23
    It could also say that the
  • 1:23 - 1:24
    document is not valid, meaning
  • 1:24 - 1:26
    the structure of the document doesn't
  • 1:26 - 1:28
    match the content specific specification.
  • 1:28 - 1:30
    If everything is good, then
  • 1:30 - 1:33
    once again "parsed XML" is returned.
  • 1:33 - 1:36
    Now let's talk about the document-type descriptors, or DTDs.
  • 1:36 - 1:37
    We see a DTD in
  • 1:37 - 1:38
    the lower-left corner of the
  • 1:38 - 1:39
    video, but we won't look
  • 1:39 - 1:40
    at it in any detail, because we'll
  • 1:40 - 1:44
    be doing demos of DTDs a little later on.
  • 1:44 - 1:45
    A DTD is a language
  • 1:45 - 1:47
    that's kind of like a grammar, and
  • 1:47 - 1:49
    what you can specify in that language is for
  • 1:49 - 1:51
    a particular document what elements
  • 1:51 - 1:52
    you want that document to contain,
  • 1:52 - 1:54
    the tags of the elements,
  • 1:54 - 1:55
    what attributes can be in
  • 1:55 - 1:59
    the elements, how the different types of elements can be nested.
  • 1:59 - 2:00
    Sometimes the ordering of the
  • 2:00 - 2:01
    elements might want to be
  • 2:01 - 2:06
    specified, and sometimes the number of occurrences of different elements.
  • 2:06 - 2:07
    DTDs also allow the
  • 2:07 - 2:09
    introduction of special types of
  • 2:09 - 2:11
    attributes, called id and idrefs.
  • 2:11 - 2:13
    And, effectively, what these allow you
  • 2:13 - 2:15
    to do is specify pointers within
  • 2:15 - 2:19
    a document, although these pointers are untyped.
  • 2:19 - 2:20
    Before moving to the demo,
  • 2:20 - 2:21
    let's talk a little bit about
  • 2:21 - 2:22
    the positives and negatives about
  • 2:22 - 2:24
    choosing to use a DTD
  • 2:24 - 2:26
    or and XSD for one's XML data.
  • 2:26 - 2:27
    After all, if you're
  • 2:27 - 2:29
    building an application that encodes
  • 2:29 - 2:30
    its data in XML, you'll have
  • 2:30 - 2:32
    to decide whether you want the
  • 2:32 - 2:33
    XML to just be well formed
  • 2:33 - 2:34
    or whether you want to
  • 2:34 - 2:37
    have specifications and require the
  • 2:37 - 2:40
    XML to be valid to satisfy those specifications.
  • 2:40 - 2:41
    So, let's put a few positives
  • 2:41 - 2:44
    of choosing a later of requiring a DTD or an XSD.
  • 2:44 - 2:46
    First of all, one of
  • 2:46 - 2:47
    them is that when you write your
  • 2:47 - 2:49
    program, you can assume
  • 2:49 - 2:52
    that the data adheres to a specific structure.
  • 2:52 - 2:54
    So programs can assume a
  • 2:54 - 2:56
    structure and so the
  • 2:56 - 2:57
    programs themselves are simpler because they don't
  • 2:57 - 3:00
    have to be doing a lot of error checking on the data.
  • 3:00 - 3:01
    They'll know that before the data
  • 3:01 - 3:03
    reaches the program, it's been
  • 3:03 - 3:07
    run through a validator and it does satisfy a particular structure.
  • 3:07 - 3:08
    Second of all, we talked
  • 3:08 - 3:10
    at some time ago about
  • 3:10 - 3:13
    the cascading style sheet language
  • 3:13 - 3:15
    and the extensible style sheet languages.
  • 3:15 - 3:17
    These are languages that take XML
  • 3:17 - 3:19
    and they run rules on it
  • 3:19 - 3:22
    to process it into a different form, often HTML.
  • 3:22 - 3:24
    When you write those rules, if
  • 3:24 - 3:25
    you note that the data
  • 3:25 - 3:26
    has a certain structure, then those
  • 3:26 - 3:28
    rules can be simpler, so like
  • 3:28 - 3:30
    the programs they also can
  • 3:30 - 3:33
    assume particular structure and it makes them simpler.
  • 3:33 - 3:35
    Now, another use for DTDs
  • 3:35 - 3:36
    or XSDs is as a
  • 3:36 - 3:39
    specification language for conveying
  • 3:39 - 3:41
    what XML might need to look like.
  • 3:41 - 3:43
    So, as an example if you're
  • 3:43 - 3:45
    performing data exchange using
  • 3:45 - 3:47
    XML, maybe a company is
  • 3:47 - 3:48
    going to receive purchase orders in
  • 3:48 - 3:50
    XML, the company can
  • 3:50 - 3:51
    actually use the DTD as
  • 3:51 - 3:53
    a specification for what
  • 3:53 - 3:54
    the XML needs to look
  • 3:54 - 3:56
    like when it arrives at
  • 3:56 - 3:59
    the program it's going to operate on it.
  • 3:59 - 4:01
    Also documentation, it can
  • 4:01 - 4:02
    be useful to use one of
  • 4:02 - 4:04
    the specifications to just document
  • 4:04 - 4:06
    what the data itself looks like.
  • 4:06 - 4:08
    In general, really what
  • 4:08 - 4:11
    we have here is the benefits of typing.
  • 4:11 - 4:13
    We're talking about strongly typed data
  • 4:13 - 4:17
    versus loosely-typed data, if you want to think of it that way.
  • 4:17 - 4:21
    Now let's look at when we might prefer not to use a DTD.
  • 4:21 - 4:22
    So what I'm going describe down
  • 4:22 - 4:25
    here is the benefits of not using a DTD.
  • 4:25 - 4:27
    So the biggest benefit is flexibility.
  • 4:27 - 4:30
    So a DTD makes your
  • 4:30 - 4:33
    XML data have to conform to a specification.
  • 4:33 - 4:34
    If you want more flexibility or
  • 4:34 - 4:36
    you want ease of change
  • 4:36 - 4:37
    in the way that the data is
  • 4:37 - 4:39
    formatted without running into
  • 4:39 - 4:40
    a lot of errors, then, if
  • 4:40 - 4:42
    that's what you want,
  • 4:42 - 4:45
    then the DTD can be constraining.
  • 4:45 - 4:46
    Another fact is that DTDs can
  • 4:46 - 4:48
    be fairly messy and this
  • 4:48 - 4:49
    is not going to be obvious
  • 4:49 - 4:50
    to you yet until we get
  • 4:50 - 4:52
    into the demo, but if
  • 4:52 - 4:55
    the data is irregular, very irregular, then
  • 4:55 - 4:57
    specifying its structure can
  • 4:57 - 5:00
    be hard, especially for irregular documents.
  • 5:00 - 5:02
    Actually, when we see
  • 5:02 - 5:04
    the schema language, we'll
  • 5:04 - 5:06
    discover that XSDs can be,
  • 5:06 - 5:10
    I would say, really messy, so they can actually get very large.
  • 5:10 - 5:11
    It's possible to have a
  • 5:11 - 5:13
    document where the specification of
  • 5:13 - 5:14
    the structure of the document is
  • 5:14 - 5:16
    much, much larger than the
  • 5:16 - 5:18
    document itself, which seems not
  • 5:18 - 5:19
    entirely intuitive, but when we get to
  • 5:19 - 5:22
    learn about XSDs, I think you'll see how that can happen.
  • 5:22 - 5:23
    So, overall, this is
  • 5:23 - 5:26
    the benefits of nil typing.
  • 5:26 - 5:28
    It' s really quite similar to
  • 5:28 - 5:31
    the analogy in programming languages.
  • 5:31 - 5:33
    The remainder of this video will
  • 5:33 - 5:35
    teach about the DTDs themselves through a set of examples.
  • 5:35 - 5:36
    We'll have a separate video
  • 5:36 - 5:39
    for learning about XML schema and XSDs.
  • 5:39 - 5:41
    So, here we are
  • 5:41 - 5:43
    with our first document that we're
  • 5:43 - 5:45
    going to look at with a document type descriptor.
  • 5:45 - 5:47
    We have on the left the document itself.
  • 5:47 - 5:49
    We have on the right the document-type
  • 5:49 - 5:50
    descriptor, and then we have
  • 5:50 - 5:51
    in the lower right a command
  • 5:51 - 5:55
    line shell that we're going to use to validate the document.
  • 5:55 - 5:56
    So this is similar data to
  • 5:56 - 5:57
    what we saw on the last video,
  • 5:57 - 5:59
    but let's go through it just to see what we have.
  • 5:59 - 6:01
    We have an outermost element called
  • 6:01 - 6:04
    bookstore, and we have two books in our bookstore.
  • 6:04 - 6:08
    The first book has an ISBN number, price and editions.
  • 6:08 - 6:09
    As attributes and then it
  • 6:09 - 6:12
    has a sub-element called title, another
  • 6:12 - 6:13
    sub-element called authors with two
  • 6:13 - 6:16
    authors underneath; first names and last names.
  • 6:16 - 6:18
    The second book element is
  • 6:18 - 6:20
    similar, except it doesn't have a edition.
  • 6:20 - 6:23
    It also has, as we see, a remark.
  • 6:23 - 6:24
    Now let's take a look at
  • 6:24 - 6:25
    the DTD and I'm just going
  • 6:25 - 6:27
    to walk through DTD, not
  • 6:27 - 6:29
    too slowly, not too fast, and
  • 6:29 - 6:30
    explain exactly what it's doing.
  • 6:30 - 6:31
    So the start of the
  • 6:31 - 6:33
    DTD says this a
  • 6:33 - 6:35
    DTD named bookstore and the
  • 6:35 - 6:37
    root element is called bookstore,
  • 6:37 - 6:40
    and now we have the first grammar-like construct.
  • 6:40 - 6:42
    So these constructs, in fact, are
  • 6:42 - 6:44
    a little bit like regular expressions if you know them.
  • 6:44 - 6:45
    What this says is that
  • 6:45 - 6:47
    a bookstore element has as
  • 6:47 - 6:49
    its sub-element any number
  • 6:49 - 6:51
    of elements that are called book or magazine.
  • 6:51 - 6:53
    We have book or magazine.
  • 6:53 - 6:55
    We don't have any magazines yet but we'll add one.
  • 6:55 - 6:58
    And then this star says, zero or more instances.
  • 6:58 - 7:02
    It's the clean and close operator for those of you familiar with regular expression.
  • 7:02 - 7:04
    Now let's talk about
  • 7:04 - 7:07
    what the book element
    has, so that's our next specification.
  • 7:07 - 7:09
    The book element has a
  • 7:09 - 7:11
    title followed by authors,
  • 7:11 - 7:13
    followed by an optional remark.
  • 7:13 - 7:14
    So now we don't have an
  • 7:14 - 7:15
    "or", we have a comma, and
  • 7:15 - 7:16
    that says that these are going to
  • 7:16 - 7:17
    be in that order - title,
  • 7:17 - 7:19
    authors, and remark and the
  • 7:19 - 7:22
    question mark says that the remark is optional.
  • 7:22 - 7:24
    Next we have the attributes of our book elements.
  • 7:24 - 7:26
    So this bang attribute list
  • 7:26 - 7:27
    says we're going to describe
  • 7:27 - 7:28
    the attributes and we're going
  • 7:28 - 7:31
    to have three of them: the ISBN,
  • 7:31 - 7:33
    the price, and the edition.
  • 7:33 - 7:35
    C data is the type of the attribute.
  • 7:35 - 7:36
    It's just a string.
  • 7:36 - 7:37
    And then required says that
  • 7:37 - 7:39
    the attribute must be present, whereas
  • 7:39 - 7:41
    implied says it doesn't have to be present.
  • 7:41 - 7:45
    As you may remember, we have one book that doesn't have an edition.
  • 7:45 - 7:46
    Our magazines are simply going
  • 7:46 - 7:47
    to have titles and they're going
  • 7:47 - 7:49
    to have attributes that are month and year.
  • 7:49 - 7:51
    Again, we don't have any magazines yet.
  • 7:51 - 7:53
    A title is going to
  • 7:53 - 7:55
    consist of string data.
  • 7:55 - 7:58
    So here we see our title of first course and database system.
  • 7:58 - 8:02
    You can think of that as the leaf data in the XML tree.
  • 8:02 - 8:03
    And when you have a leaf that
  • 8:03 - 8:05
    consists of text data, this is
  • 8:05 - 8:06
    what you put in the DTD
  • 8:06 - 8:08
    - just take my word for it:
  • 8:08 - 8:10
    hash PC data in parentheses.
  • 8:10 - 8:14
    Now our authors are an element that still has structure .
  • 8:14 - 8:16
    Our authors have a sub-element,
  • 8:16 - 8:18
    author sub-elements or elements,
  • 8:18 - 8:19
    and we're going to
  • 8:19 - 8:21
    specify here that the
  • 8:21 - 8:23
    author's element must have one
  • 8:23 - 8:25
    or more author subelements.
  • 8:25 - 8:26
    So that's what the plus
  • 8:26 - 8:29
    is saying here, again taken from regular expressions.
  • 8:29 - 8:32
    "Plus" means one or more instances.
  • 8:32 - 8:33
    We have the remark, which
  • 8:33 - 8:36
    is just going to be pc data or string data.
  • 8:36 - 8:38
    We have our authors which consist
  • 8:38 - 8:40
    of a first name sub-element and
  • 8:40 - 8:42
    a last-name sub-element, and in that order.
  • 8:42 - 8:46
    And then finally, our first names and last names are also strengths.
  • 8:46 - 8:47
    So, this is the entire
  • 8:47 - 8:49
    DTD and it describes
  • 8:49 - 8:51
    in detail the structure
  • 8:51 - 8:53
    of our document.
  • 8:53 - 8:54
    Now we have a command, we're
  • 8:54 - 8:57
    using something called xmllint,
  • 8:57 - 9:00
    that will check to see if the document meets the structure.
  • 9:00 - 9:02
    We'll just run that command
  • 9:02 - 9:03
    here with a couple of options, and
  • 9:03 - 9:05
    it doesn't give us any output
  • 9:05 - 9:09
    which actually means that our document is correct.
  • 9:09 - 9:13
    Well be making some edits and seeing when our document is not correct what happens when we run the command.
  • 9:13 - 9:14
    So let's make our first edit,
  • 9:14 - 9:16
    let's say that we decide that
  • 9:16 - 9:17
    we want the additional attribute
  • 9:17 - 9:21
    of our books to be "required" rather than "applied".
  • 9:21 - 9:23
    So we'll change the DTD.
  • 9:23 - 9:27
    We'll save the file and now when we run our command.
  • 9:27 - 9:28
    So as expected we got an
  • 9:28 - 9:30
    error, and the error said
  • 9:30 - 9:33
    that one of our book elements does not have attribute addition.
  • 9:33 - 9:36
    Now that addition is required, every book element ought to have it.
  • 9:36 - 9:39
    So let's add an addition to our second book.
  • 9:39 - 9:41
    Let 's say that it's
  • 9:41 - 9:43
    the second edition, save the
  • 9:43 - 9:44
    file, we'll validate our
  • 9:44 - 9:48
    document again, and now everything is good. Let's
  • 9:48 - 9:49
    do an edit to the document
  • 9:49 - 9:51
    this time to see what
  • 9:51 - 9:52
    happens when we change the
  • 9:52 - 9:54
    order of the first name and the last name.
  • 9:54 - 9:58
    So we've swapped Jeffrey Ullman to be Ullman Jeffery.
  • 9:58 - 10:00
    We validate our document, and now
  • 10:00 - 10:02
    we see we got an error
  • 10:02 - 10:04
    because the elements are not in the correct order.
  • 10:04 - 10:06
    In this case, let's undo that
  • 10:06 - 10:09
    change, rather than change our DTD.
  • 10:09 - 10:11
    Let's try another edit to our document.
  • 10:11 - 10:13
    Let's add a remark to our first book.
  • 10:13 - 10:14
    But what we'll do is
  • 10:14 - 10:16
    we'll leave the remark empty, so
  • 10:16 - 10:18
    we'll add a opening and then
  • 10:18 - 10:24
    directly a closing tag, and let's see if that validates.
  • 10:24 - 10:25
    So, it did validate.
  • 10:25 - 10:26
    And in fact when we have
  • 10:26 - 10:27
    PC data as the type
  • 10:27 - 10:32
    of an element it's perfectly acceptable to have a empty element.
  • 10:32 - 10:34
    As a final change, let's add a magazine to our database.
  • 10:34 - 10:37
    You'll have to bear with me as I type.
  • 10:37 - 10:39
    I'm always a little bit slow.
  • 10:39 - 10:40
    So we see over here that
  • 10:40 - 10:41
    when we have a magazine there are
  • 10:41 - 10:44
    two required attributes, the month and the year.
  • 10:44 - 10:45
    So, let's say the month is
  • 10:45 - 10:48
    January and the year,
  • 10:48 - 10:50
    let's make that 2011,
  • 10:50 - 10:53
    and then we have a title for our magazine.
  • 10:53 - 10:54
    Here.
  • 10:54 - 10:55
    We'll go down here.
  • 10:55 - 11:00
    Our title, let's make it National Geographic.
  • 11:00 - 11:03
    We'll close the tag, title tag.
  • 11:03 - 11:05
    And then, sorry again about my typing.
  • 11:05 - 11:08
    Let's go ahead and validate the document.
  • 11:08 - 11:11
    we saw premature end of something or other.
  • 11:11 - 11:13
    We forgot our closing tag for
  • 11:13 - 11:17
    magazine, let's put that in.
  • 11:17 - 11:19
    My terrible typing, and here we go.
  • 11:19 - 11:23
    Let's validate, and we're done.
  • 11:23 - 11:26
    Now we're gonna learn about and id rep attributes.
  • 11:26 - 11:28
    The document on the left side
  • 11:28 - 11:29
    contains the same data as
  • 11:29 - 11:32
    our previous document but completely restructured.
  • 11:32 - 11:33
    Instead of having authors as
  • 11:33 - 11:35
    subelements of book elements,
  • 11:35 - 11:37
    we're going to have our authors listed separately,
  • 11:37 - 11:41
    and then effectively point from the books to the authors of the book.
  • 11:41 - 11:42
    We'll take a look at the
  • 11:42 - 11:43
    data first, and then
  • 11:43 - 11:47
    we'll look at the DTD that describes the data.
  • 11:47 - 11:48
    Let's actually start with the
  • 11:48 - 11:51
    author, so our bookstore element
  • 11:51 - 11:55
    here has two subelements that are books and three that are authors.
  • 11:55 - 11:56
    So, looking at the authors, we have
  • 11:56 - 11:58
    the first name and last name
  • 11:58 - 11:59
    as sub-elements as usual, but
  • 11:59 - 12:02
    we've added what we call the ident attribute.
  • 12:02 - 12:03
    That's not a keyword; we've just
  • 12:03 - 12:05
    called the attribute ident, and
  • 12:05 - 12:07
    then for each of the three authors,
  • 12:07 - 12:08
    we've given a string value
  • 12:08 - 12:10
    to that attribute that we're going
  • 12:10 - 12:12
    to use effectively for the pointers in the book.
  • 12:12 - 12:16
    So we have our three authors, now let's take a look at the books.
  • 12:16 - 12:18
    Our book has the ISBN number and price.
  • 12:18 - 12:21
    I've taken the addition out for now.
  • 12:21 - 12:23
    special attribute called authors.
  • 12:23 - 12:25
    Authors is an ID reps
  • 12:25 - 12:27
    attribute, and it's value
  • 12:27 - 12:28
    can refer to one or
  • 12:28 - 12:31
    more strings that are ID attributes.
  • 12:31 - 12:32
    attributes in another element.
  • 12:32 - 12:33
    So that's what we're doing here.
  • 12:33 - 12:36
    We're referring to the two author elements here.
  • 12:36 - 12:40
    And in our second book we're referring to the three author elements.
  • 12:40 - 12:41
    We still have the title subelement
  • 12:41 - 12:44
    and we still have the remarks subelement.
  • 12:44 - 12:46
    And furthermore, we have one
  • 12:46 - 12:47
    other cute thing here, which is,
  • 12:47 - 12:49
    instead of referring to
  • 12:49 - 12:51
    the book by name within the
  • 12:51 - 12:52
    remark when we're talking about
  • 12:52 - 12:56
    the other book, we have another type of pointer.
  • 12:56 - 12:57
    So we'll specify that the
  • 12:57 - 12:59
    ISBN is an ID
  • 12:59 - 13:01
    for books and then this
  • 13:01 - 13:03
    is an id reps attribute
  • 13:03 - 13:07
    that's referring to the id of the other book.
  • 13:07 - 13:11
    The DTD on the right that describes the structure of this document.
  • 13:11 - 13:12
    This time our bookstore is
  • 13:12 - 13:14
    going to contain zero or more
  • 13:14 - 13:17
    books followed by zero or more authors.
  • 13:17 - 13:18
    Our books contain a title and
  • 13:18 - 13:20
    an optional remark is subelements and
  • 13:20 - 13:22
    now they contain three attributes,
  • 13:22 - 13:24
    the IDBN which is
  • 13:24 - 13:26
    now a special type of
  • 13:26 - 13:28
    attribute called and ID, the
  • 13:28 - 13:30
    price,which is the string
  • 13:30 - 13:31
    value as usual and the
  • 13:31 - 13:32
    authors which is the special type
  • 13:32 - 13:34
    called id reps. Let's keep
  • 13:34 - 13:37
    going, our title is just string Value as usual.
  • 13:37 - 13:41
    A remark, here this is a actually interesting construct.
  • 13:41 - 13:43
    A remark consist of the
  • 13:43 - 13:46
    PC data which is string,
  • 13:46 - 13:47
    or a book reference and then
  • 13:47 - 13:50
    zero more instances of those.
  • 13:50 - 13:51
    This is the type of construct
  • 13:51 - 13:52
    that can be used to mix
  • 13:52 - 13:55
    strings and sub elements within an element.
  • 13:55 - 13:56
    So anytime you want an
  • 13:56 - 13:57
    element that might have some
  • 13:57 - 14:00
    strings and then another element and then more string value.
  • 14:00 - 14:01
    That's how it's done.
  • 14:01 - 14:05
    PC data or the element type zero or more.
  • 14:05 - 14:08
    Then we have our book reference
  • 14:08 - 14:09
    which is actually an empty element it's
  • 14:09 - 14:11
    only interesting because is has
  • 14:11 - 14:12
    an attribute so let's go
  • 14:12 - 14:13
    back here we see our book
  • 14:13 - 14:14
    wrap here it actually doesn't
  • 14:14 - 14:16
    have any data or sub
  • 14:16 - 14:17
    elements, but it has an
  • 14:17 - 14:20
    attribute called book and that is an ID ref.
  • 14:20 - 14:22
    That means it refers to an
  • 14:22 - 14:26
    ID attribute of another, another
  • 14:26 - 14:27
    element.
  • 14:27 - 14:28
    Now we have our authors the first
  • 14:28 - 14:30
    name and the last name and
  • 14:30 - 14:33
    our author attributes have again
  • 14:33 - 14:35
    an ID and we're calling it the ident.
  • 14:35 - 14:39
    And finally the first name and last name are string values.
  • 14:39 - 14:40
    This may seem overwhelming but the
  • 14:40 - 14:43
    key points in this DTD
  • 14:43 - 14:44
    are the ID the attributes.
  • 14:44 - 14:46
    So the ID attributes, the ISBN
  • 14:46 - 14:48
    attributes in the book, and
  • 14:48 - 14:50
    the ident, wherever it
  • 14:50 - 14:52
    went, ident attribute in the author
  • 14:52 - 14:53
    are special attributes, and by
  • 14:53 - 14:54
    the way, they do need to be
  • 14:54 - 14:57
    unique values for those attributes,
  • 14:57 - 14:58
    and they're special in that
  • 14:58 - 15:01
    ID refs attributes can refer
  • 15:01 - 15:03
    to them, and that will be checked as well.
  • 15:03 - 15:04
    Now, I did want to
  • 15:04 - 15:05
    point out that the book
  • 15:05 - 15:08
    reference here says ID ref singular.
  • 15:08 - 15:09
    When you have a singular
  • 15:09 - 15:11
    ID ref then the string has
  • 15:11 - 15:13
    to be exactly one ID value.
  • 15:13 - 15:15
    When you have the plural ID refs.
  • 15:15 - 15:17
    Then the string of the
  • 15:17 - 15:19
    attribute is one or
  • 15:19 - 15:21
    more ID ref value, I'm
  • 15:21 - 15:24
    sorry one or more ID values separated by spaces.
  • 15:24 - 15:27
    So it's a little bit clunky, but it does seem to work.
  • 15:27 - 15:31
    Now let's go to our command line, and let's validate the document.
  • 15:31 - 15:33
    So the document is in fact valid.
  • 15:33 - 15:34
    That's what it means when we
  • 15:34 - 15:35
    get nothing back, and let's
  • 15:35 - 15:36
    make some changes, as we did
  • 15:36 - 15:39
    before, to explore what structure
  • 15:39 - 15:42
    is imposed and what's checked with this DTD in the presence.
  • 15:42 - 15:44
    IDs and ID refs.
  • 15:44 - 15:46
    As a first change, let's change
  • 15:46 - 15:48
    this ID, this identifier
  • 15:48 - 15:51
    HG to JU.
  • 15:51 - 15:52
    That should actually cause a couple of problems
  • 15:52 - 15:53
    when we do that let's
  • 15:53 - 15:56
    validate the document and see what happens.
  • 15:56 - 15:58
    And we do in fact get two different errors.
  • 15:58 - 16:00
    The first error says that
  • 16:00 - 16:03
    we have two instances of "JU".
  • 16:03 - 16:04
    As you can see here, we
  • 16:04 - 16:06
    now have JU twice where
  • 16:06 - 16:08
    ID values do have to be unique.
  • 16:08 - 16:10
    They have to be globally unique throughout the document.
  • 16:10 - 16:12
    The second error that occurred
  • 16:12 - 16:14
    when we changed HG to JU
  • 16:14 - 16:17
    is we effectively have a dangling pointer.
  • 16:17 - 16:19
    We refer to HG here
  • 16:19 - 16:21
    in this ID refs attribute but there's
  • 16:21 - 16:24
    no longer an element whose value is HG.
  • 16:24 - 16:25
    So that's an error as well.
  • 16:25 - 16:27
    So let's change it back to
  • 16:27 - 16:31
    HG just so our document is valid again.
  • 16:31 - 16:34
    Now let's make another change, let's take our book reference.
  • 16:34 - 16:37
    We can see that our book reference is referring to the other book.
  • 16:37 - 16:39
    We're in the complete book here
  • 16:39 - 16:40
    and the comment, the remark is
  • 16:40 - 16:41
    referring to the first course
  • 16:41 - 16:44
    through the ISBN number, but let's
  • 16:44 - 16:47
    change this string instead to refer to HG.
  • 16:47 - 16:49
    So now we're actually referring
  • 16:49 - 16:51
    to an author rather than another book.
  • 16:51 - 16:54
    Let's check if the document validates.
  • 16:54 - 16:55
    In fact it does.
  • 16:55 - 16:56
    And that shows that the
  • 16:56 - 16:59
    pointers when you have a DTD are untyped.
  • 16:59 - 17:01
    So it does check to make
  • 17:01 - 17:02
    sure that this is an
  • 17:02 - 17:03
    id of another element, but we
  • 17:03 - 17:05
    weren't able to specify that
  • 17:05 - 17:07
    it should be a book element
  • 17:07 - 17:08
    in our DTD, and since we're
  • 17:08 - 17:10
    not able to specify it, of
  • 17:10 - 17:11
    course it's not possible to check it.
  • 17:11 - 17:13
    We will see that in XML
  • 17:13 - 17:14
    schema, we can have typed
  • 17:14 - 17:17
    pointers but it's not possible to have them in DTDs.
  • 17:17 - 17:19
    The last change I'm going to
  • 17:19 - 17:20
    show is to add a
  • 17:20 - 17:22
    second book reference within our remark.
  • 17:22 - 17:24
    So as I pointed out over
  • 17:24 - 17:26
    here, when we write PC data
  • 17:26 - 17:28
    or in an element type
  • 17:28 - 17:29
    followed by the [xx] closure, the
  • 17:29 - 17:31
    zero or more star, that
  • 17:31 - 17:34
    means we can freely mix text and sub-elements.
  • 17:34 - 17:39
    So just right in the middle here, let's put a book reference.
  • 17:39 - 17:41
    and we can put, let's say
  • 17:41 - 17:45
    book equals JU, and that
  • 17:45 - 17:46
    will be the end of our reference
  • 17:46 - 17:48
    there and now we
  • 17:48 - 17:50
    see that we have text followed
  • 17:50 - 17:51
    by a subelement followed by more
  • 17:51 - 17:53
    text then so on.
  • 17:53 - 17:56
    That should validate fine, and in fact it does.
  • 17:56 - 17:58
    That completes our demonstration of
  • 17:58 -
    XML documents with DTDs.
Title:
03-02-dtds-ids-idrefs.mp4
Video Language:
English
Duration:
18:01
Amara Bot edited English subtitles for 03-02-dtds-ids-idrefs.mp4
Amara Bot added a translation

English subtitles

Revisions