-
In this video, we'll see a demonstration of JSON data.
-
As a reminder, JSON stands for
-
Java Script Object Notation, and
-
it's a standard for writing
-
data objects into human readable format, typically in a file.
-
It's useful for exchanging data
-
between programs, and generally
-
because it's quite flexible, it's useful
-
for representing and for storing data that's semi-structured.
-
A reminder of the
-
basic constructs in JSON, we
-
have the atomic value, such
-
as integers and strings and so on.
-
And then we have two types of
-
composite things; we have
-
objects that are sets of
-
label-value pairs and then we have arrays that are lists of values.
-
In the demonstration, we'll go through
-
in more detail the basic constructs
-
of JSON and we'll look at
-
some tactic correctness, we'll demonstrate
-
the flexibility of the data
-
model and then we'll
-
look briefly at JSON's schema,
-
not widely used yet but
-
still fairly interesting to look at
-
and we'll look at some validation
-
of JSON data against a particular schema.
-
So, here's the JSON
-
data that we're gonna be working with during this demo.
-
It's the same data that appeared
-
in the slides, in the introduction
-
to JSON, but now we're going
-
to look into the components of the data.
-
It's also by the way, the
-
same example pretty much that we
-
used for XML, it's reformatted
-
of course to meet the JSON
-
data model, but you can compare the two directly.
-
Lastly, we do have
-
the file for the data on
-
the website, and I do
-
suggest that you download the
-
file so that you can
-
take a look at it closely on your own computer.
-
All right.
-
So, let's see what we have,
-
right now we're in
-
an editor for JSON data.
-
It happens to be the Eclipse
-
editor and we're going to
-
make make some edits to the
-
file after we look through
-
the constructs of the file.
-
So, this is JSON
-
data representing books and
-
magazines, and we have
-
a little more information about our books and our magazines.
-
So, at the outermost, the
-
curly brace indicates that this is a JSON object.
-
And as a reminder, an object
-
is a set of label-value
-
pairs, separated by commas.
-
So, our first value is the label "books". And
-
then our first element in
-
the object is the label books
-
and this big value and the
-
second, so there's only two label-value
-
pairs here, is the
-
label magazines and this big value here.
-
And let's take a look first at magazines.
-
So magazines, again, is the
-
label and the value we
-
can see with the square
-
brackets here is an array.
-
An array is a list of
-
values and here we
-
have two values in our array.
-
They're still composite values.
-
So, we have two values, each
-
of which is an object,
-
a set of label-value pairs.
-
Let me mention, sometimes people call these labels 'properties', by the way.
-
Okay. So, now we are inside
-
our 2 objects that are
-
the 2 elements in the array that's the value of magazines.
-
And each one of those has
-
3 labels and 3 values.
-
And now we're finally down to the base values.
-
So, we have the title being "National
-
Geographic", a string, the
-
month being January, a string
-
and the year 2009, where 2009 is an integer.
-
And again, we have
-
another object here that's a different magazine
-
with a different name, month and happens to be the same year.
-
Now, these two have exactly the
-
same structure but they don't
-
have to and we will
-
see that as we start editing the file.
-
But before we edit the file,
-
let's go and look at
-
our books here.
-
The value of our other
-
label-value pair inside the
-
outermost object, "books" is
-
also an array, and
-
the array in this case also
-
has just two elements, so we've represented two books here.
-
It's a little more complicated than the
-
magazines, but those elements
-
are still objects that are label-value pairs.
-
So, we have now the ISBN,
-
the price, the addition, the title,
-
all either integers or strings,
-
and then we have one nested composite
-
object which is the authors
-
and that's an array again.
-
So, the array again, is indicated by the square brackets.
-
And inside this array, we
-
have two authors and each
-
of the authors has a first
-
name and a last name,
-
but again, that uniformity is
-
not required by the model itself, as we'll see.
-
So, as I mentioned,
-
this is actually an editor for
-
JSON data and we're going to come back to this editor in a moment.
-
But what I wanted to do is
-
show the same data
-
in a browser because browsers
-
actually offer some nice features
-
for navigating in JSON.
-
So here we are in the
-
Chrome browser, which has nice
-
features for navigating JSON,
-
and other browsers do as well.
-
We can see here again that we
-
have an object in
-
our JSON data, that consists
-
of two label-value pairs;
-
books and magazines, which are
-
currently closed and and then
-
this plus allows us to open them up, and see the structure.
-
For example, we open magazines
-
and we see that magazines is an array containing two objects.
-
We can open one of those
-
objects, and see that the three label-value pairs.
-
Now we're at the lowest levels and similarly for the other object.
-
We can see here that Books
-
is also an array, and we go ahead and open it up.
-
It's an array of two objects.
-
We open one of those
-
objects and we see again
-
the set of label-value pairs,
-
where one of the values
-
is a further nesting.
-
It's an array and we open
-
that array, and we see
-
two objects, and we open
-
them and finally see the data at the lowest levels.
-
So again, the browser
-
here gives us a nice way
-
to navigate the JSON data and see its structure.
-
So now we're back to our JSON editor.
-
By the way, this editor, Eclipse, does
-
also have some features for
-
opening and closing the structure
-
of the data, but it's
-
not quite as nice as the browser that we use.
-
So we decided to use the browser instead.
-
What we are going to
-
use the editor for is to
-
make some changes to the
-
JSON data and see which
-
changes are legal and which aren't.
-
So, let's take a look at the first change, a very simple one.
-
What if we forgot a comma.
-
Well, when we try to
-
save that file, we get a
-
little notice that we have an
-
error, we expected an
-
N value, so that's a
-
pretty straightforward mistake, let's put that comma back.
-
Let's say insert an
-
extra brace somewhere here, for whatever reason.
-
We accidentally put in an extra brace.
-
Again we see that that's marked as an error.
-
So an error that can
-
be fairly common to make is
-
to forget to put quotes around strings.
-
So, for example, this ISBN
-
number here, if we don't quote it, we're gonna get an error.
-
As we'll see the only things that can
-
be unquoted are numbers and
-
the values null, true and false.
-
So, let's put our quotes back there.
-
Now, actually, even more
-
common is to forget to
-
put quotes around the labels in label-value pairs.
-
But if we forget to quote that, that's going to be an error as well.
-
You might have noticed, by the
-
way, when we use the browser
-
that the browser didn't even show
-
us the quotes in the labels.
-
But you do when you make
-
the raw JSON data, you do need to include those quotes.
-
Speaking of quotes, what if we quoted our price here.
-
Well that's actually not an
-
error, because now we've simply turned
-
price into a string, and
-
string values are perfectly well allowed anywhere.
-
Now we'll see when we use
-
JSON's schema that we
-
can make restrictions that don't allow
-
strings in certain places, but
-
just for syntactic correctness of
-
JSON data any of our values can be strings.
-
Now, as I mentioned, there are
-
a few values that are
-
sort of reserved words in JSON.
-
For example, true is a
-
reserved word for a bullion value.
-
That means we don't need to
-
quote it because it's actually
-
its own special type of value.
-
And so is false.
-
And the third one is null,
-
so there's a built-in concept of null.
-
Now, if we wanted to
-
use nil for whatever reason
-
instead of null, well, now
-
we're going to get an error because
-
nil is not a reserved word,
-
and if we really wanted nil
-
then we would need to actually make it a quoted string.
-
Now, let's take a look inside our author list.
-
And I'm going to show you
-
that arrays do not have
-
to have the same type of
-
value for every element in the array.
-
So here we have a homogeneous
-
list of authors. Both of them
-
are objects with a first
-
name and a last name as
-
separate label-value pairs,
-
but if I change that
-
first one, the entire value
-
to be, instead of a
-
composite one, simply the string,
-
Jefferey Ullman. Oops, sorry
-
about my typing there, and that
-
is not an error, it
-
is allowed to have a string,
-
and then a composite object.
-
And we could even have an array, and anything we want.
-
In an array, when you
-
have a list of values, all
-
you need is for each one
-
to be syntactically a correct value in JSON.
-
Now let's go visit our magazines
-
for a moment here and let
-
me show that empty objects are okay.
-
So a list of label
-
value pairs, comprising an object, can be the empty list.
-
And so now I've turned this magazine
-
into having no information about
-
it, but that is legal in JSON.
-
And similarly, arrays are allowed to be of zero length.
-
So I can take these authors
-
here and I can just take
-
out all of the authors, and
-
make that an empty list, but that's still valid JSON.
-
Now, what if I took this array out altogether?
-
In that case, now we
-
have an error because this is
-
an object where we have
-
label-value pairs and every
-
label-value pair has to
-
have both a label and a value.
-
So let's put our array back
-
and we can have anything in
-
there so let's just make it
-
"fu" and that corrects the error.
-
What if we didn't want an
-
array here instead and we
-
tried to make it, say, an object,?
-
Well, we're going to see an
-
error there, because an object
-
as a reminder and this is an
-
easy mistake to make. Objects
-
are always label-value pairs.
-
So if you want just a value,
-
that should be an array if
-
you want an object, then we're
-
talking about a label-value pair, so
-
we can just add "fu" as
-
our value, and then we're all set.
-
So what we've seen so far is syntactic correctness.
-
Again, there's no required
-
uniformity across values in
-
arrays or in the
-
label-value pairs in objects we
-
just need to ensure that
-
all of our values, our basic
-
values, are of the right types,
-
and things like our commas and
-
curly braces are all in place.
-
What we're gonna do next is look
-
at JSON's schema where we
-
have a mechanism for enforcing certain
-
constraints beyond simple syntactic correctness.
-
If you've been very observant, you
-
might even have noticed that we
-
have a second tab up
-
here in our editor for a
-
second JSON file, and this file
-
is going to be the schema
-
for our bookstore data. We're using
-
JSON schema, and JSON
-
schema, like, XML schema
-
is expressed in the data model itself.
-
So, our schema description for
-
this JSON data is itself
-
JSON data, and here it is.
-
And it's going to take a bit of time to explain.
-
Now the first thing that you might
-
notice is wow, the schema
-
looks more complicated and in
-
fact longer than the data itself.
-
Well, that is true, but that's mostly because our data file is tiny.
-
So, if we had thousands, you know, tens
-
of thousands of books and magazines,
-
our schema file wouldn't
-
change, but our data file would
-
be much longer and that's the typical case, in reality.
-
Now, this video is not a
-
complete tutorial about JSON's schema.
-
There's many constructs in JSON's
-
schema that weren't needed to
-
describe the bookstore data, for example.
-
And even this file here,
-
I'm not gonna go through every detail of it right here.
-
You can download the file and
-
take a look, read a little more about JSON schema.
-
I'm just going to give the
-
flavor of the schema
-
specification and then we're
-
going to work with validating the data
-
itself to see how the schema and data work together.
-
But to give you the flavor here, let's go through at least some portions of the schema.
-
So, in some sense,
-
the structure of the schema file
-
reflects the structure of the data file that it's describing.
-
So, the outermost constructs in
-
the schema file are the
-
outermost in the data file and
-
as we nest it parallels the nesting.
-
Let me just show a little
-
bit here, we'll probably look at most of it in the context of validation.
-
So, we see here that our outermost construct in our data file is an object.
-
And that's told to us,
-
because we have "type" as
-
one of our built-in labels for the schema.
-
So we we have an
-
object with two properties, as
-
we can see here, the book's property
-
and the magazine's property.
-
And I use the word
-
"labels" frequently for label-value
-
pairs, that's synonymous with property value pairs.
-
Then inside the books property
-
for example, we see that
-
the type of that is array,
-
so we've got a label-value pair where the value is an array.
-
And then we follow the nesting and see that it's an array of objects.
-
And we go further down and we
-
see the different label-value pairs
-
of the object that make up
-
the books and nesting further into the authors and so on.
-
We see similarly for magazines
-
that the value of the
-
a label-value pair for
-
magazines is an array, and
-
that array consists of objects with further nesting.
-
So what we're looking at here is
-
an online JSON schema validator. We have two windows.
-
On the left we have our
-
schema and on the
-
right we have our data, and
-
this is exactly the same data
-
file and schema file that we were looking at earlier.
-
If we hit the validate button,
-
hopefully everything should work and it does.
-
This tells us that the
-
JSON data is valid with respect to the schema.
-
Now, this system will of
-
course find basic syntactic errors
-
so I can take away a comma
-
just like I did before and
-
when I validate I'll get a
-
parsing error that really has nothing to do with the schema.
-
What I'm going to focus on
-
now is actually validating
-
semantic correctness of the Jason
-
with respect back to the constructs
-
that we've specified in this schema.
-
Let me first put that comma back so we start with a valid file.
-
So, the first thing I'll show is
-
the ability to constrain basic
-
types, and then the ability
-
to constrain the range of values of those basic types.
-
And let's focus on price.
-
So here we're talking about the
-
price property inside books and
-
we specify in our schema
-
that the type of the price must be an integer.
-
So, for example, if our
-
price were instead a string
-
and we went ahead and try
-
to validate that we would get an error.
-
Let's make it back into an
-
integer but let's make
-
it into the integer 300 now instead of 100.
-
And why am I doing that?
-
Because the JSON schema also
-
lets me constrain the range of
-
values that are allowed if we have a numeric value.
-
So, not only in price did I
-
say that it's an integer but
-
I also said that it
-
has a minimum and maximum value,
-
the integer of prices must
-
be between 0 and 200.
-
So, if I try to make
-
the price of 300, and I
-
validate, I'm again getting an error.
-
Now it's not a type error,
-
but it's an error that my
-
integer was outside of the allowed range.
-
I've put the price back to
-
a hundred, and now let's
-
look at constraints on string values.
-
JSON schema actually has
-
a little pattern matching language that
-
can be used to constrain the
-
allowable strings for a specific type of value.
-
We'll look at ISBN number here as an example of that.
-
We've said that ISBN is
-
of type string, and then
-
we've further constrained in the
-
schema that the string values for
-
ISBN must satisfy a certain pattern.
-
I'm not gonna go into the details of this pattern-matching language.
-
I'm just gonna give an example.
-
And in fact, this entire demo is
-
really just an example lots of
-
things in JSON's schema that we're not seeing.
-
What this pattern here says is
-
that the string value for
-
ISBN must start with
-
the four characters ISBN and then can be followed by anything else.
-
So, if we go over to our
-
data and we look at
-
the ISBN number here and
-
say we have a typo, we
-
forgot the "I" and we try to validate.
-
Then we'll see that our data
-
no longer matches our schema specification.
-
Now let's look at some other constraints we can specify in JSON's schema.
-
We can constrain the number of elements in an array.
-
We can give a minimum or maximum or both.
-
And I've done that here in the context of the authors array.
-
Remember the authors are
-
an array that's a list of
-
objects and here I've said that
-
we have a minimum number of
-
items of 1 and a
-
maximum number items of 10.
-
In other words, every book
-
has to have between one and ten authors.
-
So let's try, for example,
-
taking out all of our authors here in our first book.
-
We actually looked at this before in terms
-
of syntactic validity, and it
-
was perfectly valid to have an empty array.
-
But when we try to validate
-
now we do get an
-
error, and the reason is
-
that we said that we needed
-
between one and ten array elements in the case of authors.
-
Now let's fix that,
-
not by putting our authors back
-
but let's say we actually decide
-
we would like to be able to have books that have no authors.
-
So, we can simply fix
-
that by changing that minimum
-
item to zero and that
-
makes our data valid again and
-
in fact, we could actually take that
-
minimum constraint out all together,
-
and if we do that our data is still going to be valid.
-
Now let's see what happens when we
-
add something to our data that isn't mentioned in the schema.
-
If you look carefully you'll see
-
that everything that we have
-
in the data so far has been specified in the schema.
-
Let's say we come along
-
and decide were gonna also have ratings for our books.
-
So let's add here a
-
rating label property with the value 5.
-
We go ahead and validate, you
-
probaly think it's not going to
-
validate properly but actually it did.
-
The definition of JSON
-
schema that it can constrain things by
-
describing them but you
-
can also have components in
-
the data that aren't present in this schema.
-
If we want to insist
-
that every property that is
-
present in the data is
-
also described in this
-
schema, then we can
-
actually add a constraint to the schema that tells us that.
-
Specifically, under the object
-
here, we can put in
-
a special flag which itself
-
is specified as a label called additional properties.
-
And this flag if we
-
set it to false and remember
-
false can is actually a keyword
-
in json's schema, tells us
-
that in our data we're not
-
allowed to have any properties
-
beyond those that are specified in the schema.
-
So now we validate and we
-
get an error, because the property
-
rating hasn't been defined in the schema.
-
If additional properties is missing,
-
or have the default value
-
of "true", then the validation goes through.
-
Now lets take a look at our authors that are still here.
-
Let's suppose that we don't
-
have a first name for our middle author here.
-
If we take that away and
-
we try to validate, we do
-
get an error, because we specified
-
in our schema and it's right
-
down here--that author-objects must
-
have both a first name and a last name.
-
It turns out that we can
-
specify for every property that the property is optional.
-
So, we can add to the
-
description of the first
-
name, not only that the
-
type is a string but that that
-
property is optional so we
-
say optional, true.
-
Now let's validate, and now we're in good shape.
-
Now, let's take a look
-
at what happens when we have
-
object that has more than
-
one instance of the same label or same property.
-
So let's suppose, for example, in
-
our magazine, the magazine
-
has two different years, 2009 and 2011.
-
This is syntactically valid, JSON,
-
it meets the structure of having a list of label-value pairs.
-
When we validate it, we
-
see that we can't add a second property, year.
-
So this validator doesn't permit
-
two copies of the same
-
property, and it's actually kind
-
of a parsing thing and not
-
so much related to JSON's schema.
-
Many parsers actually do enforce
-
that labels or properties need
-
to be unique within objects, even
-
though technically syntactically correct
-
JSON does allow multiple copies.
-
So that's just something to remember,
-
the typical use of objects is
-
to have unique labels, sometimes
-
are even called keys of which evokes a concept of them unique.
-
So typically they are unique.
-
They don't have to be for syntactic validity.
-
Usually when you wanna have
-
repeated values, it actually makes more sense to create an array.
-
I've taken away the second year in order to make the JSON valid again.
-
Now let's take a look at months.
-
I've used months to illustrate
-
the enumeration constraint so we
-
saw that we could constrain the
-
values of integers, and we
-
saw that we can constrain strings
-
using a pattern, but we can
-
also constrain any type by
-
enumerating the values that are allowed.
-
So, for the month, we've set
-
it a string type which it
-
is but we've further constrained it
-
by saying that string must be
-
either January or February.
-
So, if we try to say
-
put in the string March, we
-
validate and we get the obvious error here.
-
We can fix that by changing the
-
month back, but maybe it
-
makes more sense that March
-
would be part of our enumeration type,
-
so we'll add March to
-
the possible values for months, and now we're good.
-
As a next example, let's take
-
a look at something that we
-
saw was syntactically correct but
-
isn't going to be semantically
-
correct, which is when
-
we have the author list
-
be a mixture of objects and strings.
-
So, let's put Jeffrey Ullman here just as a string.
-
We saw that that was still
-
valid JSON, but when we
-
try to validate now, we're gonna
-
get an error because we expected
-
to see an object, we have
-
specified that the authors
-
are objects, and instead we got a string.
-
Now JSON schema does allow
-
us to specify that we
-
can have different types of data
-
in the same context, and I'm
-
going to show that with a little bit of a simpler example here.
-
So, let's first take away our
-
author there so that we're back with a valid file.
-
And what I am going to look at is simply the year values.
-
So, let suppose for whatever
-
reason that in our
-
magazines, one of the
-
years was a string and the other year was an integer.
-
So that's not gonna work out
-
right now because we have
-
specified clearly that the year must be an integer.
-
In JSON schema specifications, when we
-
want to allow multiple types
-
for values that are
-
used in the same context, we
-
actually make the type be an array.
-
So instead of just saying
-
integer, if we put
-
an array here that has
-
both integer and string that's
-
telling us that our year
-
value can be either an
-
integer or a string
-
and now when we validate,
-
we get a correct JSON file.
-
That concludes our demo of JSON schema validation.
-
Again, we've just seen
-
one example with a number
-
of the constructs that are available
-
in JSON schema, but it's not
-
nearly exhaustive, there are many
-
others, and I encourage you
-
to read a bit more about it.
-
You can download this data and
-
this schema as a starting
-
point, and start adding things playing around
-
and I think you'll get a
-
good feel for how JSON
-
schema can be used to
-
constrain the allowable data in a JSON file.