-
So here's how the deal is going to work
Official computer science terminology.
-
Alright, so, I've got this signal. And
for, to make it digital, what I want to do
-
is I want to take the signal in like
[inaudible], the, the digital equivalent
-
of a microphone. I'm going to take in
sound signal. And what I want to do is
-
reduce it basically to a series of
numbers. So then it looks the RGB data
-
that we applied with so that once I've got
those numbers, I could put them on a file
-
or send them on a network or whatever so
that process is called digitization.
-
Taking in analog reducing it to numbers so
I'm going to show you how that works. So
-
the way digitization is done. Is I, I,
I've got this signal, I've drawn sort of
-
big, and I want to match, so this, the
signal is analog. Let, let's say this is
-
really in the air and that is really
perfectly what the signal looks like and
-
so I want to try and capture that signal.
So, the way that this is done with
-
digitization is called sampling. And what
am I going to do is I'm going to run a
-
little system which is going to measure
what the height of the curve is very
-
quickly overtime And so I'm going to use
as my example the audio CD format. So the
-
audio CD format, digitizers, sounds and is
samples 4400 per second. So I've drawn the
-
samples for the coarsely on the curve here
but in reality for sounds you can actually
-
hear the samples will be very tightly
spaced against the curve so it's going to
-
come up pretty good. Alright so here's
what sampling does. Let's say this is
-
going to be my first sample here. The way
it's going to have a notional kind of zero
-
line going Down the middle here All right?
So that will be, that's going to be sort
-
of the base line and the way these sounds
work is I, I've always talked about them
-
going above the zero line but actually
have the time on there below. So, we
-
record it as a series of both positive and
negative numbers. So, the way that samples
-
go work, let's say I start recording right
here, this is going to be my first sample.
-
So what I'm going to do is I'm going to
look at like well, how high is t his above
-
the zero line and I'm going to measure it.
And I'm just going to have some scale
-
where let's say, way up here for really
loud noise will be the way audio CDs work
-
will be about 32,000 would be the max and
so, in that scale. Maybe this is kind of a
-
not very loud sound so I'm just going to
measure on that scale. We're just, where
-
is that number. So, let's say, that one
turns out to be a 1003. So I'm going to
-
record that number. Okay, that was a
three, that's my first sample. So then,
-
144,000 of a second later, the curve, now,
this is very, this example, the curve is
-
way moving much farther than it would for,
for real sound here just to show. So let's
-
say for this next sample I gauge the
height of that like that's about 1720, so
-
I'm going to record that number. That's
going to be my second sample. Now here's
-
my third sample, oh, 1939 and my next
sample, you know, 2,102 and so on. So, I
-
just keep sampling this thing overtime.
Now, here's can be a bunch the positive
-
numbers but actually down here can be a
bunch of negative numbers, that's fine.
-
The result is I take signal in and I just
reduce it to this like [inaudible] of
-
numbers. So there are, this is not a
perfect process but it does work very
-
well. There's two sources of problems.
Once source is let's that I was getting
-
this very first sample and I've said well
let's call that a 1003. What if really
-
that signal, it was like a little bit
higher than a 1003 but it wasn't quite so
-
far as a 1004. But in my system I'm stuck
picking one of those two numbers. It's
-
either a 1003 or 1004 in my system and so
there has maybe been a tiny error there as
-
I kind of put it in the nearest bucket and
Audio CD ... A little bit of detail but I
-
mentioned, so they use numbers between
roughly between -32,000 and +32,000. And
-
the reason is, that's the, that's the
range of numbers you can store in two
-
bytes. So, the, the byte and its 8-bits
comes back here And so it turns out, on an
-
audio CD, example, one sample started two
bytes And so that, that kind of give us
-
how many different buckets and h ow many,
you know, how many distinct numbers we can
-
have. Alright So that's a small
[inaudible] I should just mention. So,
-
audio CDs, I think by most accounts, sound
fantastic like, you know, whatever, the
-
"error" I was mentioning here is like,
well, it's pretty small. The other source
-
of error which also for audio CD's is not
a problem is like when one of the signal
-
had some big excursion like it went way up
and then when it came down before my next
-
sample came in. I've, I could miss
something and it turns out for audio CD's
-
just for the range of sound, a sound that
did on audio CD would be outside of human
-
hearing. So, for the most part of that,
that's not really a problem. All right, so
-
those are the two sources of Tactically
imperfection but in reality this works
-
really well for [sound] aright so what AI
have done is I've taken in the signal and
-
now what I have is just like, there's a
lot of number and you can put those them
-
in to file or draw them on a CD or
whatever. So I think that this leads to
-
the natural question of okay well, How
does playback work? Well, I don't see why
-
to look at those numbers to be like, yeah.
No of course you wanted -- Alright sop
-
here's going to work. Just give me so the
reverse process. So, this is called
-
digital to audio conversion and there's a
piece of hardware that a chip that
-
specialized to just do this. What the
digital audio conversion is going to do is
-
it's going to take in the numbers and I
swear, the phrase that comes in mind to
-
this is connect the dots. So, what this
thing wants to do is to make actually a
-
pattern of electricity that exactly
follows the original signal and then,
-
that, that electricity then we can feed to
speaker and the speaker will make it back
-
in the sound. So, what the, what the
digital audio converter going to do is it
-
going to look at the first number, a 1003.
And so, it puts a dot, and like dot far
-
above line. Like okay, a 1003, got it. And
then let's look at the next sample, oh,
-
1720, all right. S, let's kind of put a
dot that height and then it, it's going to
-
come, basically draw a line so like, okay,
I'll connect those first two. And then it
-
takes the next sample and the next one and
the next one and you can see so we get
-
these dots and other, I drew actually
these straight lines between those
-
samples. As you could see, even though the
original was in some sense curved. The
-
straight lines put together, it does
basically capture the shape of the sound.
-
And as I was saying, for audio CDs because
really the sample are so close together
-
the straight lines were, worked very well.
So I [inaudible] so the effect of going
-
through these numbers is, is able to
recreate the original signal so this goes
-
to your direction. Taken a bunch of
numbers, recreate the signal beautifully
-
and the we could put that in the speaker
and now you're hearing one of the original
-
prerecorded sound was. So. That's how it
works. And also I just sort of mention the
-
chip that does that The DTA converter. I
see that sometimes in marketing materials
-
for like stereos or MP3 players you know,
extra awesome DTA converter so you may
-
actually, just a proof I'm not making this
up you actually, you may see that like a
-
box or something that's nearby. Alright,
those are two directions. The sampling to
-
recorded it and DTA conversion to play
back so I think the natural question here
-
would be. Why? Why go to all this trouble
like, it wasn't my little telephone
-
diagram was actually simpler, right? I
just have the microphone; I hook that to
-
the wire to the speaker. So what is the
advantage of putting stuff and then we
-
would say technically taking the data and
putting it in the digital domain. It used
-
to be was in the sound domain and then it
was on the electricity domain but now I've
-
put in the digital domain of just numbers
and we're going to see there's actually a
-
lot of advantages to having the data just
be in the digital domain. So, I'll talk
-
about those. All right, so the first thing
I want to think about is errors. It's
-
going to be one big advantage for digital
And so I, when I described that I've
-
recorded this thing as numbers, 1003, 1720
or whatever. You know, all take is covered
-
earlier. So, really, when I say numbers
that means in a computer it could be just
-
the [inaudible] one. It's just that, you
know, the 1720, there is just a pattern of
-
zeros and ones to represent that. It, it
does take two bytes to do it but it does
-
work. So, an audio CD At the end of the
day, what's recorded on an audio CD is
-
really the pattern of zeros and ones that
makes up the series of numbers that then
-
would feed in to the DDA converter makes
the music. So, if you, you got a
-
microscope and actually on the D, it
really is really picks and valleys, Like
-
there's a physicality of the ones in the
zeros u, you can actually see. Well, here
-
in the microscope we can see. Alright, so,
I want to think about. Let's think about
-
the C, audio CD playback. So, the way it
works is there's a laser that's in the end
-
of the CD and it's trying to kind of pull
off the patterns of ones and zeros so they
-
can remake the sample numbers so that you
can make the music. So, what is that going
-
to look like? And we talked about this a
little bit before in my networking section
-
and I would say, well, it's going to look
kind of like this. The pattern, the laser
-
is looking at the CD and let's say, the
first number it sees is a one, you know,
-
I'll say that it's a one And then the next
thing it sees is a zero. And then there's
-
a couple one and then a zero. So, it's
reading the ones and zeros off the CD.
-
That's what it looks like in the abstract.
Is that really what it's going to look
-
like coming off the CD? It I hook it in a
oscilloscope off to the laser and look as
-
the electricity coming out, is or going to
look like that with perfect 90 degree
-
corners, right? Now, there's going to be
noise. Right? All the little wires and
-
magnets and layers and CD like probably
jiggling a little bit while it seemly.
-
Here's what it's going to look like. It's
going to look like that. You've got the
-
ones and the zeros in there just as we
have for the analog scheme. There is noise
-
crowding up on top of our signal that's
wha t we actually get. We get back signal
-
with noise. All right now, this is a, this
is a little bit of a punchline moment
-
here. Suppose you're the CD player and
you're like, oh, no. My signal up the CD,
-
[laugh]. It's got all this noise, [laugh]
But look. Do you have any problem picking
-
the ones and zeros out of there? No.
Right? You can have a lot noise on that
-
signal but the 1s and 0s you start pick
them out and alright and so the effect on
-
the noise on the digital signal, it's
nothing. Right, I can see 10111 so that
-
means I can have the CD also [inaudible]
noise on it and I playback, it doesn't
-
come back like pretty close. The playback,
it's perfect. Right it's just like the
-
signal came out ideally. So that's why or
this is what is it. That's why digital
-
sound is better. Right? And I'm going to
have to kind of string together all these
-
steps but I can take the signal I care
about, encoded the zeros and ones, so
-
basically it's come to [inaudible] and it
gives me a lot of noise resistance. It's
-
not perfect, right? If you, if you draw a
hole in the CD you know, there could be a
-
mess up big enough on the CD to really to
really mess things up. But it can stand a
-
lot. In particular, it's much better than
analog but with analog, that hiss was just
-
mixed right in with the sound that I
wanted to hear. So this is the big jump up
-
in, yeah, lots of things digital. So
there, now you know. That, that's how it
-
works Alright, So, noise reduction or, you
know, noise elimination really is one big
-
example. I'll just mention for
completeness so the way CDs work,
-
obviously for the 0s and 1s is its greatly
resistant to little of specs or dust
-
whatever that cause noise. It is also the
case that the CD actually stores multiple
-
in a sense, multiple copies of the music.
It's a little bit like packets on the
-
network. Remember we talked about packets
and resend. The CD actually has multiple
-
copies and the copies are marked with
checksums which I talked about in the
-
networking section And the CD can actually
notice if there's like a little tiny h ole
-
or something but one part of the music
didn't come out right, it can go to a, I'm
-
over some final behavior, but basically it
can go to a fall back copy And so, it can,
-
it can swap that one in and just keep
playing. And so that is another, a higher
-
level of air resistance. It's called air,
air detection and correction that CDs have
-
and actually DVDs do as well. So that's
how it, yeah, I can see you know, when
-
someone else's DVDs you could figure out
like how big of a hole can you drill,
-
[laugh], in a, in a DVD and still have it
play. I'll place for a big but You can
-
stand a little bit of just missing data
and has another copy. That cannot, you
-
know so also that works, that works
because it's digital that we can have this
-
logic in these if statements to kind of
check someone copy and having if statement
-
that says oh, I'm going to go get the
other copy in some places so yeah anyway
-
that would work. Alright Let me show you
another thing that you can do with
-
digital. Alright so supposed the numbers
That I had coming off of my audio CD were
-
these. So 12,000, 12002, 12006, 12007 this
is actually pretty realistic that the, the
-
sampling is so fast on the audio CD that
the signal appears to just change very
-
slowly for sounds that actually you know
that you can hear. Alright so that's, so
-
what you notice about those numbers is
that they are pretty near to each other.
-
Right? Like even though the base number
12,000 is big, the change from one sample
-
to the next is like, is not very big,
right it's under ten. In fact, it's just
-
sort of five. So, I'm going to propose a
scheme, a compression scheme that we could
-
use so that we could record the audio data
and have it take a, take a plus base. So,
-
here's a scheme I'm going to propose. What
if. At the start of the audio data I just
-
put whatever the first number is so I
record that first sample. And then after
-
that, I don't record any more samples.
What I do, is I just record the difference
-
of each sample to the next. So, in this
case, I would end up with 12,000 and I
-
would say plus two because the next sample
is 12,002 and I would say plus four
-
because the difference between the third
sample and the second was going up by
-
four. And so you know + one + three - five
+ one I just gotta put these all numbers.
-
So now you just have to convince yourself
playback can still work so as long as
-
playback knows about my scheme. Like it's
playback after to know the sample numbers,
-
So, playback would say, alright, this is
next crazy this new called delta scheme
-
because you're just recording the deltas.
So, playback, you know, alright so 12,000
-
is the first sample and the we'll just
have to do the arithmetic to recreate the
-
sample. So, you could just work, it could
just, you know, undo it to work out the
-
samples where 12,002, 12,006, 12,007 and
then using those samples, feed those into
-
the DGA converter to like really recreate
the sample. Okay, so, what's the advantage
-
of this? What's better about +two, +four,
+one, +three than 12,000 to 12,006,
-
12,007. I mean it's sort of gets back to
bytes. What this comes down to is those
-
numbers are smaller, a lot smaller and the
reality is I could record them using to
-
your bytes. Right? If you just think about
the amount of space on the CD, the amount
-
of little bits and values or whatever, if
I use the scheme and, and then I'm
-
[inaudible] a little bit of complexity.
But basically, I could take that sound and
-
I can record it may be using just half
this much space because I'm just being KG
-
about making the number of small in those
cases so that maybe I could just use one
-
byte whereas previously I had to use two
bytes for each sample And I know that's in
-
complexity. So, that is compression. Right
we talked out in all media, sound and
-
images all of these sort of things, they
are typically stored in some compressed
-
format and this is an example where
instead of just storing the numbers in the
-
total like straight ahead obvious way,
we're going to have some scheme that takes
-
advantage of the fact and this is true for
the images as well that the numbers don't
-
intended as jump around rando mly. But if
I would get one pixel in an image and it
-
got certain red green blue values And then
suppose you look at the picture right next
-
to it. Super, super close probably the red
green blue values to that second pixel
-
they're probably really similar, really
close to the red green blue values for the
-
first pixel And so maybe you could take
advantage of that, have that some delta
-
encoding scheme where for each pixel maybe
you didn't record what the number was.
-
Maybe you recorded what the difference of
that pixel was versus the pixel to its
-
left and so then suddenly the numbers tend
to get a lot smaller. So that's just an
-
example or essentially compression is
comes up light so this is my kind of
-
zoomed in kind of simple example of how
that might work. Now this example, this
-
compression is called loss less Because
I've, I've changed the data around to take
-
up less space but I haven't given up any
fidelity at all. If you run through my
-
delta scheme here, the samples, they come
back exactly right. Right, I haven't, I
-
haven't given up any. I just add a
complexity to take a plus space. So just
-
an example so PNG I talked about that as I
think a little bit as a image format so
-
PNG is also laws less. It takes a bunch of
pixels. It arranges them to take up last
-
space but the image is not corrupted
anyway. It comes back professionally. So
-
laws less for mats I think are a little
bit a little bit more rare. The more
-
common approach to compression is called
lossy compression and so lossy compression
-
is going to take in the data. And it's
going to cost it, it's going to rearrange
-
it so it takes up a lot less space. But.
It's not going to repeat on playback. It's
-
not going to reproduce the data exactly.
It's just going to be sort of
-
qualitatively very close. So, so, you
pretty much can't tell. So JPEG, That's
-
the [inaudible] Lord knows we've been
using. A lot of, JPEG is actually lossy
-
and I, I'll demonstrate this a little bit.
So, you can give it an image with the
-
pixels just perfect and it takes up
certain amount of space. You encode it as
-
JPEG and it will take up a lot less space,
maybe ten actually less space but it will
-
make certain shortcuts how the red and
greens and blues are recorded. So, when
-
you look at it, it looks very good, but if
you put it under a microscope, you could
-
see where it had kind of fudged a little
bit to, to save space. So I can do a, a
-
simple example of a lossy compression,
sticking with my audio sample example here
-
would be what if. I think I have this oh
yeah, what if we just threw out every
-
other number? We just said, well whatever
we've got 44000 of these a second. Let's
-
just only record 22 thousand. And then on
playback they're going to have to know
-
that so we're going to record this number
and this number and this number and then
-
on playback, what could play back do?
Playback is going to have all these
-
missing numbers right so it'll get this
number and get this number then playback
-
is going to have to sort of fudge this
middle number. What could playback do
-
there? You could just guess, right? You
could say well, let's guess that it's
-
halfway in between. We're going to just
guess 12,003. Well maybe in reality it was
-
12,002. Let's point your sound pretty
good. All right But so, this is a little
-
bit lossy. We're fudging the data a little
bit, right? So in the next one would be
-
12,010, 12,006 so again maybe it would
guess 12,008 even though in reality, the
-
sample was 12,007. Now this is a big
savings, right? So you, this have the
-
space, my delta scheme also more or less
have the space maybe in a little bit
-
better. So you compile these techniques on
where your data was originally quite
-
large. It, it could take up a lot less
space and when you play it back, so I'll
-
do my JPEG example, it really does look
very good. But if you got out of the
-
microscope and look, you could see with
certain little, fudge is the word I want
-
to use but it was probably a better term.
Little adjustments have been made.
-
[laugh]. So it takes up my space. Okay, so
the, the two, you know, so any media data
-
that you played with. So, JPEG for images,
MP3 for sound a nd all the video formats
-
make heavy, heavy use of lossy
compression, Especially video data. If you
-
just had all the samples and you just
recorded them in the raw. It's an enormous
-
amount of data. Fortunately the data like
I said with the pixels and like the pixel
-
right next to it. It is, the pixel is said
to be have a, have a lot of redundancy in
-
it but actually, there are these patterns
in the data that you can take advantage of
-
to compress it quite a lot and have it
still look very good. So, JPEG does this
-
for images. Mp3. Which is like, it sounds
like the definition of being in college. I
-
guess not anymore. That's just being alive
anyway so MP3 is very, is aggressively
-
lossy. It's starts with a lot of data and
it has all these tricks to throw out and
-
cut corners a little bit to get it down to
be pretty small. As I mention before MP3
-
data works out about one megabyte, about a
million bytes per minute at your corporate
-
high quality audio. I should mention MP3
was the result. That format is a result of
-
a lot of research that they had Test
subjects and they would play sounds with
-
different compression schemes and really
home in on for the human ear and brain
-
what are kind of [inaudible] and omissions
that can be heard and what are omissions
-
that cannot be heard and there was, and a
lot of creativity and research went it to
-
having MP3 worked pretty well. And that
was like I don't know what to maybe
-
fifteen years ago or something? So in fact
there have been any advance since then.
-
Where they have gotten, the formats have
gotten. Even better when they take up less
-
space in MP3 but also make their trade
offs in more cover release so they even
-
sound better than MP3. Alrighty, So, that
is, so lossy compression is just like a
-
part, part of life.