Hi and welcome to Module 9.3 of Digital 
Signal Processing.

We are still talking about Digital 
Communication Systems.

In the previous module we addressed 
bandwidth constraint.

In this module we will tackle the powered 
constraint so first we will introduce the

concept of noise and probability of error 
in a communication system.

We will look at signaling alphabet and 
power and their related power.

And finally, we'll introduce QAM 
signaling.

So we have seen that a transmitter sends 
a sequence of symbols a of n.

Created by the mapper. 
Now we take the receiver into account.

We don't yet know how, but it's safe to 
assume that the receiver in the end will

obtain an estimation hat a of n. 
Of the original transmitted symbol

sequence. 
It's an estimation because even if there

is no distortion introduced by the 
channel.

Even if nothing bad happens. 
There will always be a certain amount of

noise, that will corrupt the original 
sequence.

When noise is very large, our estimate 
for the transmitted symbol will be off,

and will incur a decoding error. 
Now, this probability of error will

depend on the power of the noise, with 
respect to the power of the signal.

And will also depend on the decoding 
strategies that we've put in place, how

smart we are in circumventing the effects 
of the noise.

One we can maximize the probability of 
correctly guessing the transmit symbol,

is by using suitable alphabets. 
And so we will see in more detail what

that means. 
Remember the scheme for the transmitter.

We have a bitstream coming in. 
And then we have the scrambler.

And then the mapper. 
And here we have a sequence of symbols a

of n. 
These symbols will have to be sent over

the channel. 
And to do so, we upsample.

And we interpolate, and then we transmit. 
Now, how do we go from bitstreams to

samples in more detail? 
In other words, how does the mapper work?

The mapper will split the incoming 
bitstreams into chunks and will assign a

symbol, a of n, from a finite alphabet to 
each chunk.

The alphabet, we will decide later what 
it is composed of.

To undo the mapping operation and recover 
the bitstream, the receiver will perform

a slicing operation. 
So the receiver will a value, hat a of n,

where hat indicates the fact that noise 
has leaked into the value of the signal.

And the receiver will decide which symbol 
from the alphabet, which is known to the

receiver as well, is closest to the 
received symbol.

And from there, it will be extremely easy 
to piece back the original bitstream.

As an example, let's look at simple 
two-level signaling.

This generates signals of the kind we 
have seen in the example so far,

alternating between two levels. 
The way the mapper works is by splitting

the incoming bitstream into single bits. 
And the output symbol sequence uses an

alphabet composed of two symbols, g and 
minus g, and associates g to a bit of

value 1 and minus g to a bit of value 0. 
And the receiver, the slicer.

Looks at the sign of the incoming symbol 
sequence which has been corrupted by

noise. 
And decides that the nth bit will be 1 if

the sign of the nth symbol is positive, 
and 0 otherwise.

Lets look at an example, lets assume G 
equal to 1.

So the two-level signal will alternate 
between plus 1 and minus 1.

And suppose we have an input bit sequence 
that gives rise to this signal here after

transmission and after decoding at the 
receiver.

The resulting symbol sequence will look 
like this, where each symbol has been

corrupted by a varying amount of noise. 
If we now slice this sequence by

thresholding, as shown, shown before. 
We recover a simple sequence like this

where we have indicated in red the errors 
incurred by the slicer because of the

noise. 
So if you want to analyze in more detail

what the probability of error is, we have 
to make some hypothesis on the signals

involved in this toy experiment. 
Assume that each received symbol can be

modeled as the original symbol plus a 
noise sample.

Assume also that the bits in the 
bitstream are equiprobable.

So zero and one appear with probability 
50% each.

Assume that the noise and the signal are 
independent.

And assume that the noise is additive 
white Gaussian noise with zero mean and

known variance sigma 0. 
With this hypothesis, the probability of

error can be written out as follows. 
First of all, we split the probability of

errors into 2 conditional probabilities. 
Conditioned by whether the nth bit is

equal to 1, or the nth bit is equal to 
zero.

In the first case, when the nth bit is 
equal to 1.

Remember, the produced symbol will be 
equal to G, so the probability of error

is equal to the probability for the noise 
sample to be less than minus G.

Because only in this case the sum of the 
sample plus the noise will be negative.

Similarly, when the nth bit is equal to 
0, we have a negative sample.

And the only way for that to change sign 
is if the noise sample is greater than G.

Since the probability of each occurrence 
is 1 half because of the symmetry of the

Gaussian distribution function. 
This is equal to the probability for the

noise sample to be larger than G. 
And we can compute this as the integral

from G to infinity of the probability 
distribution function for the Gaussian

distribution with the known variance 
here.

This function has a standard name. 
It's called the error function.

And since this integral can not be 
computed in closed form, this function is

available in most numerical packages 
under this name.

So the important thing to notice here is 
that the probability of error is some

function of the ratio between the 
amplitude of the signal and the standard

deviation of the noise. 
And we can carry this analysis further by

considering the transmitted power. 
We have a bi-level signal and each level

occurs with 1 half probability. 
So the variance of the signal, which

corresponds to the power, is equal to G 
squared time the probability of the nth

being equal to 1. 
Plus G squared times the probability of

the nth bit being equal to 0, which is 
equal to G squared.

And so, if we rewrite the probability 
error function we can write that it is

equal to the error function of the ratio. 
Between the standard deviation of the

transmitted signal divided by the 
standard deviation of the noise, which is

equivalent to saying that it is the error 
function of the square root of the signal

to noise ratio. 
If we plot this as a function of the

signal to noise ratio in dBs. 
And I remind here that dBs here mean that

we compute 10 times the log in base 10 of 
the power of the signal divided by the

power of the noise. 
And since we are in a log log scale, we

can see that the probability of error 
decays exponentially with the signal to

noise ratio. 
This exponential decay is quite the norm

in communication systems. 
And while the absolute rate of decay

might change in terms of the linear 
constants involved in the curve.

The trend will stay the same even for 
more complex signaling schemes.

So the lesson that we learn from the 
simple example is that in order to reduce

the probability of error, we should 
increase G, the amplitude of the signal.

But of course, increasing G also 
increases the power of the transmitted

signal, and we know that we cannot go 
above the channel's power constraint.

And so that's how the power constraint 
limits the reliability of transmission.

The bilevel signalling scheme is very 
instructive, but it's also very limited

in the sense that we're sending just one 
bit per output symbol.

So to increase the throughput, to 
increase the number of bits per second

that we send over a channel, we can use 
multilevel signaling.

There are very many ways to do so and we 
will just look at a few, but the

fundamental idea is that we take now. 
Larger chunks of bits and therefore we

have alphabets that have a higher 
cardinality.

So more values in the alphabet means more 
bits per symbol and therefore a higher

data rate. 
But not to give the ending away, we will

see that the power of the signal will 
also be dependent on the size of the

alphabet. 
And so, in order not to exceed a certain

probability of error, given the channel's 
power of constraint, we will not be able

to grow the alphabet indefinitely. 
But we can be smart in the way we build

this alphabet and so we will look at some 
examples.

The first example is PAM, Pulse Amplitude 
Modulation.

We split the incoming bitstream into 
chunks of M bits so that each chunk

corresponds to an integer between 0 and 2 
to the M minus 1.

We can call this sequence of integers k 
of n and this sequence is mapped onto a

sequence of symbols a of n like so. 
There's a gain factor G, like always.

And then we use 2 to the n minus 1 odd 
integers around 0.

So for instance, if M is equal to 2, we 
have 0, 1, 2, and 3 as potential items

for k of n. 
And a of n will be either.

Let's assume G is equal to 1. 
Will be either minus 3, or minus 1, or 1,

or 3. 
We will see why we use the odd integers

in just a second. 
And the receiver the slicer will work by

simply associating to the received 
symbol, the closest odd integer, always

taking the gain into account. 
So graphically, again, PAM for M equal to

2 and G equal to 1, will look like this. 
Here are the odd integers.

The distance between two transmitted 
points, or transmitted symbols, is 2G

right here. 
G Is equal to 1, but it would be in

general 2 times the gain. 
And using odd integers creates a

zero-mean sequence. 
If we assume that each symbol is

equiprobable. 
Which is likely, given that we've used a

scrambler in the transmitter. 
The the resulting mean is zero.

The analysis of the probability of error 
for PAM is very similar to what we

carried out for bilateral signaling. 
As a matter of fact, binary signaling is

simply PAM with M equal to 1. 
The end result is very similar, and it's

an exponential decaying function of the 
ratio between the power of the signal and

the power of the noise. 
The reason why we don't analyze this

further is because we have an improvement 
in store.

And the improvement is aimed at 
increasing the throughput, increasing the

number of bits per symbol that we can 
send without necessarily increasing the

probability of error. 
So here's a wild idea.

Let's use complex numbers and build a 
complex valued transmission system.

This requires certain suspension of 
disbelief for the time being, but believe

me, it will work in the end. 
The name for this complex valued mapping

scheme is QAM. 
Which is an acronym for Quadtrature

Amplitude Modulation, and it works like 
so.

The mapper takes the income and bit 
stream, and splits it into chunks of M

bits, with M even. 
Then it uses half of the bits, to define

a PAM sequence, which we call a of r of 
n, and the reamaining, M over 2 bits, to

define another independent PAM sequence. 
Ai of n.

The final symbol sequence is a sequence 
of complex numbers, where the real part

is the first PAM sequence, and the 
imaginary part is the second PAM

sequence. 
And of course, in front we have a gain

factor, G. 
So the transmission alphabet, a, is given

by points in the complex plane, with 
odd-valued coordinates around the origin.

At the receiver, the slicer works by 
finding the symbol in the alphabet, which

is closest in Euclidean distance to the 
received symbol.

Let's look at this graphically. 
This is a set of points for QAM

transmission with M equal to 2, which 
corresponds to two bilevel PAM signals on

the real axis and on the imaginary axis. 
So that results into four points.

If we increase the number of bits per 
symbol, we set M equal to 4, that

corresponds to two pam signals with 2 
bits each, which makes for a

constellation. 
This is how these arrangement of points

in the complex plain are called. 
A constellation of four by four points at

the odd-valued coordinates in the complex 
plane.

If we increase M to 8, then we have a 256 
point constellation, with 16 points per

side. 
Lets look at what happens when a symbol

is received, and how we derive an 
expression for the probability of error.

If this is the nominal constellation, the 
transmitter will choose one of these

values for transmission, say this one. 
And this value will corrupted by noise in

the transmission and the receiving 
process.

And will appear somewhere in the complex 
plane, not necessarily exactly on the

point it originates from. 
The way the slicer operates, is by

defining decision regions around each 
point in the constellation.

So suppose for this point here, the 
transmitted point, the decision region is

square, of side 2G, centered around which 
is made in point.

So what happens is that when we receive 
symbols.

They will now fall on the original point. 
But as long as they fall within the

decision region, they will be decoded 
correctly.

So for instance here. 
We will decode this correctly.

Here we will decode this correctly. 
Same here.

But this point for instance falls outside 
of the decision region and therefore it

will be associated to a different 
constellation point, thereby causing an

error. 
To quantify the probability of error, we

assume as per usual that each received 
symbol is the sum of the transmitted

symbol. 
Plus a noise sample theta of n.

And we further assume that this noise is 
a complex value Gaussian noise of equal

variance in the complex and real 
components.

We're working on a completely digital 
system that operates.

With complex valued quantities. 
So we're making a new model for the

noise, and we will see later, how to 
translate the physical real noise, into a

complex variable. 
With these assumptions, the probability

of error, is equal to the probability 
that the real part of the noise is larger

than G in magnitude. 
Plus the probability that the imaginary

part of the noise is larger than G in 
magnitude.

We assume that real and imaginary 
component of the noise are independent,

and that's why we can split the 
probability like so.

Now, if you remember the shape of the 
decision region, this condition is

equivalent to saying that the noise is 
pushing the real part of the point,

outside of the decision region, in either 
direction, and same for the imaginary

part. 
Now if we develop this, this is equal to

1 minus the probability that the real 
part of the noise is less than G, and the

imaginary part of the noise is less than 
G.

This is the complimentary condition to 
what we just wrote above.

And so this is equal to 1 minus the 
integral over the decision region d of

the complex valued probability density 
function for the noise.

In order to compute this integral, we're 
going to approximate the shape of the

decision region. 
With the inbound circle.

So instead of using the square, we're 
going to use a circle centered around the

transmission point. 
When the constellation is very dense,

this approximation is quite accurate. 
With this approximation, we can compute

the integral exactly for a gaussian 
distribution.

And if we assume that the variance of the 
noise is sigma 0 squared over 2 in each

component, real or imaginary. 
It turns out that the probability of

error is equal to each of the minus g 
squared over sigma 0 square.

Now to obtain a probability of error as a 
function of the signal to noise ratio we

have to compute the power of the 
transmitted signal.

So if all symbols are equiprobable and 
independent, it turns out that the

variance of the signal is G squared times 
1 over 2 to the power of M.

Which is the probability of each symbol, 
times the sum over all symbols in the

alphabet of the magnitude of the symbols 
squared.

Now, it's a little bit tedious but we can 
solve it exactly for M.

And it turns out that the power to 
transmit the signal is g squared 2 rds3,

2 to the n to the minus 1. 
Now, if you plug this into the formula

for the probability of error that we seen 
before.

We get that the result is an exponential 
function where the argument is minus 3,

that multiplies 2 to the minus m plus 1, 
that multiplies the signals to noise

ratio. 
We can plot this probability of error in

a log log scale, like we did before. 
And we can paramatrize the curve, as a

function of the number of points in the 
constellation.

So here you have the curve for a four 
point constellation, Here's the curve for

16-points and here's the curve for 
64-points.

Now you can see that for a given signal 
to noise ratio the probability of error

increases with the number of points. 
Why is that?

Well if the signal to noise remains the 
same, and we assume that the noise is

always at the same level, then it means 
that the power of the signal remains

constant as well. 
In that case, if the number of points

increases, g has to become smaller. 
In order to accomodate a larger number of

points for the same power. 
But if g becomes smaller, then the

decision regions becomes smaller, the 
separation between points become smaller,

and the decision process becomes more 
vulnerable to noise.

So in the end here's the final recipe to 
design a QAM transmitter.

First you pick a probability of error 
that you can live.

In general, 10 to the minus 6 is an 
acceptable probability of error at the

symbol level. 
Then you find out the signals noise ratio

that is imposed by the channel's power 
constraint.

Once you have that, you can find the size 
of your constellation, by finding M.

Which, based on the previous equations, 
is the log and base 2 of 1 minus 3 over 2

times the signal to noise ratio, divided 
by the natural logarithm of the

probability of error. 
Of course, you will have to round this to

a suitable integer value, and potentially 
to an even power of 2 in order to have a

square constellation. 
The final data rate of your system will

be M, the number of bits per symbol, 
times W, which, if you remember, is the

baud rate of the system, and corresponds 
to the bandwidth allowed for by the

channel. 
So we know how to fit the bandwidth

constraint via upsampling. 
With QAM, we know how many bits per

symbol we can use given the power 
constraint.

And so we know the theoretical throughput 
of the transmit for a given reliability

figure. 
However, the question remains, how are we

going to send complex value symbols over 
a physical channel?

It's time, therefore, to Stop the 
suspension of this belief, and look at

techniques to do complex signaling over a 
real value channel.