Hi and welcome to Module 9.3 of Digital
Signal Processing.
We are still talking about Digital
Communication Systems.
In the previous module we addressed
bandwidth constraint.
In this module we will tackle the powered
constraint so first we will introduce the
concept of noise and probability of error
in a communication system.
We will look at signaling alphabet and
power and their related power.
And finally, we'll introduce QAM
signaling.
So we have seen that a transmitter sends
a sequence of symbols a of n.
Created by the mapper.
Now we take the receiver into account.
We don't yet know how, but it's safe to
assume that the receiver in the end will
obtain an estimation hat a of n.
Of the original transmitted symbol
sequence.
It's an estimation because even if there
is no distortion introduced by the
channel.
Even if nothing bad happens.
There will always be a certain amount of
noise, that will corrupt the original
sequence.
When noise is very large, our estimate
for the transmitted symbol will be off,
and will incur a decoding error.
Now, this probability of error will
depend on the power of the noise, with
respect to the power of the signal.
And will also depend on the decoding
strategies that we've put in place, how
smart we are in circumventing the effects
of the noise.
One we can maximize the probability of
correctly guessing the transmit symbol,
is by using suitable alphabets.
And so we will see in more detail what
that means.
Remember the scheme for the transmitter.
We have a bitstream coming in.
And then we have the scrambler.
And then the mapper.
And here we have a sequence of symbols a
of n.
These symbols will have to be sent over
the channel.
And to do so, we upsample.
And we interpolate, and then we transmit.
Now, how do we go from bitstreams to
samples in more detail?
In other words, how does the mapper work?
The mapper will split the incoming
bitstreams into chunks and will assign a
symbol, a of n, from a finite alphabet to
each chunk.
The alphabet, we will decide later what
it is composed of.
To undo the mapping operation and recover
the bitstream, the receiver will perform
a slicing operation.
So the receiver will a value, hat a of n,
where hat indicates the fact that noise
has leaked into the value of the signal.
And the receiver will decide which symbol
from the alphabet, which is known to the
receiver as well, is closest to the
received symbol.
And from there, it will be extremely easy
to piece back the original bitstream.
As an example, let's look at simple
two-level signaling.
This generates signals of the kind we
have seen in the example so far,
alternating between two levels.
The way the mapper works is by splitting
the incoming bitstream into single bits.
And the output symbol sequence uses an
alphabet composed of two symbols, g and
minus g, and associates g to a bit of
value 1 and minus g to a bit of value 0.
And the receiver, the slicer.
Looks at the sign of the incoming symbol
sequence which has been corrupted by
noise.
And decides that the nth bit will be 1 if
the sign of the nth symbol is positive,
and 0 otherwise.
Lets look at an example, lets assume G
equal to 1.
So the two-level signal will alternate
between plus 1 and minus 1.
And suppose we have an input bit sequence
that gives rise to this signal here after
transmission and after decoding at the
receiver.
The resulting symbol sequence will look
like this, where each symbol has been
corrupted by a varying amount of noise.
If we now slice this sequence by
thresholding, as shown, shown before.
We recover a simple sequence like this
where we have indicated in red the errors
incurred by the slicer because of the
noise.
So if you want to analyze in more detail
what the probability of error is, we have
to make some hypothesis on the signals
involved in this toy experiment.
Assume that each received symbol can be
modeled as the original symbol plus a
noise sample.
Assume also that the bits in the
bitstream are equiprobable.
So zero and one appear with probability
50% each.
Assume that the noise and the signal are
independent.
And assume that the noise is additive
white Gaussian noise with zero mean and
known variance sigma 0.
With this hypothesis, the probability of
error can be written out as follows.
First of all, we split the probability of
errors into 2 conditional probabilities.
Conditioned by whether the nth bit is
equal to 1, or the nth bit is equal to
zero.
In the first case, when the nth bit is
equal to 1.
Remember, the produced symbol will be
equal to G, so the probability of error
is equal to the probability for the noise
sample to be less than minus G.
Because only in this case the sum of the
sample plus the noise will be negative.
Similarly, when the nth bit is equal to
0, we have a negative sample.
And the only way for that to change sign
is if the noise sample is greater than G.
Since the probability of each occurrence
is 1 half because of the symmetry of the
Gaussian distribution function.
This is equal to the probability for the
noise sample to be larger than G.
And we can compute this as the integral
from G to infinity of the probability
distribution function for the Gaussian
distribution with the known variance
here.
This function has a standard name.
It's called the error function.
And since this integral can not be
computed in closed form, this function is
available in most numerical packages
under this name.
So the important thing to notice here is
that the probability of error is some
function of the ratio between the
amplitude of the signal and the standard
deviation of the noise.
And we can carry this analysis further by
considering the transmitted power.
We have a bi-level signal and each level
occurs with 1 half probability.
So the variance of the signal, which
corresponds to the power, is equal to G
squared time the probability of the nth
being equal to 1.
Plus G squared times the probability of
the nth bit being equal to 0, which is
equal to G squared.
And so, if we rewrite the probability
error function we can write that it is
equal to the error function of the ratio.
Between the standard deviation of the
transmitted signal divided by the
standard deviation of the noise, which is
equivalent to saying that it is the error
function of the square root of the signal
to noise ratio.
If we plot this as a function of the
signal to noise ratio in dBs.
And I remind here that dBs here mean that
we compute 10 times the log in base 10 of
the power of the signal divided by the
power of the noise.
And since we are in a log log scale, we
can see that the probability of error
decays exponentially with the signal to
noise ratio.
This exponential decay is quite the norm
in communication systems.
And while the absolute rate of decay
might change in terms of the linear
constants involved in the curve.
The trend will stay the same even for
more complex signaling schemes.
So the lesson that we learn from the
simple example is that in order to reduce
the probability of error, we should
increase G, the amplitude of the signal.
But of course, increasing G also
increases the power of the transmitted
signal, and we know that we cannot go
above the channel's power constraint.
And so that's how the power constraint
limits the reliability of transmission.
The bilevel signalling scheme is very
instructive, but it's also very limited
in the sense that we're sending just one
bit per output symbol.
So to increase the throughput, to
increase the number of bits per second
that we send over a channel, we can use
multilevel signaling.
There are very many ways to do so and we
will just look at a few, but the
fundamental idea is that we take now.
Larger chunks of bits and therefore we
have alphabets that have a higher
cardinality.
So more values in the alphabet means more
bits per symbol and therefore a higher
data rate.
But not to give the ending away, we will
see that the power of the signal will
also be dependent on the size of the
alphabet.
And so, in order not to exceed a certain
probability of error, given the channel's
power of constraint, we will not be able
to grow the alphabet indefinitely.
But we can be smart in the way we build
this alphabet and so we will look at some
examples.
The first example is PAM, Pulse Amplitude
Modulation.
We split the incoming bitstream into
chunks of M bits so that each chunk
corresponds to an integer between 0 and 2
to the M minus 1.
We can call this sequence of integers k
of n and this sequence is mapped onto a
sequence of symbols a of n like so.
There's a gain factor G, like always.
And then we use 2 to the n minus 1 odd
integers around 0.
So for instance, if M is equal to 2, we
have 0, 1, 2, and 3 as potential items
for k of n.
And a of n will be either.
Let's assume G is equal to 1.
Will be either minus 3, or minus 1, or 1,
or 3.
We will see why we use the odd integers
in just a second.
And the receiver the slicer will work by
simply associating to the received
symbol, the closest odd integer, always
taking the gain into account.
So graphically, again, PAM for M equal to
2 and G equal to 1, will look like this.
Here are the odd integers.
The distance between two transmitted
points, or transmitted symbols, is 2G
right here.
G Is equal to 1, but it would be in
general 2 times the gain.
And using odd integers creates a
zero-mean sequence.
If we assume that each symbol is
equiprobable.
Which is likely, given that we've used a
scrambler in the transmitter.
The the resulting mean is zero.
The analysis of the probability of error
for PAM is very similar to what we
carried out for bilateral signaling.
As a matter of fact, binary signaling is
simply PAM with M equal to 1.
The end result is very similar, and it's
an exponential decaying function of the
ratio between the power of the signal and
the power of the noise.
The reason why we don't analyze this
further is because we have an improvement
in store.
And the improvement is aimed at
increasing the throughput, increasing the
number of bits per symbol that we can
send without necessarily increasing the
probability of error.
So here's a wild idea.
Let's use complex numbers and build a
complex valued transmission system.
This requires certain suspension of
disbelief for the time being, but believe
me, it will work in the end.
The name for this complex valued mapping
scheme is QAM.
Which is an acronym for Quadtrature
Amplitude Modulation, and it works like
so.
The mapper takes the income and bit
stream, and splits it into chunks of M
bits, with M even.
Then it uses half of the bits, to define
a PAM sequence, which we call a of r of
n, and the reamaining, M over 2 bits, to
define another independent PAM sequence.
Ai of n.
The final symbol sequence is a sequence
of complex numbers, where the real part
is the first PAM sequence, and the
imaginary part is the second PAM
sequence.
And of course, in front we have a gain
factor, G.
So the transmission alphabet, a, is given
by points in the complex plane, with
odd-valued coordinates around the origin.
At the receiver, the slicer works by
finding the symbol in the alphabet, which
is closest in Euclidean distance to the
received symbol.
Let's look at this graphically.
This is a set of points for QAM
transmission with M equal to 2, which
corresponds to two bilevel PAM signals on
the real axis and on the imaginary axis.
So that results into four points.
If we increase the number of bits per
symbol, we set M equal to 4, that
corresponds to two pam signals with 2
bits each, which makes for a
constellation.
This is how these arrangement of points
in the complex plain are called.
A constellation of four by four points at
the odd-valued coordinates in the complex
plane.
If we increase M to 8, then we have a 256
point constellation, with 16 points per
side.
Lets look at what happens when a symbol
is received, and how we derive an
expression for the probability of error.
If this is the nominal constellation, the
transmitter will choose one of these
values for transmission, say this one.
And this value will corrupted by noise in
the transmission and the receiving
process.
And will appear somewhere in the complex
plane, not necessarily exactly on the
point it originates from.
The way the slicer operates, is by
defining decision regions around each
point in the constellation.
So suppose for this point here, the
transmitted point, the decision region is
square, of side 2G, centered around which
is made in point.
So what happens is that when we receive
symbols.
They will now fall on the original point.
But as long as they fall within the
decision region, they will be decoded
correctly.
So for instance here.
We will decode this correctly.
Here we will decode this correctly.
Same here.
But this point for instance falls outside
of the decision region and therefore it
will be associated to a different
constellation point, thereby causing an
error.
To quantify the probability of error, we
assume as per usual that each received
symbol is the sum of the transmitted
symbol.
Plus a noise sample theta of n.
And we further assume that this noise is
a complex value Gaussian noise of equal
variance in the complex and real
components.
We're working on a completely digital
system that operates.
With complex valued quantities.
So we're making a new model for the
noise, and we will see later, how to
translate the physical real noise, into a
complex variable.
With these assumptions, the probability
of error, is equal to the probability
that the real part of the noise is larger
than G in magnitude.
Plus the probability that the imaginary
part of the noise is larger than G in
magnitude.
We assume that real and imaginary
component of the noise are independent,
and that's why we can split the
probability like so.
Now, if you remember the shape of the
decision region, this condition is
equivalent to saying that the noise is
pushing the real part of the point,
outside of the decision region, in either
direction, and same for the imaginary
part.
Now if we develop this, this is equal to
1 minus the probability that the real
part of the noise is less than G, and the
imaginary part of the noise is less than
G.
This is the complimentary condition to
what we just wrote above.
And so this is equal to 1 minus the
integral over the decision region d of
the complex valued probability density
function for the noise.
In order to compute this integral, we're
going to approximate the shape of the
decision region.
With the inbound circle.
So instead of using the square, we're
going to use a circle centered around the
transmission point.
When the constellation is very dense,
this approximation is quite accurate.
With this approximation, we can compute
the integral exactly for a gaussian
distribution.
And if we assume that the variance of the
noise is sigma 0 squared over 2 in each
component, real or imaginary.
It turns out that the probability of
error is equal to each of the minus g
squared over sigma 0 square.
Now to obtain a probability of error as a
function of the signal to noise ratio we
have to compute the power of the
transmitted signal.
So if all symbols are equiprobable and
independent, it turns out that the
variance of the signal is G squared times
1 over 2 to the power of M.
Which is the probability of each symbol,
times the sum over all symbols in the
alphabet of the magnitude of the symbols
squared.
Now, it's a little bit tedious but we can
solve it exactly for M.
And it turns out that the power to
transmit the signal is g squared 2 rds3,
2 to the n to the minus 1.
Now, if you plug this into the formula
for the probability of error that we seen
before.
We get that the result is an exponential
function where the argument is minus 3,
that multiplies 2 to the minus m plus 1,
that multiplies the signals to noise
ratio.
We can plot this probability of error in
a log log scale, like we did before.
And we can paramatrize the curve, as a
function of the number of points in the
constellation.
So here you have the curve for a four
point constellation, Here's the curve for
16-points and here's the curve for
64-points.
Now you can see that for a given signal
to noise ratio the probability of error
increases with the number of points.
Why is that?
Well if the signal to noise remains the
same, and we assume that the noise is
always at the same level, then it means
that the power of the signal remains
constant as well.
In that case, if the number of points
increases, g has to become smaller.
In order to accomodate a larger number of
points for the same power.
But if g becomes smaller, then the
decision regions becomes smaller, the
separation between points become smaller,
and the decision process becomes more
vulnerable to noise.
So in the end here's the final recipe to
design a QAM transmitter.
First you pick a probability of error
that you can live.
In general, 10 to the minus 6 is an
acceptable probability of error at the
symbol level.
Then you find out the signals noise ratio
that is imposed by the channel's power
constraint.
Once you have that, you can find the size
of your constellation, by finding M.
Which, based on the previous equations,
is the log and base 2 of 1 minus 3 over 2
times the signal to noise ratio, divided
by the natural logarithm of the
probability of error.
Of course, you will have to round this to
a suitable integer value, and potentially
to an even power of 2 in order to have a
square constellation.
The final data rate of your system will
be M, the number of bits per symbol,
times W, which, if you remember, is the
baud rate of the system, and corresponds
to the bandwidth allowed for by the
channel.
So we know how to fit the bandwidth
constraint via upsampling.
With QAM, we know how many bits per
symbol we can use given the power
constraint.
And so we know the theoretical throughput
of the transmit for a given reliability
figure.
However, the question remains, how are we
going to send complex value symbols over
a physical channel?
It's time, therefore, to Stop the
suspension of this belief, and look at
techniques to do complex signaling over a
real value channel.