Hi and welcome to Module 9.3 of Digital Signal Processing. We are still talking about Digital Communication Systems. In the previous module we addressed bandwidth constraint. In this module we will tackle the powered constraint so first we will introduce the concept of noise and probability of error in a communication system. We will look at signaling alphabet and power and their related power. And finally, we'll introduce QAM signaling. So we have seen that a transmitter sends a sequence of symbols a of n. Created by the mapper. Now we take the receiver into account. We don't yet know how, but it's safe to assume that the receiver in the end will obtain an estimation hat a of n. Of the original transmitted symbol sequence. It's an estimation because even if there is no distortion introduced by the channel. Even if nothing bad happens. There will always be a certain amount of noise, that will corrupt the original sequence. When noise is very large, our estimate for the transmitted symbol will be off, and will incur a decoding error. Now, this probability of error will depend on the power of the noise, with respect to the power of the signal. And will also depend on the decoding strategies that we've put in place, how smart we are in circumventing the effects of the noise. One we can maximize the probability of correctly guessing the transmit symbol, is by using suitable alphabets. And so we will see in more detail what that means. Remember the scheme for the transmitter. We have a bitstream coming in. And then we have the scrambler. And then the mapper. And here we have a sequence of symbols a of n. These symbols will have to be sent over the channel. And to do so, we upsample. And we interpolate, and then we transmit. Now, how do we go from bitstreams to samples in more detail? In other words, how does the mapper work? The mapper will split the incoming bitstreams into chunks and will assign a symbol, a of n, from a finite alphabet to each chunk. The alphabet, we will decide later what it is composed of. To undo the mapping operation and recover the bitstream, the receiver will perform a slicing operation. So the receiver will a value, hat a of n, where hat indicates the fact that noise has leaked into the value of the signal. And the receiver will decide which symbol from the alphabet, which is known to the receiver as well, is closest to the received symbol. And from there, it will be extremely easy to piece back the original bitstream. As an example, let's look at simple two-level signaling. This generates signals of the kind we have seen in the example so far, alternating between two levels. The way the mapper works is by splitting the incoming bitstream into single bits. And the output symbol sequence uses an alphabet composed of two symbols, g and minus g, and associates g to a bit of value 1 and minus g to a bit of value 0. And the receiver, the slicer. Looks at the sign of the incoming symbol sequence which has been corrupted by noise. And decides that the nth bit will be 1 if the sign of the nth symbol is positive, and 0 otherwise. Lets look at an example, lets assume G equal to 1. So the two-level signal will alternate between plus 1 and minus 1. And suppose we have an input bit sequence that gives rise to this signal here after transmission and after decoding at the receiver. The resulting symbol sequence will look like this, where each symbol has been corrupted by a varying amount of noise. If we now slice this sequence by thresholding, as shown, shown before. We recover a simple sequence like this where we have indicated in red the errors incurred by the slicer because of the noise. So if you want to analyze in more detail what the probability of error is, we have to make some hypothesis on the signals involved in this toy experiment. Assume that each received symbol can be modeled as the original symbol plus a noise sample. Assume also that the bits in the bitstream are equiprobable. So zero and one appear with probability 50% each. Assume that the noise and the signal are independent. And assume that the noise is additive white Gaussian noise with zero mean and known variance sigma 0. With this hypothesis, the probability of error can be written out as follows. First of all, we split the probability of errors into 2 conditional probabilities. Conditioned by whether the nth bit is equal to 1, or the nth bit is equal to zero. In the first case, when the nth bit is equal to 1. Remember, the produced symbol will be equal to G, so the probability of error is equal to the probability for the noise sample to be less than minus G. Because only in this case the sum of the sample plus the noise will be negative. Similarly, when the nth bit is equal to 0, we have a negative sample. And the only way for that to change sign is if the noise sample is greater than G. Since the probability of each occurrence is 1 half because of the symmetry of the Gaussian distribution function. This is equal to the probability for the noise sample to be larger than G. And we can compute this as the integral from G to infinity of the probability distribution function for the Gaussian distribution with the known variance here. This function has a standard name. It's called the error function. And since this integral can not be computed in closed form, this function is available in most numerical packages under this name. So the important thing to notice here is that the probability of error is some function of the ratio between the amplitude of the signal and the standard deviation of the noise. And we can carry this analysis further by considering the transmitted power. We have a bi-level signal and each level occurs with 1 half probability. So the variance of the signal, which corresponds to the power, is equal to G squared time the probability of the nth being equal to 1. Plus G squared times the probability of the nth bit being equal to 0, which is equal to G squared. And so, if we rewrite the probability error function we can write that it is equal to the error function of the ratio. Between the standard deviation of the transmitted signal divided by the standard deviation of the noise, which is equivalent to saying that it is the error function of the square root of the signal to noise ratio. If we plot this as a function of the signal to noise ratio in dBs. And I remind here that dBs here mean that we compute 10 times the log in base 10 of the power of the signal divided by the power of the noise. And since we are in a log log scale, we can see that the probability of error decays exponentially with the signal to noise ratio. This exponential decay is quite the norm in communication systems. And while the absolute rate of decay might change in terms of the linear constants involved in the curve. The trend will stay the same even for more complex signaling schemes. So the lesson that we learn from the simple example is that in order to reduce the probability of error, we should increase G, the amplitude of the signal. But of course, increasing G also increases the power of the transmitted signal, and we know that we cannot go above the channel's power constraint. And so that's how the power constraint limits the reliability of transmission. The bilevel signalling scheme is very instructive, but it's also very limited in the sense that we're sending just one bit per output symbol. So to increase the throughput, to increase the number of bits per second that we send over a channel, we can use multilevel signaling. There are very many ways to do so and we will just look at a few, but the fundamental idea is that we take now. Larger chunks of bits and therefore we have alphabets that have a higher cardinality. So more values in the alphabet means more bits per symbol and therefore a higher data rate. But not to give the ending away, we will see that the power of the signal will also be dependent on the size of the alphabet. And so, in order not to exceed a certain probability of error, given the channel's power of constraint, we will not be able to grow the alphabet indefinitely. But we can be smart in the way we build this alphabet and so we will look at some examples. The first example is PAM, Pulse Amplitude Modulation. We split the incoming bitstream into chunks of M bits so that each chunk corresponds to an integer between 0 and 2 to the M minus 1. We can call this sequence of integers k of n and this sequence is mapped onto a sequence of symbols a of n like so. There's a gain factor G, like always. And then we use 2 to the n minus 1 odd integers around 0. So for instance, if M is equal to 2, we have 0, 1, 2, and 3 as potential items for k of n. And a of n will be either. Let's assume G is equal to 1. Will be either minus 3, or minus 1, or 1, or 3. We will see why we use the odd integers in just a second. And the receiver the slicer will work by simply associating to the received symbol, the closest odd integer, always taking the gain into account. So graphically, again, PAM for M equal to 2 and G equal to 1, will look like this. Here are the odd integers. The distance between two transmitted points, or transmitted symbols, is 2G right here. G Is equal to 1, but it would be in general 2 times the gain. And using odd integers creates a zero-mean sequence. If we assume that each symbol is equiprobable. Which is likely, given that we've used a scrambler in the transmitter. The the resulting mean is zero. The analysis of the probability of error for PAM is very similar to what we carried out for bilateral signaling. As a matter of fact, binary signaling is simply PAM with M equal to 1. The end result is very similar, and it's an exponential decaying function of the ratio between the power of the signal and the power of the noise. The reason why we don't analyze this further is because we have an improvement in store. And the improvement is aimed at increasing the throughput, increasing the number of bits per symbol that we can send without necessarily increasing the probability of error. So here's a wild idea. Let's use complex numbers and build a complex valued transmission system. This requires certain suspension of disbelief for the time being, but believe me, it will work in the end. The name for this complex valued mapping scheme is QAM. Which is an acronym for Quadtrature Amplitude Modulation, and it works like so. The mapper takes the income and bit stream, and splits it into chunks of M bits, with M even. Then it uses half of the bits, to define a PAM sequence, which we call a of r of n, and the reamaining, M over 2 bits, to define another independent PAM sequence. Ai of n. The final symbol sequence is a sequence of complex numbers, where the real part is the first PAM sequence, and the imaginary part is the second PAM sequence. And of course, in front we have a gain factor, G. So the transmission alphabet, a, is given by points in the complex plane, with odd-valued coordinates around the origin. At the receiver, the slicer works by finding the symbol in the alphabet, which is closest in Euclidean distance to the received symbol. Let's look at this graphically. This is a set of points for QAM transmission with M equal to 2, which corresponds to two bilevel PAM signals on the real axis and on the imaginary axis. So that results into four points. If we increase the number of bits per symbol, we set M equal to 4, that corresponds to two pam signals with 2 bits each, which makes for a constellation. This is how these arrangement of points in the complex plain are called. A constellation of four by four points at the odd-valued coordinates in the complex plane. If we increase M to 8, then we have a 256 point constellation, with 16 points per side. Lets look at what happens when a symbol is received, and how we derive an expression for the probability of error. If this is the nominal constellation, the transmitter will choose one of these values for transmission, say this one. And this value will corrupted by noise in the transmission and the receiving process. And will appear somewhere in the complex plane, not necessarily exactly on the point it originates from. The way the slicer operates, is by defining decision regions around each point in the constellation. So suppose for this point here, the transmitted point, the decision region is square, of side 2G, centered around which is made in point. So what happens is that when we receive symbols. They will now fall on the original point. But as long as they fall within the decision region, they will be decoded correctly. So for instance here. We will decode this correctly. Here we will decode this correctly. Same here. But this point for instance falls outside of the decision region and therefore it will be associated to a different constellation point, thereby causing an error. To quantify the probability of error, we assume as per usual that each received symbol is the sum of the transmitted symbol. Plus a noise sample theta of n. And we further assume that this noise is a complex value Gaussian noise of equal variance in the complex and real components. We're working on a completely digital system that operates. With complex valued quantities. So we're making a new model for the noise, and we will see later, how to translate the physical real noise, into a complex variable. With these assumptions, the probability of error, is equal to the probability that the real part of the noise is larger than G in magnitude. Plus the probability that the imaginary part of the noise is larger than G in magnitude. We assume that real and imaginary component of the noise are independent, and that's why we can split the probability like so. Now, if you remember the shape of the decision region, this condition is equivalent to saying that the noise is pushing the real part of the point, outside of the decision region, in either direction, and same for the imaginary part. Now if we develop this, this is equal to 1 minus the probability that the real part of the noise is less than G, and the imaginary part of the noise is less than G. This is the complimentary condition to what we just wrote above. And so this is equal to 1 minus the integral over the decision region d of the complex valued probability density function for the noise. In order to compute this integral, we're going to approximate the shape of the decision region. With the inbound circle. So instead of using the square, we're going to use a circle centered around the transmission point. When the constellation is very dense, this approximation is quite accurate. With this approximation, we can compute the integral exactly for a gaussian distribution. And if we assume that the variance of the noise is sigma 0 squared over 2 in each component, real or imaginary. It turns out that the probability of error is equal to each of the minus g squared over sigma 0 square. Now to obtain a probability of error as a function of the signal to noise ratio we have to compute the power of the transmitted signal. So if all symbols are equiprobable and independent, it turns out that the variance of the signal is G squared times 1 over 2 to the power of M. Which is the probability of each symbol, times the sum over all symbols in the alphabet of the magnitude of the symbols squared. Now, it's a little bit tedious but we can solve it exactly for M. And it turns out that the power to transmit the signal is g squared 2 rds3, 2 to the n to the minus 1. Now, if you plug this into the formula for the probability of error that we seen before. We get that the result is an exponential function where the argument is minus 3, that multiplies 2 to the minus m plus 1, that multiplies the signals to noise ratio. We can plot this probability of error in a log log scale, like we did before. And we can paramatrize the curve, as a function of the number of points in the constellation. So here you have the curve for a four point constellation, Here's the curve for 16-points and here's the curve for 64-points. Now you can see that for a given signal to noise ratio the probability of error increases with the number of points. Why is that? Well if the signal to noise remains the same, and we assume that the noise is always at the same level, then it means that the power of the signal remains constant as well. In that case, if the number of points increases, g has to become smaller. In order to accomodate a larger number of points for the same power. But if g becomes smaller, then the decision regions becomes smaller, the separation between points become smaller, and the decision process becomes more vulnerable to noise. So in the end here's the final recipe to design a QAM transmitter. First you pick a probability of error that you can live. In general, 10 to the minus 6 is an acceptable probability of error at the symbol level. Then you find out the signals noise ratio that is imposed by the channel's power constraint. Once you have that, you can find the size of your constellation, by finding M. Which, based on the previous equations, is the log and base 2 of 1 minus 3 over 2 times the signal to noise ratio, divided by the natural logarithm of the probability of error. Of course, you will have to round this to a suitable integer value, and potentially to an even power of 2 in order to have a square constellation. The final data rate of your system will be M, the number of bits per symbol, times W, which, if you remember, is the baud rate of the system, and corresponds to the bandwidth allowed for by the channel. So we know how to fit the bandwidth constraint via upsampling. With QAM, we know how many bits per symbol we can use given the power constraint. And so we know the theoretical throughput of the transmit for a given reliability figure. However, the question remains, how are we going to send complex value symbols over a physical channel? It's time, therefore, to Stop the suspension of this belief, and look at techniques to do complex signaling over a real value channel.