Welcome to Module 5.2. In the previous module, we introduced the concept of filtering and that of convolution. In this module, we will see some filtering in action by setting up a mock problem where we try to remove some noise from the signal. The two characters that we will introduce in this chapter are two friends that will remain with us for a very long time: the Moving Average and the Leaky Integrator. Hi, and welcome to Module 5.2 of Digital Signal Processing, in which we will talk about filters, by showing some filtering examples. In particular, we will introduce two characters that will remain with us until the end of the class, the Moving Average filter and the Leaky Integrator. So, in the previous module, we talked a lot about the properties of Linear Time Invariant filters, but we haven't said much about the applications that can be addressed by using filters. So here, let me give you an example. Suppose you're taking measurements of a physical quantity that you know to be smooth and slowly varying: for instance, the tide level at the sea or the blood pressure of a patient. Because of the measurement process, your values, the values that you actually measure, are corrupted by some form of additive noise. Which at this point, we could just say it's some extraneous quantity that perturbs the measured value. So, what you get in the end, instead of the smooth curve of the first panel, is a curve where you see that there's really fast wiggles that you interpret as being a noise corruption. And the question is, can we process this data as a discrete time signal so that most of the noise is removed and we get back a measurement that is in line with our expectations of smoothness? The first idea is to replace each sample of the input sequence with a local average. Averages are usually good when you want to eliminate random variations that you don't know much about. So, for instance, assume a situation where we process the signal and the process (? processed?) signal is at each sample, the simple two-point average of the current sample and the past sample. Well, it could very well be that the two-point average is not enough to remove most of the noise. And so, in general, we could take longer averages by considering the output samples to be the average of M points from the input. If we try to do that experimentally, we can see immediately that the strategy works for increasing values of M. So, if we take just a two-point average, we see that we don't really get a lot of leverage(?). You can see in blue the original signal, and in red the filtered output. And with a two-point average, we still have a lot of wiggles. But, as we increase the number of points that we use to compute the average, we can see that we start to remove a little bit more of the noise. And by a 12 point average, we already can see that the strategy actually pays off: we have already smoothed out the signal quite significantly, and if we're willing to increase the average up to, say, 100-point, then we can see that the signal we get at the output is actually very smooth. You will also see that, with respect to the original signal, the output has been delayed by a certain amount. This is the price we pay because we are actually using up 100 samples to compute the output value. So, in a sense, we're accumulating a certain delay with respect to the input because we're averaging from the past. We will see this in more detail later on. What we have done is really a filtering operation. And if we want to compute the impulse response, we need to do nothing more than just put the δ function inside our averaging machine, and it's very easy to see that if we put the δ function in here, -- this is, remember, is the averaging relation that we saw before, -- we have that the impulse response is equal to 1/M for n that goes from 0 to M -1, and 0 otherwise. It's easy to convince yourself that in this summation, there's only one non-zero term, which is the one where n is equal to k. Since the index goes from 0 to M-1, if your value of n is outside of this range, none of these terms will be nonzero and you will get 0. If we plot the impulse response, it looks like this. So, you will have a string of samples that are equal to 1/M from 0 to M-1. So, what we have learned experimentally is that the smoothing effect of the filter is proportional to the length of the impulse response. As a consequence of that, a number of operations that we have to perform to compute each output sample, and also the storage -- because we have to keep in the memory of our processing unit the past values in order to be able to compute the average -- well, both of these quantities are proportional to the length of the filter. So, as the filter becomes longer and as its smoothing effect increases, the price we pay in terms of computational resources and memory are higher. So, we might not like that, and we might want to see whether there's a smarter way to achieve the same effect without using up so much in terms of computing power. Let's look at the equation for the Moving Average. It's just 1/M times the sum of input samples from n to n-M + 1. Now, we can split this into two parts. We have the first term, which is 1/M times the current input + 1/M times the sum of past input samples from n-1 to n-M + 1. And this is the sum of M-1 terms. So, the second term here looks suspiciously close to the Moving Average, computed this time over M-1 points. So, 1 point less than before and delayed by 1. So, it's close, but it is not quite the same thing, so let's look at ways to formalize this relationship. To do so, we first write out the standard equation for the Moving Average filter, which we have seen before. So, nothing new here. Now, we try and compute the delayed output. So, y of M of n-1 is the sum, it's 1/M times the sum from k that goes from 0 to M-1 of x[(n-1) - k]. We can rearrange the index of the signal like so, and now this + 1 that affects only this emission (?) index can be propagated to the summation range. And we have that, if we delay the Moving Average by 1, we basically just affect the range of the summation that now goes from 1 to M, instead of from 0 to M-1. Okay. Now, let's look at the formula for the Moving Average over M-1 point, and this is just the Moving Average. So, 1 over M-1 times the sum from k that goes to zero to M-2 of n[x - k]. And now we do the same, we try to delay this by one. And again, this delay will propagate to the summation range. And we will have the sum that goes now from k=1 to M-1. So, formula for the Moving Average over M points. And here, formula for the Moving Average over n-1 points, delayed by 1. Alright, but now, if we take the sum of the input from zero to M-1, we can split this as current sample + n-2 samples in the past. and so, this part here corresponds to this and this part here corresponds to this, right? So, we can swap terms and we obtain the relationship that we were looking for between the Moving Average over M points, and the Moving Average over M-1 points delayed by 1. So, we have rearranged the terms and we have that the Moving Average over M points in n is equal to M-1 over n times the Moving Average over M-1 points delayed by 1, plus 1/M times x[n]. And so, if we want to write this a little bit better, we define a parameter λ as M-1 over M, and we have this final (?) relationship here. Now, there's nothing new, what we saw so far. We just rearranged terms in finite sums. But, here comes the trick that defines the Leaky Integrator filter. When M is very large, we assume that the average over M-1 or over M point (?) is going to be pretty much the same: not much is going to change if we add a single data point to a very long average. And also, when M is very large, λ, which was defined as M-1 over M, is going to be very close to 1. So, we can rewrite the previous equation as current estimate for the Moving Average is equal to λ times the previous estimate for a Moving Average plus 1-λ times the current input. And this defines a filter that we can use to process our signal. Now, the filter is recursive because we're using output values, previous output values, to compute the current value of the output. So, there's a feedback path that goes from the output, back into the filter. Before we even try to analyze this, let's see if it works. So, we take our smooth signal corrupted by noise, and we filter it with a Leaky Integrator for varying values of λ. So, if we start with λ rather small, we don't really see much happening. But, as we increase the value of λ towards 1, which is really the operating assumption for the duration of the filter anyway, we see that we start to smooth the signal as we expected. And when λ is very close to 1, the smoothing power is comparable to that of a Moving Average filter with a very high number of taps. But now, the difference is that each output value only requires three operations because remember, the value is simply one multiplication, and then there's going to be another multiplication and one sum. And this is independent of the value of λ. So: high smoothing power at a fixed price. Of course, the natural question when we have a filter is what the impulse response is going to be and to compute the impulse response of the Leaky Integrator, all we need to do is to plug the δ function into the recursive equations, and see what happens. So, when n is less than 0, since the δ function is 0 from minus infinity to 0, nothing "non-zero" has ever happened inside this equation. And so, it's very safe to assume that the output will be zero for all values of n less than 0. Things start to happen when n reaches zero. At which point, the δ function kicks and assumes the value of one. So, if we compute the value of the output for n equal to 0, we have λ times the previous value of the output, -- which we know to be 0, so this is 0 -- plus 1-λ times the value of δ and 0 (?δn0), which is 1. And so, the output would be 1-λ. At the next step, y[1] is equal to λ times the previous value of the output, which now is not non-zero, it's 1-λ as we computed before, plus 1-λ times δ 1, which we know to be 0, and so, this is cancelled. And the final output is λ times 1-λ. At the next step, once again, the recursive equation kicks in, and we have λ times the previous value, which we know to be λ times 1- λ + 1-λ δ of 2, but again, δ from now on is going to be zero, so this term, we don't need to have to look at. And so, here we have λ squared times 1-λ. Next step is going to be the same. And you have pretty much understood what's going on here. If we plot the impulse response, we have an exponentially decaying function where the exponential basis is λ, and the whole curve is weighted by 1-λ and it's, of course multiplied by uni step (?) because everything is 0 for n less than 0. You might wonder why we call this filter a Leaky Integrator. The reason is, because if you consider, for instance, this very simple sum: it's the sum of all input samples from minus infinity to current time. This is really the discrete time approximation to an integral, because you're really accumulating all past values up to now. And we can rewrite this summation here as such. So, at each step, we get what we have accumulated up to the previous step, and we sum the current input. So you see, it's very close to the equation for the Leaky Integrator. Except that in the Leaky Integrator, what we do is the following. We scale the previous accumulated value, by λ. Which means, we -- remember λ is close to one, so that means we keep pretty much all of what we have accumulated before, but we "forget" a little bit. We leak a part of it. If λ is equal to 0.9, then it means that 10% of what we have accumulated is lost. And we replace what we've lost by a fraction of the input, and this fraction is 1-λ. So, what we've lost from the accumulator, we'll replace with an equal percentage of the input. The idea is that by picking λ less than one, we are forgetting, we are leaking the past, and the accumulation will never blow up.