We have seen a lot of formulas so far in this module, and this is good because we needed to set the stage for Fourier Transforms. Now we are going to look at the computational Fourier Transform. So I give you a signal of length n, and you want to compute the DFT. How complex is that going to be? Well, if you're not smart about it, it's going to cost you n square operations. And when n is a very large number, 100,000 or a million or so, n square is even larger, and at some point it becomes prohibitive to calculate the Discrete Fourier Transform. But there is a magic trick. This magic trick was first noticed by Gauss, actually even before Fourier invented the Fourier transform, just shows how smart Gauss actually was. It was reinvented many times in the 19th century and in the early 20th century. But then, there was a quantum step with the publication of a paper by Cooley and Tuckey, for a Fast Fourier Transform algorithm. This algorithm is both very famous and relatively simple to understand. So we are going to derive it here, in this module, by looking at divide and conquer methods in general, and then specifically for the Fast Fourier Transform computation. Once we have this algorithm, we'll talk about some generalizations. We are not going to, to study them in detail. We could do an entire class about Fast Fourier Transforms. That's not the purpose here, but you will understand the basics of one of the most famous algorithms of all times, and also one that runs not only on super computers, but on your mobile phone, on everyday applications. Module 4.9 is about the Fast Fourier Transform, its history and algorithms. So the overview of the module is that we are going to cover this history and see that Gauss actually invented the Fast Fourier Transform, even before Fourier proposed the Fourier Transform. Then, in the second half of the module, we will specifically describe the Cooley Tuckey Fast Fourier Transform algorithm in sufficient detail to understand how it works, and then briefly overview other possible algorithms. The overview of this lecture on Fast Fourier Transform is that we first look at some history, namely from the earliest days to the current practice of FFTs. Then we study in detail one particular algorithm that is very popular, the Cooley-Tuckey FFT. Then we look at some factorizations examples, and conclude. Joseph Fourier, the inventor of the Fourier transform, of course knew about how to calculate Fourier Series, for example, but he did this long hand with very extensive calculations. It turns out that Karl Frederick Gauss, the prince of mathematicians, had already a fast way to compute Fourier series, even before Fourier proposed Fourier Series. This points out that there is an interesting history about computing efficiently Fourier series, Discrete Fourier Transforms and the like. Gauss, in 1805, comes up with a neat trick to compute trigonometric series efficiently, which happened to be the same trick used much later in the Fast Fourier Transform. Just a bit later, Joseph Fourier invents Fourier series with no particularly efficient way to compute them. During the 19th century, a number of people have to compute Fourier series for various applications. And many of them develop tricks to speed up the calculation, because the calculations were done by hand, or one had to construct machines to calculate Fourier Series. In the 20th century, a number of people in applied mathematics and numerical linear algebra came up with tricks and algorithms, but none of them really were very influential. Then in 1958, Good, who is a statistician, comes up with an algorithm which turns out to be one of the classic algorithms used today. But the quantum step was really Cooley and Tuckey's rediscovery of the Fast Fourier Transform in 1965, in a very celebrated paper, for lengths of Fourier Transforms which are a power of a prime. The concluding work on Fast Fourier Transform was maybe done by Winograd, who has really studied the complexity of FFT in great detail, and put all the tricks together, added a few of his own, to come up with the most efficient ways to compute the Fast Fourier Transform. Remember the Discrete Fourier Transform matrix. We have the Nth root of unity, W sub N, is equal to e to the -j 2π/N. When the context is clear, we drop the subscript N. The powers of N can be taken modulo capital N, because W to the N is equal to 1. So a DFT matrix of size N by N is this very structured matrix shown here, with a first line and column of ones. And then successive powers of the roots of unity on successive lines. The matrix is symmetric and, as can be seen, has a lot of structure in it. Let's look at the few small matrices. N is equal to 2 is the easiest one. The second root of unity is minus 1, and the matrix is a simple symmetric matrix, shown here. For N is equal to 3, the matrix already looks more interesting. In the middle, we simplified the powers of W by taking them modulo 3 (?). So we see that we only have two powers, W and W square. And at the right side, we write out the actual values of the matrix. N is equal to 4, so W4 is given here, we simplify the powers by taking them modulo 2, (?) then we write out explicitly. We know that W in this case is -j. You see, this is a very simple matrix. Actually, it doesn't require any multiplication, only changes of signs and changing real and imaginary parts when we multiply by j or -j. Let's look at N is equal to 5. So W5 is given here, then we simplify the powers by taking them modulo 5. (?) So we have only powers from 0 to 4 that are left. And you see this matrix is very structured, and we'll see that it has a hidden structure that can be taken advantage of. One more step, N is equal to 6. Now, this matrix is, again, with powers simplified by taking modulo 6, (?) and we see again that it has a lot of structure very different from the previous one. So the message so far is that for every N we have seen so far, the matrix looks interesting, but looks very different from one case to the other. We have seen all of these matrices, now we need an algorithm to compute them efficiently. And the idea behind a fast algorithm, like with many fast algorithms, is the idea of Divide et Impera, in Latin, which in English says Divide and conquer. So this goes back to the Roman Empire. It's usually attributed to Julius Caesar, it's a way to conduct war. I'm not sure if that's what we want to pursue here. And it is a standard attack for developing algorithms, and the idea is very simple. You take a problem, you split it into two problems, let's say the upper and the lower here, so you have two subproblems which you can split again up to simple subproblems. Here we have four simple subproblems in the middle. Then from the subproblems, you merge them to get intermediate solutions, merge again, and you finally get the solution. So as long as the splitting and the merging, plus the sum of simple subproblems is such that it is cheaper than computing the entire problem, you have a faster algorithm for the problem at hand. Let's look at the idea of divide and conquer for the Discrete Fourier Transform. And we start by doing one step of divide and conquer. Recall that the computation of the DFT has complexity N square. So the idea is to take the problem of size N, where N is a power of 2. That will allow us to do recursion and to cut this problem into two problems of size N/2. Each one will have complexity N square over 4, since it's a quadratic complexity, and there might be some complexity to recover the full solution, say complexity of order N. So the divide and conquer solution has a complexity of N squared over 2 plus N. And this is -- for large enough N -- is better than N squared. So doing one step of divide and conquer is already a gain. Graphically, split the DFT into two pieces of size N/2. Compute two DFTs of size N/2, merge the two results. And we see this here in this diagram where we start with a problem with an input from x[0] to x[N1]. We do a splitting, we compute two DFTs of size N/2, we merge the results of the two DFTs and we get our final result, X, from 0 to N-1. So that's one step of Divide and Conquer for the DFT. Let's do divide and conquer in multiple steps: it's a very simple idea. If it worked once, we will apply it again and this will give more gains. So cut the two problems of size N/2 into four problems of size N/4. We can do this, because N is a power of 2, so we can keep cutting into halves. There might be some complexity to recover the full solution after these cuts. Let's say it's linear in N, so there's a cost order N at each step. Now the division into subproblems can be done, order log N times, precisely actually log in base 2 of N-1, until the problem of size 2 is obtained, because that one is trivial, as we have seen when we looked at the matrix W2. Each cutting and merging requires order and complexity. So the total complexity of this divide and conquer solution is order N log in base 2 of N. Now for N large enough, this is a tremendous gain with respect to N square, as we can easily see by plugging in some numbers. And this is the power of the Fast Fourier Transform algorithm. Let's summarize the steps we went through. We split the DFT into two, four and eight pieces of sizes, respectively, N/2, N/4 and N/8. Then we compute 8 DFTs of size N/8. We merge the results successively into DFTs of size N/4, N/2 and finally N to obtain the result. This is shown here in this diagram where we see the recursive splits into 2 halves, then 4 pieces, finally 8 pieces. In the center, the DFTs of size N/8, then the mergers into DFTs of size N/4, then N/2, and finally the result, the total DFT of size N.