Welcome, uh, to this, uh, cloud computing concepts course. In this, uh, lecture we'll be covering several, uh, concepts, uh, that are assumed as a part of, uh, the topics that we'll be discussing in the cloud computing concepts, uh, course. So, we'll be covering several of the basics, uh, they're considered to be basics in computer science. Uh, typically, some of these concepts, are concepts that, uh, computer science students learn in their first couple of years in, uh, a computer science degree, undergraduate degree program. So for those of you, who are already familiar with this material, uh, this could be a refresher or you could just, uh, skip it, if you already know, uh, this stuff. In any case, uh, you can always use this orientation lecture as a reference, uh, to keep coming back to if, you encounter a, uh, term in, uh, during the regular lectures that doesn't make sense to you, or that you'd like to be reminded about. So, we'll discuss a few topics. We'll discuss some basic data structures, uh, what processes are. We'll discuss a little bit about computer architecture, and then, um, a little bit of math with Big O notation, basic probability and a few other miscellaneous topics. So, let's get started here. So, bas-basic data structures, we'll discuss two di-different data structures. The first one is a queue. A queue, is essentially, a First-in First-out data structure. Elements of the queue could be, uh, any data types. In this example, here, I have elements that are in tiers. So, in this, uh, example right now, the queue has 5 elements, 3, 5, 8, 0, 7. This is an ordered, list of elements. When you remove an element from the queue, you remove it from the head of the queue. In this case, the head is the element three. Uh, when you insert a new element, you insert it in the tail of the queue, which means that, if, I insert a new element, say one, it's going to get inserted right after seven in this queue. So removal always comes from the head, while insertion always happens at the tail. Uh, so, for instance, given this particular queue, consisting of five elements, if you-if you, dequeue or remove an item, uh, then that item will be three. At that point of time the queue will only consist of 5, 8, 0, and 7, with the head pointing to 5. The next item that you dequeue or remove will be five, okay? After that the queue will consist of only 8, 0, and 7. Then, uh, the next item that you dequeue will be eight, um, and so on and so forth. Uh, dequeues and, or removals, as well as, insertions, uh, can occur, uh, concurrently, so you can, uh, remove an element, but at the same time also insert an element at the tail. So, that's a queue, which is a First-in First-out data structure. A different data structure is the stack, which is the First-in Last-out data structure. Uh, imagine a stack of dishes that you, uh, place on your table. Uh, the dish that you put on the top, that you add the last, will be the first one that you can remove, right? And that's essentially a stack. So, uh, remove or also known, as a pop happens from the top. An insert or a push also happens at the top. Okay, so in this case, um, uh, if you, uh, push a new element into this particular stack, um, uh, say nine. The nine will go above three. After that, if you pop, or remove an element, that's going to be the last element that you added, which is nine, in this case. The next pop after that will get three, because nine was just removed. And three is now the top of the, uh, stack. Uh, after you've removed three, the next pop or removal will get you five, uh, and so on and so forth. After that you'll get 8 and then 0 and 7. Okay, so, once again, the removal always returns you the last item that was added. And, uh, insertion or push adds an item right to the top, of the stack. So these two data structures, queue and stack are used very widely in computer science and, um, in fact, we'll be using the notion of a stack, uh, right away when we'll discuss, uh, uh, processes, um, uh, soon in this particular orientation lecture, itself. So, speaking of processes, um, let's discuss a process next. Uh, a process is essentially a program in action. Uh, you may write a program in a language, programming language, such as, C or C++. And that's sh- an example is shown here. This example code, here, uh, consists of a main function, which, uh, calls a function, uh, called f1. F1 itself has a local integer variable x, and then, f1 itself calls a different function, f2. Okay, this is shown pictorially here, in the circle, where main calls f1 and f1 calls f2. 'Kay, this is the code that you would write, and then, y'know, you would compile the code and then, uh, execute it. And when you're executing it, when your program is in action, that is a process. So, a process consists of your code, but also several other elements that, uh, help your code to execute on the computer, on which you are executing it. So, we are going to discuss some, uh, elements o-of a process which are critical. Here, are some of these elements. First, um, uh, let's look at the code. The code itself, uh, is, uh, static. Once you write the code, um, nothing changes it, uh, we are not considering the variable values to be a part of the code. I'll talk about that in a little bit. But the code itself, is static; it does not change once you write it. Uh, but, uh, there is a program counter, which is typically maintained by the computer on which you're running your, uh, process, uh, which points to which is the line number inside the code, where the program is currently executing, or rather where, the process is currently executing. 'Kay, this is called a Program Counter, or the PC. Okay, this is a part, an important part of the process state. So, if you were to take a core dump, or stop the process right now, and take a snapshot of that process. That snapshot must include, the program counter, because it says, where in the code is, uh, the execution currently at. Next, uh, when, um, er-r, um, functions call each other, um, or in an object oriented, um, uh, program, uh, methods call each other, they use a stack. So every process contains, uh, a stack. Uh, more specifically, um, a proce- a process might contain multiple threads. Each thread will contain its own stack. But, let's just say there is one thread in this process. Uh, so, a process contains one stack, and this stack is used by these functions or methods, to pass arguments and return, uh, the result values. So, for instance, when main calls f1, main will push the arguments, uh, to f1, on top of the stack. When f1 starts executing it will pop, or remove, the elements from the top of the stack and use it to, uh, execute. Uh, similarly, uh, when f1 calls f2, um, f1 will push the arguments on top of the stack, for f2, and then f2 will, um, uh, pop them, execute, and then push the result values on top of the stack. The result values then popped by f1. This is the way in which f1 gets back the result, um, the final int return value from f2, to itself. And finally when f-f1 needs to return a resolved value to, uh, main, it pushes it on top of the stack when the execution returns to main, uh, main, uh, pops it or removes it from the top of the stack and main has the final return value from f1. 'Kay, so, the stack is an important part of the process state as well, because it tells you where in the program execution you are, with respect to functions calling each other. Finally, uh, y'know, uh, functions have local variable, like x. There might also be global variables, um, uh, and of course in object oriented programs, you have, uh, objects, which store a lot of fields in them. A lot of these, uh, pieces of data, are stored in, what is known as a heap. Okay, so, uh, a heap is essentially a lot of data, uh, that was created by functions or methods, uh, or, uh, um, by objects, um. And it holds all of the data and, y'know, when an object, when a function's executing, uh, say f1 is executing, uh, the object x might be created inside the heap, when f1 is done executing. X, uh, the object of x will be removed from the heap. Similarly, there're also registers, and I'll talk more about registers when we discuss computer architecture, but registers hold some of the, uh, recent values that have been accessed, uh, by the process, p1, in this case. Okay, uh, speaking of computer architecture. Let's discuss a simplified version of computer architecture. Computer architectures today, uh, can be very complicated. But, um, typically when we ta- when we think of how a process executes, in a computer architecture, we can think of a simplified view. So there is a CPU, which executes the action instructions, which are present in your, uh, code. There're also a bunch of registers, that are co-located, uh, with the CPU. These are, uh, small pieces of memory, that can be accessed very quickly by the CPU. Typically, uh, there's only a small number of registers. Typically, not more than a few tens of registers. Uh, there's also a cache, which is, a slightly larger memory than the collection of registers. And the larger a map piece of memory is, uh, the slower it is to access it. So accessing a cache is slower than registers, uh, than accessing registers. But accessing a cache is still pretty fast. Uh, then beyond the cache there is the main memory, or the Random Access Memory, or the RAM. Uh, which is even larger than the cache, uh, and the ca- the memory itself, therefore, is slower than the cache, but still, it's pretty fast. And then finally, there's hard disk, or if you would like, a solid stave-state drive, a flash, uh, drive, or whatever, which is, a lot more memory than, uh, the main memory, uh, but of course, it's much slower. So, as you go up from disk, to memory, to cache, registers, the speed increases, the total amount of, uh, storage space that you have, decreases. So, when you write a program, such as, C++, or Java, or C, and you compile it, it gets compiled to low level machine instructions. These machine instructions, uh, might be specific to the architecture, the machine on which you are running, or it might be a, um, ah, virtual machine, like a JVM, kind of, um, code. In any case, uh, these low level machine instructions are the executable version of your, uh, program, and this is stored in file system, um, uh, in the file system on- on your disk. When your program starts executing, when it becomes a process, uh, the CPU loads instructions in batches, or, m-, in a simplifi- more simplified way, it loads instructions one by one. Uh, it loads them into memory, and then into cache, and into registers. Typically, the cache and the registers contain the last few accessed instructions. Um, so they are, uhm, they follow what is known as, the least recently used principle. Now, as it executes y- each instruction, the CPU, which is executing the process, loads the data, that is necessary for the instruction, into the memory and then if necessary into the cache and the registers. Um, and if there are any necessary, uh, changes that happen to a variable like, x, then, these are stored into a memory, um, um, uh, first into cache and then into memory. Once in a while, you may need to store something on disks, so that you have, um, a permanent storage. So in case, the process goes offline, then you still have some data. Say for instance, the program might open a file and write something into the file. So, in this case, uh, you may want to write those, uh, updates to the file onto the disk. Now, of course this is a very highly simplified picture. Computer architectures can be far more, uh, complex than this, today, but for our practical purposes, in this course, and as far as, uh, explaining how processes work, um, this will suffice. So next, we move on to a few mathematical topics. The first mathematical topic is, a Big O notation. A Big O notation is, one of the most basic ways of analyzing algorithms. When I give you an algorithm, and I say analyze the algorithm mathematically, essentially, I expect you to, uh, come up with, at least a Big O notation, uh, and an analysis of that algorithm. The-the Big O notation, describes an upper bound in the behavior of an algorithm. Um, as some variable is scaled or increased in value all the way to infinity. Typically, this variable is a size of the input. Um, maybe the number of input elements, um, you'll see-sa-you'll see a few examples of this. So, essentially, this, uhh-oah, Big O analysis tells us, how does the, uh, algorithm itself scale as the size of the input is increased? Um, this analyzes the run time, or typically, another performance metric, as well. Most of the time, we analyze run time, sometimes, as you'll see in the course, we'll analyze things like number of messages, um, uh, even, uh, bandwidth, um, uh, that is exchanged i-in the total network itself. The important thing about Big O notation is that, i-it captures worst case performance. Okay, so this is not expected performance or best case performance. Big O is worst case performance. Given this algorithm, what is the worst that it can do in terms of, uh, this performance metric that we're measuring? So, essentially, when I say, an algorithm A is order of foo, this means that, Algorithm A takes < c foo time to complete, assuming time is the metric we are measuring, for some constant c, a fixed constant c, uh, beyond some input size N. Okay? Typically, this foo that is, uh, present in the definition is s-, uh, uh, some function of input size N. For instance, I could say that an algorithm is order N, which means that algorithm A takes less than c times N time to complete for some constant c, beyond some input size N. Similarly the algorithm might be order N^2. Which means, that, uh, it is quadratic in time, and it takes time less than c times N squared for some constant c. Once again, when we say order N^2 or order N, notice that we do not include the constant c, in the definition itself. This is because, the constant is irrelevant, as long as there is a fixed constant, some fixed constant. That is good enough for us as far as Big O notation is, uh, concerned. Okay, so we do not state the constants in Big O notation. So let's see a couple of examples. Um, so first, I give you an unsorted list o-of, say, integers. And I ask you to search for a specific element in the list. This essentially means, that you have to iterate through the list, one item at a time. And search for the element by trying to match it with each element quality present, in the list. What is the worst case performance? That's what we need, for Big O notation, right? So, the worst case performance happens when, either element is not there, at all, in the list, so, you return a false. Uh, or the element is the very last one, in the list. The very last one, that you match against. Therefore, the worst case performance involves N operations, and the number of operations involved is < c* N, where c=2, because the number of operations is N, where N is the size of the list. Okay, so, the time as well as, the number of operations, uh, assuming each operation takes one unit of time. Both the time taken for this, uh, um, uh, for this algorithm, which iterates through the list. As a list number of operations are both order N. That's why we see order N over here. Let's take a different, uh, example to see how Big O notation works. This leads with insertion sorting of an unsorted list. So suppose I give you a list of elements, uh, say integers, and this is unsorted and I ask you to sort them, say in increasing order of, uh, the, uh, elements. The insertion sorter algorithm works as follows; first, you create a new list, that is empty. Then, you iterate through the elements in the unsorted original list, you remove one element at a time, and you insert the element, in the new list at the appropriate position. Essentially, you sort through the new list, uh, linearly, uh, searching for the right position where this element, should be. So, if you do this, the first element, of course, takes one operation to insert. That's one insert operation. The second element, in the worst case, takes two operations to insert. One operation for the comparison, with the one element that's already present, in the new sorted list and one more operation to insert. Imagine the case, the worst case, where, this second element, is inserted right at the very end, of the list. Similarly, the third element, will take, uh, three operations to insert, where the first two elements, compared with the two already existing elements, the third, uh, uh, operation, uh, inserts it at the very end of, uh, the, ah, uh, of the list. If, you go on like this and the i-th element takes i operations to insert, and you can calculate that the total time required to insert all the, m, elements is 1+2+3, so on until N. And this is a well-known sum which comes to N(N+1)/2, which is, uh, <1*N^2. So, with a value of C equals one, uh, this, uh, insertion sorting example, is an algorithm that takes order N^2 in time, and also order N^2 in number of operations. Next, we look at a little bit of basic probability, uh, before probability, we need to discuss, few definitions. Uh, the first definition is a set. A set is, essentially, a collection of things. So for instance, a set S might be the set of all humans, who live in this world, at this current moment. A subset, is another set, which is a collection of things, that is a part of the larger set. So, I might design a-, uh, da, uh, describe a set S2, which says, a set of all humans who live, currently, in Europe, today. S2 is a subset of S, because every element that is present in S2 is also present in S, but the reverse is not necessarily true. Okay, so now, um, probability, any-any event, uh, has a probability of happening. Okay, so, umm, let me give you an example. If you wake up at a random hour of the day, um, so you just wake up at some random hour of the day, say you're jet lagged or something. Uh, and, uh, I ask you, what is the probability, that the event at that time is between 10 a.m. and 11 a.m.? Let's try to calculate this. Well, there are 24 hours in a day, so you have a set that consists of 24 elements, one for each hour, 12 a.m., 1 a.m., 2 a.m., all the way til, 10 a.m., 11 a.m., 11 p.m. 'Kay, so when I say an element like, uh, 1 a.m. I mean, the hour between 1 a.m. and 1:59 am. Similarly, 10 a.m. means, 10 a.m. and 10:59 a.m. Out of the set of 24 elements, you pick one, at random. And the probability that you're going to pick 10 a.m., is essentially, 1 divided by 24, because you're picking one element at random and there are 24 elements, uh, in the set, and the probability that you pick, um, that one special element, 10 a.m., is 1 over 24, because you-you wanna pick, you wanna be lucky, o-of that, uh, to pick that, uh, 10. Okay, so, the probability that you, if you wake up at a random hour of the day, the time is between 10 a.m. and 11 a.m., the-that probability is 1 over 24. Now, uh, there might be multiple events that you might need to calculate the probability of, uh, conjunctions or disjunctions of these events. I'll describe what these are, soon. So say, E1 is one event and E2 is another event, and E1 and E-E2 are independent of each other. This means that, E1 does not influence E2, E2 does not influence E1. Then the probability that E1 and E2 both happen, is essentially, the multiplication of these probabilities. So you take the probability of E1, multiply it by the probability of E2, and that gives you, the probability that both E1 and E2 are true. Let's discuss an example of this. Suppose you have three shirts. They're colored blue, green, and red. And also you wake up at a random hour and blindly pick a shirt, you put your, uh, hand into your, uh, closet, and you blindly pick out a shirt. What is the probability that, he woke up between 10 a.m. and 11 a.m. and, that he picked a green shirt to wear? Well, the first probability of the first event, that he woke up between 10 and 11, is essential, 1 over 24, like we calculated on the previous slide. The probability that you pick the green shirt, is essentially, one out of three, because there are three colors of shirts, uh, you have three shirts and, uh, you want to pick the green one, you want to be lucky, to pick the green one, so that's one over three. And so, the probability that both these events are true, that you woke up between 10 and 11, and that you wore the green shirt is the multiplication of the probabilities of those events. So it's (1/24)*(1/3)=1/72. One of the things, that you have to be careful about here, is that, uh, if E1 and E2 are not independent, meaning that they influence each other in some way, uh, then, you cannot multiply, uh, the, uh, probabilities with, uh, each other. Okay, so, for instance, um, if, uh, the shirts that you have in your closet, uh, change from one hour to another, because, uh, um, say, uh, someone in your house changes them around, then you can't necessarily multiply these probabilities, as they are here. A different thing that we need to do with two event probabilities is, uh, to OR them. So, if, I give you two events, E1 and E2 and I ask you the probability that, uh, of either E1 or E2 happening then, you can say that Prob(E1 OR E2)= Prob(E1)+Prob(E2)- Prob(E1 AND E2). Okay, so, this is the intersection probability here. If, you do not know probability of E1 and E2, this last term over here, than, you can write the inequality Prob(E1 OR E2)<= the sum of the constituent probabilities, okay? Sometimes we use these inequalities to upper bound, uh, the, uh, probabilities of the, um, uh, disjunctions, as we see here. Okay, let's disces-discuss a couple of-of, a few miscellaneous topics. The first miscellaneous topic is, a Domain Name System, or known as DNS. This is a collection of servers, that are spread throughout the world and, uh, the DNS is very critical to the operation of the, uh, web. Uh, typically the input to the DU-to the DNS system is a URL, say coursera.org. Uh, the URL is a name, it's a human readable string, that uniquely identifies the object. Uh, typically, when you go to your browser, whether it's Safari, Chrome, Internet Explorer, Netscape, Firefox, or whatever your favorite browser is. Uh, you might enter a URL like coursera.org, the moment you hit Enter on your browser, your browser goes off and contacts a DNS, the DNS, the DNS system, and, uhm, uh, gives the DNS system the name of this URL, which is coursera.org. What does the DNS give back to your browser? It gives back the IP address of a web server, that hosts that content, so that your machine, your browser can tho- then go and send a request, to that IP address, and fetch the actual content of that web page. So, the IP address is an ID, it's a unique string that points to that particular object. For instance, the coursera.org, uh, web page. Uh, and the, um, the ID itself may not be human readable. So, essentially, a DNS is a system that translates between a human readable name, such as, a URL, to, uh, to, uh, unique ID, um, which may or may not be human readable, like the IP address. Uh, in this case, IP address may refer to either, the actual web server that is storing the content, or maybe an indirection server such as, a content distribution network server, or, uh, such as, an Akamai server that is, uh, hosting or, uh, caching the content on your behalf. Uh, graphs, um, so when we say graphs, in computer science, we don't necessarily mean plots, which have x and y axes. Uh, a graph, is essentially, a network. So here I am showing you a graph of some cities, um, Amsterdam, Boston, Chicago, Delhi, and Edmonton, and the travel times, uh, between those cities by air, um, rounded up to the nearest hour. Okay, so, uh, traveling from Amsterdam to Boston takes about six hours. Traveli-traveling from Chicago to Delhi takes about 14 hours. Say these are the flights that are available on some particular airline. Right? So, you see both cities here, as well as, uh, connections between, uh, pairs of cities, uh, and also, um, some values along those connections, which are, the number of hours it takes to transit between those cities. Uh, these all have names, in the terms, in the world of graphs. So, each city, would be a node or a vertex. Uh, each connection between a pair of nodes, would be called an edge, and the value along an edge is called the edge weight. Also, when I say that A is adjacent to B, this means that A has an edge that connects it to B, directly. Typically, an edge only goes between, uh, uh two nodes or two vertices. An edge does not cross three, uh, nodes. So here, for instance, I have, um, uh, 1, 2, 3, 4, 5 vertices, and how many edges do I have? I have an A-E edge, an A-B edge, a B-E edge, a B-C edge and a C-D edge, so, I also have, uh, five, uh, different edges, each with its own, uh, edge weight. So, in this, uh, lecture, we have covered, the basic data structure, such as, uh, queue and a stack, we have seen that processes are programs in action, and that they consist of multiple different pieces, uh, processes of course are under computer architecture, and it's important for you to know, at least, a simplified version of the computer architecture, so that you know what's, ah, operating where, uh, when you run a process. We've seen some mathematical topics, such as, Big O notation which, analyzes the worst case performance of any given algorithm, and then some basic probability. And finally, we have seen a few miscellaneous topics. I hope you can come back- keep coming back to this, as a reference, whenever you need it, uh, throughout, uh, the, uh, course.