[Script Info] Title: [Events] Format: Layer, Start, End, Style, Name, MarginL, MarginR, MarginV, Effect, Text Dialogue: 0,0:00:00.20,0:00:01.66,Default,,0000,0000,0000,,Welcome to Chapter Seven. Dialogue: 0,0:00:01.66,0:00:03.65,Default,,0000,0000,0000,,Python for Informatics: Exploring\NInformation. Dialogue: 0,0:00:03.65,0:00:04.53,Default,,0000,0000,0000,,I'm Charles Severence. Dialogue: 0,0:00:04.53,0:00:09.72,Default,,0000,0000,0000,,I'm the author of the book and your host.\NAnd, as always, this is brought to you by. Dialogue: 0,0:00:09.72,0:00:10.41,Default,,0000,0000,0000,,No, I'm sorry. Dialogue: 0,0:00:10.41,0:00:14.65,Default,,0000,0000,0000,,It's all creative copyright, Creative\NCommons Attribution. Dialogue: 0,0:00:14.65,0:00:18.68,Default,,0000,0000,0000,,The audio, the video, the slides, and even\Nthe book. Dialogue: 0,0:00:18.68,0:00:21.08,Default,,0000,0000,0000,,So, here we go. Dialogue: 0,0:00:21.08,0:00:25.42,Default,,0000,0000,0000,,Oh, and and so, frankly, where\Nwe've been working Dialogue: 0,0:00:25.42,0:00:34.28,Default,,0000,0000,0000,,all along is, we have been writing code\Nand talking to the CPU. Dialogue: 0,0:00:34.28,0:00:37.48,Default,,0000,0000,0000,,Hang on, let me, let me go get\Nmy CPU and stuff. Dialogue: 0,0:00:37.48,0:00:42.15,Default,,0000,0000,0000,,Hang on, be right back. Dialogue: 0,0:00:44.15,0:00:49.67,Default,,0000,0000,0000,,[SOUND]\NOkay. Dialogue: 0,0:00:49.67,0:00:53.73,Default,,0000,0000,0000,,Here we go. Here we go. Dialogue: 0,0:00:53.73,0:00:58.83,Default,,0000,0000,0000,,Here's all the stuff. Remember the stuff\Nfrom the first lecture? Dialogue: 0,0:01:00.98,0:01:01.71,Default,,0000,0000,0000,,There we go with that. Dialogue: 0,0:01:02.86,0:01:05.84,Default,,0000,0000,0000,,Remember the motherboard from the first\Nlecture? Dialogue: 0,0:01:05.84,0:01:08.10,Default,,0000,0000,0000,,This is kind of the picture of what's on\Nthe screen. Dialogue: 0,0:01:08.80,0:01:12.16,Default,,0000,0000,0000,,The motherboard, the CPU plugs in here,\Nmemory plugs in here. Dialogue: 0,0:01:12.16,0:01:18.12,Default,,0000,0000,0000,,And remember how the CPU is sort of the\Nbrains, as Dialogue: 0,0:01:18.12,0:01:22.79,Default,,0000,0000,0000,,much brains as there is, for the operation.\NThe CPU is asking what next. Dialogue: 0,0:01:22.79,0:01:26.24,Default,,0000,0000,0000,,The instructions come in through these\Nlittle pins. Dialogue: 0,0:01:26.24,0:01:30.13,Default,,0000,0000,0000,,There's data inside, and it stores sort of\Nsemi-permanent Dialogue: 0,0:01:30.13,0:01:33.48,Default,,0000,0000,0000,,data, variables, are all stored pretty\Nmuch here in RAM. Dialogue: 0,0:01:34.82,0:01:37.97,Default,,0000,0000,0000,,And we write our programs, and so your\NPython programs, they're sitting here Dialogue: 0,0:01:37.97,0:01:44.14,Default,,0000,0000,0000,,in this RAM, and they're being fed to this\NCPU through those chips. Dialogue: 0,0:01:44.14,0:01:45.28,Default,,0000,0000,0000,,Through those pins, right? Dialogue: 0,0:01:45.28,0:01:47.85,Default,,0000,0000,0000,,The pins, I mean it doesn't really connect\Nlike that. Dialogue: 0,0:01:47.85,0:01:51.85,Default,,0000,0000,0000,,And so, so frankly, up to now, everything\Nthat we've been doing Dialogue: 0,0:01:51.85,0:01:54.52,Default,,0000,0000,0000,,is just the Python programming language. Dialogue: 0,0:01:54.52,0:01:58.21,Default,,0000,0000,0000,,And so the only place we've really been\Noperating is here. Dialogue: 0,0:01:59.65,0:02:02.80,Default,,0000,0000,0000,,We have been putting Python into the main\Nmemory. Dialogue: 0,0:02:02.80,0:02:05.87,Default,,0000,0000,0000,,And the main memory. And we have Dialogue: 0,0:02:05.87,0:02:09.60,Default,,0000,0000,0000,,been effectively feeding instructions to\Nthe CPU, Dialogue: 0,0:02:09.60,0:02:14.08,Default,,0000,0000,0000,,the central processing unit, as it needed\Nthem, and then the program would stop. Dialogue: 0,0:02:14.08,0:02:15.58,Default,,0000,0000,0000,,And everything we've done so far Dialogue: 0,0:02:15.58,0:02:17.26,Default,,0000,0000,0000,,everything Dialogue: 0,0:02:17.26,0:02:22.29,Default,,0000,0000,0000,,is just sort of fiddling around here.\NWe have never escaped it. Dialogue: 0,0:02:22.29,0:02:25.50,Default,,0000,0000,0000,,So now we are finally going to escape Dialogue: 0,0:02:25.50,0:02:28.16,Default,,0000,0000,0000,,from the central processing unit and the\Nmemory. Dialogue: 0,0:02:29.18,0:02:31.72,Default,,0000,0000,0000,,We'll still write programs and have\Nvariables in here. Dialogue: 0,0:02:32.92,0:02:38.66,Default,,0000,0000,0000,,But now we're going to use the disk,\Nthe secondary storage, the Dialogue: 0,0:02:38.66,0:02:44.30,Default,,0000,0000,0000,,permanent media, right?\NSo if I go grab my Raspberry Pi, Dialogue: 0,0:02:44.30,0:02:46.00,Default,,0000,0000,0000,,alright, that goes right there. Dialogue: 0,0:02:46.00,0:02:51.29,Default,,0000,0000,0000,,Here's my Raspberry Pi, so here we've got\Nthe Raspberry Pi, which is the small version, Dialogue: 0,0:02:51.29,0:02:55.99,Default,,0000,0000,0000,,which of course has a CPU, memory, and Dialogue: 0,0:02:55.99,0:02:58.77,Default,,0000,0000,0000,,graphics processor, all in this little chip\Nright here. Dialogue: 0,0:02:58.77,0:03:02.85,Default,,0000,0000,0000,,But the secondary memory for the,\Nis this little Dialogue: 0,0:03:02.85,0:03:05.71,Default,,0000,0000,0000,,SD card that is the secondary memory for\NRaspberry Pi. Dialogue: 0,0:03:05.71,0:03:07.50,Default,,0000,0000,0000,,So the structure of the Raspberry Pi is Dialogue: 0,0:03:07.50,0:03:09.30,Default,,0000,0000,0000,,exactly the same as the structure\Nof any other Dialogue: 0,0:03:09.30,0:03:13.37,Default,,0000,0000,0000,,personal computer, it's just smaller and\Nless expensive. Dialogue: 0,0:03:13.37,0:03:14.63,Default,,0000,0000,0000,,And so in the Raspberry Pi, if you're Dialogue: 0,0:03:14.63,0:03:17.78,Default,,0000,0000,0000,,programming the Raspberry Pi, you're sort\Nof finally escaping. Dialogue: 0,0:03:17.78,0:03:19.55,Default,,0000,0000,0000,,All your programs were in here. Dialogue: 0,0:03:19.55,0:03:24.35,Default,,0000,0000,0000,,Your CPU is in here and that's pretty much\Nhow, how far you've got to run. Dialogue: 0,0:03:24.35,0:03:28.73,Default,,0000,0000,0000,,But now, of course when you save your files,\Nyou save them to here. Dialogue: 0,0:03:28.73,0:03:34.62,Default,,0000,0000,0000,,But now we are going to start looking at\Ndata on the disk drive and so it's time Dialogue: 0,0:03:34.62,0:03:38.88,Default,,0000,0000,0000,,to escape to the secondary memory.\NOkay? Dialogue: 0,0:03:38.88,0:03:41.00,Default,,0000,0000,0000,,Time to escape to the secondary memory. Dialogue: 0,0:03:41.00,0:03:43.93,Default,,0000,0000,0000,,And Raspberry Pi, you can go right there.\NOkay? Dialogue: 0,0:03:43.93,0:03:45.79,Default,,0000,0000,0000,,So it's time to find some data to mess\Nwith. Dialogue: 0,0:03:45.79,0:03:48.69,Default,,0000,0000,0000,,So a lot of what we've been doing so far\Nis just Dialogue: 0,0:03:48.69,0:03:52.74,Default,,0000,0000,0000,,kind of the pre-work to get to the point\Nwhere we can do this. Dialogue: 0,0:03:52.74,0:03:54.60,Default,,0000,0000,0000,,And in here we're going to have data files. Dialogue: 0,0:03:54.60,0:03:55.76,Default,,0000,0000,0000,,Now, we've been making data files. Dialogue: 0,0:03:55.76,0:04:00.01,Default,,0000,0000,0000,,You've been writing, every Python program\Nthat you write on your computer gets saved Dialogue: 0,0:04:00.01,0:04:03.18,Default,,0000,0000,0000,,as a file. Then Python reads the file and runs it. Dialogue: 0,0:04:04.38,0:04:07.20,Default,,0000,0000,0000,,But now we're actually going to start\Nmessing with some data. Dialogue: 0,0:04:09.06,0:04:11.53,Default,,0000,0000,0000,,And so, files are where we're going to be\Nworking. Dialogue: 0,0:04:11.53,0:04:16.75,Default,,0000,0000,0000,,And so, one of things about secondary memory\Nis it's much larger. Dialogue: 0,0:04:18.78,0:04:21.48,Default,,0000,0000,0000,,And this is, main memory of the computer\Nis pretty large, it's just Dialogue: 0,0:04:21.48,0:04:26.09,Default,,0000,0000,0000,,not large enough to hold everything that\Nthe computer is capable of holding. Dialogue: 0,0:04:26.09,0:04:28.01,Default,,0000,0000,0000,,So the files that we're going to work with. Dialogue: 0,0:04:28.01,0:04:32.23,Default,,0000,0000,0000,,Now we're not talking about image files or\NQuicktime movies or things like that. Dialogue: 0,0:04:32.23,0:04:34.39,Default,,0000,0000,0000,,We're going to work with text files\Nbecause the Dialogue: 0,0:04:34.39,0:04:37.54,Default,,0000,0000,0000,,theme of this course is digging through\Ntext. Dialogue: 0,0:04:37.54,0:04:39.09,Default,,0000,0000,0000,,Sometimes we'll pull it off the Internet. Dialogue: 0,0:04:39.09,0:04:42.17,Default,,0000,0000,0000,,Sometimes we'll read files, but it's\Ndigging through and Dialogue: 0,0:04:42.17,0:04:44.03,Default,,0000,0000,0000,,using all the things that we've learned so\Nfar, Dialogue: 0,0:04:44.03,0:04:46.04,Default,,0000,0000,0000,,looping and strings, and all those things, Dialogue: 0,0:04:46.04,0:04:49.40,Default,,0000,0000,0000,,to make sense of a sequence of\Ninformation. Dialogue: 0,0:04:50.52,0:04:51.54,Default,,0000,0000,0000,,Okay? Dialogue: 0,0:04:51.54,0:04:57.67,Default,,0000,0000,0000,,Now, to access file information, we have\Nto do this thing called opening the file. Dialogue: 0,0:04:57.67,0:05:02.40,Default,,0000,0000,0000,,We can't just say, yo, the information is\Njust omnipresent because there are Dialogue: 0,0:05:02.40,0:05:06.17,Default,,0000,0000,0000,,so much data that you can't have Python\Nsort of know all the data. Dialogue: 0,0:05:06.17,0:05:09.22,Default,,0000,0000,0000,,You literally have hundreds of thousands\Nof files on Dialogue: 0,0:05:09.22,0:05:12.42,Default,,0000,0000,0000,,your computer's hard drive.\NAnd you, Dialogue: 0,0:05:12.42,0:05:13.50,Default,,0000,0000,0000,,which one are you going to read? Dialogue: 0,0:05:13.50,0:05:15.77,Default,,0000,0000,0000,,So there's a step that you have to do, Dialogue: 0,0:05:15.77,0:05:19.03,Default,,0000,0000,0000,,that you call this built-in function\Ncalled open. Dialogue: 0,0:05:19.03,0:05:21.88,Default,,0000,0000,0000,,And say, oh, this is the file that I\Nwant to work with, Dialogue: 0,0:05:21.88,0:05:23.85,Default,,0000,0000,0000,,of the hundreds of thousands, and then\Nonce you do, Dialogue: 0,0:05:23.85,0:05:27.51,Default,,0000,0000,0000,,you've kind of got this little\Nconnector into it. Dialogue: 0,0:05:27.51,0:05:31.52,Default,,0000,0000,0000,,And the open is a built-in function inside\NPython. Dialogue: 0,0:05:31.52,0:05:34.30,Default,,0000,0000,0000,,Hang on a sec, let's say good bye to that.\NThe open Dialogue: 0,0:05:34.30,0:05:39.68,Default,,0000,0000,0000,,function is a built-in function in Python,\Nand you, it takes two parameters. Dialogue: 0,0:05:39.68,0:05:45.81,Default,,0000,0000,0000,,The first parameter is the name of the\Nfile, like mbox.txt, Dialogue: 0,0:05:45.81,0:05:48.60,Default,,0000,0000,0000,,and then the second is how you're going to\Nread it. Dialogue: 0,0:05:48.60,0:05:49.27,Default,,0000,0000,0000,,Are you going to read it? Dialogue: 0,0:05:49.27,0:05:50.28,Default,,0000,0000,0000,,are you going to write it? et cetera. Dialogue: 0,0:05:50.28,0:05:53.19,Default,,0000,0000,0000,,Now most of the time we'll be reading our\Nfiles. Dialogue: 0,0:05:53.19,0:05:55.73,Default,,0000,0000,0000,,So we call the open function and pass it\Nin the name of Dialogue: 0,0:05:55.73,0:05:59.31,Default,,0000,0000,0000,,the file we want to open, and then how we\Nwant to read it. Dialogue: 0,0:05:59.31,0:06:02.30,Default,,0000,0000,0000,,Now you can leave this second parameter\Noff and it Dialogue: 0,0:06:02.30,0:06:04.63,Default,,0000,0000,0000,,assumes that you're going to want to read\Nthe file. Dialogue: 0,0:06:04.63,0:06:05.13,Default,,0000,0000,0000,,Now. Dialogue: 0,0:06:08.92,0:06:11.93,Default,,0000,0000,0000,,When the open is successful, it doesn't\Nactually read all Dialogue: 0,0:06:11.93,0:06:16.98,Default,,0000,0000,0000,,of the data because the memory is small,\Nsmall compared to Dialogue: 0,0:06:16.98,0:06:19.14,Default,,0000,0000,0000,,the hard drive, and so you have to sort of Dialogue: 0,0:06:19.14,0:06:22.18,Default,,0000,0000,0000,,step through the data, you'll tell it when\Nto read it. Dialogue: 0,0:06:22.18,0:06:26.70,Default,,0000,0000,0000,,So the act of opening it is not\Nactually reading all data. Dialogue: 0,0:06:26.70,0:06:30.51,Default,,0000,0000,0000,,It is creating kind of like a connection\Nbetween the Dialogue: 0,0:06:30.51,0:06:33.22,Default,,0000,0000,0000,,memory and the data that's on the hard\Ndrive, right? Dialogue: 0,0:06:33.22,0:06:34.47,Default,,0000,0000,0000,,It's connecting Dialogue: 0,0:06:34.47,0:06:38.45,Default,,0000,0000,0000,,between, oh listen to this.\NOh that's going to fall down. Dialogue: 0,0:06:38.45,0:06:42.01,Default,,0000,0000,0000,,Is it going to stand up that way? Dialogue: 0,0:06:42.01,0:06:45.33,Default,,0000,0000,0000,,Oh, I should come up with a way to\Nmake that stand. Dialogue: 0,0:06:46.37,0:06:48.08,Default,,0000,0000,0000,,So it's a connection. Dialogue: 0,0:06:48.08,0:06:50.29,Default,,0000,0000,0000,,So the, your program's kind of running in\Nhere. Dialogue: 0,0:06:50.29,0:06:53.91,Default,,0000,0000,0000,,And the, and the file handle is just sort\Nof it's Dialogue: 0,0:06:53.91,0:06:57.76,Default,,0000,0000,0000,,like a phone call between your memory and\Nyour disk drive. Dialogue: 0,0:06:57.76,0:07:00.15,Default,,0000,0000,0000,,It's not the actual data.\NThe actual data is still Dialogue: 0,0:07:00.15,0:07:06.01,Default,,0000,0000,0000,,sitting on the disk drive, okay?\NSo, a graphical way to take a look at this Dialogue: 0,0:07:06.01,0:07:11.68,Default,,0000,0000,0000,,is, the file handle, the thing that comes\Nback from the open request. Dialogue: 0,0:07:11.68,0:07:14.99,Default,,0000,0000,0000,,The open goes and finds the file out on\Nthe disk drive and Dialogue: 0,0:07:14.99,0:07:20.25,Default,,0000,0000,0000,,yada, yada, yada, and then the handle is\Nsomething that lives in the memory. Dialogue: 0,0:07:20.25,0:07:22.10,Default,,0000,0000,0000,,that is sort of like the thing that Dialogue: 0,0:07:22.10,0:07:25.83,Default,,0000,0000,0000,,maintains its connection to where all the\Ndata is Dialogue: 0,0:07:25.83,0:07:28.91,Default,,0000,0000,0000,,on the disk or on the SD RAM that's in it. Dialogue: 0,0:07:28.91,0:07:30.82,Default,,0000,0000,0000,,So the handle is not all the data, but it is Dialogue: 0,0:07:30.82,0:07:34.28,Default,,0000,0000,0000,,a mechanism that you can use to get at the\Ndata. Dialogue: 0,0:07:34.28,0:07:37.99,Default,,0000,0000,0000,,So if you print it out, it doesn't have\Nall the data from the file, Dialogue: 0,0:07:37.99,0:07:44.10,Default,,0000,0000,0000,,it says, I am a file handle that's opened\Nthis file and we're in read mode. Dialogue: 0,0:07:44.10,0:07:46.12,Default,,0000,0000,0000,,So, that doesn't actually have the data, Dialogue: 0,0:07:46.12,0:07:48.44,Default,,0000,0000,0000,,even though this is the data that's \Nin the file. Dialogue: 0,0:07:48.44,0:07:51.05,Default,,0000,0000,0000,,And then we have operations that we do to\Nthe handle like open it, Dialogue: 0,0:07:51.05,0:07:53.47,Default,,0000,0000,0000,,close it, read it, write it.\NSo we do things. Dialogue: 0,0:07:53.47,0:07:56.37,Default,,0000,0000,0000,,So, so the handle and then through the\Nhandle it actually changes Dialogue: 0,0:07:56.37,0:07:58.86,Default,,0000,0000,0000,,what's on the disk or reads\Nwhat's on the disk. Dialogue: 0,0:07:58.86,0:08:01.56,Default,,0000,0000,0000,,So the handle is kind of a thing that's\Nnot there. Dialogue: 0,0:08:02.89,0:08:06.46,Default,,0000,0000,0000,,If you attempt to open a file and the name\Nof the file. Dialogue: 0,0:08:06.46,0:08:08.66,Default,,0000,0000,0000,,Now the way we're going to do these is\Nthese need to be Dialogue: 0,0:08:08.66,0:08:14.49,Default,,0000,0000,0000,,in the same folder on your computer as in,\Nas your Python code. Dialogue: 0,0:08:14.49,0:08:16.11,Default,,0000,0000,0000,,Now, there are trickier ways to do it, but Dialogue: 0,0:08:16.11,0:08:17.18,Default,,0000,0000,0000,,we're going to keep it simple. Dialogue: 0,0:08:17.18,0:08:18.67,Default,,0000,0000,0000,,This is the name of a file in the Dialogue: 0,0:08:18.67,0:08:21.59,Default,,0000,0000,0000,,same folder as the Python code that you're\Nrunning. Dialogue: 0,0:08:21.59,0:08:28.10,Default,,0000,0000,0000,,[SOUND] And if it's not, then we get, of\Ncourse, a traceback and we're Dialogue: 0,0:08:28.10,0:08:32.29,Default,,0000,0000,0000,,used to using, reading tracebacks by\Nnow, no such file or directory stuff.txt. Dialogue: 0,0:08:32.29,0:08:34.71,Default,,0000,0000,0000,,Oh, of course, I forgot to save it or I\Ntyped it wrong. Dialogue: 0,0:08:37.82,0:08:38.84,Default,,0000,0000,0000,,So. Dialogue: 0,0:08:38.84,0:08:42.66,Default,,0000,0000,0000,,The next thing we have to learn is the\Nnotion of the newline character. Dialogue: 0,0:08:42.66,0:08:44.39,Default,,0000,0000,0000,,You haven't seen this so far, Dialogue: 0,0:08:44.39,0:08:47.96,Default,,0000,0000,0000,,but there's a special character in files Dialogue: 0,0:08:47.96,0:08:52.03,Default,,0000,0000,0000,,that is used to indicate the end of a line. Dialogue: 0,0:08:52.03,0:08:53.78,Default,,0000,0000,0000,,Because these text files that we've been\Nwriting, Dialogue: 0,0:08:53.78,0:08:57.72,Default,,0000,0000,0000,,including Python programs that you have,\Nare organized into lines. Dialogue: 0,0:08:57.72,0:08:59.69,Default,,0000,0000,0000,,Each line has variable length and there is Dialogue: 0,0:08:59.69,0:09:02.87,Default,,0000,0000,0000,,a special non-printing character that you\Njust don't see. Dialogue: 0,0:09:02.87,0:09:05.84,Default,,0000,0000,0000,,Now you see it because you see a line, Dialogue: 0,0:09:05.84,0:09:10.71,Default,,0000,0000,0000,,multiple lines, but you don't see the\Ncharacter itself. Dialogue: 0,0:09:10.71,0:09:13.13,Default,,0000,0000,0000,,So it turns out that this character is\Nvery Dialogue: 0,0:09:13.13,0:09:15.69,Default,,0000,0000,0000,,important because the data is just a\Nstream of Dialogue: 0,0:09:15.69,0:09:18.85,Default,,0000,0000,0000,,characters on disk and then it's\Npunctuated by newlines Dialogue: 0,0:09:18.85,0:09:22.20,Default,,0000,0000,0000,,that tell it when it's time to end the\Nline. Dialogue: 0,0:09:22.20,0:09:29.37,Default,,0000,0000,0000,,So if we are building a string, the\Nconstant for newline is backslash n. Dialogue: 0,0:09:29.37,0:09:32.97,Default,,0000,0000,0000,,And so, when we make a string that we\Nwant to Dialogue: 0,0:09:32.97,0:09:38.38,Default,,0000,0000,0000,,have a newline in it, we'll say Hello\Nbackslash n World. Dialogue: 0,0:09:38.38,0:09:41.37,Default,,0000,0000,0000,,And then if you print it out one way, you\Nactually see the backslash n. Dialogue: 0,0:09:41.37,0:09:44.23,Default,,0000,0000,0000,,But then if you use the print to print it\Nout, you see sort of Dialogue: 0,0:09:44.23,0:09:49.58,Default,,0000,0000,0000,,like the, it moves back down, you know,\Nto the left margin and down. Dialogue: 0,0:09:49.58,0:09:55.90,Default,,0000,0000,0000,,So, so, sometimes you see the slash n\Nand sometimes it's shown as movement. Dialogue: 0,0:09:55.90,0:09:57.14,Default,,0000,0000,0000,,Right? You, it moves it. Dialogue: 0,0:09:58.67,0:10:00.00,Default,,0000,0000,0000,,The other thing that's important is even Dialogue: 0,0:10:00.00,0:10:02.13,Default,,0000,0000,0000,,though we represent this as two\Ncharacters, Dialogue: 0,0:10:02.13,0:10:06.30,Default,,0000,0000,0000,,the backslash n is represented as two characters\Nin a string, it's actually one character. Dialogue: 0,0:10:06.30,0:10:10.25,Default,,0000,0000,0000,,So if we print it out, we see\NX newline Y Dialogue: 0,0:10:10.25,0:10:13.28,Default,,0000,0000,0000,,and if we ask how many characters are\Nin stuff, Dialogue: 0,0:10:13.28,0:10:17.44,Default,,0000,0000,0000,,which is this string, it says 3.\NThat's important. Dialogue: 0,0:10:17.44,0:10:18.12,Default,,0000,0000,0000,,Okay? Dialogue: 0,0:10:18.12,0:10:22.07,Default,,0000,0000,0000,,There is one, two, three.\NThe newline is a single character. Dialogue: 0,0:10:22.07,0:10:26.89,Default,,0000,0000,0000,,This is a just a syntax that we use to\Nsort of encode a newline in a string. Dialogue: 0,0:10:27.89,0:10:28.39,Default,,0000,0000,0000,,Okay? Dialogue: 0,0:10:29.45,0:10:33.71,Default,,0000,0000,0000,,So, even though these are just a Dialogue: 0,0:10:33.71,0:10:36.59,Default,,0000,0000,0000,,long sequence of characters punctuated by\Nnewlines, Dialogue: 0,0:10:36.59,0:10:40.93,Default,,0000,0000,0000,,visually, text editors and operating\Nsystems show them, show Dialogue: 0,0:10:40.93,0:10:43.95,Default,,0000,0000,0000,,these files to us as a sequence of lines. Dialogue: 0,0:10:43.95,0:10:46.28,Default,,0000,0000,0000,,And it doesn't take very long to just\Nstart thinking about them Dialogue: 0,0:10:46.28,0:10:47.65,Default,,0000,0000,0000,,as a sequence of lines. Dialogue: 0,0:10:47.65,0:10:50.57,Default,,0000,0000,0000,,As a matter of fact, maybe you never, wish\NI'd never told you about newlines. Dialogue: 0,0:10:51.83,0:10:53.08,Default,,0000,0000,0000,,But when we start reading files, we're Dialogue: 0,0:10:53.08,0:10:54.99,Default,,0000,0000,0000,,going to have to deal with these newlines. Dialogue: 0,0:10:54.99,0:10:59.26,Default,,0000,0000,0000,,So the way that we sort of have to\Nmentally visualize of what these text Dialogue: 0,0:10:59.26,0:11:03.98,Default,,0000,0000,0000,,files look like is they have a newline\Nthat punctuates the end of the line. Dialogue: 0,0:11:03.98,0:11:09.20,Default,,0000,0000,0000,,Now in reality, if we look at this, this\NR really comes right after it. Dialogue: 0,0:11:09.20,0:11:09.44,Default,,0000,0000,0000,,Right? Dialogue: 0,0:11:09.44,0:11:13.41,Default,,0000,0000,0000,,This is all a bunch of characters and the\Nnewlines are punctuation, okay? Dialogue: 0,0:11:13.41,0:11:16.72,Default,,0000,0000,0000,,To say this is first line, second line,\Nthird line, and fourth line. Dialogue: 0,0:11:16.72,0:11:18.73,Default,,0000,0000,0000,,So, you gotta think that each of these\Nthings Dialogue: 0,0:11:18.73,0:11:21.71,Default,,0000,0000,0000,,is here, sitting at the end of the line. Dialogue: 0,0:11:21.71,0:11:24.95,Default,,0000,0000,0000,,And so the number of characters in this\Nline include that newline. Dialogue: 0,0:11:24.95,0:11:26.93,Default,,0000,0000,0000,,Now the newline is one character. Dialogue: 0,0:11:26.93,0:11:31.92,Default,,0000,0000,0000,,Okay? So, how do we read these files? Dialogue: 0,0:11:31.92,0:11:36.13,Default,,0000,0000,0000,,Well, we've already talked about doing an\Nopen xfile. Dialogue: 0,0:11:36.13,0:11:39.10,Default,,0000,0000,0000,,And I'm just, this xfile, again that's\Njust a mneumonic Dialogue: 0,0:11:39.10,0:11:41.84,Default,,0000,0000,0000,,name that I made up. This is a handle. Dialogue: 0,0:11:41.84,0:11:43.69,Default,,0000,0000,0000,,Remember, it's not all the data. Dialogue: 0,0:11:43.69,0:11:46.07,Default,,0000,0000,0000,,But the handle is the way that we can read\Nthe data. Dialogue: 0,0:11:46.07,0:11:48.69,Default,,0000,0000,0000,,We can use it as a access point. Dialogue: 0,0:11:48.69,0:11:52.00,Default,,0000,0000,0000,,The coolest way to read a file, if it's a\Ntext file in multiple Dialogue: 0,0:11:52.00,0:11:58.27,Default,,0000,0000,0000,,lines, is to use a determinant loop, a\Nfor loop. for cheese in xfile. Dialogue: 0,0:11:58.27,0:12:03.09,Default,,0000,0000,0000,,So this, remember we would put a list of\Nnumbers or a string here. Dialogue: 0,0:12:03.09,0:12:04.15,Default,,0000,0000,0000,,Now we've put a file Dialogue: 0,0:12:04.15,0:12:05.20,Default,,0000,0000,0000,,handle here. Dialogue: 0,0:12:05.20,0:12:09.32,Default,,0000,0000,0000,,Python knows automatically that each time\Nwe are going to run this Dialogue: 0,0:12:09.32,0:12:11.98,Default,,0000,0000,0000,,loop, it's going to go to the next line of\Nthe file. Dialogue: 0,0:12:11.98,0:12:16.09,Default,,0000,0000,0000,,Automatically, for, a cheese is just a\Nstupid name that I came up with it. Dialogue: 0,0:12:16.09,0:12:20.23,Default,,0000,0000,0000,,I would be better to call line rather than\Ncheese, but for cheese in and then it goes Dialogue: 0,0:12:20.23,0:12:22.81,Default,,0000,0000,0000,,dot, dot, dot, dot, dot, dot, dot,\Neach file Dialogue: 0,0:12:22.81,0:12:25.76,Default,,0000,0000,0000,,and then it stops when it reads\Nthe whole file. Dialogue: 0,0:12:25.76,0:12:29.27,Default,,0000,0000,0000,,So this line will print out every line Dialogue: 0,0:12:29.27,0:12:33.84,Default,,0000,0000,0000,,in the file, that's how you do it.\NThese three lines open a file, Dialogue: 0,0:12:35.51,0:12:42.40,Default,,0000,0000,0000,,read every line in the file, okay?\NSo a file handle itself is a special kind Dialogue: 0,0:12:42.40,0:12:47.17,Default,,0000,0000,0000,,of a sequence, much like a list of numbers\Nor a string is a sequence of characters. Dialogue: 0,0:12:47.17,0:12:48.86,Default,,0000,0000,0000,,So one of the things we can do to combine\None of Dialogue: 0,0:12:48.86,0:12:51.93,Default,,0000,0000,0000,,our counting idioms is count the number of\Nlines in a file. Dialogue: 0,0:12:53.34,0:12:54.34,Default,,0000,0000,0000,,Okay? And so how we Dialogue: 0,0:12:54.34,0:12:56.97,Default,,0000,0000,0000,,would do that is we would open\Nthe file, set a Dialogue: 0,0:12:56.97,0:13:00.58,Default,,0000,0000,0000,,counter to zero, this time I'll use a\Nmnemonic variable called count. Dialogue: 0,0:13:00.58,0:13:02.95,Default,,0000,0000,0000,,For line in fhand, that says run this Dialogue: 0,0:13:02.95,0:13:05.74,Default,,0000,0000,0000,,indented text once for each line in the\Nfile. Dialogue: 0,0:13:05.74,0:13:08.41,Default,,0000,0000,0000,,For each line in the file, add count equals\Ncount plus 1. Dialogue: 0,0:13:08.41,0:13:10.76,Default,,0000,0000,0000,,When the for loop is done, print the\Ncount. Dialogue: 0,0:13:13.01,0:13:14.48,Default,,0000,0000,0000,,Pretty straightforward. Dialogue: 0,0:13:14.48,0:13:18.24,Default,,0000,0000,0000,,Very few other languages are capable of\Nwriting that program in Dialogue: 0,0:13:18.24,0:13:22.16,Default,,0000,0000,0000,,as quick and as dense and succinct a way as\NPython is. Dialogue: 0,0:13:22.16,0:13:25.08,Default,,0000,0000,0000,,Python does a really, really nice\Njob of this. Dialogue: 0,0:13:25.08,0:13:28.14,Default,,0000,0000,0000,,Okay? So that's how you count the lines. Dialogue: 0,0:13:28.14,0:13:31.25,Default,,0000,0000,0000,,Open it, write a for loop, and then add\None. Dialogue: 0,0:13:31.25,0:13:35.98,Default,,0000,0000,0000,,Now we, we can't just say, so what you\Ncan't do, and this gives you a sense. Dialogue: 0,0:13:35.98,0:13:37.12,Default,,0000,0000,0000,,You can't say len, Dialogue: 0,0:13:37.12,0:13:40.30,Default,,0000,0000,0000,,fhand. Dialogue: 0,0:13:40.30,0:13:42.62,Default,,0000,0000,0000,,And that's because this isn't really the\Ndata. Dialogue: 0,0:13:42.62,0:13:45.39,Default,,0000,0000,0000,,That's sort of, you have to like pull the,\Npull it Dialogue: 0,0:13:45.39,0:13:48.08,Default,,0000,0000,0000,,and read it to get the data out of it. Dialogue: 0,0:13:48.08,0:13:49.99,Default,,0000,0000,0000,,Although we'll see another way of reading\Nit later. Dialogue: 0,0:13:51.02,0:13:53.27,Default,,0000,0000,0000,,Okay? So that's counting the lines in a\Nfile. Dialogue: 0,0:13:55.10,0:13:57.46,Default,,0000,0000,0000,,It turns out you can also read the entire\Nfile. Dialogue: 0,0:13:58.98,0:14:02.10,Default,,0000,0000,0000,,Now if you read the entire file, it's not\Nbroken into lines. Dialogue: 0,0:14:02.10,0:14:04.00,Default,,0000,0000,0000,,You're getting all the characters\Npunctuated Dialogue: 0,0:14:04.00,0:14:06.32,Default,,0000,0000,0000,,by newlines and you get everything. Dialogue: 0,0:14:06.32,0:14:09.82,Default,,0000,0000,0000,,Now you don't want to read this if it's\Ntoo big, so it's Dialogue: 0,0:14:09.82,0:14:12.61,Default,,0000,0000,0000,,going to all try to read it into the\Nmemory of the computer. Dialogue: 0,0:14:12.61,0:14:16.08,Default,,0000,0000,0000,,And if the memory is not big enough,\Nyou're going to slow down to a crawl. Dialogue: 0,0:14:16.08,0:14:18.90,Default,,0000,0000,0000,,But if it's a real tiny file, this works\Njust fine. Dialogue: 0,0:14:18.90,0:14:22.12,Default,,0000,0000,0000,,And so, so we have sort of real, we open Dialogue: 0,0:14:22.12,0:14:26.99,Default,,0000,0000,0000,,a file and we say fhand.read, this is\Nbasically saying, hey, Dialogue: 0,0:14:26.99,0:14:30.84,Default,,0000,0000,0000,,dear fhand, read it all and return it to\Nme as a string. Dialogue: 0,0:14:31.95,0:14:34.35,Default,,0000,0000,0000,,So that's a string with all the lines of\Nthe file concatenated Dialogue: 0,0:14:34.35,0:14:38.79,Default,,0000,0000,0000,,together with newlines, which is actually\Nexactly what's in the file. Dialogue: 0,0:14:38.79,0:14:39.80,Default,,0000,0000,0000,,It's the raw data. Dialogue: 0,0:14:39.80,0:14:42.40,Default,,0000,0000,0000,,That for loop sort of looks for the newline Dialogue: 0,0:14:42.40,0:14:44.41,Default,,0000,0000,0000,,and does all of the stuff\Nautomatically for us. Dialogue: 0,0:14:44.41,0:14:45.16,Default,,0000,0000,0000,,It's quite nice. Dialogue: 0,0:14:46.41,0:14:49.67,Default,,0000,0000,0000,,So then we can, like, because inp is a\Nstring at this point, Dialogue: 0,0:14:49.67,0:14:50.59,Default,,0000,0000,0000,,we can just print the length of it. Dialogue: 0,0:14:50.59,0:14:53.11,Default,,0000,0000,0000,,And we can say, oh, there's 94,626 Dialogue: 0,0:14:53.11,0:14:56.78,Default,,0000,0000,0000,,characters that came from that file. Dialogue: 0,0:14:56.78,0:15:01.91,Default,,0000,0000,0000,,It reads the whole thing, whole file,\Nreads the whole file. Dialogue: 0,0:15:01.91,0:15:04.45,Default,,0000,0000,0000,,We can also do things like, you know, slice\Nit now. Dialogue: 0,0:15:04.45,0:15:10.33,Default,,0000,0000,0000,,And so this is the first 20 characters,\Nup from zero up to, but not including, 20. Dialogue: 0,0:15:10.33,0:15:12.79,Default,,0000,0000,0000,,So this, this is our file. Okay? Dialogue: 0,0:15:12.79,0:15:15.64,Default,,0000,0000,0000,,So that's reading through the whole file. Dialogue: 0,0:15:15.64,0:15:18.39,Default,,0000,0000,0000,,So, let me go back a little bit, this is\Nthe file that we're Dialogue: 0,0:15:18.39,0:15:19.23,Default,,0000,0000,0000,,going to play with. Dialogue: 0,0:15:20.37,0:15:24.92,Default,,0000,0000,0000,,This file here that we're going to play\Nwith in this class is a mailbox file. Dialogue: 0,0:15:24.92,0:15:27.12,Default,,0000,0000,0000,,And this is actual real data.\NAnd these are real people. Dialogue: 0,0:15:27.12,0:15:28.89,Default,,0000,0000,0000,,And these are real dates, having to do\Nwith Dialogue: 0,0:15:28.89,0:15:31.68,Default,,0000,0000,0000,,an open source project that I worked on\Ncalled Sakai. Dialogue: 0,0:15:31.68,0:15:35.99,Default,,0000,0000,0000,,I actually have a tattoo of Sakai here on\Nmy shoulder. Dialogue: 0,0:15:35.99,0:15:38.22,Default,,0000,0000,0000,,Maybe in an upcoming lecture, I'll have a Dialogue: 0,0:15:38.22,0:15:40.43,Default,,0000,0000,0000,,short-sleeved shirt, and show you my\Ntattoo. Dialogue: 0,0:15:40.43,0:15:44.50,Default,,0000,0000,0000,,But for now, I can't because I've got, got\Nclothes on. Dialogue: 0,0:15:44.50,0:15:52.48,Default,,0000,0000,0000,,So, but this is real data.\NIt's the mbox.txt, mbox.txt file. Dialogue: 0,0:15:52.48,0:15:56.27,Default,,0000,0000,0000,,So, so that's the file that we're going to\Nuse for most of the next few assignments. Dialogue: 0,0:15:56.27,0:15:57.96,Default,,0000,0000,0000,,It'll be the same file. You'll get tired of it. Dialogue: 0,0:15:57.96,0:16:00.33,Default,,0000,0000,0000,,And you'll get to know all these people,\NStephen, Dialogue: 0,0:16:00.33,0:16:02.11,Default,,0000,0000,0000,,Chen Wen, and all the people in the file. Dialogue: 0,0:16:05.36,0:16:06.02,Default,,0000,0000,0000,,Okay, so. Dialogue: 0,0:16:07.44,0:16:10.47,Default,,0000,0000,0000,,We can search for lines that have a\Nprefix. Dialogue: 0,0:16:10.47,0:16:14.38,Default,,0000,0000,0000,,This is kind of the find pattern from the\Nlooping lecture. Dialogue: 0,0:16:14.38,0:16:17.86,Default,,0000,0000,0000,,So we're going to go through a list of, of\Nlines in a file, Dialogue: 0,0:16:17.86,0:16:20.91,Default,,0000,0000,0000,,and we're going to only print out the ones\Nthat match a certain thing. Dialogue: 0,0:16:20.91,0:16:22.81,Default,,0000,0000,0000,,So again, we open the file up. Dialogue: 0,0:16:22.81,0:16:25.41,Default,,0000,0000,0000,,We're going to write a for loop that's\Ngoing to say, for each line in the Dialogue: 0,0:16:25.41,0:16:30.42,Default,,0000,0000,0000,,file, if the line and then we can call a,\Na utility function Dialogue: 0,0:16:30.42,0:16:32.66,Default,,0000,0000,0000,,inside of string, because line is a string. Dialogue: 0,0:16:32.66,0:16:35.23,Default,,0000,0000,0000,,If line startswith From, print it out. Dialogue: 0,0:16:35.23,0:16:37.86,Default,,0000,0000,0000,,So this means it's going to loop through\Nall of the lines in the Dialogue: 0,0:16:37.86,0:16:43.18,Default,,0000,0000,0000,,file and it's going to print the ones that\Nstart with the string 'From:' Dialogue: 0,0:16:44.53,0:16:45.70,Default,,0000,0000,0000,,Okay? Dialogue: 0,0:16:45.70,0:16:49.78,Default,,0000,0000,0000,,Again, four lines, complete Python program\Nto read this Dialogue: 0,0:16:49.78,0:16:52.76,Default,,0000,0000,0000,,file and print the lines that have a\Nprefix of from. Dialogue: 0,0:16:54.95,0:16:59.02,Default,,0000,0000,0000,,So, if you run this program, and I suggest\Nthat you do, Dialogue: 0,0:17:01.05,0:17:02.71,Default,,0000,0000,0000,,this is what the output's going to look like. Dialogue: 0,0:17:03.84,0:17:07.16,Default,,0000,0000,0000,,And it's like, wait a second, I'm seeing\Nthe lines, Dialogue: 0,0:17:09.68,0:17:13.99,Default,,0000,0000,0000,,seeing the lines that have the froms, but\Nthen I get these blank lines. Dialogue: 0,0:17:16.53,0:17:18.95,Default,,0000,0000,0000,,And why is that?\NWhy are these blank lines there? Dialogue: 0,0:17:18.95,0:17:24.39,Default,,0000,0000,0000,,If I look at the program, I mean, I'm not\Nprinting blank lines. Dialogue: 0,0:17:24.39,0:17:26.37,Default,,0000,0000,0000,,I'm only printing lines that\Nstart with from. Dialogue: 0,0:17:26.37,0:17:27.52,Default,,0000,0000,0000,,I'm not doing that, so why? Dialogue: 0,0:17:30.52,0:17:31.02,Default,,0000,0000,0000,,What do you think? Dialogue: 0,0:17:31.79,0:17:32.53,Default,,0000,0000,0000,,I'll give you a second. Dialogue: 0,0:17:34.74,0:17:38.08,Default,,0000,0000,0000,,I've certainly done enough foreshadowing\Nin this lecture. Dialogue: 0,0:17:38.08,0:17:41.10,Default,,0000,0000,0000,,Well it turns out these newlines are the\Nproblem. Dialogue: 0,0:17:41.10,0:17:43.85,Default,,0000,0000,0000,,So it turns out that the print, we've been\Ndoing this Dialogue: 0,0:17:43.85,0:17:46.58,Default,,0000,0000,0000,,all along, you just, we didn't make a fuss\Nabout it. Dialogue: 0,0:17:46.58,0:17:49.93,Default,,0000,0000,0000,,The print adds a newline at the end of\Neverything that it prints. Dialogue: 0,0:17:49.93,0:17:53.27,Default,,0000,0000,0000,,So the yellow newlines are coming from\Nthe print statement. Dialogue: 0,0:17:53.27,0:17:57.50,Default,,0000,0000,0000,,But when we read the file, each line ends\Nin a newline. Dialogue: 0,0:17:57.50,0:18:00.49,Default,,0000,0000,0000,,So these green newlines are actually from\Nthe file. Dialogue: 0,0:18:03.17,0:18:05.96,Default,,0000,0000,0000,,They're the ones from the file. Dialogue: 0,0:18:05.96,0:18:08.06,Default,,0000,0000,0000,,So what's happening is we're seeing two Dialogue: 0,0:18:08.06,0:18:10.53,Default,,0000,0000,0000,,newlines, and so that turns into a\Nblank line. Dialogue: 0,0:18:11.87,0:18:14.14,Default,,0000,0000,0000,,So, how do we deal with that? Dialogue: 0,0:18:14.14,0:18:19.14,Default,,0000,0000,0000,,Well, we've got a string function that\Nconveniently solves that problem, okay? Dialogue: 0,0:18:19.14,0:18:21.04,Default,,0000,0000,0000,,And that is we're going to call rstrip. Dialogue: 0,0:18:21.04,0:18:25.20,Default,,0000,0000,0000,,If you recall, we had strip, lstrip, and\Nrstrip to strip Dialogue: 0,0:18:25.20,0:18:28.38,Default,,0000,0000,0000,,white space on one side, on the other\Nside, or on both sides. Dialogue: 0,0:18:28.38,0:18:29.51,Default,,0000,0000,0000,,So in this one, Dialogue: 0,0:18:29.51,0:18:30.57,Default,,0000,0000,0000,,we're going to use rstrip. Dialogue: 0,0:18:30.57,0:18:33.13,Default,,0000,0000,0000,,We're going to say, we're going to read\Nthe line, that Dialogue: 0,0:18:33.13,0:18:35.54,Default,,0000,0000,0000,,this line is going to have a newline in it. Dialogue: 0,0:18:35.54,0:18:40.20,Default,,0000,0000,0000,,rstrip says pull white space, and the\Nnewlines are also counted as white space. Dialogue: 0,0:18:40.20,0:18:42.87,Default,,0000,0000,0000,,Blanks or newlines are white space. Dialogue: 0,0:18:42.87,0:18:46.61,Default,,0000,0000,0000,,And then we're going to replace this with\Nno newline in it. Dialogue: 0,0:18:46.61,0:18:50.11,Default,,0000,0000,0000,,Then we're going to ask if it starts with\Na from and then we're going to print it Dialogue: 0,0:18:50.11,0:18:51.80,Default,,0000,0000,0000,,out, and then we go and we're going to Dialogue: 0,0:18:51.80,0:18:55.13,Default,,0000,0000,0000,,see exactly what we're looking for\Nin this file. Dialogue: 0,0:18:55.13,0:18:56.04,Default,,0000,0000,0000,,And there's no newlines. Dialogue: 0,0:18:56.04,0:19:01.36,Default,,0000,0000,0000,,So the newline that's coming out here\Nis the one from the print, not the Dialogue: 0,0:19:01.36,0:19:03.93,Default,,0000,0000,0000,,one from the file, because the one from Dialogue: 0,0:19:03.93,0:19:06.61,Default,,0000,0000,0000,,the file got wiped out by that particular\Nline. Dialogue: 0,0:19:07.95,0:19:08.45,Default,,0000,0000,0000,,Okay? Dialogue: 0,0:19:09.72,0:19:13.36,Default,,0000,0000,0000,,So another general pattern of these\Nfile-based loops Dialogue: 0,0:19:13.36,0:19:17.51,Default,,0000,0000,0000,,that we have done this, is a skipping\Npattern. Dialogue: 0,0:19:17.51,0:19:20.49,Default,,0000,0000,0000,,Now, you can do, the, the non-skipping\Npattern Dialogue: 0,0:19:20.49,0:19:22.96,Default,,0000,0000,0000,,is where you're saying, I'm going to look\Nfor lines Dialogue: 0,0:19:22.96,0:19:25.64,Default,,0000,0000,0000,,that start with from and do something to\Nthem. Dialogue: 0,0:19:25.64,0:19:30.42,Default,,0000,0000,0000,,Sometimes you'll want to do something to\Nall, to, to the to, to, you want to say, Dialogue: 0,0:19:30.42,0:19:32.79,Default,,0000,0000,0000,,here's a bunch of lines I'm going to\Nskip, and then I'm going to do something. Dialogue: 0,0:19:32.79,0:19:36.52,Default,,0000,0000,0000,,So the skipping pattern uses continue. Dialogue: 0,0:19:36.52,0:19:38.84,Default,,0000,0000,0000,,And so the first few lines here are the\Nsame. Dialogue: 0,0:19:38.84,0:19:41.76,Default,,0000,0000,0000,,We open a file, we read each line\Nin the file, Dialogue: 0,0:19:41.76,0:19:43.78,Default,,0000,0000,0000,,but we're going to strip off the white\Nspace. Dialogue: 0,0:19:43.78,0:19:45.64,Default,,0000,0000,0000,,You're going to get tired of typing these\Nthree lines, Dialogue: 0,0:19:45.64,0:19:47.28,Default,,0000,0000,0000,,because you're going to do it a lot. Dialogue: 0,0:19:47.28,0:19:51.89,Default,,0000,0000,0000,,Open the file, start reading the file,\Nstrip the whitespace for each line. Dialogue: 0,0:19:51.89,0:19:57.74,Default,,0000,0000,0000,,And you can make it so that you can look\Nfor some fact. Dialogue: 0,0:19:57.74,0:20:01.26,Default,,0000,0000,0000,,In this case, I'm going to say, if not\Nline startswith From, this Dialogue: 0,0:20:01.26,0:20:05.22,Default,,0000,0000,0000,,means this is true for all the lines that\Ndon't start with from, Dialogue: 0,0:20:05.22,0:20:08.60,Default,,0000,0000,0000,,continue. And if you remember, continue\Ngoes up. Dialogue: 0,0:20:08.60,0:20:10.96,Default,,0000,0000,0000,,So the continue says I'm done, it\Nfinishes Dialogue: 0,0:20:10.96,0:20:14.23,Default,,0000,0000,0000,,the iteration, and it doesn't do anything\Ndown here. Dialogue: 0,0:20:14.23,0:20:15.13,Default,,0000,0000,0000,,Okay? Dialogue: 0,0:20:15.13,0:20:18.21,Default,,0000,0000,0000,,And so it, this is a, and then, we can do\Nsomething. Dialogue: 0,0:20:18.21,0:20:21.11,Default,,0000,0000,0000,,So, I've kind of flipped this, where I\Nsaid, these are the Dialogue: 0,0:20:21.11,0:20:24.83,Default,,0000,0000,0000,,things I'm interesting, interested in,\Nthat's lines that start with from. Dialogue: 0,0:20:24.83,0:20:26.27,Default,,0000,0000,0000,,So, I'm going to skip the lines that\Ndon't. Dialogue: 0,0:20:26.27,0:20:27.88,Default,,0000,0000,0000,,So I'm going to use continue. Dialogue: 0,0:20:27.88,0:20:32.42,Default,,0000,0000,0000,,Either way you can do it, depending on the\Ncomplexity or how much. Dialogue: 0,0:20:32.42,0:20:34.10,Default,,0000,0000,0000,,Often when you're, this is a good pattern\Nwhen Dialogue: 0,0:20:34.10,0:20:36.40,Default,,0000,0000,0000,,you have lots of lines of code down here Dialogue: 0,0:20:36.40,0:20:37.85,Default,,0000,0000,0000,,that you're going to do a lot of cool\Nstuff with. Dialogue: 0,0:20:39.29,0:20:42.78,Default,,0000,0000,0000,,You can also use things like in to select\Nlines. Dialogue: 0,0:20:42.78,0:20:43.32,Default,,0000,0000,0000,,Right? Dialogue: 0,0:20:43.32,0:20:51.20,Default,,0000,0000,0000,,So I'm going to, I'm going to look for\Nlines that have @uct.ac.za in them. Dialogue: 0,0:20:51.20,0:20:53.07,Default,,0000,0000,0000,,So again, I'm going to open it up. Dialogue: 0,0:20:53.07,0:20:55.83,Default,,0000,0000,0000,,I'm going to open these, go through each\Nline in the file. Dialogue: 0,0:20:55.83,0:21:00.51,Default,,0000,0000,0000,,I'm going to strip the white space out,\Nand [COUGH] Dialogue: 0,0:21:00.51,0:21:03.07,Default,,0000,0000,0000,,if not u-c-t, Dialogue: 0,0:21:03.07,0:21:07.93,Default,,0000,0000,0000,,if this string is not in line, then I'm\Ngoing to continue. Dialogue: 0,0:21:07.93,0:21:12.27,Default,,0000,0000,0000,,So it's a way for me to skip all of the\Nlines that don't have this string in it. Dialogue: 0,0:21:14.00,0:21:19.26,Default,,0000,0000,0000,,So these lines do, that one has it too,\Nand then we're going to print it out. Dialogue: 0,0:21:19.26,0:21:23.75,Default,,0000,0000,0000,,It will print out the ones that make it past\Nhere, okay? Dialogue: 0,0:21:23.75,0:21:28.44,Default,,0000,0000,0000,,So, but in is another way to do searching,\Nright, starts with, Dialogue: 0,0:21:28.44,0:21:28.94,Default,,0000,0000,0000,,et cetera. Dialogue: 0,0:21:30.64,0:21:37.55,Default,,0000,0000,0000,,So one more thing that you might want to\Ntry is, so we can count, right? Dialogue: 0,0:21:37.55,0:21:40.27,Default,,0000,0000,0000,,Now, and this is a pattern for prompting\Nfor a file name. Dialogue: 0,0:21:41.92,0:21:45.85,Default,,0000,0000,0000,,And so, so here you, you'll get tired of\Nsort of Dialogue: 0,0:21:45.85,0:21:48.77,Default,,0000,0000,0000,,changing your code every time you want to\Nopen a different file. Dialogue: 0,0:21:48.77,0:21:50.85,Default,,0000,0000,0000,,because you probably want to run the\Nprogram Dialogue: 0,0:21:50.85,0:21:53.62,Default,,0000,0000,0000,,with mbox once and mbox-short because,\Njust so you Dialogue: 0,0:21:53.62,0:21:57.85,Default,,0000,0000,0000,,can test it with different things of data.\NSo here's just another pattern. Dialogue: 0,0:21:57.85,0:22:01.91,Default,,0000,0000,0000,,We add this line to say raw_input, enter\Nthe file name. Dialogue: 0,0:22:01.91,0:22:04.70,Default,,0000,0000,0000,,And there you go, we'll type in the file\Nname. Dialogue: 0,0:22:04.70,0:22:08.24,Default,,0000,0000,0000,,And then the thing that we open is\Nwhatever we entered as the file name. Dialogue: 0,0:22:08.24,0:22:11.28,Default,,0000,0000,0000,,And then the rest of it is pretty much\Nyada yada. Dialogue: 0,0:22:11.28,0:22:14.06,Default,,0000,0000,0000,,So here I'm, it's reading the whole file. Dialogue: 0,0:22:14.06,0:22:17.23,Default,,0000,0000,0000,,If the line starts with subject, count\Nequals count plus one. Dialogue: 0,0:22:17.23,0:22:19.34,Default,,0000,0000,0000,,And then there were 1797 subject Dialogue: 0,0:22:19.34,0:22:21.85,Default,,0000,0000,0000,,lines in mbox.txt. Dialogue: 0,0:22:21.85,0:22:26.50,Default,,0000,0000,0000,,There were 27 subject lines in\Nmbox-short.txt, okay? Dialogue: 0,0:22:26.50,0:22:29.02,Default,,0000,0000,0000,,So that's prompting for the file names. Dialogue: 0,0:22:29.02,0:22:31.31,Default,,0000,0000,0000,,Now, open. Dialogue: 0,0:22:31.31,0:22:35.45,Default,,0000,0000,0000,,The open statement fails if the file name\Ndoesn't exist. Dialogue: 0,0:22:35.45,0:22:37.29,Default,,0000,0000,0000,,So, you might want to add a try and Dialogue: 0,0:22:37.29,0:22:39.84,Default,,0000,0000,0000,,accept around that if you want to, if\Nyou're just writing Dialogue: 0,0:22:39.84,0:22:42.53,Default,,0000,0000,0000,,code for yourself and you assume that\Neverything's okay, Dialogue: 0,0:22:42.53,0:22:44.61,Default,,0000,0000,0000,,then you don't have to write try accept\Nbut if Dialogue: 0,0:22:44.61,0:22:50.61,Default,,0000,0000,0000,,you want to catch it [SOUND]\Nand catch a bad file name, Dialogue: 0,0:22:50.61,0:22:55.86,Default,,0000,0000,0000,,then you take the open which, and turn it\Ninto these four lines. Dialogue: 0,0:22:55.86,0:22:58.48,Default,,0000,0000,0000,,So this is the code that we think might\Nblow up, Dialogue: 0,0:22:59.50,0:23:01.33,Default,,0000,0000,0000,,and it's going to blow up, we know it's\Ngoing to blow up. Dialogue: 0,0:23:01.33,0:23:03.51,Default,,0000,0000,0000,,If they enter a bad file name like Dialogue: 0,0:23:03.51,0:23:06.51,Default,,0000,0000,0000,,na na boo boo, right, this is is going to\Nblow up. Dialogue: 0,0:23:06.51,0:23:08.94,Default,,0000,0000,0000,,So what do we do?\NWe use try and accept. Dialogue: 0,0:23:08.94,0:23:09.78,Default,,0000,0000,0000,,We put try Dialogue: 0,0:23:09.78,0:23:10.39,Default,,0000,0000,0000,,around that. Dialogue: 0,0:23:10.39,0:23:14.21,Default,,0000,0000,0000,,We're going to take out some insurance on\Nthat particular line. Dialogue: 0,0:23:14.21,0:23:16.54,Default,,0000,0000,0000,,And then, if it fails, we're going to\Nprint Dialogue: 0,0:23:16.54,0:23:20.50,Default,,0000,0000,0000,,this message and then say exit, to get\Nout. Dialogue: 0,0:23:20.50,0:23:22.92,Default,,0000,0000,0000,,So if you get a good file, Dialogue: 0,0:23:25.50,0:23:27.93,Default,,0000,0000,0000,,if you get a good file, it works, skips the Dialogue: 0,0:23:27.93,0:23:31.61,Default,,0000,0000,0000,,except, then runs the thing, prints out\Nthe count. Dialogue: 0,0:23:31.61,0:23:35.93,Default,,0000,0000,0000,,That's what's happening here. If, on the\Nother hand, you get a bad file, Dialogue: 0,0:23:36.99,0:23:41.94,Default,,0000,0000,0000,,it comes here, open blows up, runs the\Nexcept, prints this out, and then quits. Dialogue: 0,0:23:43.21,0:23:45.68,Default,,0000,0000,0000,,So that's how this one works with a bad\Nfile. Dialogue: 0,0:23:46.82,0:23:48.54,Default,,0000,0000,0000,,And now, no traceback, right? Dialogue: 0,0:23:53.54,0:23:55.39,Default,,0000,0000,0000,,So we are Dialogue: 0,0:23:56.69,0:24:00.27,Default,,0000,0000,0000,,It's kind of a short lecture.\NWe're done with Chapter Seven. Dialogue: 0,0:24:01.48,0:24:03.94,Default,,0000,0000,0000,,We open a file. Dialogue: 0,0:24:03.94,0:24:05.67,Default,,0000,0000,0000,,We read the file. Dialogue: 0,0:24:05.67,0:24:09.38,Default,,0000,0000,0000,,We take out white space at the end with\Nrstrip. Dialogue: 0,0:24:09.38,0:24:11.67,Default,,0000,0000,0000,,We had used string functions. Dialogue: 0,0:24:11.67,0:24:14.65,Default,,0000,0000,0000,,So, this is kind of putting it all\Ntogether. Dialogue: 0,0:24:14.65,0:24:17.28,Default,,0000,0000,0000,,And it's kind of short little programs\Nnow. Dialogue: 0,0:24:17.28,0:24:22.10,Default,,0000,0000,0000,,So, it's not.\NAnd you know, starting now, Dialogue: 0,0:24:22.10,0:24:25.26,Default,,0000,0000,0000,,we are going to start putting these things\Ntogether and start actually doing work. Dialogue: 0,0:24:25.26,0:24:28.10,Default,,0000,0000,0000,,Because now, we have, from the first few\Nchapters, Dialogue: 0,0:24:28.10,0:24:32.39,Default,,0000,0000,0000,,we have basic capabilities of Python.\NNow we have some data to work with. Dialogue: 0,0:24:32.39,0:24:33.18,Default,,0000,0000,0000,,Now going forward Dialogue: 0,0:24:33.18,0:24:36.57,Default,,0000,0000,0000,,we are going to do increasingly\Nsophisticated things with that data. Dialogue: 0,0:24:36.57,0:24:38.24,Default,,0000,0000,0000,,So I can't wait to see in the next\Nlecture.