Python for Informatics - Chapter 8 - Lists
-
0:00 - 0:05Hello, and welcome to Chapter Eight:
Python Lists. -
0:05 - 0:08So now we're sort of going to start taking
care of business. -
0:08 - 0:11We are doing, make lists and
-
0:11 - 0:13dictionaries and tuples and really start
manipulating this data, -
0:13 - 0:16and doing real data analysis,
starting the, -
0:16 - 0:18laying the proper work for real data
analysis. -
0:18 - 0:22As always, these lectures, audio, video,
slides, -
0:22 - 0:26and even book are copyright Creative Commons
Attribution. -
0:26 - 0:31So, lists, dictionaries, and tuples, the
next real three big topics we're going to -
0:31 - 0:36talk about, are collections.
And we've been doing lists already, right? -
0:37 - 0:41We've been doing lists when we were doing
for loops. -
0:41 - 0:44A list in Python is something that has a
square braces. -
0:44 - 0:45This is a constant list.
-
0:47 - 0:48Now, when I first talked to you
-
0:48 - 0:51about variables, I sort of oversimplified
things. -
0:51 - 0:51I said
-
0:51 - 0:54if you put like x equals two, and then put
-
0:54 - 0:58x equals four, the two and the four
overwrite each other. -
0:58 - 1:02A collection is where you can put a bunch
of things in the same variable. -
1:02 - 1:04Now, I have to have a way to find those
things. -
1:06 - 1:09But it allows us to put multiple things in
-
1:09 - 1:12more, more things, more than one thing in
the variable. -
1:12 - 1:15So, here we have friends, that has three
strings, Joseph, Glenn, and Sally. -
1:15 - 1:16And we have carryon
-
1:16 - 1:20that has socks, shirt, and perfume.
So that's the basic idea. -
1:20 - 1:22So what's not a collection?
-
1:22 - 1:23Well, simple variables.
-
1:23 - 1:27Simple variables are not collections, just
like this example. -
1:27 - 1:30I say x equals 2, x equals 4, and print x,
-
1:30 - 1:33and the 4's in there and the 2 is somehow
gone. -
1:33 - 1:36It was there for a moment, and then it's
gone. -
1:37 - 1:38And so that's a normal variable.
-
1:38 - 1:41They're not collections.
You can't put more than one thing in it. -
1:41 - 1:44But when you put more than one thing in
it, then you -
1:44 - 1:47have to have a way to find the things that
are in there. -
1:47 - 1:47We'll, we'll get to that.
-
1:49 - 1:52So, we've been using list constants for
the last couple -
1:52 - 1:55of chapters just because we have to use
list constants. -
1:55 - 1:59You know, so we used, in the for loop
chapter, we did lists of numbers. -
2:01 - 2:05We have done lists of strings, that's
strings, red, yellow, and blue. -
2:06 - 2:11And you don't have to necessarily, you
don't necessarily -
2:11 - 2:14have to have things all of the same type.
-
2:14 - 2:18This is a three-item list, that has
a string red, -
2:18 - 2:23the number integer 24, and 98.6, which is
a floating point number. -
2:23 - 2:26And here's an interesting thing, just as a
side note. -
2:26 - 2:28This shows that floating point numbers are
-
2:28 - 2:32not always perfectly represented inside of
the computer. -
2:32 - 2:35It's sort of an artifact of how they work.
-
2:35 - 2:37And this is an example of 98.6 is really
98 point -
2:37 - 2:39na, na, na, na, na.
-
2:39 - 2:41So, but, don't, when you see something
like that, don't freak out. -
2:41 - 2:44Floating point numbers are the ones that
show this behavior. -
2:45 - 2:48So, interestingly, you can always,
although we won't put a lot of energy into -
2:48 - 2:53this, you can also have an element of a
list be a list itself. -
2:53 - 2:56So this a outer list that's got three
elements. -
2:56 - 2:581, 7, and then
-
2:58 - 3:00a list that's 5 and 6.
-
3:00 - 3:04So, if you look at the length of this,
there is three things in it. -
3:04 - 3:06Not four, three.
-
3:06 - 3:09Because the outer list has 1, 2, 3 things
in it. -
3:09 - 3:12And an empty list is bracket, bracket.
-
3:12 - 3:13Okay?
-
3:13 - 3:17Like I said, we have been going through
lists all along. -
3:17 - 3:20We have iteration variables for i in.
-
3:20 - 3:22This is a list.
We've been using it all along. -
3:22 - 3:27Similarly, we've been using lists in
definite loops, are a -
3:27 - 3:30great way to go through lists, for friend
in friends, there we have -
3:30 - 3:34goes through three times, out come
three lines, with the -
3:34 - 3:39variable friend advancing through the
three successive items in the list. -
3:39 - 3:40And away we go.
-
3:40 - 3:44So, again, lists are not completely
foreign to us. -
3:44 - 3:46Now,
-
3:46 - 3:53just like in a string, we can use the
index operator, -
3:53 - 3:57the square bracket operator, and
we can look up items in the list. -
3:57 - 3:59Sub one, friends, sub one.
-
4:00 - 4:04Not surprisingly, using the European
elevator rule, -
4:06 - 4:09the first item in a list is sub zero,
the second -
4:09 - 4:12item is sub one and the third one is sub
two. -
4:12 - 4:15So here when I print friends sub one I
get Glenn. -
4:15 - 4:18Which is the second element.
Just like strings. -
4:18 - 4:21So once you kind of know it for strings,
lists -
4:21 - 4:23and the rest of these things make a lot
more sense. -
4:23 - 4:26Just, remember that we're in Europe, and
things start with zero. -
4:28 - 4:32Some things in these data items that we
work with are not mutable. -
4:32 - 4:34So for example, strings, when we ask for a
lower case -
4:34 - 4:37version of a string, we're given a copy of
that string. -
4:37 - 4:42And that's because strings are not
mutable, and we can see this -
4:42 - 4:47by doing something like saying fruit
sub 0 equals lowercase b. -
4:47 - 4:50Now you'd think that that would just
change this -
4:50 - 4:54to be a lower case b, but it doesn't,
okay? -
4:54 - 4:57It says string object does not support
item assignment -
4:57 - 5:00which means that you're not allowed to
reassign. -
5:00 - 5:03You can make a new string and put
different things in -
5:03 - 5:07that new string, but once the strings are
made, they're not changeable. -
5:07 - 5:12And that's why when we call fruit.lower, we
get a copy of it in lower case. -
5:12 - 5:15And so x is a copy of the original
string, but -
5:15 - 5:18the original string, once we assign it
into fruit, is unchanged. -
5:18 - 5:19It can't be changed.
-
5:20 - 5:22Lists, on the other hand, can be changed,
and we -
5:22 - 5:23can change them in the middle.
-
5:23 - 5:26This is one of the things we like about
them. -
5:26 - 5:29So here we have a list: 2, 14, 26, 41, and
63. -
5:29 - 5:31Then we say lotto sub two.
-
5:31 - 5:34Of course, that's going to be the third
item. -
5:34 - 5:36Lotto sub two is equal to 28.
-
5:36 - 5:38Then we print it and we see the new number
there. -
5:38 - 5:41So all this is saying is that we can
change them, right? -
5:41 - 5:45Strings no, and lists yes.
-
5:45 - 5:48You can change lists, but you can't change
strings. -
5:49 - 5:52So the len function, we've used it for
several -
5:52 - 5:56things, we can say you know, use, len is
-
5:56 - 5:58used for, for strings and it's used for
lists as well. -
5:58 - 6:01So the same function knows
when its -
6:01 - 6:03parameter is a string. And when its
parameter is a string, -
6:03 - 6:05it gives us the number of characters
in the string. -
6:05 - 6:07And when it is a list, it gives us
-
6:07 - 6:11the number of elements in the list.
-
6:11 - 6:14And just because one of them is a string,
it's still one element from the point -
6:14 - 6:16of view of this list.
-
6:16 - 6:21So it has one, two, three, four - four
items in the list, okay? -
6:25 - 6:28So, the range function is a special
function. -
6:28 - 6:30It's probably about time to talk about the
range function. -
6:31 - 6:34The range function is a function that
generates a list, that -
6:34 - 6:37produces a list and gives it back to us.
-
6:37 - 6:39And so you give the range function a
-
6:39 - 6:42parameter, how many items you want, and
the range -
6:42 - 6:46function creates and gives us back a list
that -
6:46 - 6:50is four numbers starting at zero, which is
zero -
6:50 - 6:54up to, but not including three.
Sound familiar? -
6:54 - 6:54Yeah.
-
6:54 - 6:58Zero up to but not, I mean zero up to, but
not including four. -
6:58 - 7:05And, and so the same thing is true here.
So, we can combine the len and the range -
7:05 - 7:10to say, you know, to say okay, well len
friends, that's three -
7:10 - 7:15items, and range len friends is 0, 1, 2.
And it also -
7:15 - 7:23corresponds exactly to these items.
So we can actually use this -
7:23 - 7:31to construct loops to go through a list.
We already have a basic for loop, right? -
7:31 - 7:34We basically have a for loop that is our,
-
7:34 - 7:39that, that said that for each friend in
friends. -
7:39 - 7:41And out comes, Happy New Year, Glenn and
Joseph. -
7:41 - 7:45If we also want to know where, what
position we're at as -
7:45 - 7:50the loop progresses, we can rewrite the
exact same loop a different way. -
7:50 - 7:53And make i be our iteration variable.
-
7:53 - 7:59And say i in range(len(friends)), that
turns this into zero, one, two. -
7:59 - 8:02And then i goes zero, one, two.
-
8:02 - 8:03And then, we can in the loop, look up the
-
8:03 - 8:07particular friend that is the particular
one we are interested in, -
8:07 - 8:11using the index operator, friend sub i.
-
8:11 - 8:12And then print Happy New Year.
-
8:12 - 8:14So these two loops,
-
8:16 - 8:20these two loops are equivalent.
These, oop, not that one. -
8:20 - 8:25[SOUND] This loop and this loop.
This loop is -
8:25 - 8:31preferred, unless you happen to need this
value i, which tells you where you're at. -
8:31 - 8:32In case maybe you're going to change
something, you're -
8:32 - 8:35going to look through something and then
change it. -
8:35 - 8:39So, but, but, for what I've written here,
they're exactly equivalent. -
8:39 - 8:41Prefer the simpler one, unless you need
-
8:41 - 8:44the more complex one.
They both produce the same kind of output. -
8:46 - 8:50We can concatenate lists, much like we
concatenate strings, with plus. -
8:53 - 9:00And you can think of the Python operator's
looking to its right and to its left and -
9:00 - 9:02saying oh, those are both lists, I know
what -
9:02 - 9:05to do with lists, I'm going to put those
together. -
9:05 - 9:08And so that produces a two, three-long
lists become a six-long -
9:08 - 9:12list with the first one followed by
the second one concatenated. -
9:12 - 9:16It didn't hurt the original, a. c is a new
list, basically. -
9:19 - 9:23We can also slice lists.
Feels a lot like strings, right? -
9:23 - 9:24Everything's kind of like strings.
-
9:24 - 9:28For loops like strings, concatenation like
strings, and now slicing like strings. -
9:28 - 9:30And it is exactly the same.
-
9:32 - 9:38So one up to, but not including.
Just remember, up to, but not including. -
9:38 - 9:42the second parameter, is up to but not
including, so that starts at the sub one, -
9:42 - 9:48which is the second one up to but not
including 3, the third one, so. -
9:48 - 9:51This is 1, 2, and 3 so that's 41 comma 2.
-
9:51 - 9:55Starting at the first one, up to but not
including the third one. -
9:59 - 10:02We can similarly eliminate the first one,
-
10:02 - 10:04so that's up to but not including the fourth
one. -
10:04 - 10:09Starting at zero, one, two, three, but not
including four. -
10:09 - 10:14So that's this one.
If we go three to the end, and again, -
10:14 - 10:21remember that there, starting at 0, so 3
to the end is 0, 1, 2, 3 to the end. -
10:21 - 10:24The number 3 doesn't matter.
So that's 3, 74, 15. -
10:24 - 10:24And the
-
10:26 - 10:29whole thing, that's the whole thing, so
these two things are the same. -
10:29 - 10:33So slicing works like strings, starting
and up -
10:33 - 10:35to but not including is the second
parameter. -
10:36 - 10:39There are some methods, and you can
-
10:39 - 10:43read about these online in the Python
documentation. -
10:43 - 10:45We can use the built-in function.
-
10:45 - 10:48It doesn't have a lot of use in sort of how
-
10:48 - 10:51we run, when we're running programs but
it's kind of of useful. -
10:51 - 10:52I like it when I'm typing
-
10:52 - 10:54interactively. Like, what can this thing do?
-
10:54 - 10:58So I make a list, list is a unique type, and
-
10:58 - 11:00I say, with dir I say what can we do with it?
-
11:00 - 11:04Well, we can append, we can count, extend,
index, insert, pop, remove, reverse -
11:04 - 11:08and sort. And then you can sort of read up
on all these things. -
11:08 - 11:14I'll show you just a couple.
We can build a list with the append. -
11:15 - 11:16So this syntax here,
-
11:16 - 11:19stuff equals list, that's called a
constructor -
11:19 - 11:21which says give me an empty list.
-
11:22 - 11:26You could also say bracket, bracket for an
empty list. -
11:26 - 11:30Whatever, you gotta make an empty list and
then you call the append. -
11:30 - 11:33Remember that lists are mutable, so it's
okay to change it. -
11:33 - 11:36So we're saying, okay, we started with an
empty list. -
11:36 - 11:38Now append to the end of that, the word
book. -
11:38 - 11:40And then append to that, 99.
-
11:40 - 11:44Wait a sec.
-
11:44 - 11:45That's a mistake.
-
11:49 - 11:52That's a mistake.
So I have to fix this mistake. -
11:52 - 11:55So watch me fix the mistake.
Poof. -
11:58 - 12:01Now my thing is magically fixed.
Isn't that amazing. -
12:01 - 12:04I have magic powers when it comes to slide
fixing. -
12:04 - 12:07I just snap my fingers and the slides are
fixed. -
12:07 - 12:08So here we go.
-
12:08 - 12:10We append the 99, and we print it out.
-
12:10 - 12:14And it's got book and 99, emphasizing the
fact that they don't -
12:14 - 12:17have to be the exact same kind of thing in
a list. -
12:17 - 12:20Then later we append cookie and then it's
book, 99, cookie. -
12:20 - 12:23Okay? So this append, we won't do it in line
-
12:23 - 12:26like this so often, we'll tend to do it in
a loop as we're building up a -
12:26 - 12:27list, but that's the way you start with
-
12:27 - 12:31an empty list and then [SOUND]
programmatically grow it. -
12:33 - 12:38We can ask, much like we do in a string,
we can ask if an item is in a list. -
12:38 - 12:41So here is a list called some, with these
numbers in it. -
12:41 - 12:43It's got five numbers in it.
-
12:43 - 12:46Is nine in some? True, yes it is.
-
12:46 - 12:49Is 15 in some? False.
-
12:49 - 12:55Is 20 not in, that's a leg, a legal
syntax, that is legal syntax. -
12:55 - 12:58Is 20 not in some, yes it's not there,
okay? -
12:58 - 13:03They don't modify the list, don't modify
the list, they're just asking questions. -
13:03 - 13:06These are logical operations often used in
if statements or -
13:06 - 13:10while, some kind of a logic that you might
be building. -
13:12 - 13:15Okay, so lists have order.
-
13:15 - 13:17So when we were appending them, the first
thing went -
13:17 - 13:21in first, the second thing went in second,
et cetera, et cetera. -
13:21 - 13:23And we can also tell the list to sort
itself. -
13:23 - 13:26So one of the things that we can do with a
list, -
13:26 - 13:29now we're starting to see some power here,
is say, sort yourself. -
13:29 - 13:30This is a list of strings.
-
13:30 - 13:33It can sort numbers, it can sort lots of
things. -
13:33 - 13:39friends.sort, that says hey there, dear
friends, sort yourself. -
13:39 - 13:40This makes a change.
-
13:43 - 13:45It alters the list, and puts it, in
-
13:45 - 13:48this case, in alphabetical order, Glenn,
Joseph, and Sally. -
13:48 - 13:52It is muted, it was, it's, it's been
modified, and so -
13:52 - 13:55friend sub one is now Joseph because
that's the second one. -
13:55 - 13:56Okay?
-
13:56 - 14:00So the sort method says sort yourself now,
-
14:00 - 14:04sort yourself, and it sorts and then
it stays sorted. -
14:07 - 14:11So [COUGH]
-
14:11 - 14:13you're going to be kind of ticked about
this particular slide. -
14:13 - 14:17Because there's a whole bunch of built-in
functions that help with lists. -
14:17 - 14:22And, there's max, there's min, there's
len, various things. -
14:22 - 14:25And so we could, all those loops that I
told you how to -
14:25 - 14:30do, I was just showing you that stuff
because I thought it was important. -
14:30 - 14:32This the simplest way to go through and
-
14:32 - 14:35find the largest, smallest, and sum,
et cetera. -
14:35 - 14:37So here's a list of numbers.
-
14:38 - 14:40We can say how many are there.
-
14:40 - 14:43That's the count.
We can say what's the largest, it's 74. -
14:43 - 14:46What's the smallest, that'd be 3.
-
14:46 - 14:49What is the sum of the running total of
them all? 154. -
14:49 - 14:52If you remember from a few lectures
ago, these are the same numbers. -
14:52 - 14:57And what is the average, which is, sum of
them over the length of them, -
14:57 - 14:58Okay?
-
14:58 - 15:01So this makes a lot more sense and if you
had a list of numbers -
15:01 - 15:05like this, you would simply say what's the
max, you wouldn't write a max loop. -
15:05 - 15:07I just did that to kind of demonstrate how
loops work. -
15:07 - 15:10[COUGH] Demonstrate how loops work.
-
15:10 - 15:12So here is a way that you can sort
-
15:12 - 15:17of change those kind of programs that we
wrote. -
15:17 - 15:20So there's two ways to write a summing
program. -
15:20 - 15:22Let's just say instead of the data being
-
15:22 - 15:26in a list, we're going to write a while
loop that's going to read a -
15:26 - 15:31set of numbers until we say done, and then
compute the average of those numbers. -
15:31 - 15:33Okay, so let's say this is our problem.
-
15:33 - 15:38Read a list of numbers, wait till the word
done comes in, and then average them. -
15:38 - 15:40So here's a little program that does that.
-
15:40 - 15:43We create total equals zero, count equals
zero. -
15:43 - 15:46Make a infinite loop with while True.
-
15:46 - 15:48And then we ask
-
15:48 - 15:49to enter a number.
-
15:49 - 15:52We get a string back from this, remember
raw_input always -
15:52 - 15:57gives us strings back, and then if it's
done, we're going to break. -
15:57 - 16:00This is the version of the if that does
not require an indent. -
16:00 - 16:02We just put the break up there.
-
16:02 - 16:04And so that gets us out of the loop when
the time is right. -
16:04 - 16:06So when the time is right over here.
-
16:06 - 16:10And then, we convert the value to float.
-
16:10 - 16:13We use a float to convert the input to a
floating point number. -
16:13 - 16:15And then we do our accumulation pattern,
-
16:15 - 16:18total equals total plus value, count equals
count plus one. -
16:18 - 16:19So this is going to run.
-
16:19 - 16:21These numbers are going to go up and up
and up and up. -
16:21 - 16:23And then we're going to break out of it,
-
16:23 - 16:26calculate the average, and then print the
average. -
16:26 - 16:30Because that's a floating point number, so now
the average is a floating point number. -
16:30 - 16:31So that's one way to do it.
-
16:31 - 16:31Right?
-
16:31 - 16:35That would be one way to write a program
-
16:35 - 16:38that does an average, is keep a running
average -
16:38 - 16:39as you're reading the numbers.
-
16:40 - 16:44But there's another way to do it, that
would exact, work exactly -
16:44 - 16:48the same way, and this is when you can
start using lists. -
16:48 - 16:52So you come in, you say I'm going to
make a list -
16:52 - 16:57of numbers, just a mnemonic name, numlist,
is an empty list. -
16:57 - 17:02Then I create another infinite loop
that's going to read for enter a number. -
17:02 - 17:03And if it's done, break.
-
17:03 - 17:09That gets us out of it.
Convert the value to an int. -
17:09 - 17:12Convert the value to a float,
the input value to a float. -
17:12 - 17:14And then append it to the list.
-
17:14 - 17:17So now the list is going to grow, each
time -
17:17 - 17:19we read a number the list is going to
grow. -
17:19 - 17:21However many times we add the number is
-
17:21 - 17:23how many things are going to be in the
list. -
17:23 - 17:26So in this case, when we're at this point
and we -
17:26 - 17:29type done, there will be three numbers in
the list, because we -
17:29 - 17:33will have run append three times.
We'll have appended 3, 9, and 5. -
17:33 - 17:37We'll have them sitting in a list.
And we will have exited the loop. -
17:37 - 17:39So now you say, oh add up all the numbers
in -
17:39 - 17:43that list, and then divide it by the
length of the list. -
17:43 - 17:44And print the average.
-
17:44 - 17:47So these two programs are basically
equivalent. -
17:47 - 17:49The only time that they might not be
-
17:49 - 17:54equivalent was if there was ten million
numbers. -
17:54 - 17:59This would use up 40 megabytes of your
memory, which -
17:59 - 18:01is actually not a lot of memory on some
computers. -
18:01 - 18:05But if memory mattered, this does store
all those numbers. -
18:05 - 18:08This one actually just runs the
calculation. -
18:08 - 18:12So if there's a really large number of
numbers, this would make a difference, -
18:12 - 18:16because the list is growing and keeping
them all, summing them all at the end. -
18:16 - 18:17This is actually storing very little data.
-
18:18 - 18:21But for reasonably sized numbers,
-
18:21 - 18:24like thousands or even hundreds of thousands
of numbers, these -
18:24 - 18:29two approaches are kind of equivalent.
And then sometimes you actually -
18:29 - 18:32want to accumulate something a little more
complex than this, you want to -
18:32 - 18:35sort them or look for the maximum and look
for something else. -
18:35 - 18:37Who knows what, but the notion of make a
-
18:37 - 18:40list and then append something to the list
-
18:40 - 18:42each time through the iteration, and then do
something with -
18:42 - 18:45the list at the end is a rather powerful
pattern. -
18:45 - 18:49So this is also a powerful pattern,
this is accumulator -
18:49 - 18:52pattern where we just have the variables
accumulating in the loop. -
18:52 - 18:55This one is one where we accumulate the
data in -
18:55 - 18:58the loop and then do the computations all
at the end. -
18:58 - 19:02The, certain situations will make use of
these different techniques. -
19:03 - 19:09Okay.
So, connecting strings and lists. -
19:09 - 19:12So there's a method, a capability
-
19:12 - 19:16of strings that is really powerful when it
comes to tearing data apart. -
19:19 - 19:23It's called the split.
So here is a string -
19:23 - 19:27with three words and it has blanks in between
here. -
19:27 - 19:34And abc.split says parse this string,
-
19:34 - 19:39look for the blanks, break the string into
pieces, and give me back a -
19:39 - 19:44list with one item for each of the words
in the list as -
19:44 - 19:47defined by the spaces. Okay?
-
19:47 - 19:53So, it takes, breaks it into three pieces
and gives us that back in a list. -
19:53 - 19:56This is very powerful. Okay?
-
19:56 - 19:58So we're going to split it and we get back
a list. -
19:58 - 20:04There are three words, and the first word,
stuff sub zero, is With. -
20:04 - 20:06So there's a lot of parsing going on here.
-
20:06 - 20:09We could do this with for loops and a lot
of other things. -
20:09 - 20:11There would be a lot of work in this
split. -
20:11 - 20:14Given that this is a really common task,
it's really -
20:14 - 20:18great that this has been put into Python
for us. -
20:18 - 20:19Okay?
-
20:19 - 20:23So split breaks a string into parts and
produces a list of strings. -
20:23 - 20:26We think of these as words, we can access a
-
20:26 - 20:28particular word or we can loop through all
the words. -
20:28 - 20:31So here we have stuff again and now we
have a, a for loop -
20:32 - 20:35for each of the, that's going to go
through each of the three words. -
20:35 - 20:36And then it's going to run three times.
-
20:36 - 20:37Now chances are good we're going to do
-
20:37 - 20:40something different other than just print
them out. -
20:40 - 20:44But you see how that you quickly can take
a split followed by a for, and then write -
20:44 - 20:46a loop that's going to go through each of
the -
20:46 - 20:48words, without working too hard to find
the spaces. -
20:48 - 20:53You let Python do all the hard work of
finding the spaces. -
20:53 - 20:53Okay?
-
20:53 - 20:56So let's take a look at a couple of
samples. -
20:58 - 21:00Just a couple of things to teach you a
little more about split. -
21:02 - 21:06Split looks at many spaces as equal to one
space. -
21:08 - 21:11So, if you split a lot blank, blank, blank
of spaces, it's -
21:11 - 21:14still just throws away all the spaces and
gives us four words. -
21:16 - 21:20One, two, three, four and throws away
all the spaces, -
21:20 - 21:22because it assumes that's what we
want done. -
21:22 - 21:23So that's nice.
-
21:23 - 21:27You can also have split, you can also have
split, -
21:27 - 21:30split on some other character. Sometimes
you'll be getting data -
21:30 - 21:33and they'll have used a semicolon, or a
comma, or -
21:33 - 21:36a colon, or a tab character, who knows
what they've -
21:36 - 21:39used, and your job is to dig that data
out. -
21:39 - 21:43So you can split, based on the different
character. -
21:43 - 21:47Here, if we're splitting normally with,
with this is a normal split. -
21:47 - 21:50It's not going to see the semicolons, it's
looking for a space. -
21:50 - 21:53And so all we get back is one
-
21:53 - 21:55item in the string, with the semicolons.
-
21:55 - 21:59But, if we switch, and we pass semicolon
-
21:59 - 22:01as a parameter, in as as parameter to
split, -
22:01 - 22:03then it will know to split it based on
-
22:03 - 22:06semicolons, and gives us first, second, and
third back. -
22:08 - 22:08Okay?
-
22:08 - 22:10And then it gives us three words.
-
22:10 - 22:14So you can split either on spaces, or you
-
22:14 - 22:17can split on a character other than a
space. -
22:17 - 22:18Okay?
-
22:18 - 22:20[COUGH]
-
22:20 - 22:25So, let's take a look at how we might turn
this into some of our common assignments -
22:25 - 22:32that we have in this chapter, where we're
going to read some of the mailbox data. Okay? -
22:33 - 22:37So, here we go with a little program.
-
22:37 - 22:41First three lines, we write these a lot.
Open the file. -
22:41 - 22:43Write a for loop to loop through each
-
22:43 - 22:45line in the file.
-
22:45 - 22:48Then we're going to strip off the white
space at the end of the line. -
22:48 - 22:51One, two, three.
Do those all the time. -
22:51 - 22:55And we're looking for lines, if you look
at the whole file, -
22:55 - 22:58we're looking for lines that start with
from, followed by a space. -
22:58 - 23:00So if the line does not start with from
-
23:00 - 23:04followed by a space, that's a space right
there, continue. -
23:04 - 23:08So that's a way to skip all the lines that
don't look like this. -
23:08 - 23:12There're thousands of lines in this file
and just a few that look like this. Okay? -
23:12 - 23:17So we're going to look and we're
going to try -
23:17 - 23:23to find what day of the week this thing
happened on. -
23:23 - 23:28So, so we're throwing away all the lines
with this little bit of code. -
23:28 - 23:33Then what we do is we take the line, which
is all of this text, and then we split it. -
23:34 - 23:38And we know that the day of the week is
words sub two. -
23:38 - 23:43So this is words sub zero, this is words sub
one, and this is words sub two. -
23:43 - 23:46So this is words sub zero, sub one, and sub
two. -
23:46 - 23:49And so, all we have to do is print out the
sub two -
23:49 - 23:54and we get, we throw away all the lines
except the from lines. -
23:54 - 23:57We split them and take the sec, uh, the,
-
23:57 - 23:59the third word or words sub two and we
-
23:59 - 24:02can quickly quickly create something
that's extracting -
24:02 - 24:04the day of the week out of these.
-
24:06 - 24:07Okay?
-
24:07 - 24:12So it's, it's, I mean, it's quick, because
split does the tricky work. -
24:12 - 24:15If you go back to the strings chapter, you
saw that -
24:15 - 24:17we did a lot of work to get this to
happen. -
24:18 - 24:21So here's even another tricky pattern.
-
24:21 - 24:27So let's say we want to do what we did at
the end of Chapter Six, -
24:27 - 24:28the string chapter.
-
24:28 - 24:31Let's say we wanted to get back this little
bit of data. -
24:32 - 24:33Okay?
-
24:33 - 24:37So, can look at this and say, okay, let's
split this. -
24:37 - 24:42And this will be zero, one, and two, and
three, and four, and five, and six. -
24:42 - 24:45We're splitting it based on spaces.
-
24:45 - 24:50Then the email address is words sub one,
right? -
24:51 - 24:55So that email address is this little bit
of stuff -
24:55 - 24:59because it's in between spaces, right?
So that's what we pull out. -
24:59 - 25:02The email address is words sub one.
-
25:02 - 25:05We've got that.
-
25:05 - 25:08So that's sitting in this email address
variable. -
25:08 - 25:10Then we really, all we want, we don't
really want the whole thing, -
25:10 - 25:12we just want the part after the
-
25:12 - 25:14at sign, and we can do a lookup for the, oop.
-
25:14 - 25:16We can do a lookup of the at sign.
-
25:17 - 25:22But you can also then do a second, come
back, come back. -
25:22 - 25:25[SOUND] There we come.
-
25:25 - 25:29You can also do a second split.
Okay? -
25:29 - 25:31So we're taking this variable here, email,
-
25:31 - 25:34which is merely this little part right
here. -
25:34 - 25:37And we are splitting it again, except this
-
25:37 - 25:38time we're splitting it based on a at
sign. -
25:38 - 25:43Which means it's going to bust it right
here, and find -
25:43 - 25:44us two pieces.
-
25:44 - 25:50So pieces now is a list where the sub zero
item is the -
25:50 - 25:56person's name and sub one item is the host
that their mail address is held from. -
25:56 - 26:01Okay?
And so then all we need to know is pieces -
26:01 - 26:06is sub one, and pieces sub one is this
guy right here. -
26:08 - 26:11So that's pieces sub one, and so we
pulled it out. -
26:11 - 26:13So if you go back to how we did it before,
we were -
26:13 - 26:17doing searching, we were searching some
more, and then we were taking slices. -
26:17 - 26:19This is a little more elegant, okay?
-
26:19 - 26:21Because really, we split it and then we
split it, -
26:21 - 26:23and we knew what piece we were looking at.
-
26:23 - 26:27So this is what I call the Double Split
Pattern, where you split a string -
26:27 - 26:31into a list, then you take a thing out,
and then you split it again. -
26:32 - 26:33Depending on what data you're looking for.
-
26:33 - 26:35This is just a technique, it's not the
only technique. -
26:35 - 26:40Okay, so that's lists.
-
26:40 - 26:42We talked about the concept of a
-
26:42 - 26:45collection where lists have multiple
things in it. -
26:45 - 26:47Definite loops, again, we've seen these
things. -
26:47 - 26:50We're kind of, it looks a lot like strings
-
26:50 - 26:53except the elements are more powerful and
they're more mutable. -
26:53 - 26:59We still use the bracket operator and we
redid the max, min, and sum. -
26:59 - 27:02Except we did it in, like, one line rather
than a whole loop. -
27:02 - 27:06And something we're going to play with a
lot is using split to parse strings, -
27:06 - 27:09the single split, and then the double
split -
27:09 - 27:11is the natural extension of the single
split. -
27:11 - 27:15So, see you in the next lecture, looking
forward to talking about dictionaries.
- Title:
- Python for Informatics - Chapter 8 - Lists
- Description:
-
This is from Python for Informatics Chapter 8 - Lists. www.pythonlearn.com
All Lectures: http://www.youtube.com/playlist?list=PLlRFEj9H3Oj4JXIwMwN1_ss1Tk8wZShEJ - Video Language:
- English
- Team:
- Captions Requested
- Duration:
- 27:15
Claude Almansi edited English subtitles for Python for Informatics - Chapter 8 - Lists |