Return to Video

Python for Informatics - Chapter 8 - Lists

  • 0:00 - 0:05
    Hello, and welcome to Chapter Eight:
    Python Lists.
  • 0:05 - 0:08
    So now we're sort of going to start taking
    care of business.
  • 0:08 - 0:11
    We are doing, make lists and
  • 0:11 - 0:13
    dictionaries and tuples and really start
    manipulating this data,
  • 0:13 - 0:16
    and doing real data analysis,
    starting the,
  • 0:16 - 0:18
    laying the proper work for real data
    analysis.
  • 0:18 - 0:22
    As always, these lectures, audio, video,
    slides,
  • 0:22 - 0:26
    and even book are copyright Creative Commons
    Attribution.
  • 0:26 - 0:31
    So, lists, dictionaries, and tuples, the
    next real three big topics we're going to
  • 0:31 - 0:36
    talk about, are collections.
    And we've been doing lists already, right?
  • 0:37 - 0:41
    We've been doing lists when we were doing
    for loops.
  • 0:41 - 0:44
    A list in Python is something that has a
    square braces.
  • 0:44 - 0:45
    This is a constant list.
  • 0:47 - 0:48
    Now, when I first talked to you
  • 0:48 - 0:51
    about variables, I sort of oversimplified
    things.
  • 0:51 - 0:51
    I said
  • 0:51 - 0:54
    if you put like x equals two, and then put
  • 0:54 - 0:58
    x equals four, the two and the four
    overwrite each other.
  • 0:58 - 1:02
    A collection is where you can put a bunch
    of things in the same variable.
  • 1:02 - 1:04
    Now, I have to have a way to find those
    things.
  • 1:06 - 1:09
    But it allows us to put multiple things in
  • 1:09 - 1:12
    more, more things, more than one thing in
    the variable.
  • 1:12 - 1:15
    So, here we have friends, that has three
    strings, Joseph, Glenn, and Sally.
  • 1:15 - 1:16
    And we have carryon
  • 1:16 - 1:20
    that has socks, shirt, and perfume.
    So that's the basic idea.
  • 1:20 - 1:22
    So what's not a collection?
  • 1:22 - 1:23
    Well, simple variables.
  • 1:23 - 1:27
    Simple variables are not collections, just
    like this example.
  • 1:27 - 1:30
    I say x equals 2, x equals 4, and print x,
  • 1:30 - 1:33
    and the 4's in there and the 2 is somehow
    gone.
  • 1:33 - 1:36
    It was there for a moment, and then it's
    gone.
  • 1:37 - 1:38
    And so that's a normal variable.
  • 1:38 - 1:41
    They're not collections.
    You can't put more than one thing in it.
  • 1:41 - 1:44
    But when you put more than one thing in
    it, then you
  • 1:44 - 1:47
    have to have a way to find the things that
    are in there.
  • 1:47 - 1:47
    We'll, we'll get to that.
  • 1:49 - 1:52
    So, we've been using list constants for
    the last couple
  • 1:52 - 1:55
    of chapters just because we have to use
    list constants.
  • 1:55 - 1:59
    You know, so we used, in the for loop
    chapter, we did lists of numbers.
  • 2:01 - 2:05
    We have done lists of strings, that's
    strings, red, yellow, and blue.
  • 2:06 - 2:11
    And you don't have to necessarily, you
    don't necessarily
  • 2:11 - 2:14
    have to have things all of the same type.
  • 2:14 - 2:18
    This is a three-item list, that has
    a string red,
  • 2:18 - 2:23
    the number integer 24, and 98.6, which is
    a floating point number.
  • 2:23 - 2:26
    And here's an interesting thing, just as a
    side note.
  • 2:26 - 2:28
    This shows that floating point numbers are
  • 2:28 - 2:32
    not always perfectly represented inside of
    the computer.
  • 2:32 - 2:35
    It's sort of an artifact of how they work.
  • 2:35 - 2:37
    And this is an example of 98.6 is really
    98 point
  • 2:37 - 2:39
    na, na, na, na, na.
  • 2:39 - 2:41
    So, but, don't, when you see something
    like that, don't freak out.
  • 2:41 - 2:44
    Floating point numbers are the ones that
    show this behavior.
  • 2:45 - 2:48
    So, interestingly, you can always,
    although we won't put a lot of energy into
  • 2:48 - 2:53
    this, you can also have an element of a
    list be a list itself.
  • 2:53 - 2:56
    So this a outer list that's got three
    elements.
  • 2:56 - 2:58
    1, 7, and then
  • 2:58 - 3:00
    a list that's 5 and 6.
  • 3:00 - 3:04
    So, if you look at the length of this,
    there is three things in it.
  • 3:04 - 3:06
    Not four, three.
  • 3:06 - 3:09
    Because the outer list has 1, 2, 3 things
    in it.
  • 3:09 - 3:12
    And an empty list is bracket, bracket.
  • 3:12 - 3:13
    Okay?
  • 3:13 - 3:17
    Like I said, we have been going through
    lists all along.
  • 3:17 - 3:20
    We have iteration variables for i in.
  • 3:20 - 3:22
    This is a list.
    We've been using it all along.
  • 3:22 - 3:27
    Similarly, we've been using lists in
    definite loops, are a
  • 3:27 - 3:30
    great way to go through lists, for friend
    in friends, there we have
  • 3:30 - 3:34
    goes through three times, out come
    three lines, with the
  • 3:34 - 3:39
    variable friend advancing through the
    three successive items in the list.
  • 3:39 - 3:40
    And away we go.
  • 3:40 - 3:44
    So, again, lists are not completely
    foreign to us.
  • 3:44 - 3:46
    Now,
  • 3:46 - 3:53
    just like in a string, we can use the
    index operator,
  • 3:53 - 3:57
    the square bracket operator, and
    we can look up items in the list.
  • 3:57 - 3:59
    Sub one, friends, sub one.
  • 4:00 - 4:04
    Not surprisingly, using the European
    elevator rule,
  • 4:06 - 4:09
    the first item in a list is sub zero,
    the second
  • 4:09 - 4:12
    item is sub one and the third one is sub
    two.
  • 4:12 - 4:15
    So here when I print friends sub one I
    get Glenn.
  • 4:15 - 4:18
    Which is the second element.
    Just like strings.
  • 4:18 - 4:21
    So once you kind of know it for strings,
    lists
  • 4:21 - 4:23
    and the rest of these things make a lot
    more sense.
  • 4:23 - 4:26
    Just, remember that we're in Europe, and
    things start with zero.
  • 4:28 - 4:32
    Some things in these data items that we
    work with are not mutable.
  • 4:32 - 4:34
    So for example, strings, when we ask for a
    lower case
  • 4:34 - 4:37
    version of a string, we're given a copy of
    that string.
  • 4:37 - 4:42
    And that's because strings are not
    mutable, and we can see this
  • 4:42 - 4:47
    by doing something like saying fruit
    sub 0 equals lowercase b.
  • 4:47 - 4:50
    Now you'd think that that would just
    change this
  • 4:50 - 4:54
    to be a lower case b, but it doesn't,
    okay?
  • 4:54 - 4:57
    It says string object does not support
    item assignment
  • 4:57 - 5:00
    which means that you're not allowed to
    reassign.
  • 5:00 - 5:03
    You can make a new string and put
    different things in
  • 5:03 - 5:07
    that new string, but once the strings are
    made, they're not changeable.
  • 5:07 - 5:12
    And that's why when we call fruit.lower, we
    get a copy of it in lower case.
  • 5:12 - 5:15
    And so x is a copy of the original
    string, but
  • 5:15 - 5:18
    the original string, once we assign it
    into fruit, is unchanged.
  • 5:18 - 5:19
    It can't be changed.
  • 5:20 - 5:22
    Lists, on the other hand, can be changed,
    and we
  • 5:22 - 5:23
    can change them in the middle.
  • 5:23 - 5:26
    This is one of the things we like about
    them.
  • 5:26 - 5:29
    So here we have a list: 2, 14, 26, 41, and
    63.
  • 5:29 - 5:31
    Then we say lotto sub two.
  • 5:31 - 5:34
    Of course, that's going to be the third
    item.
  • 5:34 - 5:36
    Lotto sub two is equal to 28.
  • 5:36 - 5:38
    Then we print it and we see the new number
    there.
  • 5:38 - 5:41
    So all this is saying is that we can
    change them, right?
  • 5:41 - 5:45
    Strings no, and lists yes.
  • 5:45 - 5:48
    You can change lists, but you can't change
    strings.
  • 5:49 - 5:52
    So the len function, we've used it for
    several
  • 5:52 - 5:56
    things, we can say you know, use, len is
  • 5:56 - 5:58
    used for, for strings and it's used for
    lists as well.
  • 5:58 - 6:01
    So the same function knows
    when its
  • 6:01 - 6:03
    parameter is a string. And when its
    parameter is a string,
  • 6:03 - 6:05
    it gives us the number of characters
    in the string.
  • 6:05 - 6:07
    And when it is a list, it gives us
  • 6:07 - 6:11
    the number of elements in the list.
  • 6:11 - 6:14
    And just because one of them is a string,
    it's still one element from the point
  • 6:14 - 6:16
    of view of this list.
  • 6:16 - 6:21
    So it has one, two, three, four - four
    items in the list, okay?
  • 6:25 - 6:28
    So, the range function is a special
    function.
  • 6:28 - 6:30
    It's probably about time to talk about the
    range function.
  • 6:31 - 6:34
    The range function is a function that
    generates a list, that
  • 6:34 - 6:37
    produces a list and gives it back to us.
  • 6:37 - 6:39
    And so you give the range function a
  • 6:39 - 6:42
    parameter, how many items you want, and
    the range
  • 6:42 - 6:46
    function creates and gives us back a list
    that
  • 6:46 - 6:50
    is four numbers starting at zero, which is
    zero
  • 6:50 - 6:54
    up to, but not including three.
    Sound familiar?
  • 6:54 - 6:54
    Yeah.
  • 6:54 - 6:58
    Zero up to but not, I mean zero up to, but
    not including four.
  • 6:58 - 7:05
    And, and so the same thing is true here.
    So, we can combine the len and the range
  • 7:05 - 7:10
    to say, you know, to say okay, well len
    friends, that's three
  • 7:10 - 7:15
    items, and range len friends is 0, 1, 2.
    And it also
  • 7:15 - 7:23
    corresponds exactly to these items.
    So we can actually use this
  • 7:23 - 7:31
    to construct loops to go through a list.
    We already have a basic for loop, right?
  • 7:31 - 7:34
    We basically have a for loop that is our,
  • 7:34 - 7:39
    that, that said that for each friend in
    friends.
  • 7:39 - 7:41
    And out comes, Happy New Year, Glenn and
    Joseph.
  • 7:41 - 7:45
    If we also want to know where, what
    position we're at as
  • 7:45 - 7:50
    the loop progresses, we can rewrite the
    exact same loop a different way.
  • 7:50 - 7:53
    And make i be our iteration variable.
  • 7:53 - 7:59
    And say i in range(len(friends)), that
    turns this into zero, one, two.
  • 7:59 - 8:02
    And then i goes zero, one, two.
  • 8:02 - 8:03
    And then, we can in the loop, look up the
  • 8:03 - 8:07
    particular friend that is the particular
    one we are interested in,
  • 8:07 - 8:11
    using the index operator, friend sub i.
  • 8:11 - 8:12
    And then print Happy New Year.
  • 8:12 - 8:14
    So these two loops,
  • 8:16 - 8:20
    these two loops are equivalent.
    These, oop, not that one.
  • 8:20 - 8:25
    [SOUND] This loop and this loop.
    This loop is
  • 8:25 - 8:31
    preferred, unless you happen to need this
    value i, which tells you where you're at.
  • 8:31 - 8:32
    In case maybe you're going to change
    something, you're
  • 8:32 - 8:35
    going to look through something and then
    change it.
  • 8:35 - 8:39
    So, but, but, for what I've written here,
    they're exactly equivalent.
  • 8:39 - 8:41
    Prefer the simpler one, unless you need
  • 8:41 - 8:44
    the more complex one.
    They both produce the same kind of output.
  • 8:46 - 8:50
    We can concatenate lists, much like we
    concatenate strings, with plus.
  • 8:53 - 9:00
    And you can think of the Python operator's
    looking to its right and to its left and
  • 9:00 - 9:02
    saying oh, those are both lists, I know
    what
  • 9:02 - 9:05
    to do with lists, I'm going to put those
    together.
  • 9:05 - 9:08
    And so that produces a two, three-long
    lists become a six-long
  • 9:08 - 9:12
    list with the first one followed by
    the second one concatenated.
  • 9:12 - 9:16
    It didn't hurt the original, a. c is a new
    list, basically.
  • 9:19 - 9:23
    We can also slice lists.
    Feels a lot like strings, right?
  • 9:23 - 9:24
    Everything's kind of like strings.
  • 9:24 - 9:28
    For loops like strings, concatenation like
    strings, and now slicing like strings.
  • 9:28 - 9:30
    And it is exactly the same.
  • 9:32 - 9:38
    So one up to, but not including.
    Just remember, up to, but not including.
  • 9:38 - 9:42
    the second parameter, is up to but not
    including, so that starts at the sub one,
  • 9:42 - 9:48
    which is the second one up to but not
    including 3, the third one, so.
  • 9:48 - 9:51
    This is 1, 2, and 3 so that's 41 comma 2.
  • 9:51 - 9:55
    Starting at the first one, up to but not
    including the third one.
  • 9:59 - 10:02
    We can similarly eliminate the first one,
  • 10:02 - 10:04
    so that's up to but not including the fourth
    one.
  • 10:04 - 10:09
    Starting at zero, one, two, three, but not
    including four.
  • 10:09 - 10:14
    So that's this one.
    If we go three to the end, and again,
  • 10:14 - 10:21
    remember that there, starting at 0, so 3
    to the end is 0, 1, 2, 3 to the end.
  • 10:21 - 10:24
    The number 3 doesn't matter.
    So that's 3, 74, 15.
  • 10:24 - 10:24
    And the
  • 10:26 - 10:29
    whole thing, that's the whole thing, so
    these two things are the same.
  • 10:29 - 10:33
    So slicing works like strings, starting
    and up
  • 10:33 - 10:35
    to but not including is the second
    parameter.
  • 10:36 - 10:39
    There are some methods, and you can
  • 10:39 - 10:43
    read about these online in the Python
    documentation.
  • 10:43 - 10:45
    We can use the built-in function.
  • 10:45 - 10:48
    It doesn't have a lot of use in sort of how
  • 10:48 - 10:51
    we run, when we're running programs but
    it's kind of of useful.
  • 10:51 - 10:52
    I like it when I'm typing
  • 10:52 - 10:54
    interactively. Like, what can this thing do?
  • 10:54 - 10:58
    So I make a list, list is a unique type, and
  • 10:58 - 11:00
    I say, with dir I say what can we do with it?
  • 11:00 - 11:04
    Well, we can append, we can count, extend,
    index, insert, pop, remove, reverse
  • 11:04 - 11:08
    and sort. And then you can sort of read up
    on all these things.
  • 11:08 - 11:14
    I'll show you just a couple.
    We can build a list with the append.
  • 11:15 - 11:16
    So this syntax here,
  • 11:16 - 11:19
    stuff equals list, that's called a
    constructor
  • 11:19 - 11:21
    which says give me an empty list.
  • 11:22 - 11:26
    You could also say bracket, bracket for an
    empty list.
  • 11:26 - 11:30
    Whatever, you gotta make an empty list and
    then you call the append.
  • 11:30 - 11:33
    Remember that lists are mutable, so it's
    okay to change it.
  • 11:33 - 11:36
    So we're saying, okay, we started with an
    empty list.
  • 11:36 - 11:38
    Now append to the end of that, the word
    book.
  • 11:38 - 11:40
    And then append to that, 99.
  • 11:40 - 11:44
    Wait a sec.
  • 11:44 - 11:45
    That's a mistake.
  • 11:49 - 11:52
    That's a mistake.
    So I have to fix this mistake.
  • 11:52 - 11:55
    So watch me fix the mistake.
    Poof.
  • 11:58 - 12:01
    Now my thing is magically fixed.
    Isn't that amazing.
  • 12:01 - 12:04
    I have magic powers when it comes to slide
    fixing.
  • 12:04 - 12:07
    I just snap my fingers and the slides are
    fixed.
  • 12:07 - 12:08
    So here we go.
  • 12:08 - 12:10
    We append the 99, and we print it out.
  • 12:10 - 12:14
    And it's got book and 99, emphasizing the
    fact that they don't
  • 12:14 - 12:17
    have to be the exact same kind of thing in
    a list.
  • 12:17 - 12:20
    Then later we append cookie and then it's
    book, 99, cookie.
  • 12:20 - 12:23
    Okay? So this append, we won't do it in line
  • 12:23 - 12:26
    like this so often, we'll tend to do it in
    a loop as we're building up a
  • 12:26 - 12:27
    list, but that's the way you start with
  • 12:27 - 12:31
    an empty list and then [SOUND]
    programmatically grow it.
  • 12:33 - 12:38
    We can ask, much like we do in a string,
    we can ask if an item is in a list.
  • 12:38 - 12:41
    So here is a list called some, with these
    numbers in it.
  • 12:41 - 12:43
    It's got five numbers in it.
  • 12:43 - 12:46
    Is nine in some? True, yes it is.
  • 12:46 - 12:49
    Is 15 in some? False.
  • 12:49 - 12:55
    Is 20 not in, that's a leg, a legal
    syntax, that is legal syntax.
  • 12:55 - 12:58
    Is 20 not in some, yes it's not there,
    okay?
  • 12:58 - 13:03
    They don't modify the list, don't modify
    the list, they're just asking questions.
  • 13:03 - 13:06
    These are logical operations often used in
    if statements or
  • 13:06 - 13:10
    while, some kind of a logic that you might
    be building.
  • 13:12 - 13:15
    Okay, so lists have order.
  • 13:15 - 13:17
    So when we were appending them, the first
    thing went
  • 13:17 - 13:21
    in first, the second thing went in second,
    et cetera, et cetera.
  • 13:21 - 13:23
    And we can also tell the list to sort
    itself.
  • 13:23 - 13:26
    So one of the things that we can do with a
    list,
  • 13:26 - 13:29
    now we're starting to see some power here,
    is say, sort yourself.
  • 13:29 - 13:30
    This is a list of strings.
  • 13:30 - 13:33
    It can sort numbers, it can sort lots of
    things.
  • 13:33 - 13:39
    friends.sort, that says hey there, dear
    friends, sort yourself.
  • 13:39 - 13:40
    This makes a change.
  • 13:43 - 13:45
    It alters the list, and puts it, in
  • 13:45 - 13:48
    this case, in alphabetical order, Glenn,
    Joseph, and Sally.
  • 13:48 - 13:52
    It is muted, it was, it's, it's been
    modified, and so
  • 13:52 - 13:55
    friend sub one is now Joseph because
    that's the second one.
  • 13:55 - 13:56
    Okay?
  • 13:56 - 14:00
    So the sort method says sort yourself now,
  • 14:00 - 14:04
    sort yourself, and it sorts and then
    it stays sorted.
  • 14:07 - 14:11
    So [COUGH]
  • 14:11 - 14:13
    you're going to be kind of ticked about
    this particular slide.
  • 14:13 - 14:17
    Because there's a whole bunch of built-in
    functions that help with lists.
  • 14:17 - 14:22
    And, there's max, there's min, there's
    len, various things.
  • 14:22 - 14:25
    And so we could, all those loops that I
    told you how to
  • 14:25 - 14:30
    do, I was just showing you that stuff
    because I thought it was important.
  • 14:30 - 14:32
    This the simplest way to go through and
  • 14:32 - 14:35
    find the largest, smallest, and sum,
    et cetera.
  • 14:35 - 14:37
    So here's a list of numbers.
  • 14:38 - 14:40
    We can say how many are there.
  • 14:40 - 14:43
    That's the count.
    We can say what's the largest, it's 74.
  • 14:43 - 14:46
    What's the smallest, that'd be 3.
  • 14:46 - 14:49
    What is the sum of the running total of
    them all? 154.
  • 14:49 - 14:52
    If you remember from a few lectures
    ago, these are the same numbers.
  • 14:52 - 14:57
    And what is the average, which is, sum of
    them over the length of them,
  • 14:57 - 14:58
    Okay?
  • 14:58 - 15:01
    So this makes a lot more sense and if you
    had a list of numbers
  • 15:01 - 15:05
    like this, you would simply say what's the
    max, you wouldn't write a max loop.
  • 15:05 - 15:07
    I just did that to kind of demonstrate how
    loops work.
  • 15:07 - 15:10
    [COUGH] Demonstrate how loops work.
  • 15:10 - 15:12
    So here is a way that you can sort
  • 15:12 - 15:17
    of change those kind of programs that we
    wrote.
  • 15:17 - 15:20
    So there's two ways to write a summing
    program.
  • 15:20 - 15:22
    Let's just say instead of the data being
  • 15:22 - 15:26
    in a list, we're going to write a while
    loop that's going to read a
  • 15:26 - 15:31
    set of numbers until we say done, and then
    compute the average of those numbers.
  • 15:31 - 15:33
    Okay, so let's say this is our problem.
  • 15:33 - 15:38
    Read a list of numbers, wait till the word
    done comes in, and then average them.
  • 15:38 - 15:40
    So here's a little program that does that.
  • 15:40 - 15:43
    We create total equals zero, count equals
    zero.
  • 15:43 - 15:46
    Make a infinite loop with while True.
  • 15:46 - 15:48
    And then we ask
  • 15:48 - 15:49
    to enter a number.
  • 15:49 - 15:52
    We get a string back from this, remember
    raw_input always
  • 15:52 - 15:57
    gives us strings back, and then if it's
    done, we're going to break.
  • 15:57 - 16:00
    This is the version of the if that does
    not require an indent.
  • 16:00 - 16:02
    We just put the break up there.
  • 16:02 - 16:04
    And so that gets us out of the loop when
    the time is right.
  • 16:04 - 16:06
    So when the time is right over here.
  • 16:06 - 16:10
    And then, we convert the value to float.
  • 16:10 - 16:13
    We use a float to convert the input to a
    floating point number.
  • 16:13 - 16:15
    And then we do our accumulation pattern,
  • 16:15 - 16:18
    total equals total plus value, count equals
    count plus one.
  • 16:18 - 16:19
    So this is going to run.
  • 16:19 - 16:21
    These numbers are going to go up and up
    and up and up.
  • 16:21 - 16:23
    And then we're going to break out of it,
  • 16:23 - 16:26
    calculate the average, and then print the
    average.
  • 16:26 - 16:30
    Because that's a floating point number, so now
    the average is a floating point number.
  • 16:30 - 16:31
    So that's one way to do it.
  • 16:31 - 16:31
    Right?
  • 16:31 - 16:35
    That would be one way to write a program
  • 16:35 - 16:38
    that does an average, is keep a running
    average
  • 16:38 - 16:39
    as you're reading the numbers.
  • 16:40 - 16:44
    But there's another way to do it, that
    would exact, work exactly
  • 16:44 - 16:48
    the same way, and this is when you can
    start using lists.
  • 16:48 - 16:52
    So you come in, you say I'm going to
    make a list
  • 16:52 - 16:57
    of numbers, just a mnemonic name, numlist,
    is an empty list.
  • 16:57 - 17:02
    Then I create another infinite loop
    that's going to read for enter a number.
  • 17:02 - 17:03
    And if it's done, break.
  • 17:03 - 17:09
    That gets us out of it.
    Convert the value to an int.
  • 17:09 - 17:12
    Convert the value to a float,
    the input value to a float.
  • 17:12 - 17:14
    And then append it to the list.
  • 17:14 - 17:17
    So now the list is going to grow, each
    time
  • 17:17 - 17:19
    we read a number the list is going to
    grow.
  • 17:19 - 17:21
    However many times we add the number is
  • 17:21 - 17:23
    how many things are going to be in the
    list.
  • 17:23 - 17:26
    So in this case, when we're at this point
    and we
  • 17:26 - 17:29
    type done, there will be three numbers in
    the list, because we
  • 17:29 - 17:33
    will have run append three times.
    We'll have appended 3, 9, and 5.
  • 17:33 - 17:37
    We'll have them sitting in a list.
    And we will have exited the loop.
  • 17:37 - 17:39
    So now you say, oh add up all the numbers
    in
  • 17:39 - 17:43
    that list, and then divide it by the
    length of the list.
  • 17:43 - 17:44
    And print the average.
  • 17:44 - 17:47
    So these two programs are basically
    equivalent.
  • 17:47 - 17:49
    The only time that they might not be
  • 17:49 - 17:54
    equivalent was if there was ten million
    numbers.
  • 17:54 - 17:59
    This would use up 40 megabytes of your
    memory, which
  • 17:59 - 18:01
    is actually not a lot of memory on some
    computers.
  • 18:01 - 18:05
    But if memory mattered, this does store
    all those numbers.
  • 18:05 - 18:08
    This one actually just runs the
    calculation.
  • 18:08 - 18:12
    So if there's a really large number of
    numbers, this would make a difference,
  • 18:12 - 18:16
    because the list is growing and keeping
    them all, summing them all at the end.
  • 18:16 - 18:17
    This is actually storing very little data.
  • 18:18 - 18:21
    But for reasonably sized numbers,
  • 18:21 - 18:24
    like thousands or even hundreds of thousands
    of numbers, these
  • 18:24 - 18:29
    two approaches are kind of equivalent.
    And then sometimes you actually
  • 18:29 - 18:32
    want to accumulate something a little more
    complex than this, you want to
  • 18:32 - 18:35
    sort them or look for the maximum and look
    for something else.
  • 18:35 - 18:37
    Who knows what, but the notion of make a
  • 18:37 - 18:40
    list and then append something to the list
  • 18:40 - 18:42
    each time through the iteration, and then do
    something with
  • 18:42 - 18:45
    the list at the end is a rather powerful
    pattern.
  • 18:45 - 18:49
    So this is also a powerful pattern,
    this is accumulator
  • 18:49 - 18:52
    pattern where we just have the variables
    accumulating in the loop.
  • 18:52 - 18:55
    This one is one where we accumulate the
    data in
  • 18:55 - 18:58
    the loop and then do the computations all
    at the end.
  • 18:58 - 19:02
    The, certain situations will make use of
    these different techniques.
  • 19:03 - 19:09
    Okay.
    So, connecting strings and lists.
  • 19:09 - 19:12
    So there's a method, a capability
  • 19:12 - 19:16
    of strings that is really powerful when it
    comes to tearing data apart.
  • 19:19 - 19:23
    It's called the split.
    So here is a string
  • 19:23 - 19:27
    with three words and it has blanks in between
    here.
  • 19:27 - 19:34
    And abc.split says parse this string,
  • 19:34 - 19:39
    look for the blanks, break the string into
    pieces, and give me back a
  • 19:39 - 19:44
    list with one item for each of the words
    in the list as
  • 19:44 - 19:47
    defined by the spaces. Okay?
  • 19:47 - 19:53
    So, it takes, breaks it into three pieces
    and gives us that back in a list.
  • 19:53 - 19:56
    This is very powerful. Okay?
  • 19:56 - 19:58
    So we're going to split it and we get back
    a list.
  • 19:58 - 20:04
    There are three words, and the first word,
    stuff sub zero, is With.
  • 20:04 - 20:06
    So there's a lot of parsing going on here.
  • 20:06 - 20:09
    We could do this with for loops and a lot
    of other things.
  • 20:09 - 20:11
    There would be a lot of work in this
    split.
  • 20:11 - 20:14
    Given that this is a really common task,
    it's really
  • 20:14 - 20:18
    great that this has been put into Python
    for us.
  • 20:18 - 20:19
    Okay?
  • 20:19 - 20:23
    So split breaks a string into parts and
    produces a list of strings.
  • 20:23 - 20:26
    We think of these as words, we can access a
  • 20:26 - 20:28
    particular word or we can loop through all
    the words.
  • 20:28 - 20:31
    So here we have stuff again and now we
    have a, a for loop
  • 20:32 - 20:35
    for each of the, that's going to go
    through each of the three words.
  • 20:35 - 20:36
    And then it's going to run three times.
  • 20:36 - 20:37
    Now chances are good we're going to do
  • 20:37 - 20:40
    something different other than just print
    them out.
  • 20:40 - 20:44
    But you see how that you quickly can take
    a split followed by a for, and then write
  • 20:44 - 20:46
    a loop that's going to go through each of
    the
  • 20:46 - 20:48
    words, without working too hard to find
    the spaces.
  • 20:48 - 20:53
    You let Python do all the hard work of
    finding the spaces.
  • 20:53 - 20:53
    Okay?
  • 20:53 - 20:56
    So let's take a look at a couple of
    samples.
  • 20:58 - 21:00
    Just a couple of things to teach you a
    little more about split.
  • 21:02 - 21:06
    Split looks at many spaces as equal to one
    space.
  • 21:08 - 21:11
    So, if you split a lot blank, blank, blank
    of spaces, it's
  • 21:11 - 21:14
    still just throws away all the spaces and
    gives us four words.
  • 21:16 - 21:20
    One, two, three, four and throws away
    all the spaces,
  • 21:20 - 21:22
    because it assumes that's what we
    want done.
  • 21:22 - 21:23
    So that's nice.
  • 21:23 - 21:27
    You can also have split, you can also have
    split,
  • 21:27 - 21:30
    split on some other character. Sometimes
    you'll be getting data
  • 21:30 - 21:33
    and they'll have used a semicolon, or a
    comma, or
  • 21:33 - 21:36
    a colon, or a tab character, who knows
    what they've
  • 21:36 - 21:39
    used, and your job is to dig that data
    out.
  • 21:39 - 21:43
    So you can split, based on the different
    character.
  • 21:43 - 21:47
    Here, if we're splitting normally with,
    with this is a normal split.
  • 21:47 - 21:50
    It's not going to see the semicolons, it's
    looking for a space.
  • 21:50 - 21:53
    And so all we get back is one
  • 21:53 - 21:55
    item in the string, with the semicolons.
  • 21:55 - 21:59
    But, if we switch, and we pass semicolon
  • 21:59 - 22:01
    as a parameter, in as as parameter to
    split,
  • 22:01 - 22:03
    then it will know to split it based on
  • 22:03 - 22:06
    semicolons, and gives us first, second, and
    third back.
  • 22:08 - 22:08
    Okay?
  • 22:08 - 22:10
    And then it gives us three words.
  • 22:10 - 22:14
    So you can split either on spaces, or you
  • 22:14 - 22:17
    can split on a character other than a
    space.
  • 22:17 - 22:18
    Okay?
  • 22:18 - 22:20
    [COUGH]
  • 22:20 - 22:25
    So, let's take a look at how we might turn
    this into some of our common assignments
  • 22:25 - 22:32
    that we have in this chapter, where we're
    going to read some of the mailbox data. Okay?
  • 22:33 - 22:37
    So, here we go with a little program.
  • 22:37 - 22:41
    First three lines, we write these a lot.
    Open the file.
  • 22:41 - 22:43
    Write a for loop to loop through each
  • 22:43 - 22:45
    line in the file.
  • 22:45 - 22:48
    Then we're going to strip off the white
    space at the end of the line.
  • 22:48 - 22:51
    One, two, three.
    Do those all the time.
  • 22:51 - 22:55
    And we're looking for lines, if you look
    at the whole file,
  • 22:55 - 22:58
    we're looking for lines that start with
    from, followed by a space.
  • 22:58 - 23:00
    So if the line does not start with from
  • 23:00 - 23:04
    followed by a space, that's a space right
    there, continue.
  • 23:04 - 23:08
    So that's a way to skip all the lines that
    don't look like this.
  • 23:08 - 23:12
    There're thousands of lines in this file
    and just a few that look like this. Okay?
  • 23:12 - 23:17
    So we're going to look and we're
    going to try
  • 23:17 - 23:23
    to find what day of the week this thing
    happened on.
  • 23:23 - 23:28
    So, so we're throwing away all the lines
    with this little bit of code.
  • 23:28 - 23:33
    Then what we do is we take the line, which
    is all of this text, and then we split it.
  • 23:34 - 23:38
    And we know that the day of the week is
    words sub two.
  • 23:38 - 23:43
    So this is words sub zero, this is words sub
    one, and this is words sub two.
  • 23:43 - 23:46
    So this is words sub zero, sub one, and sub
    two.
  • 23:46 - 23:49
    And so, all we have to do is print out the
    sub two
  • 23:49 - 23:54
    and we get, we throw away all the lines
    except the from lines.
  • 23:54 - 23:57
    We split them and take the sec, uh, the,
  • 23:57 - 23:59
    the third word or words sub two and we
  • 23:59 - 24:02
    can quickly quickly create something
    that's extracting
  • 24:02 - 24:04
    the day of the week out of these.
  • 24:06 - 24:07
    Okay?
  • 24:07 - 24:12
    So it's, it's, I mean, it's quick, because
    split does the tricky work.
  • 24:12 - 24:15
    If you go back to the strings chapter, you
    saw that
  • 24:15 - 24:17
    we did a lot of work to get this to
    happen.
  • 24:18 - 24:21
    So here's even another tricky pattern.
  • 24:21 - 24:27
    So let's say we want to do what we did at
    the end of Chapter Six,
  • 24:27 - 24:28
    the string chapter.
  • 24:28 - 24:31
    Let's say we wanted to get back this little
    bit of data.
  • 24:32 - 24:33
    Okay?
  • 24:33 - 24:37
    So, can look at this and say, okay, let's
    split this.
  • 24:37 - 24:42
    And this will be zero, one, and two, and
    three, and four, and five, and six.
  • 24:42 - 24:45
    We're splitting it based on spaces.
  • 24:45 - 24:50
    Then the email address is words sub one,
    right?
  • 24:51 - 24:55
    So that email address is this little bit
    of stuff
  • 24:55 - 24:59
    because it's in between spaces, right?
    So that's what we pull out.
  • 24:59 - 25:02
    The email address is words sub one.
  • 25:02 - 25:05
    We've got that.
  • 25:05 - 25:08
    So that's sitting in this email address
    variable.
  • 25:08 - 25:10
    Then we really, all we want, we don't
    really want the whole thing,
  • 25:10 - 25:12
    we just want the part after the
  • 25:12 - 25:14
    at sign, and we can do a lookup for the, oop.
  • 25:14 - 25:16
    We can do a lookup of the at sign.
  • 25:17 - 25:22
    But you can also then do a second, come
    back, come back.
  • 25:22 - 25:25
    [SOUND] There we come.
  • 25:25 - 25:29
    You can also do a second split.
    Okay?
  • 25:29 - 25:31
    So we're taking this variable here, email,
  • 25:31 - 25:34
    which is merely this little part right
    here.
  • 25:34 - 25:37
    And we are splitting it again, except this
  • 25:37 - 25:38
    time we're splitting it based on a at
    sign.
  • 25:38 - 25:43
    Which means it's going to bust it right
    here, and find
  • 25:43 - 25:44
    us two pieces.
  • 25:44 - 25:50
    So pieces now is a list where the sub zero
    item is the
  • 25:50 - 25:56
    person's name and sub one item is the host
    that their mail address is held from.
  • 25:56 - 26:01
    Okay?
    And so then all we need to know is pieces
  • 26:01 - 26:06
    is sub one, and pieces sub one is this
    guy right here.
  • 26:08 - 26:11
    So that's pieces sub one, and so we
    pulled it out.
  • 26:11 - 26:13
    So if you go back to how we did it before,
    we were
  • 26:13 - 26:17
    doing searching, we were searching some
    more, and then we were taking slices.
  • 26:17 - 26:19
    This is a little more elegant, okay?
  • 26:19 - 26:21
    Because really, we split it and then we
    split it,
  • 26:21 - 26:23
    and we knew what piece we were looking at.
  • 26:23 - 26:27
    So this is what I call the Double Split
    Pattern, where you split a string
  • 26:27 - 26:31
    into a list, then you take a thing out,
    and then you split it again.
  • 26:32 - 26:33
    Depending on what data you're looking for.
  • 26:33 - 26:35
    This is just a technique, it's not the
    only technique.
  • 26:35 - 26:40
    Okay, so that's lists.
  • 26:40 - 26:42
    We talked about the concept of a
  • 26:42 - 26:45
    collection where lists have multiple
    things in it.
  • 26:45 - 26:47
    Definite loops, again, we've seen these
    things.
  • 26:47 - 26:50
    We're kind of, it looks a lot like strings
  • 26:50 - 26:53
    except the elements are more powerful and
    they're more mutable.
  • 26:53 - 26:59
    We still use the bracket operator and we
    redid the max, min, and sum.
  • 26:59 - 27:02
    Except we did it in, like, one line rather
    than a whole loop.
  • 27:02 - 27:06
    And something we're going to play with a
    lot is using split to parse strings,
  • 27:06 - 27:09
    the single split, and then the double
    split
  • 27:09 - 27:11
    is the natural extension of the single
    split.
  • 27:11 - 27:15
    So, see you in the next lecture, looking
    forward to talking about dictionaries.
Title:
Python for Informatics - Chapter 8 - Lists
Description:

This is from Python for Informatics Chapter 8 - Lists. www.pythonlearn.com
All Lectures: http://www.youtube.com/playlist?list=PLlRFEj9H3Oj4JXIwMwN1_s­s1Tk8wZShEJ

more » « less
Video Language:
English
Team:
Captions Requested
Duration:
27:15

English subtitles

Revisions