Return to Video

The curly fry conundrum: Why social media “likes” say more than you might think

  • 0:01 - 0:03
    If you remember that first decade of the web,
  • 0:03 - 0:05
    it was really a static place.
  • 0:05 - 0:07
    You could go online, you could look at pages,
  • 0:07 - 0:10
    and they were put up either by organizations
  • 0:10 - 0:11
    who had teams to do it
  • 0:11 - 0:14
    or by individuals who were really tech-savvy
  • 0:14 - 0:15
    for the time.
  • 0:15 - 0:17
    And with the rise of social media
  • 0:17 - 0:20
    and social networks in the early 2000s,
  • 0:20 - 0:21
    the web was completely changed
  • 0:21 - 0:25
    to a place where now the vast majority of content
  • 0:25 - 0:28
    we interact with is put up by average users,
  • 0:28 - 0:31
    either in YouTube videos or blog posts
  • 0:31 - 0:34
    or product reviews or social media postings.
  • 0:34 - 0:37
    And it's also become a much more interactive place,
  • 0:37 - 0:39
    where people are interacting with others,
  • 0:39 - 0:41
    they're commenting, they're sharing,
  • 0:41 - 0:43
    they're not just reading.
  • 0:43 - 0:44
    So Facebook is not the only place you can do this,
  • 0:44 - 0:46
    but it's the biggest
  • 0:46 - 0:48
    and it serves to illustrate the numbers.
  • 0:48 - 0:51
    Facebook has 1.2 billion users per month.
  • 0:51 - 0:53
    So half the Earth's internet population
  • 0:53 - 0:55
    is using Facebook.
  • 0:55 - 0:57
    They are a site, along with others,
  • 0:57 - 1:00
    that has allowed people to create an online persona
  • 1:00 - 1:01
    with very little technical skill,
  • 1:01 - 1:04
    and people responded by putting huge amounts
  • 1:04 - 1:06
    of personal data online.
  • 1:06 - 1:08
    So the result is that we have behavioral,
  • 1:08 - 1:10
    preference, demographic data
  • 1:10 - 1:12
    for hundreds of millions of people,
  • 1:12 - 1:14
    which is unprecedented in history.
  • 1:14 - 1:17
    And as a computer scientist, what this means is that
  • 1:17 - 1:19
    I've been able to build models
  • 1:19 - 1:21
    that can predict all sorts of hidden attributes
  • 1:21 - 1:23
    for all of you that you don't even know
  • 1:23 - 1:25
    you're sharing information about.
  • 1:25 - 1:28
    As scientists, we use that to help
  • 1:28 - 1:30
    the way people interact online,
  • 1:30 - 1:33
    but there's less altruistic applications,
  • 1:33 - 1:35
    and there's a problem in that users don't really
  • 1:35 - 1:37
    understand these techniques and how they work,
  • 1:37 - 1:41
    and even if they did, they don't
    have a lot of control over it.
  • 1:41 - 1:42
    So what I want to talk to you about today
  • 1:42 - 1:45
    is some of these things that we're able to do,
  • 1:45 - 1:47
    and then give us some ideas
    of how we might go forward
  • 1:47 - 1:50
    to move some control back into the hands of users.
  • 1:50 - 1:52
    So this is Target, the company.
  • 1:52 - 1:53
    I didn't just put that logo
  • 1:53 - 1:55
    on this poor, pregnant woman's belly.
  • 1:55 - 1:57
    You may have seen this anecdote that was printed
  • 1:57 - 1:59
    in Forbes Magazine where Target
  • 1:59 - 2:02
    sent a flyer to this 15-year old girl
  • 2:02 - 2:03
    with advertisements and coupons
  • 2:03 - 2:06
    for baby bottles and diapers and cribs
  • 2:06 - 2:08
    two weeks before she told her parents
  • 2:08 - 2:09
    that she was pregnant.
  • 2:09 - 2:12
    Yeah, the dad was really upset.
  • 2:12 - 2:14
    He said, "How did Target figure out
  • 2:14 - 2:16
    that this high school girl was pregnant
  • 2:16 - 2:18
    before she told her parents?"
  • 2:18 - 2:20
    It turns out that they have the purchase history
  • 2:20 - 2:23
    for hundreds of thousands of customers
  • 2:23 - 2:25
    and they compute what they call a pregnancy score,
  • 2:25 - 2:28
    which is not just whether or not a woman's pregnant,
  • 2:28 - 2:29
    but what her due date is.
  • 2:29 - 2:31
    And they compute that
  • 2:31 - 2:33
    not by looking at, like, the obvious things,
  • 2:33 - 2:35
    like, she's buying a crib or baby clothes,
  • 2:35 - 2:38
    but things like, she bought more vitamins
  • 2:38 - 2:40
    than she normally had,
  • 2:40 - 2:41
    or she bought a handbag
  • 2:41 - 2:43
    that's big enough to hold diapers.
  • 2:43 - 2:45
    And by themselves, those purchases don't seem
  • 2:45 - 2:47
    like they might reveal a lot,
  • 2:47 - 2:49
    but it's a pattern of behavior that,
  • 2:49 - 2:52
    when you take it in the context of other people,
  • 2:52 - 2:55
    starts to actually reveal some insights.
  • 2:55 - 2:57
    So that's the kind of thing that we do
  • 2:57 - 3:00
    when we're predicting stuff
    about you on social media.
  • 3:00 - 3:02
    We're looking for little
    patterns of behavior that,
  • 3:02 - 3:05
    when you detect them among millions of people,
  • 3:05 - 3:08
    let's us find out all kinds of things.
  • 3:08 - 3:09
    So in my lab and with colleagues,
  • 3:09 - 3:11
    we've developed mechanisms where we can
  • 3:11 - 3:13
    quite accurately predict things
  • 3:13 - 3:15
    like your political preference,
  • 3:15 - 3:18
    your personality score, gender, sexual orientation,
  • 3:18 - 3:21
    religion, age, intelligence,
  • 3:21 - 3:23
    along with things like
  • 3:23 - 3:24
    how much you trust the people you know
  • 3:24 - 3:26
    and how strong those relationships are.
  • 3:26 - 3:28
    We can do all of this really well.
  • 3:28 - 3:30
    And again, it doesn't come from what you might
  • 3:30 - 3:32
    think of as obvious information.
  • 3:32 - 3:34
    So my favorite example is from this study
  • 3:34 - 3:36
    that was published this year
  • 3:36 - 3:38
    in the Proceedings of the National Academies.
  • 3:38 - 3:39
    If you Google this, you'll find it.
  • 3:39 - 3:41
    It's four pages, easy to read.
  • 3:41 - 3:44
    And they looked at just people's Facebook likes,
  • 3:44 - 3:45
    so just the things you like on Facebook,
  • 3:45 - 3:48
    and used that to predict all these attributes,
  • 3:48 - 3:49
    along with some other ones.
  • 3:49 - 3:52
    And in their paper they listed the five likes
  • 3:52 - 3:55
    that were most indicative of high intelligence.
  • 3:55 - 3:57
    And among those was liking a page
  • 3:57 - 3:59
    for curly fries. (Laughter)
  • 3:59 - 4:01
    Curly fries are delicious,
  • 4:01 - 4:04
    but liking them does not necessarily mean
  • 4:04 - 4:06
    that you're smarter than the average person.
  • 4:06 - 4:09
    So how is it that one of the strongest indicators
  • 4:09 - 4:11
    of your intelligence
  • 4:11 - 4:12
    is liking this page
  • 4:12 - 4:14
    when the content is totally irrelevant
  • 4:14 - 4:17
    to the attribute that's being predicted?
  • 4:17 - 4:19
    And it turns out that we have to look at
  • 4:19 - 4:21
    a whole bunch of underlying theories
  • 4:21 - 4:23
    to see why we're able to do this.
  • 4:23 - 4:26
    One of them is a sociological
    theory called homophily,
  • 4:26 - 4:29
    which basically says people are
    friends with people like them.
  • 4:29 - 4:31
    So if you're smart, you tend to
    be friends with smart people,
  • 4:31 - 4:33
    and if you're young, you tend
    to be friends with young people,
  • 4:33 - 4:35
    and this is well-established
  • 4:35 - 4:37
    for hundreds of years.
  • 4:37 - 4:39
    We also know a lot
  • 4:39 - 4:41
    about how information spreads through networks.
  • 4:41 - 4:42
    It turns out things like viral videos
  • 4:42 - 4:45
    or Facebook likes or other information
  • 4:45 - 4:47
    spreads in exactly the same way
  • 4:47 - 4:49
    that diseases spread through social networks.
  • 4:49 - 4:51
    So this is something we've studied for a long time.
  • 4:51 - 4:53
    We have good models of it.
  • 4:53 - 4:55
    And so you can put those things together
  • 4:55 - 4:58
    and start seeing why things like this happen.
  • 4:58 - 5:00
    So if I were to give you a hypothesis,
  • 5:00 - 5:03
    it would be that a smart guy started this page,
  • 5:03 - 5:05
    or maybe one of the first people who liked it
  • 5:05 - 5:07
    would have scored high on that test.
  • 5:07 - 5:09
    And they liked it, and their friends saw it,
  • 5:09 - 5:12
    and by homophily, we know that
    he probably had smart friends,
  • 5:12 - 5:15
    and so it spread to them, and some of them liked it,
  • 5:15 - 5:16
    and they had smart friends,
  • 5:16 - 5:17
    and so it spread to them,
  • 5:17 - 5:19
    and so it propagated through the network
  • 5:19 - 5:21
    to kind of a host of smart people,
  • 5:21 - 5:23
    so that by the end, the action
  • 5:23 - 5:26
    of liking the curly fries page
  • 5:26 - 5:28
    is indicative of high intelligence,
  • 5:28 - 5:30
    not because of the content,
  • 5:30 - 5:32
    but because the actual action of liking
  • 5:32 - 5:34
    reflects back the common attributes
  • 5:34 - 5:36
    of other people who have done it.
  • 5:36 - 5:39
    So this is pretty complicated stuff, right?
  • 5:39 - 5:41
    It's a hard thing to sit down and explain
  • 5:41 - 5:44
    to an average user, and even if you do,
  • 5:44 - 5:46
    what can the average user do about it?
  • 5:46 - 5:48
    How do you know that you've liked something
  • 5:48 - 5:50
    that indicates a trait for you
  • 5:50 - 5:54
    that's totally irrelevant to the
    content of what you've liked?
  • 5:54 - 5:56
    There's a lot of power that users don't have
  • 5:56 - 5:58
    to control how this data is used.
  • 5:58 - 6:02
    And I see that as a real problem going forward.
  • 6:02 - 6:03
    So I think there's a couple path
  • 6:03 - 6:05
    that we want to look at
  • 6:05 - 6:07
    if we want to give users some control
  • 6:07 - 6:08
    over how this data is used,
  • 6:08 - 6:10
    because it's not always going to be used
  • 6:10 - 6:12
    for their benefit.
  • 6:12 - 6:13
    An example I often give is that,
  • 6:13 - 6:15
    if I ever get bored being a professor,
  • 6:15 - 6:16
    I'm going to go start a company
  • 6:16 - 6:18
    that predicts all of these attributes
  • 6:18 - 6:19
    and things like how well you work in teams
  • 6:19 - 6:22
    and if you're a drug user, if you're an alcoholic.
  • 6:22 - 6:23
    We know how to predict all that.
  • 6:23 - 6:25
    And I'm going to sell reports
  • 6:25 - 6:27
    to h.r. companies and big businesses
  • 6:27 - 6:29
    that want to hire you.
  • 6:29 - 6:31
    We totally can do that now.
  • 6:31 - 6:33
    I could start that business tomorrow,
  • 6:33 - 6:34
    and you would have absolutely no control
  • 6:34 - 6:37
    over me using your data like that.
  • 6:37 - 6:39
    That seems to me to be a problem.
  • 6:39 - 6:41
    So one of the paths we can do down
  • 6:41 - 6:43
    is the policy and law path.
  • 6:43 - 6:46
    And in some respects, I think
    that that would be most effective,
  • 6:46 - 6:48
    but the problem is we'd actually have to do it.
  • 6:48 - 6:51
    Observing our political process in action
  • 6:51 - 6:54
    makes me think it's highly unlikely
  • 6:54 - 6:55
    that we're going to get a bunch of representatives
  • 6:55 - 6:57
    to sit down, learn about this,
  • 6:57 - 6:59
    and then enact sweeping changes
  • 6:59 - 7:02
    to intellectual property law in the U.S.
  • 7:02 - 7:04
    so users control their data.
  • 7:04 - 7:06
    We could go the policy route,
  • 7:06 - 7:07
    where social media companies say,
  • 7:07 - 7:09
    you know what? You own your data.
  • 7:09 - 7:11
    You have total control over how it's used.
  • 7:11 - 7:13
    The problem is that the revenue models
  • 7:13 - 7:14
    for most social media companies
  • 7:14 - 7:18
    rely on sharing or exploiting users data in some way.
  • 7:18 - 7:20
    It's sometimes said of Facebook that the users
  • 7:20 - 7:23
    aren't the customer, they're the product.
  • 7:23 - 7:25
    And so how do you get a company
  • 7:25 - 7:28
    to cede control of their main asset
  • 7:28 - 7:29
    back to the users?
  • 7:29 - 7:31
    It's possible, but I don't think it's something
  • 7:31 - 7:33
    that we're going to see change quickly.
  • 7:33 - 7:35
    So I think the other path
  • 7:35 - 7:37
    that we can do down that's
    going to be more effective
  • 7:37 - 7:39
    is one of more science.
  • 7:39 - 7:41
    It's doing science that allowed us to develop
  • 7:41 - 7:43
    all these mechanisms for computing
  • 7:43 - 7:45
    this personal data data in the first place.
  • 7:45 - 7:47
    And it's actually very similar research
  • 7:47 - 7:49
    that we'd have to do
  • 7:49 - 7:51
    if we want to develop mechanisms
  • 7:51 - 7:52
    that can say to a user,
  • 7:52 - 7:55
    "Here's the risk of that action you just took."
  • 7:55 - 7:57
    You know, by liking that Facebook page,
  • 7:57 - 7:59
    or by sharing this piece of personal information,
  • 7:59 - 8:01
    you've now improved my ability
  • 8:01 - 8:03
    to predict whether or not you're using drugs
  • 8:03 - 8:05
    or whether or not you get
    along well in the workplace.
  • 8:05 - 8:07
    And that, I think, can affect whether or not
  • 8:07 - 8:09
    people want to share something,
  • 8:09 - 8:12
    keep it private, or just keep it offline altogether.
  • 8:12 - 8:14
    We can also look at things like
  • 8:14 - 8:16
    allowing people to encrypt data that they upload,
  • 8:16 - 8:18
    so it's kind of invisible and worthless
  • 8:18 - 8:20
    to sites like Facebook
  • 8:20 - 8:22
    or third party services that access it,
  • 8:22 - 8:26
    but that select users who the person who posted it
  • 8:26 - 8:28
    want to see it have access to see it.
  • 8:28 - 8:30
    This is all super-exciting research
  • 8:30 - 8:32
    from an intellectual perspective,
  • 8:32 - 8:34
    and so scientists are going to be willing to do it.
  • 8:34 - 8:38
    So that gives us an advantage over the loss side.
  • 8:38 - 8:39
    One of the problems that people bring up
  • 8:39 - 8:41
    when I talk about this is, they say,
  • 8:41 - 8:44
    you know, if people start
    keeping all this data private,
  • 8:44 - 8:45
    all those methods that you've been developing
  • 8:45 - 8:48
    to predict their traits are going to fail.
  • 8:48 - 8:52
    And I say, absolutely, and for me, that's success,
  • 8:52 - 8:54
    because as a scientist,
  • 8:54 - 8:57
    my goal is not to infer information about users,
  • 8:57 - 9:00
    it's to improve the way people interact online.
  • 9:00 - 9:03
    And sometimes that involves
    inferring things about them,
  • 9:03 - 9:06
    but if users don't want me to use that data,
  • 9:06 - 9:08
    I think they should have the right to do that.
  • 9:08 - 9:11
    I want users to be informed and consenting
  • 9:11 - 9:13
    users of the tools that we develop.
  • 9:13 - 9:16
    And so I think encouraging this kind of science
  • 9:16 - 9:18
    and supporting researchers
  • 9:18 - 9:20
    who want to cede some of that control back to users
  • 9:20 - 9:23
    and away from the social media companies
  • 9:23 - 9:25
    means that going forward as these tools evolve
  • 9:25 - 9:27
    and advance
  • 9:27 - 9:28
    means that we're going to have an educated
  • 9:28 - 9:30
    and empowered user base,
  • 9:30 - 9:31
    and I think all of us can agree
  • 9:31 - 9:33
    that that's a pretty ideal way to go forward.
  • 9:33 - 9:36
    Thank you.
  • 9:36 - 9:39
    (Applause)
Title:
The curly fry conundrum: Why social media “likes” say more than you might think
Speaker:
Jennifer Golbeck
Description:

more » « less
Video Language:
English
Team:
closed TED
Project:
TEDTalks
Duration:
10:01
  • do down -> go down

  • The description is different from the one on the TED page.
    http://www.ted.com/talks/jennifer_golbeck_the_curly_fry_conundrum_why_social_media_likes_say_more_than_you_might_think

  • He said, "How did Target figure out...
    # This is not words of the father.
    http://www.forbes.com/sites/kashmirhill/2012/02/16/how-target-figured-out-a-teen-girl-was-pregnant-before-her-father-did/

English subtitles

Revisions Compare revisions