Return to Video

For a culture of critical thought about data | Samuel Goëta | TEDxUTCompiègne

  • 0:25 - 0:29
    Each week I come across
    an article or a report
  • 0:29 - 0:33
    that asserts that data is the new oil,
  • 0:33 - 0:37
    that the use of data will lead
    to a new era of knowledge,
  • 0:37 - 0:39
    or even that it can predict the future.
  • 0:40 - 0:44
    This has been particularly true since
    everyone started talking about big data.
  • 0:44 - 0:47
    You know, the use of large-scale data,
    mega data.
  • 0:48 - 0:51
    For example, Sergei Brin,
    the founder of Google,
  • 0:51 - 0:53
    who is focusing on the use of medical data
  • 0:53 - 0:56
    to cure Parkinson's disease,
    for which he is at risk.
  • 0:57 - 1:00
    During the World Cup, many people said
  • 1:00 - 1:04
    that the German team was able to
    beat the Brazilian team 7-1
  • 1:04 - 1:06
    thanks to the use of match data.
  • 1:07 - 1:08
    It's clear
  • 1:08 - 1:12
    that there is no field or
    type of organization
  • 1:12 - 1:13
    for which Big Data
  • 1:13 - 1:16
    isn't supposed to be
    a magic wand that will enable
  • 1:16 - 1:19
    the resolution of extremely
    complex problems.
  • 1:20 - 1:23
    And I must admit that I feel uneasy
  • 1:23 - 1:26
    about these kinds
    of simplistic statements,
  • 1:26 - 1:29
    which I see as overshadowing
    a number of issues, including the economy,
  • 1:29 - 1:31
    the environment,
  • 1:31 - 1:31
    politics,
  • 1:31 - 1:34
    and the ethics
    of the massive production of data.
  • 1:35 - 1:38
    Please don't think that I am skeptical
  • 1:38 - 1:40
    or doubtful about data,
  • 1:40 - 1:43
    or that I am opposed
    to all forms of quantification.
  • 1:43 - 1:44
    On the contrary,
  • 1:44 - 1:46
    I live surrounded by data.
  • 1:46 - 1:50
    During the day, I'm working on a thesis
    in sociology at Telecom ParisTech
  • 1:50 - 1:51
    where I study Open Data.
  • 1:51 - 1:54
    The important effort
    to provide open access to public data.
  • 1:54 - 1:56
    And I study the consequences of Open Data
  • 1:56 - 1:59
    for the operation of government.
  • 1:59 - 2:02
    At night, I am the administrator
    for an association,
  • 2:02 - 2:04
    Open Knowledge Foundation France,
  • 2:04 - 2:08
    which campaigns for open knowledge
    and for data that benefits everyone.
  • 2:08 - 2:10
    Today, I would like to persuade you
  • 2:10 - 2:13
    that, at this time, when data
    is becoming obtrusive,
  • 2:13 - 2:16
    we need to take a step back.
  • 2:16 - 2:18
    This coronation of data
    that we are witnessing
  • 2:18 - 2:20
    during the era of Open Data and Big Data
  • 2:20 - 2:22
    demands a new culture of
    critical thought about data.
  • 2:22 - 2:27
    We must be able to understand
    how it is produced and used,
  • 2:27 - 2:29
    and how we can become
    independent from it.
  • 2:29 - 2:32
    I also want to share
    the results of an experiment
  • 2:32 - 2:35
    that we did at
    Open Knowledge Foundation France
  • 2:35 - 2:37
    called "the School of Data."
  • 2:37 - 2:40
    I hope to show that,
    through the use of data,
  • 2:40 - 2:44
    we can manage to develop
    this culture of critical thought
  • 2:44 - 2:47
    and that we can develop
    new checks and balances.
  • 2:47 - 2:49
    So, what are the problems with data?
  • 2:51 - 2:54
    The first problem is that
    data is always right.
  • 2:54 - 2:57
    Now, don't believe that this is
    anything new.
  • 2:57 - 3:01
    Historically, the word 'data'
    comes from the Latin word "datum"
  • 3:01 - 3:04
    which, in mathematics and theology
  • 3:04 - 3:06
    in the 15th century, referred to
  • 3:06 - 3:08
    the facts taken as given in an argument
  • 3:08 - 3:10
    and which were not to be
    called into question.
  • 3:10 - 3:12
    Today, as you know, data refers
  • 3:12 - 3:14
    to everything that flows
    in your computer.
  • 3:14 - 3:18
    That is to say, the 1's and 0's
    that pass from USB stick to hard disk
  • 3:18 - 3:19
    are considered data.
  • 3:20 - 3:21
    On the other hand,
  • 3:21 - 3:22
    the sense that data is a given,
  • 3:22 - 3:24
    that it is factual,
  • 3:24 - 3:25
    that it is not to be questioned,
  • 3:25 - 3:26
    has remained.
  • 3:28 - 3:30
    The second problem with data
  • 3:30 - 3:32
    is that we don't really know
    where it comes from.
  • 3:33 - 3:35
    In general, when someone uses data,
  • 3:35 - 3:37
    he or she has very little information
    about the way
  • 3:37 - 3:39
    in which it was produced.
  • 3:39 - 3:41
    At best, you will have access to metadata,
  • 3:41 - 3:43
    that is to say, data about the data,
  • 3:43 - 3:47
    which will tell you the contents
    of the file and, occasionally,
  • 3:47 - 3:49
    how the data was produced.
  • 3:50 - 3:52
    However, that data has a long history.
  • 3:53 - 3:55
    It was collected.
  • 3:55 - 3:57
    It was processed, formatted,
  • 3:57 - 4:00
    aggregated, processed by algorithms,
  • 4:00 - 4:03
    and visualized before reaching you.
  • 4:03 - 4:05
    This is why sociologist
    Bruno Latour asserts
  • 4:05 - 4:08
    that we should say 'obtaineds'
    instead of data
  • 4:08 - 4:10
    to accurately reflect this long history
  • 4:10 - 4:13
    which will constrain a number of uses.
  • 4:13 - 4:16
    Finally, the third problem with data
  • 4:16 - 4:18
    is that we can't really see it.
  • 4:18 - 4:20
    Have you ever seen a data center,
  • 4:20 - 4:23
    even if only from outside,
    or from the road?
  • 4:23 - 4:26
    Do you have any idea
    of where your data is stored?
  • 4:26 - 4:29
    I mean, physically, where it is stored?
  • 4:29 - 4:32
    Do you have any idea what will happen
    to it in 10 years?
  • 4:32 - 4:34
    In any case, I have no answer
    for these three questions.
  • 4:34 - 4:38
    However, even if we can't see our data,
    we can measure its effects.
  • 4:38 - 4:41
    At the individual level,
  • 4:41 - 4:44
    when Facebook changes its terms of service
  • 4:44 - 4:48
    or modifies its algorithm, it has
    consequences for your private life
  • 4:48 - 4:51
    and for the way in which you present
    yourself as an individual.
  • 4:51 - 4:54
    And on the most macroscopic level,
    the Snowden affair has shown
  • 4:54 - 4:56
    that the massive production of data
  • 4:56 - 4:59
    can have consequences
    for the sovereignty of the State
  • 4:59 - 5:00
    or for diplomacy.
  • 5:02 - 5:04
    This is why we must develop a culture
  • 5:04 - 5:05
    of critical thought about data.
  • 5:05 - 5:07
    To encourage myself,
  • 5:07 - 5:12
    I was inspired by a book
    called "Statactivism."
  • 5:12 - 5:14
    Statactivism is a neologism
  • 5:14 - 5:17
    proposed by researchers and artists
  • 5:17 - 5:21
    that refers to those experiences
    that permit one to liberate oneself
  • 5:21 - 5:23
    from the power of data.
  • 5:23 - 5:25
    The fundamental basis of statactivism
  • 5:25 - 5:27
    is that data controls us,
  • 5:27 - 5:30
    and that it imposes on us
    like an argument from authority.
  • 5:30 - 5:34
    The goal of statactivism
    is almost revolutionary.
  • 5:34 - 5:36
    It asserts that other kinds of data
    must be possible.
  • 5:36 - 5:39
    It is not necessary to be opposed
    to all data.
  • 5:39 - 5:41
    Instead, we should use the power of data
  • 5:41 - 5:43
    to propose other realities
  • 5:43 - 5:45
    to critique data more effectively,
  • 5:45 - 5:47
    or to propose other measures.
  • 5:47 - 5:49
    In short, to propose other data.
  • 5:49 - 5:53
    There is a motif in the book
    which I find particularly meaningful,
  • 5:53 - 5:54
    that of the judoka.
  • 5:55 - 5:59
    Judoka use the strength of their opponents
    in order to turn it back on them.
  • 5:59 - 6:02
    That is what I want to invite you
    to do today:
  • 6:02 - 6:07
    think about how to use data
    to better analyze it.
  • 6:08 - 6:12
    I think, precisely at this moment
    in the development of Open Data,
  • 6:12 - 6:14
    the need to develop a culture
    of critical thought about data
  • 6:14 - 6:17
    is increasingly crucial.
  • 6:17 - 6:21
    Don't be misled: Open Data represents
    an extraordinary opportunity.
  • 6:21 - 6:23
    The volume of data is exploding
  • 6:23 - 6:26
    and data is no longer
    the privilege of the powerful.
  • 6:26 - 6:28
    Today, you can use data
  • 6:28 - 6:30
    without asking anyone's permission.
  • 6:30 - 6:34
    And this is a good idea,
    because public data is available.
  • 6:34 - 6:35
    But I think that there is a risk
  • 6:35 - 6:38
    to thinking
    that the simple diffusion of data
  • 6:38 - 6:40
    will be enough to emancipate society,
  • 6:40 - 6:44
    that individuals can emancipate
    themselves from the power of data
  • 6:44 - 6:46
    just because they have access to data.
  • 6:46 - 6:49
    There is a Canadian sociologist
    named Michael Gurstein
  • 6:49 - 6:53
    who has proposed an expression
    that sums up a risk of Open data,
  • 6:53 - 6:57
    namely, "Empower the Empowered,"
  • 6:57 - 6:59
    meaning to give more power
    to those who already have it.
  • 6:59 - 7:03
    That is why it's crucial
    to develop a culture of critical thought
  • 7:03 - 7:06
    to be able to understand how data
    is produced,
  • 7:06 - 7:09
    used, and how you can use it
    to take a step back.
  • 7:10 - 7:11
    Well, that's the theory.
  • 7:11 - 7:15
    I would like to share with you
    the first results from an experiment
  • 7:15 - 7:18
    that we did in my association:
    Open Knowledge Foundation France.
  • 7:18 - 7:22
    We are part of a worldwide network
    dedicated to open knowledge and open data.
  • 7:22 - 7:24
    We have groups in more than 50 countries.
  • 7:24 - 7:27
    And the idea of our association
    and of this worldwide movement
  • 7:27 - 7:29
    is that each person can benefit,
    can profit,
  • 7:29 - 7:33
    from works, scientific articles,
    and content,
  • 7:33 - 7:37
    to create, play, educate,
    or to start up a business.
  • 7:38 - 7:41
    Open Knowledge has
    a large number of projects.
  • 7:41 - 7:44
    I'm going to talk about one project,
    the "School of Data."
  • 7:44 - 7:47
    We participated together
    in the translation of this project,
  • 7:47 - 7:49
    this "School of Data."
  • 7:49 - 7:52
    The School of Data consists
    of online resources
  • 7:52 - 7:54
    that are free and accessible to all,
  • 7:54 - 7:55
    and also events.
  • 7:56 - 7:58
    We first proposed classes.
  • 7:59 - 8:02
    In these classes, you do not even have
    to know what data is.
  • 8:02 - 8:05
    Or how to use a spreadsheet,
    which is really the tool of choice.
  • 8:05 - 8:07
    You will be taught about that
    in our class.
  • 8:08 - 8:10
    No expertise is required,
  • 8:10 - 8:15
    you are guided step by step
    in the use of data.
  • 8:15 - 8:18
    We also use another format
    which is particularly educational,
  • 8:18 - 8:19
    namely, the recipe.
  • 8:19 - 8:22
    Recipes are just like in cooking -
    you have ingredients
  • 8:22 - 8:23
    and steps.
  • 8:23 - 8:25
    The ingredients will be data,
  • 8:25 - 8:28
    software - free if possible,
  • 8:28 - 8:31
    so that you can use data.
  • 8:31 - 8:34
    The idea is that making a map
    of electoral results,
  • 8:34 - 8:37
    or a graph of results
    of the French soccer team
  • 8:37 - 8:40
    should be as easy to do as making
    a tarte Tatin or Bechamel sauce.
  • 8:40 - 8:42
    You find the resources online
  • 8:42 - 8:45
    and we walk you through the project
    step by step.
  • 8:45 - 8:48
    We also have tried to develop
    another format for in-person sessions,
  • 8:48 - 8:50
    which we call expeditions.
  • 8:50 - 8:53
    For expeditions, it's like
    mountain climbing:
  • 8:53 - 8:55
    you have a guide, a "data sherpa,"
  • 8:55 - 8:57
    who will accompany you,
  • 8:57 - 8:58
    attached by a rope.
  • 8:58 - 9:01
    There will be 10 or 20 participants
  • 9:01 - 9:05
    who work together during a weekend
    or sometimes for a few hours.
  • 9:06 - 9:08
    Our first data expedition
  • 9:08 - 9:11
    focused on the question of air pollution
    in Île-de-France.
  • 9:11 - 9:13
    I don't know if you have seen
  • 9:13 - 9:15
    these images of Paris
    with black clouds of pollution.
  • 9:15 - 9:18
    They left their mark on us,
    and we said to ourselves:
  • 9:18 - 9:22
    "Well, let's dig into this set of data."
  • 9:22 - 9:25
    The first step, when we undertook
    this data expedition,
  • 9:25 - 9:28
    was to identify the available data.
  • 9:28 - 9:31
    We realized that there is
    no available data
  • 9:31 - 9:35
    that is freely reusable, that is to say,
    that you have the right to reuse
  • 9:35 - 9:37
    without asking for permission,
    on this crucial question.
  • 9:37 - 9:41
    Therefore, we had to extract data
    from websites,
  • 9:41 - 9:43
    reports, or even from graphics.
  • 9:43 - 9:47
    Imagine what a mess it is to expose
    data that is in a graphic.
  • 9:48 - 9:50
    We also realized that Airparif,
  • 9:50 - 9:53
    the organization responsible
    for the production of data
  • 9:53 - 9:56
    relevant to the question
    of air pollution in Île-de-France
  • 9:56 - 10:00
    does not allow you to use
    its data freely.
  • 10:00 - 10:02
    One must ask permission, or pay.
  • 10:03 - 10:05
    We were able to overcome
    these constraints
  • 10:05 - 10:08
    and to conduct this expedition
  • 10:08 - 10:11
    guided by our sherpa, Pierre.
  • 10:11 - 10:14
    During this data expedition
  • 10:14 - 10:18
    we broke into small groups,
    and each group was assigned an angle.
  • 10:18 - 10:22
    One of the principles of the expeditions:
    you have an angle, like in journalism,
  • 10:22 - 10:25
    we ask ourselves questions that could be
    the title of an article.
  • 10:25 - 10:27
    The first group asked itself
  • 10:27 - 10:31
    if bicycle riding had led to a decrease
    in air pollution in Paris.
  • 10:31 - 10:33
    The second group,
  • 10:33 - 10:34
    since it was during a strike,
  • 10:34 - 10:37
    asked itself if public transport strikes
  • 10:37 - 10:41
    cause air pollution in Île-de-France
    to increase.
  • 10:41 - 10:44
    And the third group asked if
    all regions are equal
  • 10:44 - 10:48
    with regard to air pollution,
    or if geography and environment
  • 10:48 - 10:51
    could have an effect, and if so,
    could be seen in the data.
  • 10:54 - 10:55
    The results of this expedition,
  • 10:55 - 10:57
    I am sorry to say, will be a bit
    disappointing.
  • 10:57 - 11:01
    We did not find any correlation
    or causal connection
  • 11:01 - 11:04
    with nice data points,
    a fitting curve, or a straight line,
  • 11:04 - 11:07
    that proves that our hypotheses
    are correct.
  • 11:07 - 11:10
    We did not succeed at that,
    but we worked for four hours.
  • 11:10 - 11:12
    What we did manage to show,
    on the other hand,
  • 11:12 - 11:14
    is that it is extremely difficult
  • 11:14 - 11:18
    to use data concerning a question
    as crucial as air pollution,
  • 11:18 - 11:20
    to understand how it is produced,
  • 11:20 - 11:23
    extremely difficult to use it,
  • 11:23 - 11:26
    that the most simple measurements
    are not accessible,
  • 11:26 - 11:29
    and that you do not necessarily have
    the right to reuse them.
  • 11:29 - 11:32
    That is just what we tried
    to do at this event:
  • 11:32 - 11:35
    to develop a culture of critical thought
    on the way in which data
  • 11:35 - 11:38
    is used concerning the question
    of air pollution.
  • 11:38 - 11:43
    We also tried to develop this format
    of expeditions and training events
  • 11:43 - 11:45
    with another group
  • 11:45 - 11:46
    that is less expected,
  • 11:46 - 11:48
    that of children.
  • 11:48 - 11:53
    We asked ourselves the question
    during an event that we did with Etalab,
  • 11:53 - 11:57
    the government institution
    in charge of data.gouv.fr,
  • 11:57 - 12:00
    the open data portal
    of the French government.
  • 12:00 - 12:04
    We suggested the idea
    of radically different open data portals.
  • 12:04 - 12:08
    They were fictional projects,
    just prototypes.
  • 12:08 - 12:13
    There is a group that has come out
    with a prototype called Tada.gouv.fr.
  • 12:13 - 12:17
    Tada.gouv.fr is a fictional portal,
    a bit idealistic, destined for children.
  • 12:18 - 12:22
    The data is presented
    not by government department or minister,
  • 12:22 - 12:25
    but by discipline, that is to say
    that you have data
  • 12:25 - 12:28
    about history and geography,
    physics and chemistry,
  • 12:28 - 12:29
    or life and Earth sciences.
  • 12:29 - 12:32
    On this occasion, we realized
    that open data
  • 12:32 - 12:35
    can be a fantastic resource
    for school
  • 12:35 - 12:38
    because it allows the development
    of inter-disciplinary work,
  • 12:38 - 12:41
    and this culture of critical thought
    about data I have mentioned.
  • 12:42 - 12:44
    We did not leave things at observation.
  • 12:44 - 12:46
    We tried to do a first experiment
  • 12:46 - 12:49
    and I would like to tell you
    about the first results.
  • 12:50 - 12:52
    We joined with Silicon Banlieue,
  • 12:52 - 12:54
    which is a site dedicated
    to data in Argenteuil,
  • 12:54 - 12:56
    and we proposed to do an event
  • 12:56 - 12:58
    with children between 8 and 14 years old
  • 12:58 - 13:00
    who came to the Open World Forum,
  • 13:00 - 13:03
    an event dedicated
    to open computing in Paris.
  • 13:03 - 13:05
    There, you can see me from the back.
  • 13:05 - 13:09
    With the 8 to 14 year old children,
    we worked on the question of cinema,
  • 13:09 - 13:12
    because that interested them,
    and it is a simple enough subject.
  • 13:12 - 13:14
    First we collected data,
  • 13:14 - 13:17
    nothing very complicated,
    it was just a paper form.
  • 13:17 - 13:20
    We asked them how many times a month
    they go to the cinema,
  • 13:20 - 13:24
    which movies they saw from a list;
    then we compared that with data
  • 13:24 - 13:27
    that is available from the survey
    of French cultural practices,
  • 13:27 - 13:30
    on which you have
    exactly the same type of data.
  • 13:30 - 13:34
    With the children, we produced
    an infographic at this time.
  • 13:34 - 13:38
    Now, I am really bad at math,
    I got a 7,5 on the Bac,
  • 13:38 - 13:41
    I found myself explaining
    the concept and calculation
  • 13:41 - 13:44
    of averages using a spreadsheet,
    which was rather surprising.
  • 13:44 - 13:45
    I explained how it works.
  • 13:45 - 13:50
    We emerged with an infographic
    and we were able on this occasion,
  • 13:50 - 13:53
    I think that this is the important point,
    to develop a culture of critical thought.
  • 13:53 - 13:57
    I explained to them about data,
    how it is used,
  • 13:57 - 13:58
    how they can use it,
  • 13:58 - 14:01
    how it controls us in a certain way,
  • 14:01 - 14:04
    but that we can also take back
    the power over data.
  • 14:04 - 14:07
    I assure you that with a topic
    as attractive as cinema
  • 14:07 - 14:09
    we can deliver this kind of message
  • 14:09 - 14:11
    and have a discussion on these questions.
  • 14:12 - 14:15
    I hope that I have convinced you
    that it is necessary today
  • 14:15 - 14:17
    to take a step back with regard to data,
  • 14:17 - 14:21
    to develop a culture of critical thought,
    to understand
  • 14:21 - 14:24
    how it is produced
    and how you can use it,
  • 14:24 - 14:27
    to prevent data from being forced on you.
  • 14:27 - 14:30
    So from today,
    get your hands dirty,
  • 14:30 - 14:32
    find a sherpa,
    all of the resources are online,
  • 14:32 - 14:34
    and go on a data expedition.
  • 14:34 - 14:35
    Thank you.
  • 14:35 - 14:37
    (Applause)
Title:
For a culture of critical thought about data | Samuel Goëta | TEDxUTCompiègne
Description:

This presentation was given during an event at TEDx local, produced independently from the TED conference.

What is data? How to we collect it? How should we approach it? Who has it? What can we do with it? How can we take control of our data and the data at our disposal?

A current doctoral student in sociology at Telecom ParisTech, Samuel Goëta studies the impact of open data policies on organizations and the production of data. Co-founder and administrator of the association Open Knowledge France, he campaigns for open knowledge (content and data) that will benefit everyone. In particular, he participated in the launch of the School of Data project (http://ecoledesdonnes.org), which allows anyone to use open data without any prior experience.

more » « less
Video Language:
French
Team:
closed TED
Project:
TEDxTalks
Duration:
14:38

English subtitles

Revisions