0:00:24.767,0:00:29.430 Each week I come across [br]an article or a report 0:00:29.430,0:00:33.476 that asserts that data is the new oil, 0:00:33.476,0:00:36.702 that the use of data will lead[br]to a new era of knowledge, 0:00:37.372,0:00:39.300 or even that it can predict the future. 0:00:39.700,0:00:43.504 This has been particularly true since[br]everyone started talking about big data. 0:00:43.504,0:00:46.804 You know, the use of large-scale data, [br]mega data. 0:00:47.723,0:00:50.673 For example, Sergei Brin, [br]the founder of Google, 0:00:50.673,0:00:53.490 who is focusing on the use of medical data 0:00:53.490,0:00:56.290 to cure Parkinson's disease, [br]for which he is at risk. 0:00:57.044,0:01:00.014 During the World Cup, many people said 0:01:00.014,0:01:03.673 that the German team was able to[br]beat the Brazilian team 7-1 0:01:03.673,0:01:06.033 thanks to the use of match data. 0:01:07.263,0:01:08.263 It's clear 0:01:08.263,0:01:11.525 that there is no field or [br]type of organization 0:01:11.525,0:01:13.082 for which Big Data 0:01:13.082,0:01:16.352 isn't supposed to be[br]a magic wand that will enable 0:01:16.352,0:01:18.642 the resolution of extremely [br]complex problems. 0:01:19.741,0:01:22.741 And I must admit that I feel uneasy 0:01:22.743,0:01:25.537 about these kinds[br]of simplistic statements, 0:01:25.537,0:01:29.187 which I see as overshadowing [br]a number of issues, including the economy, 0:01:29.195,0:01:30.505 the environment, 0:01:30.507,0:01:31.402 politics, 0:01:31.402,0:01:33.802 and the ethics [br]of the massive production of data. 0:01:35.233,0:01:37.638 Please don't think that I am skeptical 0:01:37.638,0:01:39.918 or doubtful about data, 0:01:39.925,0:01:42.505 or that I am opposed [br]to all forms of quantification. 0:01:42.803,0:01:43.803 On the contrary, 0:01:43.803,0:01:45.912 I live surrounded by data. 0:01:45.912,0:01:49.870 During the day, I'm working on a thesis[br]in sociology at Telecom ParisTech 0:01:49.870,0:01:51.102 where I study Open Data. 0:01:51.102,0:01:54.186 The important effort[br]to provide open access to public data. 0:01:54.186,0:01:56.496 And I study the consequences of Open Data 0:01:56.496,0:01:58.772 for the operation of government. 0:01:58.772,0:02:01.627 At night, I am the administrator[br]for an association, 0:02:01.627,0:02:03.653 Open Knowledge Foundation France, 0:02:03.653,0:02:08.310 which campaigns for open knowledge[br]and for data that benefits everyone. 0:02:08.310,0:02:10.227 Today, I would like to persuade you 0:02:10.227,0:02:13.114 that, at this time, when data[br]is becoming obtrusive, 0:02:13.114,0:02:15.583 we need to take a step back. 0:02:15.583,0:02:17.574 This coronation of data [br]that we are witnessing 0:02:17.574,0:02:19.548 during the era of Open Data and Big Data 0:02:19.548,0:02:22.364 demands a new culture of[br]critical thought about data. 0:02:22.364,0:02:26.784 We must be able to understand[br]how it is produced and used, 0:02:26.784,0:02:28.999 and how we can become[br]independent from it. 0:02:28.999,0:02:32.468 I also want to share [br]the results of an experiment 0:02:32.468,0:02:35.279 that we did at[br]Open Knowledge Foundation France 0:02:35.279,0:02:37.246 called "the School of Data." 0:02:37.246,0:02:40.468 I hope to show that, [br]through the use of data, 0:02:40.468,0:02:43.528 we can manage to develop[br]this culture of critical thought 0:02:43.533,0:02:46.513 and that we can develop [br]new checks and balances. 0:02:46.513,0:02:49.291 So, what are the problems with data? 0:02:50.659,0:02:53.711 The first problem is that[br]data is always right. 0:02:53.711,0:02:56.858 Now, don't believe that this is[br]anything new. 0:02:56.858,0:03:00.511 Historically, the word 'data'[br]comes from the Latin word "datum" 0:03:00.511,0:03:03.791 which, in mathematics and theology 0:03:03.791,0:03:05.534 in the 15th century, referred to 0:03:05.534,0:03:07.689 the facts taken as given in an argument[br] 0:03:07.689,0:03:09.919 and which were not to be [br]called into question. 0:03:09.919,0:03:12.311 Today, as you know, data refers 0:03:12.321,0:03:14.331 to everything that flows [br]in your computer. 0:03:14.331,0:03:17.830 That is to say, the 1's and 0's[br]that pass from USB stick to hard disk 0:03:17.830,0:03:18.830 are considered data. 0:03:19.774,0:03:20.802 On the other hand, 0:03:20.802,0:03:22.432 the sense that data is a given, 0:03:22.432,0:03:23.598 that it is factual, 0:03:23.598,0:03:25.307 that it is not to be questioned, 0:03:25.307,0:03:26.007 has remained. 0:03:27.825,0:03:29.865 The second problem with data 0:03:29.865,0:03:32.407 is that we don't really know[br]where it comes from. 0:03:32.574,0:03:34.885 In general, when someone uses data, 0:03:34.885,0:03:37.418 he or she has very little information[br]about the way 0:03:37.418,0:03:38.918 in which it was produced. 0:03:38.918,0:03:41.068 At best, you will have access to metadata, 0:03:41.068,0:03:43.094 that is to say, data about the data, 0:03:43.094,0:03:46.984 which will tell you the contents[br]of the file and, occasionally, 0:03:46.984,0:03:49.077 how the data was produced. 0:03:49.938,0:03:52.458 However, that data has a long history. 0:03:53.494,0:03:54.744 It was collected. 0:03:54.744,0:03:57.073 It was processed, formatted, 0:03:57.073,0:03:59.871 aggregated, processed by algorithms, 0:03:59.871,0:04:03.083 and visualized before reaching you. 0:04:03.083,0:04:05.383 This is why sociologist [br]Bruno Latour asserts 0:04:05.393,0:04:07.833 that we should say 'obtaineds' [br]instead of data 0:04:07.833,0:04:10.167 to accurately reflect this long history 0:04:10.167,0:04:12.557 which will constrain a number of uses. 0:04:13.268,0:04:15.939 Finally, the third problem with data 0:04:15.939,0:04:17.869 is that we can't really see it. 0:04:17.869,0:04:19.915 Have you ever seen a data center, 0:04:19.915,0:04:22.871 even if only from outside, [br]or from the road? 0:04:22.871,0:04:25.840 Do you have any idea [br]of where your data is stored? 0:04:25.840,0:04:28.521 I mean, physically, where it is stored? 0:04:28.521,0:04:31.512 Do you have any idea what will happen[br]to it in 10 years? 0:04:31.512,0:04:34.483 In any case, I have no answer [br]for these three questions. 0:04:34.483,0:04:38.173 However, even if we can't see our data,[br]we can measure its effects. 0:04:38.173,0:04:41.251 At the individual level, 0:04:41.251,0:04:43.771 when Facebook changes its terms of service 0:04:43.771,0:04:47.685 or modifies its algorithm, it has [br]consequences for your private life 0:04:47.685,0:04:50.720 and for the way in which you present [br]yourself as an individual. 0:04:50.720,0:04:53.620 And on the most macroscopic level,[br]the Snowden affair has shown 0:04:53.620,0:04:56.120 that the massive production of data 0:04:56.120,0:04:58.943 can have consequences [br]for the sovereignty of the State 0:04:58.943,0:05:00.433 or for diplomacy. 0:05:01.509,0:05:03.699 This is why we must develop a culture 0:05:03.699,0:05:05.465 of critical thought about data. 0:05:05.465,0:05:06.715 To encourage myself, 0:05:06.715,0:05:11.675 I was inspired by a book [br]called "Statactivism." 0:05:11.675,0:05:13.811 Statactivism is a neologism 0:05:13.811,0:05:16.811 proposed by researchers and artists 0:05:16.811,0:05:21.273 that refers to those experiences [br]that permit one to liberate oneself 0:05:21.273,0:05:22.989 from the power of data. 0:05:22.989,0:05:25.268 The fundamental basis of statactivism 0:05:25.268,0:05:27.456 is that data controls us, 0:05:27.456,0:05:30.379 and that it imposes on us[br]like an argument from authority. 0:05:30.379,0:05:33.787 The goal of statactivism [br]is almost revolutionary. 0:05:33.787,0:05:36.293 It asserts that other kinds of data[br]must be possible. 0:05:36.293,0:05:38.697 It is not necessary to be opposed[br]to all data. 0:05:38.697,0:05:41.197 Instead, we should use the power of data 0:05:41.197,0:05:42.728 to propose other realities 0:05:42.730,0:05:44.557 to critique data more effectively, 0:05:44.557,0:05:46.744 or to propose other measures. 0:05:46.744,0:05:49.054 In short, to propose other data. 0:05:49.056,0:05:52.516 There is a motif in the book[br]which I find particularly meaningful, 0:05:52.520,0:05:53.950 that of the judoka. 0:05:54.947,0:05:58.617 Judoka use the strength of their opponents[br]in order to turn it back on them. 0:05:59.490,0:06:02.273 That is what I want to invite you [br]to do today: 0:06:02.273,0:06:06.723 think about how to use data [br]to better analyze it. 0:06:08.372,0:06:11.529 I think, precisely at this moment[br]in the development of Open Data, 0:06:11.529,0:06:14.413 the need to develop a culture [br]of critical thought about data 0:06:14.413,0:06:16.633 is increasingly crucial. 0:06:17.462,0:06:20.982 Don't be misled: Open Data represents[br]an extraordinary opportunity. 0:06:20.987,0:06:23.027 The volume of data is exploding 0:06:23.027,0:06:25.906 and data is no longer [br]the privilege of the powerful. 0:06:25.906,0:06:28.141 Today, you can use data 0:06:28.141,0:06:30.119 without asking anyone's permission. 0:06:30.119,0:06:33.727 And this is a good idea, [br]because public data is available. 0:06:33.727,0:06:35.381 But I think that there is a risk 0:06:35.381,0:06:38.066 to thinking [br]that the simple diffusion of data 0:06:38.066,0:06:40.106 will be enough to emancipate society, 0:06:40.112,0:06:43.539 that individuals can emancipate [br]themselves from the power of data 0:06:43.539,0:06:45.510 just because they have access to data. 0:06:46.390,0:06:49.253 There is a Canadian sociologist[br]named Michael Gurstein 0:06:49.253,0:06:53.473 who has proposed an expression [br]that sums up a risk of Open data, 0:06:53.477,0:06:56.587 namely, "Empower the Empowered," 0:06:56.587,0:06:59.475 meaning to give more power [br]to those who already have it. 0:06:59.475,0:07:02.938 That is why it's crucial[br]to develop a culture of critical thought 0:07:02.938,0:07:05.578 to be able to understand how data[br]is produced, 0:07:05.578,0:07:09.267 used, and how you can use it [br]to take a step back. 0:07:10.005,0:07:11.360 Well, that's the theory. 0:07:11.360,0:07:14.550 I would like to share with you[br]the first results from an experiment 0:07:14.550,0:07:18.254 that we did in my association: [br]Open Knowledge Foundation France. 0:07:18.254,0:07:22.028 We are part of a worldwide network [br]dedicated to open knowledge and open data. 0:07:22.028,0:07:24.005 We have groups in more than 50 countries. 0:07:24.005,0:07:26.926 And the idea of our association[br]and of this worldwide movement 0:07:26.926,0:07:29.483 is that each person can benefit,[br]can profit, 0:07:29.483,0:07:33.483 from works, scientific articles, [br]and content, 0:07:33.483,0:07:37.344 to create, play, educate, [br]or to start up a business. 0:07:38.225,0:07:40.913 Open Knowledge has [br]a large number of projects. 0:07:40.913,0:07:43.909 I'm going to talk about one project,[br]the "School of Data." 0:07:43.909,0:07:47.433 We participated together [br]in the translation of this project, 0:07:47.433,0:07:48.885 this "School of Data." 0:07:48.885,0:07:51.744 The School of Data consists [br]of online resources 0:07:51.744,0:07:53.834 that are free and accessible to all, 0:07:53.834,0:07:55.339 and also events. 0:07:56.306,0:07:58.097 We first proposed classes. 0:07:58.603,0:08:02.016 In these classes, you do not even have [br]to know what data is. 0:08:02.016,0:08:05.162 Or how to use a spreadsheet,[br]which is really the tool of choice. 0:08:05.162,0:08:07.444 You will be taught about that[br]in our class. 0:08:07.931,0:08:10.159 No expertise is required, 0:08:10.159,0:08:14.519 you are guided step by step[br]in the use of data. 0:08:14.519,0:08:17.739 We also use another format[br]which is particularly educational, 0:08:17.739,0:08:19.279 namely, the recipe. 0:08:19.287,0:08:22.290 Recipes are just like in cooking - [br]you have ingredients 0:08:22.290,0:08:23.293 and steps. 0:08:23.293,0:08:25.284 The ingredients will be data, 0:08:25.284,0:08:27.994 software - free if possible, 0:08:27.994,0:08:31.329 so that you can use data. 0:08:31.329,0:08:34.440 The idea is that making a map[br]of electoral results, 0:08:34.440,0:08:36.879 or a graph of results [br]of the French soccer team 0:08:36.879,0:08:40.142 should be as easy to do as making[br]a tarte Tatin or Bechamel sauce. 0:08:40.142,0:08:41.895 You find the resources online 0:08:41.895,0:08:44.595 and we walk you through the project[br]step by step. 0:08:44.595,0:08:48.040 We also have tried to develop[br]another format for in-person sessions, 0:08:48.040,0:08:49.741 which we call expeditions. 0:08:50.117,0:08:52.644 For expeditions, it's like[br]mountain climbing: 0:08:52.644,0:08:55.103 you have a guide, a "data sherpa," 0:08:55.103,0:08:56.993 who will accompany you, 0:08:56.993,0:08:58.233 attached by a rope. 0:08:58.233,0:09:01.432 There will be 10 or 20 participants 0:09:01.432,0:09:05.267 who work together during a weekend[br]or sometimes for a few hours. 0:09:05.769,0:09:07.631 Our first data expedition 0:09:07.631,0:09:10.991 focused on the question of air pollution[br]in Île-de-France. 0:09:10.991,0:09:12.675 I don't know if you have seen 0:09:12.675,0:09:15.325 these images of Paris [br]with black clouds of pollution. 0:09:15.325,0:09:18.060 They left their mark on us, [br]and we said to ourselves: 0:09:18.060,0:09:21.729 "Well, let's dig into this set of data." 0:09:22.458,0:09:25.469 The first step, when we undertook[br]this data expedition, 0:09:25.469,0:09:27.741 was to identify the available data. 0:09:27.749,0:09:31.041 We realized that there is[br]no available data 0:09:31.041,0:09:34.801 that is freely reusable, that is to say,[br]that you have the right to reuse 0:09:34.801,0:09:37.466 without asking for permission,[br]on this crucial question. 0:09:37.466,0:09:40.679 Therefore, we had to extract data[br]from websites, 0:09:40.679,0:09:43.299 reports, or even from graphics. 0:09:43.299,0:09:47.127 Imagine what a mess it is to expose[br]data that is in a graphic. 0:09:47.811,0:09:50.402 We also realized that Airparif, 0:09:50.402,0:09:53.161 the organization responsible [br]for the production of data 0:09:53.161,0:09:56.411 relevant to the question [br]of air pollution in Île-de-France 0:09:56.411,0:09:59.581 does not allow you to use[br]its data freely. 0:09:59.581,0:10:02.127 One must ask permission, or pay. 0:10:03.130,0:10:05.164 We were able to overcome[br]these constraints 0:10:05.164,0:10:07.701 and to conduct this expedition 0:10:07.701,0:10:10.571 guided by our sherpa, Pierre. 0:10:10.571,0:10:13.987 During this data expedition 0:10:13.987,0:10:18.227 we broke into small groups,[br]and each group was assigned an angle. 0:10:18.227,0:10:22.257 One of the principles of the expeditions:[br]you have an angle, like in journalism, 0:10:22.257,0:10:25.316 we ask ourselves questions that could be[br]the title of an article. 0:10:25.316,0:10:26.996 The first group asked itself 0:10:26.999,0:10:31.484 if bicycle riding had led to a decrease[br]in air pollution in Paris.[br] 0:10:31.484,0:10:32.524 The second group, 0:10:32.524,0:10:34.254 since it was during a strike, 0:10:34.254,0:10:37.464 asked itself if public transport strikes 0:10:37.464,0:10:40.616 cause air pollution in Île-de-France [br]to increase. 0:10:40.984,0:10:44.384 And the third group asked if[br]all regions are equal 0:10:44.384,0:10:48.424 with regard to air pollution,[br]or if geography and environment 0:10:48.424,0:10:51.438 could have an effect, and if so, [br]could be seen in the data. 0:10:53.617,0:10:55.173 The results of this expedition, 0:10:55.173,0:10:57.373 I am sorry to say, will be a bit [br]disappointing. 0:10:57.373,0:11:01.440 We did not find any correlation[br]or causal connection 0:11:01.440,0:11:04.320 with nice data points, [br]a fitting curve, or a straight line, 0:11:04.320,0:11:06.546 that proves that our hypotheses[br]are correct. 0:11:06.546,0:11:10.186 We did not succeed at that,[br]but we worked for four hours. 0:11:10.186,0:11:12.196 What we did manage to show,[br]on the other hand, 0:11:12.196,0:11:14.007 is that it is extremely difficult 0:11:14.007,0:11:17.838 to use data concerning a question[br]as crucial as air pollution, 0:11:17.838,0:11:20.438 to understand how it is produced, 0:11:20.438,0:11:23.228 extremely difficult to use it, 0:11:23.228,0:11:26.084 that the most simple measurements[br]are not accessible, 0:11:26.084,0:11:29.336 and that you do not necessarily have[br]the right to reuse them. 0:11:29.336,0:11:32.346 That is just what we tried [br]to do at this event: 0:11:32.346,0:11:35.426 to develop a culture of critical thought[br]on the way in which data 0:11:35.426,0:11:38.491 is used concerning the question[br]of air pollution. 0:11:38.491,0:11:43.359 We also tried to develop this format [br]of expeditions and training events 0:11:43.359,0:11:45.149 with another group 0:11:45.149,0:11:46.239 that is less expected, 0:11:46.239,0:11:48.131 that of children. 0:11:48.131,0:11:53.061 We asked ourselves the question[br]during an event that we did with Etalab, 0:11:53.061,0:11:56.661 the government institution[br]in charge of data.gouv.fr, 0:11:56.661,0:11:59.500 the open data portal[br]of the French government. 0:11:59.500,0:12:03.737 We suggested the idea [br]of radically different open data portals. 0:12:03.737,0:12:07.767 They were fictional projects,[br]just prototypes. 0:12:07.767,0:12:12.951 There is a group that has come out[br]with a prototype called Tada.gouv.fr. 0:12:12.951,0:12:17.132 Tada.gouv.fr is a fictional portal,[br]a bit idealistic, destined for children. 0:12:17.775,0:12:21.898 The data is presented[br]not by government department or minister, 0:12:21.898,0:12:24.798 but by discipline, that is to say[br]that you have data 0:12:24.798,0:12:27.517 about history and geography,[br]physics and chemistry, 0:12:27.517,0:12:29.477 or life and Earth sciences. 0:12:29.477,0:12:32.473 On this occasion, we realized[br]that open data 0:12:32.473,0:12:34.995 can be a fantastic resource[br]for school 0:12:34.995,0:12:38.295 because it allows the development[br]of inter-disciplinary work, 0:12:38.305,0:12:41.370 and this culture of critical thought[br]about data I have mentioned. 0:12:41.916,0:12:43.877 We did not leave things at observation. 0:12:43.877,0:12:46.217 We tried to do a first experiment 0:12:46.217,0:12:48.751 and I would like to tell you [br]about the first results. 0:12:49.745,0:12:51.668 We joined with Silicon Banlieue, 0:12:51.668,0:12:54.058 which is a site dedicated[br]to data in Argenteuil, 0:12:54.058,0:12:56.089 and we proposed to do an event 0:12:56.089,0:12:58.479 with children between 8 and 14 years old 0:12:58.479,0:13:00.258 who came to the Open World Forum, 0:13:00.258,0:13:02.546 an event dedicated [br]to open computing in Paris. 0:13:02.546,0:13:05.388 There, you can see me from the back. 0:13:05.388,0:13:08.828 With the 8 to 14 year old children,[br]we worked on the question of cinema, 0:13:08.828,0:13:12.152 because that interested them,[br]and it is a simple enough subject. 0:13:12.152,0:13:13.779 First we collected data, 0:13:13.779,0:13:16.619 nothing very complicated,[br]it was just a paper form. 0:13:16.619,0:13:20.026 We asked them how many times a month[br]they go to the cinema, 0:13:20.026,0:13:23.808 which movies they saw from a list;[br]then we compared that with data 0:13:23.808,0:13:27.368 that is available from the survey[br]of French cultural practices, 0:13:27.368,0:13:30.243 on which you have [br]exactly the same type of data. 0:13:30.243,0:13:33.873 With the children, we produced[br]an infographic at this time. 0:13:33.873,0:13:37.510 Now, I am really bad at math,[br]I got a 7,5 on the Bac, 0:13:37.510,0:13:40.590 I found myself explaining[br]the concept and calculation 0:13:40.590,0:13:43.512 of averages using a spreadsheet,[br]which was rather surprising. 0:13:43.512,0:13:45.362 I explained how it works. 0:13:45.362,0:13:49.505 We emerged with an infographic[br]and we were able on this occasion, 0:13:49.505,0:13:53.135 I think that this is the important point,[br]to develop a culture of critical thought. 0:13:53.135,0:13:56.521 I explained to them about data,[br]how it is used, 0:13:56.521,0:13:58.304 how they can use it, 0:13:58.304,0:14:00.914 how it controls us in a certain way, 0:14:00.914,0:14:03.726 but that we can also take back[br]the power over data. 0:14:03.726,0:14:06.731 I assure you that with a topic[br]as attractive as cinema 0:14:06.731,0:14:08.695 we can deliver this kind of message 0:14:08.695,0:14:10.798 and have a discussion on these questions. 0:14:12.058,0:14:15.198 I hope that I have convinced you[br]that it is necessary today 0:14:15.198,0:14:17.217 to take a step back with regard to data, 0:14:17.217,0:14:20.523 to develop a culture of critical thought,[br]to understand 0:14:20.523,0:14:23.574 how it is produced [br]and how you can use it, 0:14:23.574,0:14:26.806 to prevent data from being forced on you. 0:14:26.806,0:14:29.554 So from today, [br]get your hands dirty, 0:14:29.554,0:14:32.294 find a sherpa,[br]all of the resources are online, 0:14:32.294,0:14:34.123 and go on a data expedition. 0:14:34.123,0:14:35.313 Thank you. 0:14:35.313,0:14:36.765 (Applause)