Each week I come across an article or a report that asserts that data is the new oil, that the use of data will lead to a new era of knowledge, or even that it can predict the future. This has been particularly true since everyone started talking about big data. You know, the use of large-scale data, mega data. For example, Sergei Brin, the founder of Google, who is focusing on the use of medical data to cure Parkinson's disease, for which he is at risk. During the World Cup, many people said that the German team was able to beat the Brazilian team 7-1 thanks to the use of match data. It's clear that there is no field or type of organization for which Big Data isn't supposed to be a magic wand that will enable the resolution of extremely complex problems. And I must admit that I feel uneasy about these kinds of simplistic statements, which I see as overshadowing a number of issues, including the economy, the environment, politics, and the ethics of the massive production of data. Please don't think that I am skeptical or doubtful about data, or that I am opposed to all forms of quantification. On the contrary, I live surrounded by data. During the day, I'm working on a thesis in sociology at Telecom ParisTech where I study Open Data. The important effort to provide open access to public data. And I study the consequences of Open Data for the operation of government. At night, I am the administrator for an association, Open Knowledge Foundation France, which campaigns for open knowledge and for data that benefits everyone. Today, I would like to persuade you that, at this time, when data is becoming obtrusive, we need to take a step back. This coronation of data that we are witnessing during the era of Open Data and Big Data demands a new culture of critical thought about data. We must be able to understand how it is produced and used, and how we can become independent from it. I also want to share the results of an experiment that we did at Open Knowledge Foundation France called "the School of Data." I hope to show that, through the use of data, we can manage to develop this culture of critical thought and that we can develop new checks and balances. So, what are the problems with data? The first problem is that data is always right. Now, don't believe that this is anything new. Historically, the word 'data' comes from the Latin word "datum" which, in mathematics and theology in the 15th century, referred to the facts taken as given in an argument and which were not to be called into question. Today, as you know, data refers to everything that flows in your computer. That is to say, the 1's and 0's that pass from USB stick to hard disk are considered data. On the other hand, the sense that data is a given, that it is factual, that it is not to be questioned, has remained. The second problem with data is that we don't really know where it comes from. In general, when someone uses data, he or she has very little information about the way in which it was produced. At best, you will have access to metadata, that is to say, data about the data, which will tell you the contents of the file and, occasionally, how the data was produced. However, that data has a long history. It was collected. It was processed, formatted, aggregated, processed by algorithms, and visualized before reaching you. This is why sociologist Bruno Latour asserts that we should say 'obtaineds' instead of data to accurately reflect this long history which will constrain a number of uses. Finally, the third problem with data is that we can't really see it. Have you ever seen a data center, even if only from outside, or from the road? Do you have any idea of where your data is stored? I mean, physically, where it is stored? Do you have any idea what will happen to it in 10 years? In any case, I have no answer for these three questions. However, even if we can't see our data, we can measure its effects. At the individual level, when Facebook changes its terms of service or modifies its algorithm, it has consequences for your private life and for the way in which you present yourself as an individual. And on the most macroscopic level, the Snowden affair has shown that the massive production of data can have consequences for the sovereignty of the State or for diplomacy. This is why we must develop a culture of critical thought about data. To encourage myself, I was inspired by a book called "Statactivism." Statactivism is a neologism proposed by researchers and artists that refers to those experiences that permit one to liberate oneself from the power of data. The fundamental basis of statactivism is that data controls us, and that it imposes on us like an argument from authority. The goal of statactivism is almost revolutionary. It asserts that other kinds of data must be possible. It is not necessary to be opposed to all data. Instead, we should use the power of data to propose other realities to critique data more effectively, or to propose other measures. In short, to propose other data. There is a motif in the book which I find particularly meaningful, that of the judoka. Judoka use the strength of their opponents in order to turn it back on them. That is what I want to invite you to do today: think about how to use data to better analyze it. I think, precisely at this moment in the development of Open Data, the need to develop a culture of critical thought about data is increasingly crucial. Don't be misled: Open Data represents an extraordinary opportunity. The volume of data is exploding and data is no longer the privilege of the powerful. Today, you can use data without asking anyone's permission. And this is a good idea, because public data is available. But I think that there is a risk to thinking that the simple diffusion of data will be enough to emancipate society, that individuals can emancipate themselves from the power of data just because they have access to data. There is a Canadian sociologist named Michael Gurstein who has proposed an expression that sums up a risk of Open data, namely, "Empower the Empowered," meaning to give more power to those who already have it. That is why it's crucial to develop a culture of critical thought to be able to understand how data is produced, used, and how you can use it to take a step back. Well, that's the theory. I would like to share with you the first results from an experiment that we did in my association: Open Knowledge Foundation France. We are part of a worldwide network dedicated to open knowledge and open data. We have groups in more than 50 countries. And the idea of our association and of this worldwide movement is that each person can benefit, can profit, from works, scientific articles, and content, to create, play, educate, or to start up a business. Open Knowledge has a large number of projects. I'm going to talk about one project, the "School of Data." We participated together in the translation of this project, this "School of Data." The School of Data consists of online resources that are free and accessible to all, and also events. We first proposed classes. In these classes, you do not even have to know what data is. Or how to use a spreadsheet, which is really the tool of choice. You will be taught about that in our class. No expertise is required, you are guided step by step in the use of data. We also use another format which is particularly educational, namely, the recipe. Recipes are just like in cooking - you have ingredients and steps. The ingredients will be data, software - free if possible, so that you can use data. The idea is that making a map of electoral results, or a graph of results of the French soccer team should be as easy to do as making a tarte Tatin or Bechamel sauce. You find the resources online and we walk you through the project step by step. We also have tried to develop another format for in-person sessions, which we call expeditions. For expeditions, it's like mountain climbing: you have a guide, a "data sherpa," who will accompany you, attached by a rope. There will be 10 or 20 participants who work together during a weekend or sometimes for a few hours. Our first data expedition focused on the question of air pollution in Île-de-France. I don't know if you have seen these images of Paris with black clouds of pollution. They left their mark on us, and we said to ourselves: "Well, let's dig into this set of data." The first step, when we undertook this data expedition, was to identify the available data. We realized that there is no available data that is freely reusable, that is to say, that you have the right to reuse without asking for permission, on this crucial question. Therefore, we had to extract data from websites, reports, or even from graphics. Imagine what a mess it is to expose data that is in a graphic. We also realized that Airparif, the organization responsible for the production of data relevant to the question of air pollution in Île-de-France does not allow you to use its data freely. One must ask permission, or pay. We were able to overcome these constraints and to conduct this expedition guided by our sherpa, Pierre. During this data expedition we broke into small groups, and each group was assigned an angle. One of the principles of the expeditions: you have an angle, like in journalism, we ask ourselves questions that could be the title of an article. The first group asked itself if bicycle riding had led to a decrease in air pollution in Paris. The second group, since it was during a strike, asked itself if public transport strikes cause air pollution in Île-de-France to increase. And the third group asked if all regions are equal with regard to air pollution, or if geography and environment could have an effect, and if so, could be seen in the data. The results of this expedition, I am sorry to say, will be a bit disappointing. We did not find any correlation or causal connection with nice data points, a fitting curve, or a straight line, that proves that our hypotheses are correct. We did not succeed at that, but we worked for four hours. What we did manage to show, on the other hand, is that it is extremely difficult to use data concerning a question as crucial as air pollution, to understand how it is produced, extremely difficult to use it, that the most simple measurements are not accessible, and that you do not necessarily have the right to reuse them. That is just what we tried to do at this event: to develop a culture of critical thought on the way in which data is used concerning the question of air pollution. We also tried to develop this format of expeditions and training events with another group that is less expected, that of children. We asked ourselves the question during an event that we did with Etalab, the government institution in charge of data.gouv.fr, the open data portal of the French government. We suggested the idea of radically different open data portals. They were fictional projects, just prototypes. There is a group that has come out with a prototype called Tada.gouv.fr. Tada.gouv.fr is a fictional portal, a bit idealistic, destined for children. The data is presented not by government department or minister, but by discipline, that is to say that you have data about history and geography, physics and chemistry, or life and Earth sciences. On this occasion, we realized that open data can be a fantastic resource for school because it allows the development of inter-disciplinary work, and this culture of critical thought about data I have mentioned. We did not leave things at observation. We tried to do a first experiment and I would like to tell you about the first results. We joined with Silicon Banlieue, which is a site dedicated to data in Argenteuil, and we proposed to do an event with children between 8 and 14 years old who came to the Open World Forum, an event dedicated to open computing in Paris. There, you can see me from the back. With the 8 to 14 year old children, we worked on the question of cinema, because that interested them, and it is a simple enough subject. First we collected data, nothing very complicated, it was just a paper form. We asked them how many times a month they go to the cinema, which movies they saw from a list; then we compared that with data that is available from the survey of French cultural practices, on which you have exactly the same type of data. With the children, we produced an infographic at this time. Now, I am really bad at math, I got a 7,5 on the Bac, I found myself explaining the concept and calculation of averages using a spreadsheet, which was rather surprising. I explained how it works. We emerged with an infographic and we were able on this occasion, I think that this is the important point, to develop a culture of critical thought. I explained to them about data, how it is used, how they can use it, how it controls us in a certain way, but that we can also take back the power over data. I assure you that with a topic as attractive as cinema we can deliver this kind of message and have a discussion on these questions. I hope that I have convinced you that it is necessary today to take a step back with regard to data, to develop a culture of critical thought, to understand how it is produced and how you can use it, to prevent data from being forced on you. So from today, get your hands dirty, find a sherpa, all of the resources are online, and go on a data expedition. Thank you. (Applause)