How statistics can be misleading - Mark Liddell
-
0:07 - 0:09Statistics are persuasive.
-
0:09 - 0:13So much so that people, organizations,
and whole countries -
0:13 - 0:18base some of their most important
decisions on organized data. -
0:18 - 0:19But there's a problem with that.
-
0:19 - 0:23Any set of statistics might have something
lurking inside it, -
0:23 - 0:27something that can turn the results
completely upside down. -
0:27 - 0:31For example, imagine you need to choose
between two hospitals -
0:31 - 0:34for an elderly relative's surgery.
-
0:34 - 0:36Out of each hospital's
last 1000 patient's, -
0:36 - 0:40900 survived at Hospital A,
-
0:40 - 0:43while only 800 survived at Hospital B.
-
0:43 - 0:46So it looks like Hospital A
is the better choice. -
0:46 - 0:48But before you make your decision,
-
0:48 - 0:51remember that not all patients
arrive at the hospital -
0:51 - 0:54with the same level of health.
-
0:54 - 0:57And if we divide each hospital's
last 1000 patients -
0:57 - 1:01into those who arrived in good health
and those who arrived in poor health, -
1:01 - 1:04the picture starts to look very different.
-
1:04 - 1:08Hospital A had only 100 patients
who arrived in poor health, -
1:08 - 1:10of which 30 survived.
-
1:10 - 1:15But Hospital B had 400,
and they were able to save 210. -
1:15 - 1:17So Hospital B is the better choice
-
1:17 - 1:21for patients who arrive
at hospital in poor health, -
1:21 - 1:25with a survival rate of 52.5%.
-
1:25 - 1:28And what if your relative's health
is good when she arrives at the hospital? -
1:28 - 1:32Strangely enough, Hospital B is still
the better choice, -
1:32 - 1:36with a survival rate of over 98%.
-
1:36 - 1:39So how can Hospital A have a better
overall survival rate -
1:39 - 1:45if Hospital B has better survival rates
for patients in each of the two groups? -
1:45 - 1:49What we've stumbled upon is a case
of Simpson's paradox, -
1:49 - 1:52where the same set of data can appear
to show opposite trends -
1:52 - 1:55depending on how it's grouped.
-
1:55 - 1:59This often occurs when aggregated data
hides a conditional variable, -
1:59 - 2:01sometimes known as a lurking variable,
-
2:01 - 2:07which is a hidden additional factor
that significantly influences results. -
2:07 - 2:10Here, the hidden factor is the relative
proportion of patients -
2:10 - 2:13who arrive in good or poor health.
-
2:13 - 2:17Simpson's paradox isn't just
a hypothetical scenario. -
2:17 - 2:19It pops up from time
to time in the real world, -
2:19 - 2:22sometimes in important contexts.
-
2:22 - 2:24One study in the UK appeared to show
-
2:24 - 2:28that smokers had a higher survival rate
than nonsmokers -
2:28 - 2:30over a twenty-year time period.
-
2:30 - 2:33That is, until dividing the participants
by age group -
2:33 - 2:38showed that the nonsmokers
were significantly older on average, -
2:38 - 2:41and thus, more likely
to die during the trial period, -
2:41 - 2:44precisely because they were living longer
in general. -
2:44 - 2:47Here, the age groups
are the lurking variable, -
2:47 - 2:50and are vital to correctly
interpret the data. -
2:50 - 2:52In another example,
-
2:52 - 2:54an analysis of Florida's
death penalty cases -
2:54 - 2:58seemed to reveal
no racial disparity in sentencing -
2:58 - 3:02between black and white defendants
convicted of murder. -
3:02 - 3:06But dividing the cases by the race
of the victim told a different story. -
3:06 - 3:08In either situation,
-
3:08 - 3:11black defendants were more likely
to be sentenced to death. -
3:11 - 3:15The slightly higher overall sentencing
rate for white defendants -
3:15 - 3:19was due to the fact
that cases with white victims -
3:19 - 3:21were more likely
to elicit a death sentence -
3:21 - 3:24than cases where the victim was black,
-
3:24 - 3:28and most murders occurred between
people of the same race. -
3:28 - 3:31So how do we avoid
falling for the paradox? -
3:31 - 3:35Unfortunately,
there's no one-size-fits-all answer. -
3:35 - 3:39Data can be grouped and divided
in any number of ways, -
3:39 - 3:42and overall numbers may sometimes
give a more accurate picture -
3:42 - 3:47than data divided into misleading
or arbitrary categories. -
3:47 - 3:52All we can do is carefully study the
actual situations the statistics describe -
3:52 - 3:56and consider whether lurking variables
may be present. -
3:56 - 3:59Otherwise, we leave ourselves
vulnerable to those who would use data -
3:59 - 4:03to manipulate others
and promote their own agendas.
- Title:
- How statistics can be misleading - Mark Liddell
- Speaker:
- Mark Liddell
- Description:
-
View full lesson: http://ed.ted.com/lessons/how-statistics-can-be-misleading-mark-liddell
Statistics are persuasive. So much so that people, organizations, and whole countries base some of their most important decisions on organized data. But any set of statistics might have something lurking inside it that can turn the results completely upside down. Mark Liddell investigates Simpson’s paradox.
Lesson by Mark Liddell, animation by Tinmouse Animation Studio.
- Video Language:
- English
- Team:
- closed TED
- Project:
- TED-Ed
- Duration:
- 04:19
Jessica Ruby approved English subtitles for How statistics can be misleading | ||
Jessica Ruby edited English subtitles for How statistics can be misleading | ||
Jessica Ruby accepted English subtitles for How statistics can be misleading | ||
Jessica Ruby edited English subtitles for How statistics can be misleading | ||
Jessica Ruby edited English subtitles for How statistics can be misleading | ||
Jennifer Cody edited English subtitles for How statistics can be misleading | ||
Jennifer Cody edited English subtitles for How statistics can be misleading |